Click to See Complete Forum and Search --> : Duplicates
Vanny
June 21st, 2005, 01:14 AM
I am writing some code to go through and pick up duplicate records, and display them for the end user to the validate (keep or delete)
When I only need to check one condition say first name it is quite easy, but is there any tricky SQL that can be used if i want to check for same first name, surname &/OR company.
All the records are in the same table.
Table Details (MS ACCESS DB - not by my choice)
ID FIRST NAME SURNAME etc
1 Bill Smith
2 Bill Jones
3 Steve Smith
4 Bill Smith
Now I know that checks should be done on entering data, but there are quite logical reasons that there may be duplicates in the database (eg different people with a common name).
Any help would be appreciated. This SQL will then be plugged into an ASP page.
JeffB
June 21st, 2005, 01:38 AM
Something like... :
SELECT details.FIRST_NAME, details.SUR_NAME, details.ID
FROM details
WHERE (((details.FIRST_NAME) In
(SELECT FIRST_NAME FROM details As Tmp GROUP BY FIRST_NAME,SUR_NAME HAVING Count(*)>1 And SUR_NAME = details.SUR_NAME)))
ORDER BY details.FIRST_NAME, details.SUR_NAME
Hope it helps (and hope it works :D)
JeffB
cjard
June 21st, 2005, 10:51 AM
jeff.. it will cause non duplicated records to be returned too
consider
bill smith
bill jones
bill evans
bill billson
of you ask the db to count and group by bill.. it will report 4, hence > 1, but none of these are duplicates, yet your query will return them
the spirit is correct thouhg:
select *
from
people p,
(select firstname, surname, company from people
group by firstname,surname,company
having count(*) > 1) dupfind
where
(p.firstname = dupfind.firstname and p.surname = dupfind.surname)
or (p.firstname = dupfind.firstname and p.company = dupfind.company)
or (p.surname = dupfind.surname and p.company = dupfind.company)
the original spec of:
first name, surname &/OR company
is a bit woolly? i cant quite make sense of it, rewrite it in logical computer terms
JeffB
June 21st, 2005, 02:11 PM
Hum, I tried it and it works, the key is that the GROUP BY is done on both field:
GROUP BY FIRST_NAME,SUR_NAME
And in:
And SUR_NAME = details.SUR_NAME
Considering the second query table is named Tmp.
So if there is two bill jones, return both record, if there is hundred of bill with different surname, return none of them.
JeffB
cjard
June 21st, 2005, 05:35 PM
indeed; i didnt see the surname in the group by ;)
its a bit of an odd way to write sql, but it probably doesnt compile to anything different than mine
Vanny
June 21st, 2005, 07:33 PM
The First Name, Surname &/OR Company,
Means that the user can select any of these for their search. Simply put
- First Name
- First Name, Surname
- First Name, Company
- Surname, Company
- First Name, Surname, Company
- Company
I have ended up writing an ASP script to handle it, this allows me some more indepth processing when finding a match, i was just hoping that there was a simple method. I have used some of the above suggestions, so thank you, I have just written a SQL statement for each of the above options.
cjard
June 23rd, 2005, 04:57 AM
that's what you'd have to do, really..
though i suppose you could use a grouping set or a cube.. i dont use grouping sets though..
and i'd have been tempted to break them all out anyway, a grouping set would return a lot of data,.. possibly too much to handle humanly
codeguru.com
Copyright Internet.com Inc., All Rights Reserved.