iTunes SQL Oct31 '04

Show Duplicate Songs

The latest version of iTunes, as of this date, is 4.7. This version contains a feature called "Show Duplicate Songs."

This is located under the Edit menu:

Edit, Show Duplicate Songs

This is a great feature, because often times duplicate songs start to appear in iTunes playlists, for one reason or another. Sometimes the same song is on different CDs, and it’s hard to remember if you already have that song, etc.

How do the results happen?

Even though this is a great feature, I’m more interested in the SQL behind it.

I’m curious as to how the results of "Show Duplicate Songs" is accomplished.

It looks as though iTunes is comparing the "Song Name," and "Artist" columns, which makes sense, because duplicate songs would obviously have the same names in both columns.

For example, take a look at some of the results of "Show Duplicate Songs" on my computer, as of today:

My duplicate songs

As you can see, it lists the songs that are duplicates, and that have the same name in the "Song Name" column, as well as the "Artist" column.

This is an effective approach to use, as it should find most duplicates. The only way it would miss a duplicate is if the song wasn’t named properly, etc.

Real world SQL

But, if you had to guess, what would the SQL query look like, to produce such a result? If you think about it, it should be a very common query – comparing two columns to find duplicate records.

I know I’ve come across this situation many times, in my own SQL work.

And, in databases, it’s quite common to have a column that pertains to a person’s first name, and a person’s last name.

For the sake of this example, let’s say we wanted to find duplicate names in a fictional database with a "First Name" column, and a "Last Name" column. How would we do that?

Well, we could write a query that pertains to a specific person:

SELECT first_name, last_name
FROM persons_table
WHERE first_name = 'Matt'
AND last_name = 'Thommes';

This query should display all rows that meet the criteria – a person with a first name of "Matt," and a last name of "Thommes." If there are duplicates, we should be able to spot them quickly.

Even though this query works fine for that specific person, it is not very efficient at all.

In order for us to find the duplicate names of a database that has 10,000 rows, we’d have to write 10,000 separate queries – one for each name in the database.

Good grief.

There has to be a way to find the duplicate names, without using a specific name – but to have it show ALL the names that are duplicated – with one query.

In other words, we want to compare the first_name column with the last_name column – arbitrarily – meaning we don’t want to specify WHO, we just want to specify HOW.

I don’t know this query

I don’t know this query. I am asking for help.

If I had to attempt to write this query, this is how I would start:

SELECT first_name, last_name
FROM persons_table
WHERE first_name = first_name
AND last_name = last_name;

That’s a pathetic attempt. It makes absolutely no sense at all.

Something tells me DISTINCT would be useful in the query, but I’m not sure how.

Categories: Software , SQL

Add Feedback (view all)

Leave feedback

Feedback

Input format: The editor controls below will assist with Markdown syntax.

Status

Sub-status

Your info

SELECT first_name, last_name, count(first_name) AS count, count(last_name) AS count2 FROM persons_table GROUP BY last_name having count ... Read more.

opps. Change count2 > 2 to count2 > 1 ... Read more.

Excellent. Thanks. ... Read more.

It's actually simpler than that: select count(*), first, last from yourtable group by first, last having count(*) > 1 ... Read more.

slight improvement to previous comments select * from personstable where (firstname, lastname) in ( select firstname, ... Read more.

matthom is published and produced by Matt Thommes - an independent publishing enthusiast, mobile blogger, content creator, informative writer, web developer from Chicago. Never one to conform, Matt intends to promote the effect the web has on our lives, in an effort to intensify, instruct, and clarify all that is happening around us.

Contact Matt

Similar Entries

Stats

128 unique visits since August 2008

Syndicate

Advertisements