MySQL FULLTEXT searches May12 '05

Prior to the release of MySQL 4.0, searching for keywords within text or char fields required some complicated techniques.

For me, I would just write the "poor man’s" search, which looked something like this:

SELECT field FROM table WHERE field LIKE '%keyword%';

This method is extremely ineffective, and has many short–comings.

The LIKE function

The LIKE function (used above) in MySQL only searches for patterns of words, much like a regular expression would. For example, a search for the word "and" would return all rows with the pattern "and" in the field.

It would return entries such as "sand," "hand," "anderson," etc. You get the idea.

This is typically not ideal to a search situation.

FULLTEXT searches

Really what we want is an "aggregate synopsis of all the entries with that specific keyword," and have them ordered by how relevant they are.

And, for keyword searches such as "and," we don’t want any results returned, because "and" is such a vague and short word – there are far too many matches! A search shouldn’t produce a zillion matches. It should be concise.

Enter FULLTEXT searches.

Each search conducted with FULLTEXT syntax will provide extremely accurate results, based on relevancy. And – if a keyword is too short (three letters, or less), or if a keyword produces results that exist in more than 50% of the rows – the search request returns nothing, which is good!

The LIKE function, on the other hand, would return every single row, if the keyword pattern was in each. Also, it the keyword was just one letter, the LIKE function would still give us a zillion rows with that single letter in it, which is not helpful at all.

The syntax for FULLTEXT searches looks something like this:

SELECT MATCH (headline, entry) AGAINST ('keyword') AS score, id, headline, entry FROM table WHERE MATCH (headline, entry) AGAINST ('keyword') ORDER BY score DESC;

This query could produce something like this, if the keyword was php:

Notice the first column, score. This is the relevancy rating that MySQL applies to each row returned. It’s a complex mathematical calculation that MySQL performs. The higher the relevancy rating, the more relevant the row is.

For our search results example, the higher relevancy ratings indicate an entry with a great match, according to our search keyword.

Try it out!

I just applied this technique to my search form, for this site. Try out my search form, for fast, accurate results.

And no more sand.

Categories: SQL

Add Feedback (view all)

Leave feedback

Feedback

Input format: The editor controls below will assist with Markdown syntax.

Status

Sub-status

Your info

Thanks for this post Matt, I will definitely use it! The new site is looking good too! ... Read more.

matthom is published and produced by Matt Thommes - an independent publishing enthusiast, mobile blogger, content creator, informative writer, web developer from Chicago. Never one to conform, Matt intends to promote the effect the web has on our lives, in an effort to intensify, instruct, and clarify all that is happening around us.

Contact Matt

Similar Entries

Stats

51 unique visits since August 2008

Syndicate

Advertisements