SOUNDS LIKE vs. MATCH AGAINST

A friend of mine asked me:

I’m hoping you can help me out with something — I’m trying to optimize a search feature. Since it uses a MySQL database, the search already uses the LIKE statement to get matches for a search query, we might be needing something more flexible. I found mention on MySQL’s website about something called the SOUNDS LIKE expression that can be more flexible than LIKE. Do you know anything about this? If you do, can you point me a direction where I might be able to learn more about it? Thanks in advance for your help!

My response:

I haven’t used it, but the MySQL manual’s pretty good, so here are my thoughts:

http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_sounds_like
says:

expr1 SOUNDS LIKE expr2

This is the same as SOUNDEX(expr1) = SOUNDEX(expr2).

So let’s say your example is searching for “Sheeri”. So you’d do
WHERE field SOUNDS LIKE "%Sheeri%";

And maybe you’re hoping to get fields that contain “Sheeri” and “cheery”. However, what this will do is
WHERE SOUNDEX(field) = SOUNDEX('%Sheeri%');

There is some important information in the manual here:
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_soundex

which says that you can use SUBSTRING to get a part of a string. But you want to search on some part of a field, and you don’t know where in that field the string might be.

One of the biggest problems with a search feature using LIKE is that inevitably you use search terms like
WHERE field LIKE "%Sheeri%";

MySQL can use an index on a text field, but the internal format of the index uses the first character, then the second character, etc. Just as you search for a word in a dictionary — first you get to the section that has the first letter, then flip pages to get to the second letter, etc.

However, just as it would be impossible for you to search a dictionary for all words ending in “th”, MySQL cannot use an index on a string/text field if you have a wildcard at the beginning of your comparison. MySQL can use an index if you search

WHERE field like "Sheeri%";
and
WHERE field like "Sh%ri%";
but not
WHERE field like "%Sheeri";

I think what you really want is a fulltext search: http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

FULLTEXT search can only be done against MyISAM tables. With FULLTEXT search you can sort rows in order of relevance.

It won’t address the issue where someone spells something wrong in a search, but the WITH QUERY EXPANSION mode of search can help that. The manual at
http://dev.mysql.com/doc/refman/5.0/en/fulltext-query-expansion.html

says it best:


It works by performing the search twice, where the search phrase for the second search is the original search phrase concatenated with the few most highly relevant documents from the first search. Thus, if one of these documents contains the word “databases” and the word “MySQL”, the second search finds the documents that contain the word “MySQL” even if they do not contain the word “database”. </I>

Make sure you read all the gotchas at:
http://dev.mysql.com/doc/refman/5.0/en/fulltext-restrictions.html

I hope this helps!

Sheeri’s feed of articles from the pythian group
Sheeri’s feed of articles from the pythian group
Sheeri’s feed of articles from the pythian group


Major Donor Fundraising

The largest revenue category in most nonprofit budgets comes from those most dedicated and prized stakeholders – major donors. These donors make large individual contributions, thousands, and even millions of dollars to support the work of groups about which they are passionate. We will learn how to find these crucial supporters, develop your group’s relationship with them, and to work with them to expand the reach of your work and your budget. Familiarity with the basics of fundraising is recommended prior to taking this class. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced

Sheeri’s feed of articles from the pythian group
Sheeri’s feed of articles from the pythian group


Major Donor Fundraising

The largest revenue category in most nonprofit budgets comes from those most dedicated and prized stakeholders – major donors. These donors make large individual contributions, thousands, and even millions of dollars to support the work of groups about which they are passionate. We will learn how to find these crucial supporters, develop your group’s relationship with them, and to work with them to expand the reach of your work and your budget. Familiarity with the basics of fundraising is recommended prior to taking this class. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced

Sheeri’s feed of articles from the pythian group


Major Donor Fundraising

The largest revenue category in most nonprofit budgets comes from those most dedicated and prized stakeholders – major donors. These donors make large individual contributions, thousands, and even millions of dollars to support the work of groups about which they are passionate. We will learn how to find these crucial supporters, develop your group’s relationship with them, and to work with them to expand the reach of your work and your budget. Familiarity with the basics of fundraising is recommended prior to taking this class. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced


How to Start a Planned Giving Program

Many nonprofit fundraisers have heard of planned giving, easy-to-implement ways of incorporating this powerful development tool into your work. No prior experience with planned giving is required. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced

Sheeri’s feed of articles from the pythian group
Sheeri’s feed of articles from the pythian group


Major Donor Fundraising

The largest revenue category in most nonprofit budgets comes from those most dedicated and prized stakeholders – major donors. These donors make large individual contributions, thousands, and even millions of dollars to support the work of groups about which they are passionate. We will learn how to find these crucial supporters, develop your group’s relationship with them, and to work with them to expand the reach of your work and your budget. Familiarity with the basics of fundraising is recommended prior to taking this class. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced

Sheeri’s feed of articles from the pythian group


Major Donor Fundraising

The largest revenue category in most nonprofit budgets comes from those most dedicated and prized stakeholders – major donors. These donors make large individual contributions, thousands, and even millions of dollars to support the work of groups about which they are passionate. We will learn how to find these crucial supporters, develop your group’s relationship with them, and to work with them to expand the reach of your work and your budget. Familiarity with the basics of fundraising is recommended prior to taking this class. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced


How to Start a Planned Giving Program

Many nonprofit fundraisers have heard of planned giving, easy-to-implement ways of incorporating this powerful development tool into your work. No prior experience with planned giving is required. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced

Sheeri’s feed of articles from the pythian group


Major Donor Fundraising

The largest revenue category in most nonprofit budgets comes from those most dedicated and prized stakeholders – major donors. These donors make large individual contributions, thousands, and even millions of dollars to support the work of groups about which they are passionate. We will learn how to find these crucial supporters, develop your group’s relationship with them, and to work with them to expand the reach of your work and your budget. Familiarity with the basics of fundraising is recommended prior to taking this class. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced


How to Start a Planned Giving Program

Many nonprofit fundraisers have heard of planned giving, easy-to-implement ways of incorporating this powerful development tool into your work. No prior experience with planned giving is required. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced


Forging the Board of Your Dreams

An effective board of directors is essential to a healthy organization. Executive Directors and Boards must work together, and understand the keys to achieving shared success. This workshop will teach participants:


  • The roles and responsibilities of the board

  • The role of board committees

  • Roles and expectations for individual board members

  • The role of the board versus that of paid staff

  • Efficient board structure and management

  • How to develop leaders and an ongoing leadership succession

We will also cover the do’s and don’ts of effective board operations. This seminar will provide board members and executive staff with the tools and knowledge necessary to build the effective board that you all desire.

Tuition: $75.00
9:15 am – 12:15 pm
One Monday Morning Session: 2/4
Sheeri’s feed of articles from the pythian group
Sheeri’s feed of articles from the pythian group


Major Donor Fundraising

The largest revenue category in most nonprofit budgets comes from those most dedicated and prized stakeholders – major donors. These donors make large individual contributions, thousands, and even millions of dollars to support the work of groups about which they are passionate. We will learn how to find these crucial supporters, develop your group’s relationship with them, and to work with them to expand the reach of your work and your budget. Familiarity with the basics of fundraising is recommended prior to taking this class. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced

Sheeri’s feed of articles from the pythian group


Major Donor Fundraising

The largest revenue category in most nonprofit budgets comes from those most dedicated and prized stakeholders – major donors. These donors make large individual contributions, thousands, and even millions of dollars to support the work of groups about which they are passionate. We will learn how to find these crucial supporters, develop your group’s relationship with them, and to work with them to expand the reach of your work and your budget. Familiarity with the basics of fundraising is recommended prior to taking this class. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced


How to Start a Planned Giving Program

Many nonprofit fundraisers have heard of planned giving, easy-to-implement ways of incorporating this powerful development tool into your work. No prior experience with planned giving is required. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced

Sheeri’s feed of articles from the pythian group


Major Donor Fundraising

The largest revenue category in most nonprofit budgets comes from those most dedicated and prized stakeholders – major donors. These donors make large individual contributions, thousands, and even millions of dollars to support the work of groups about which they are passionate. We will learn how to find these crucial supporters, develop your group’s relationship with them, and to work with them to expand the reach of your work and your budget. Familiarity with the basics of fundraising is recommended prior to taking this class. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced


How to Start a Planned Giving Program

Many nonprofit fundraisers have heard of planned giving, easy-to-implement ways of incorporating this powerful development tool into your work. No prior experience with planned giving is required. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced


Forging the Board of Your Dreams

An effective board of directors is essential to a healthy organization. Executive Directors and Boards must work together, and understand the keys to achieving shared success. This workshop will teach participants:


  • The roles and responsibilities of the board

  • The role of board committees

  • Roles and expectations for individual board members

  • The role of the board versus that of paid staff

  • Efficient board structure and management

  • How to develop leaders and an ongoing leadership succession

We will also cover the do’s and don’ts of effective board operations. This seminar will provide board members and executive staff with the tools and knowledge necessary to build the effective board that you all desire.

Tuition: $75.00
9:15 am – 12:15 pm
One Monday Morning Session: 2/4
Sheeri’s feed of articles from the pythian group


Major Donor Fundraising

The largest revenue category in most nonprofit budgets comes from those most dedicated and prized stakeholders – major donors. These donors make large individual contributions, thousands, and even millions of dollars to support the work of groups about which they are passionate. We will learn how to find these crucial supporters, develop your group’s relationship with them, and to work with them to expand the reach of your work and your budget. Familiarity with the basics of fundraising is recommended prior to taking this class. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced


How to Start a Planned Giving Program

Many nonprofit fundraisers have heard of planned giving, easy-to-implement ways of incorporating this powerful development tool into your work. No prior experience with planned giving is required. Limited to twenty students.

6:00 pm – 7:45 pm
Dates To Be Announced


Forging the Board of Your Dreams

An effective board of directors is essential to a healthy organization. Executive Directors and Boards must work together, and understand the keys to achieving shared success. This workshop will teach participants:


  • The roles and responsibilities of the board

  • The role of board committees

  • Roles and expectations for individual board members

  • The role of the board versus that of paid staff

  • Efficient board structure and management

  • How to develop leaders and an ongoing leadership succession

We will also cover the do’s and don’ts of effective board operations. This seminar will provide board members and executive staff with the tools and knowledge necessary to build the effective board that you all desire.

Tuition: $75.00
9:15 am – 12:15 pm
One Monday Morning Session: 2/4

A friend of mine asked me:

I’m hoping you can help me out with something — I’m trying to optimize a search feature. Since it uses a MySQL database, the search already uses the LIKE statement to get matches for a search query, we might be needing something more flexible. I found mention on MySQL’s website about something called the SOUNDS LIKE expression that can be more flexible than LIKE. Do you know anything about this? If you do, can you point me a direction where I might be able to learn more about it? Thanks in advance for your help!

My response:

I haven’t used it, but the MySQL manual’s pretty good, so here are my thoughts:

http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_sounds_like
says:

expr1 SOUNDS LIKE expr2

This is the same as SOUNDEX(expr1) = SOUNDEX(expr2).

So let’s say your example is searching for “Sheeri”. So you’d do
WHERE field SOUNDS LIKE "%Sheeri%";

And maybe you’re hoping to get fields that contain “Sheeri” and “cheery”. However, what this will do is
WHERE SOUNDEX(field) = SOUNDEX('%Sheeri%');

There is some important information in the manual here:
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_soundex

which says that you can use SUBSTRING to get a part of a string. But you want to search on some part of a field, and you don’t know where in that field the string might be.

One of the biggest problems with a search feature using LIKE is that inevitably you use search terms like
WHERE field LIKE "%Sheeri%";

MySQL can use an index on a text field, but the internal format of the index uses the first character, then the second character, etc. Just as you search for a word in a dictionary — first you get to the section that has the first letter, then flip pages to get to the second letter, etc.

However, just as it would be impossible for you to search a dictionary for all words ending in “th”, MySQL cannot use an index on a string/text field if you have a wildcard at the beginning of your comparison. MySQL can use an index if you search

WHERE field like "Sheeri%";
and
WHERE field like "Sh%ri%";
but not
WHERE field like "%Sheeri";

I think what you really want is a fulltext search: http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

FULLTEXT search can only be done against MyISAM tables. With FULLTEXT search you can sort rows in order of relevance.

It won’t address the issue where someone spells something wrong in a search, but the WITH QUERY EXPANSION mode of search can help that. The manual at
http://dev.mysql.com/doc/refman/5.0/en/fulltext-query-expansion.html

says it best:


It works by performing the search twice, where the search phrase for the second search is the original search phrase concatenated with the few most highly relevant documents from the first search. Thus, if one of these documents contains the word “databases” and the word “MySQL”, the second search finds the documents that contain the word “MySQL” even if they do not contain the word “database”. </I>

Make sure you read all the gotchas at:
http://dev.mysql.com/doc/refman/5.0/en/fulltext-restrictions.html

I hope this helps!