python - Using soundex feature w/django search engine -


i'm building search engine django/python site. 1 requirement soundex feature, if searches "smith" or "johnson" search return homonyms "smyth" or "jonsen". database mysql, fwiw.

what's recommended approach? right i'm leaning towards haystack + whoosh, capture soundex feature.

thanks in advance help.

mysql has soundex() function. docs here. soundex algorithm developed aid in searching anglo-saxon names in english. it's not best choice these days.

you're better off either metaphone or double metaphone.

in case, people store result. makes easy index, , searching pretty fast.

data integrity problem, though. ideally, i'd want this.

create table persons (   ...   last_name varchar(25) not null,   last_name_phonetic varchar(6) not null,  -- not sure length   check (last_name_phonetic = double_metaphone(last_name))   ... ); 

but requires dbms have either intrinsic double_metaphone() function, or support user-defined functions in check() constraints. mysql doesn't enforce check() constraints @ all, you'd need implement in triggers if application needs kind of data integrity.

for it's worth, postgresql has contrib module, fuzzystrmatch, implements soundex, metaphone, double metaphone, , levenshtein distance functions. if me, i'd build in postgresql rather mysql.


Comments

Popular posts from this blog

python - Subclassed QStyledItemDelegate ignores Stylesheet -

java - HttpClient 3.1 Connection pooling vs HttpClient 4.3.2 -

SQL: Divide the sum of values in one table with the count of rows in another -