Implementation of Lawrence Philips’ Metaphone and Double Metaphone algorithms.
Metaphone encodes names into a phonetic form such that similar-sounding names have the same or similar Metaphone encodings.
The original system was described by Lawrence Philips in Computer Language Vol. 7 No. 12, December 1990, pp 39-43.
There are multiple implementations of Metaphone, each with their own quirks, I have based this on my interpretation of the algorithm specification. Even LP’s original BASIC implementation appears to contain bugs (specifically with the handling of CC and MB), when compared to his explanation of the algorithm.
I have also compared this implementation with that found in PHP’s standard library, which appears to mimic the behaviour of LP’s original BASIC implementation. For compatibility, these rules can also be used by passing :alternate=>true to the methods.
Double Metaphone algoirthm originally published in the June 2000 issue of C/C++ Users Journal. Ruby version based on Stephen Woodbridge’s PHP version - swoodbridge.com/DoubleMetaPhone/
Ruby implementation of Double Metaphone based on work by Paul Battley (pbattley@gmail.com).
Metaphone rules. These are simply applied in order.
The rules for the ‘buggy’ alternate implementation used by PHP etc.
Returns the primary and secondary double metaphone tokens (the secondary will be nil if equal to the primary).
# File lib/english/metaphone.rb, line 99 def self.double_metaphone(string) string = string.to_s @db_memo ||= Hash.new @db_memo[string] = calculate_double_metaphone(string) end
Returns the Metaphone representation of a string. If the string contains multiple words, each word in turn is converted into its Metaphone representation. Note that only the letters A-Z are supported, so any language-specific processing should be done beforehand.
If alt
is set to true, alternate ‘buggy’ rules are used.
# File lib/english/metaphone.rb, line 80 def self.metaphone(string, alt=nil) string.to_s.strip.split(/\s+/).map{ |w| metaphone_word(w, alt) }.join(' ') end