»
Deletion: Removing a letter from a word
»
Insertion: Inserting a letter into a word and obtaining another word
»
Substitution: Replacing one letter with another, such as changing the p letter
into an f letter and obtaining fan from pan
Each edit has a cost, which Levenshtein defines as 1 for each transformation.
However, depending on how you apply the algorithm, you could set the cost dif-
ferently for deletion, insertion, and substitution. For example, when searching for
similar street names, misspellings are more common than outright differences in
lettering, so substitution might incur only a cost of 1, and deletion or insertion
might incur a cost of 2. On the other hand, when looking for monetary amounts,
similar values quite possibly will have different numbers of numbers. Someone
could enter $123 or $123.00 into the database. The numbers are the same, but the
number of numbers is different, so insertion and deletion might cost less than
substitution (a value of $124 is not quite the same as a value of $123, so substitut-
ing 3 for 4 should cost more).
318
PART 5
Do'stlaringiz bilan baham: |