# Levenshtein Distance

Given a misspelled string s, a spell checker should return words in the dictionary which are close to s.

In 1965, Vladimir Levenshtein defined the distance between 2 words as the minimum number of “edits” it would take to transform the misspelled word into a correct word, where a single edit is the insertion, deletion, or substitution of a single character.

Example: Singapore
1. Insertion ‘a’ : Singapoare

2. Deletion (or missing) ‘r’: Singapo_e

3. Substitution ‘e’ to ‘a’: Singapora

Given 2 strings (represented as arrays of characters) A and B.
Length of A = a
Length of B = b
Distance =D(A, B) = The minimum number of edits to transform A to B.
$\boxed{ D(A, B) = 1 + \min \begin{cases} D(A[0: a-2] , B[0: b-2]) \\ D(A[0: a-2], B)\\ D(A , B[0: b-2])\\ \end{cases} }$

Note 1: Below is the C++ code for computing D(A,B):

Note 2:
A = Orchestra
B = Carthorse
D(A,B) = 8
Longest: ‘r, h, s’

Remark: a common misspell error ‘Transposition’ could also be considered as an edit:

Signapore -> Singapore

This is the “better” Damerau-Levenshtein Distance:

http://en.m.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance

[References]:

http://en.m.wikipedia.org/wiki/Levenshtein_distance

Elements of Programming Interviews: 300 Questions and Solutions

elementsofprogramminginterviews.com

NLB Call No: 005.1 AZI