Edit Distance (Levenshtein) (C)
Topics: dynamic programming, strings
Problem
Compute the Levenshtein distance between two strings: the minimum number of
single-character insertions, deletions, and substitutions to transform
one into the other.
int edit_distance(const char *a, size_t alen, const char *b, size_t blen);
For example, kitten to sitting is distance 3 (substitute k→s, substitute e→i,
insert g).
The dynamic program
Build a grid where d[i][j] is the distance between the first i characters of a
and the first j of b. The recurrence is the whole idea:
d[i][j] = min(
d[i-1][j] + 1, // delete a[i-1]
d[i][j-1] + 1, // insert b[j-1]
d[i-1][j-1] + (a[i-1] != b[j-1]) // substitute (free if the chars match)
)
with the base cases d[i][0] = i and d[0][j] = j (turning a prefix into the empty
string costs one deletion per character).
Key concepts
- Each cell depends only on three neighbors: the one above, the one to the left,
and the diagonal. That local structure is what makes it a clean DP.
- Two rows are enough. Because a cell only needs the previous row and the current
row so far, you do not store the whole grid, just a rolling pair of rows. Use the
shorter string for the columns to keep that row small.
- The same edit-distance DP underlies spell-checkers,
diff, DNA sequence alignment,
and fuzzy search ranking.
Run
cc -std=c11 -O2 -Wall tests.c -o /tmp/edit && /tmp/edit
Sign in to submit your solution.