You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expand all lines: Minimum Edit Distance/README.markdown
+17-18Lines changed: 17 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ The minimum edit distance is a possibility to measure the similarity of two stri
6
6
7
7
A common distance measure is given by the *Levenshtein distance*, which allows the following three transformation operations:
8
8
9
-
***Inseration** (*ε→x*) of a single symbol *x* with **cost 1**,
9
+
***Insertion** (*ε→x*) of a single symbol *x* with **cost 1**,
10
10
***Deletion** (*x→ε*) of a single symbol *x* with **cost 1**, and
11
11
***Substitution** (*x→y*) of two single symbols *x, y* with **cost 1** if *x≠y* and with **cost 0** otherwise.
12
12
@@ -15,37 +15,38 @@ When transforming a string by a sequence of operations, the costs of the single
15
15
To avoid exponential time complexity, the minimum edit distance of two strings in the usual is computed using *dynamic programming*. For this in a matrix
16
16
17
17
```swift
18
-
var matrix = [[Int]](count: m+1, repeatedValue: [Int](count: n+1, repeatedValue: 0))
18
+
var matrix = [[Int]](repeating: [Int](repeating: 0, count: n+1), count: m +1)
19
19
```
20
20
21
21
already computed minimal edit distances of prefixes of *w* and *u* (of length *m* and *n*, respectively) are used to fill the matrix. In a first step the matrix is initialized by filling the first row and the first column as follows:
22
22
23
23
```swift
24
24
// initialize matrix
25
25
for index in1...m {
26
-
// the distance of any prefix of the first string to an empty second string
27
-
matrix[index][0]=index
26
+
// the distance of any first string to an empty second string
27
+
matrix[index][0]=index
28
28
}
29
+
29
30
for index in1...n {
30
-
// the distance of any prefix of the second string to an empty first string
31
-
matrix[0][index]=index
31
+
// the distance of any second string to an empty first string
32
+
matrix[0][index]=index
32
33
}
33
34
```
35
+
34
36
Then in each cell the minimum of the cost of insertion, deletion, or substitution added to the already computed costs in the corresponding cells is chosen. In this way the matrix is filled iteratively:
35
37
36
38
```swift
37
39
// compute Levenshtein distance
38
-
for (i, selfChar) inself.characters.enumerate() {
39
-
for (j, otherChar) in other.characters.enumerate() {
40
+
for (i, selfChar) inself.characters.enumerated() {
41
+
for (j, otherChar) in other.characters.enumerated() {
40
42
if otherChar == selfChar {
41
43
// substitution of equal symbols with cost 0
42
-
matrix[i+1][j+1] = matrix[i][j]
44
+
matrix[i+1][j+1] = matrix[i][j]
43
45
} else {
44
-
// minimum of the cost of insertion, deletion, or substitution added
45
-
// to the already computed costs in the corresponing cells
0 commit comments