``````"""
LCS Problem Statement: Given two sequences, find the length of longest subsequence
present in both of them.  A subsequence is a sequence that appears in the same relative
order, but not necessarily continuous.
Example:"abc", "abg" are subsequences of "abcdefgh".
"""

def longest_common_subsequence(x: str, y: str):
"""
Finds the longest common subsequence between two strings. Also returns the
The subsequence found

Parameters
----------

x: str, one of the strings
y: str, the other string

Returns
-------
L[m][n]: int, the length of the longest subsequence. Also equal to len(seq)
Seq: str, the subsequence found

>>> longest_common_subsequence("programming", "gaming")
(6, 'gaming')
>>> longest_common_subsequence("physics", "smartphone")
(2, 'ph')
>>> longest_common_subsequence("computer", "food")
(1, 'o')
"""
# find the length of strings

assert x is not None
assert y is not None

m = len(x)
n = len(y)

# declaring the array for storing the dp values
L = [ * (n + 1) for _ in range(m + 1)]

for i in range(1, m + 1):
for j in range(1, n + 1):
if x[i - 1] == y[j - 1]:
match = 1
else:
match = 0

L[i][j] = max(L[i - 1][j], L[i][j - 1], L[i - 1][j - 1] + match)

seq = ""
i, j = m, n
while i > 0 and j > 0:
if x[i - 1] == y[j - 1]:
match = 1
else:
match = 0

if L[i][j] == L[i - 1][j - 1] + match:
if match == 1:
seq = x[i - 1] + seq
i -= 1
j -= 1
elif L[i][j] == L[i - 1][j]:
i -= 1
else:
j -= 1

return L[m][n], seq

if __name__ == "__main__":
a = "AGGTAB"
b = "GXTXAYB"
expected_ln = 4
expected_subseq = "GTAB"

ln, subseq = longest_common_subsequence(a, b)
print("len =", ln, ", sub-sequence =", subseq)
import doctest

doctest.testmod()
``````

#### Problem Statement

Given two strings `S` and `T`, find the length of the longest common subsequence (LCS).

#### Approach

Let the `dp[i][j]` be the length of the longest common subsequence of prefixes `S[1..i]` and `T[1..j]`. Our answer (the length of LCS) is `dp[|S|][|T|]` since the prefix of the length of string is the string itself.

Both `dp[i]` and `dp[i]` are `0` for any `i` since the LCS of empty prefix and anything else is an empty string.

Now let's try to calculate `dp[i][j]` for any `i`, `j`. Let's say `S[1..i] = *A` and `T[1..j] = *B` where `*` stands for any sequence of letters (could be different for `S` and `T`), `A` stands for any letter and `B` stands for any letter different from `A`. Since `A != B`, our LCS doesn't include `A` or `B` as a last character. So we could try to throw away `A` or `B` character. If we throw `A`, our LCS length will be `dp[i - 1][j]` (since we have prefixes `S[1..i - 1]` and `T[1..j]`). If we try to throw `B` character, we will have prefixes `S[1..i]` and `T[1..j - 1]` so the length of LCS will be `dp[i][j - 1]`. As we are looking for the Longest common subsequence, we will pick the maximum value from `dp[i][j - 1]` and `dp[i - 1][j]`.

But what if `S[1..i] = *A` and `T[1..j] = *A`? We could say that the LCS of our prefixes is LCS of prefixes `S[1..i - 1]` and `T[1..j - 1]` plus the letter `A`. So `dp[i][j] = dp[i - 1][j - 1] + 1` if `S[i] = T[j]`.

We could see that we can fill our `dp` table row by row, column by column. So our algorithm will be like:

• Let's say that we have strings `S` of the length N and `T` of the length M (numbered from 1). Let's create the table `dp` of size `(N + 1) x (M + 1)` numbered from 0.
• Let's fill the 0th row and the 0th column of `dp` with 0.
• Then, we follow the algorithm:
``````for i in range(1..N):
for j in range(1..M):
if(S[i] == T[j])
dp[i][j] = dp[i - 1][j - 1] + 1
else
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
``````

#### Time Complexity

`O(N * M)` In any case

#### Space Complexity

`O(N * M)` - simple implementation `O(min {N, M})` - two-layers implementation (as `dp[i][j]` depends on only i-th and i-th layers, we coudld store only two layers).

#### Example

Let's say we have strings `ABCB` and `BBCB`. We will build such a table:

``````# # A B C B
# 0 0 0 0 0
B 0 ? ? ? ?
B 0 ? ? ? ?
C 0 ? ? ? ?
B 0 ? ? ? ?
``````

Now we will start to fill our table from 1st row. Since `S = A` and `T = B`, the `dp` will be the maximal value of `dp = 0` and `dp = 0`. So `dp = 0`. But now `S = B = T`, so `dp = dp + 1 = 1`. `dp` is `1` since `A != C` and we pick `max{dp, dp}`. And `dp = dp + 1 = 1`.

``````# # A B C B
# 0 0 0 0 0
B 0 0 1 1 1
B 0 ? ? ? ?
C 0 ? ? ? ?
B 0 ? ? ? ?
``````

Now let's fill the other part of the table:

``````# # A B C B
# 0 0 0 0 0
B 0 0 1 1 1
B 0 0 1 1 2
C 0 0 1 2 2
B 0 0 1 2 3
``````

So the length of LCS is `dp = 3`.  