Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 5089669

Browse files
committed
Start of CYK parser. The grammar still needs to be updated accordingly.
1 parent 8bcc7b0 commit 5089669

File tree

1 file changed

+24
-0
lines changed

1 file changed

+24
-0
lines changed

nlp.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -182,3 +182,27 @@ def extender(self, edge):
182182
for (i, j, A, alpha, B1b) in self.chart[j]:
183183
if B1b and B == B1b[0]:
184184
self.add_edge([i, k, A, alpha + [edge], B1b[1:]])
185+
186+
187+
# ______________________________________________________________________________
188+
# CYK Parsing
189+
190+
def CYK_parse(words, grammar):
191+
"[Figure 23.5]"
192+
# We use 0-based indexing instead of the book's 1-based.
193+
N = len(words)
194+
P = defaultdict(float)
195+
# Insert lexical rules for each word.
196+
for (i, word) in enumerate(words):
197+
for (X, p) in grammar.categories[word]: # XXX grammar.categories needs changing, above
198+
P[X, i, 1] = p
199+
# Combine first and second parts of right-hand sides of rules,
200+
# from short to long.
201+
for length in range(2, N+1):
202+
for start in range(N-length+1):
203+
for len1 in range(1, length): # N.B. the book incorrectly has N instead of length
204+
len2 = length - len1
205+
for (X, Y, Z, p) in grammar.cnf_rules(): # XXX grammar needs this method
206+
P[X, start, length] = max(P[X, start, length],
207+
P[Y, start, len1] * P[Z, start+len1, len2] * p)
208+
return P

0 commit comments

Comments
 (0)