Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 4f109c1

Browse files
committed
Added a stop-list to reduce the size of the full text search index. Fred,
populate the "stop_list" triple-quoted string with your favorite handful of stop words.
1 parent e6b63e6 commit 4f109c1

1 file changed

Lines changed: 29 additions & 3 deletions

File tree

Doc/tools/prechm.py

Lines changed: 29 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
'''
1+
"""
22
Makes the necesary files to convert from plain html of
33
Python 1.5 and 1.5.x Documentation to
44
Microsoft HTML Help format version 1.1
@@ -13,7 +13,7 @@
1313
project, 19-Apr-2002 by Tim Peters. Assorted modifications by Tim
1414
and Fred Drake. Obtained from Robin Dunn's .chm packaging of the
1515
Python 2.2 docs, at <http://alldunn.com/python/>.
16-
'''
16+
"""
1717

1818
import sys
1919
import os
@@ -38,12 +38,12 @@
3838
# user-visible features (visible buttons, tabs, etc).
3939
project_template = '''
4040
[OPTIONS]
41-
Compatibility=1.1
4241
Compiled file=%(arch)s.chm
4342
Contents file=%(arch)s.hhc
4443
Default Window=%(arch)s
4544
Default topic=index.html
4645
Display compile progress=No
46+
Full text search stop list file=%(arch)s.stp
4747
Full-text search=Yes
4848
Index file=%(arch)s.hhk
4949
Language=0x409
@@ -80,6 +80,23 @@
8080
</OBJECT>
8181
'''
8282

83+
84+
# List of words the full text search facility shouldn't index. This
85+
# becomes file ARCH.stp. Note that this list must be pretty small!
86+
# Different versions of the MS docs claim the file has a maximum size of
87+
# 256 or 512 bytes (including \r\n at the end of each line).
88+
# Note that "and", "or", "not" and "near" are operators in the search
89+
# language, so not point indexing them even if wanted to.
90+
stop_list = '''
91+
a an and
92+
is
93+
near
94+
not
95+
of
96+
or
97+
the
98+
'''
99+
83100
# Library Doc list of tuples:
84101
# each 'book' : ( Dir, Title, First page, Content page, Index page)
85102
#
@@ -335,6 +352,15 @@ def do_it(args = None) :
335352
library = supported_libraries[ version ]
336353

337354
if not (('-p','') in optlist) :
355+
fname = arch + '.stp'
356+
f = openfile(fname)
357+
print "Building stoplist", fname, "..."
358+
words = stop_list.split()
359+
words.sort()
360+
for word in words:
361+
print >> f, word
362+
f.close()
363+
338364
f = openfile(arch + '.hhp')
339365
print "Building Project..."
340366
do_project(library, f, arch, version)

0 commit comments

Comments
 (0)