Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 7627f64

Browse files
committed
Doc: improve documentation about ts_headline() function.
Now that I've had my nose in that code, I thought the docs about it left something to be desired.
1 parent 91be1d1 commit 7627f64

File tree

1 file changed

+57
-47
lines changed

1 file changed

+57
-47
lines changed

doc/src/sgml/textsearch.sgml

Lines changed: 57 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -1301,64 +1301,75 @@ ts_headline(<optional> <replaceable class="parameter">config</replaceable> <type
13011301
<itemizedlist spacing="compact" mark="bullet">
13021302
<listitem>
13031303
<para>
1304-
<literal>StartSel</literal>, <literal>StopSel</literal>: the strings with
1305-
which to delimit query words appearing in the document, to distinguish
1306-
them from other excerpted words. You must double-quote these strings
1307-
if they contain spaces or commas.
1304+
<literal>MaxWords</literal>, <literal>MinWords</literal> (integers):
1305+
these numbers determine the longest and shortest headlines to output.
1306+
The default values are 35 and 15.
13081307
</para>
13091308
</listitem>
13101309
<listitem>
13111310
<para>
1312-
<literal>MaxWords</literal>, <literal>MinWords</literal>: these numbers
1313-
determine the longest and shortest headlines to output.
1311+
<literal>ShortWord</literal> (integer): words of this length or less
1312+
will be dropped at the start and end of a headline, unless they are
1313+
query terms. The default value of three eliminates common English
1314+
articles.
13141315
</para>
13151316
</listitem>
13161317
<listitem>
13171318
<para>
1318-
<literal>ShortWord</literal>: words of this length or less will be
1319-
dropped at the start and end of a headline. The default
1320-
value of three eliminates common English articles.
1319+
<literal>HighlightAll</literal> (boolean): if
1320+
<literal>true</literal> the whole document will be used as the
1321+
headline, ignoring the preceding three parameters. The default
1322+
is <literal>false</literal>.
13211323
</para>
13221324
</listitem>
13231325
<listitem>
13241326
<para>
1325-
<literal>HighlightAll</literal>: Boolean flag; if
1326-
<literal>true</literal> the whole document will be used as the
1327-
headline, ignoring the preceding three parameters.
1327+
<literal>MaxFragments</literal> (integer): maximum number of text
1328+
fragments to display. The default value of zero selects a
1329+
non-fragment-based headline generation method. A value greater
1330+
than zero selects fragment-based headline generation (see below).
13281331
</para>
13291332
</listitem>
13301333
<listitem>
13311334
<para>
1332-
<literal>MaxFragments</literal>: maximum number of text excerpts
1333-
or fragments to display. The default value of zero selects a
1334-
non-fragment-oriented headline generation method. A value greater than
1335-
zero selects fragment-based headline generation. This method
1336-
finds text fragments with as many query words as possible and
1337-
stretches those fragments around the query words. As a result
1338-
query words are close to the middle of each fragment and have words on
1339-
each side. Each fragment will be of at most <literal>MaxWords</literal> and
1340-
words of length <literal>ShortWord</literal> or less are dropped at the start
1341-
and end of each fragment. If not all query words are found in the
1342-
document, then a single fragment of the first <literal>MinWords</literal>
1343-
in the document will be displayed.
1335+
<literal>StartSel</literal>, <literal>StopSel</literal> (strings):
1336+
the strings with which to delimit query words appearing in the
1337+
document, to distinguish them from other excerpted words. The
1338+
default values are <quote><literal>&lt;b&gt;</literal></quote> and
1339+
<quote><literal>&lt;/b&gt;</literal></quote>, which can be suitable
1340+
for HTML output.
13441341
</para>
13451342
</listitem>
13461343
<listitem>
13471344
<para>
1348-
<literal>FragmentDelimiter</literal>: When more than one fragment is
1349-
displayed, the fragments will be separated by this string.
1345+
<literal>FragmentDelimiter</literal> (string): When more than one
1346+
fragment is displayed, the fragments will be separated by this string.
1347+
The default is <quote><literal> ... </literal></quote>.
13501348
</para>
13511349
</listitem>
13521350
</itemizedlist>
13531351

13541352
These option names are recognized case-insensitively.
1355-
Any unspecified options receive these defaults:
1353+
You must double-quote string values if they contain spaces or commas.
1354+
</para>
13561355

1357-
<programlisting>
1358-
StartSel=&lt;b&gt;, StopSel=&lt;/b&gt;,
1359-
MaxWords=35, MinWords=15, ShortWord=3, HighlightAll=FALSE,
1360-
MaxFragments=0, FragmentDelimiter=" ... "
1361-
</programlisting>
1356+
<para>
1357+
In non-fragment-based headline
1358+
generation, <function>ts_headline</function> locates matches for the
1359+
given <replaceable class="parameter">query</replaceable> and chooses a
1360+
single one to display, preferring matches that have more query words
1361+
within the allowed headline length.
1362+
In fragment-based headline generation, <function>ts_headline</function>
1363+
locates the query matches and splits each match
1364+
into <quote>fragments</quote> of no more than <literal>MaxWords</literal>
1365+
words each, preferring fragments with more query words, and when
1366+
possible <quote>stretching</quote> fragments to include surrounding
1367+
words. The fragment-based mode is thus more useful when the query
1368+
matches span large sections of the document, or when it's desirable to
1369+
display multiple matches.
1370+
In either mode, if no query matches can be identified, then a single
1371+
fragment of the first <literal>MinWords</literal> words in the document
1372+
will be displayed.
13621373
</para>
13631374

13641375
<para>
@@ -1370,25 +1381,24 @@ SELECT ts_headline('english',
13701381
is to find all documents containing given query terms
13711382
and return them in order of their similarity to the
13721383
query.',
1373-
to_tsquery('query &amp; similarity'));
1374-
ts_headline
1384+
to_tsquery('english', 'query &amp; similarity'));
1385+
ts_headline
13751386
------------------------------------------------------------
1376-
containing given &lt;b&gt;query&lt;/b&gt; terms
1377-
and return them in order of their &lt;b&gt;similarity&lt;/b&gt; to the
1387+
containing given &lt;b&gt;query&lt;/b&gt; terms +
1388+
and return them in order of their &lt;b&gt;similarity&lt;/b&gt; to the+
13781389
&lt;b&gt;query&lt;/b&gt;.
13791390

13801391
SELECT ts_headline('english',
1381-
'The most common type of search
1382-
is to find all documents containing given query terms
1383-
and return them in order of their similarity to the
1384-
query.',
1385-
to_tsquery('query &amp; similarity'),
1386-
'StartSel = &lt;, StopSel = &gt;');
1387-
ts_headline
1388-
-------------------------------------------------------
1389-
containing given &lt;query&gt; terms
1390-
and return them in order of their &lt;similarity&gt; to the
1391-
&lt;query&gt;.
1392+
'Search terms may occur
1393+
many times in a document,
1394+
requiring ranking of the search matches to decide which
1395+
occurrences to display in the result.',
1396+
to_tsquery('english', 'search &amp; term'),
1397+
'MaxFragments=10, MaxWords=7, MinWords=3, StartSel=&lt;&lt;, StopSel=&gt;&gt;');
1398+
ts_headline
1399+
------------------------------------------------------------
1400+
&lt;&lt;Search&gt;&gt; &lt;&lt;terms&gt;&gt; may occur +
1401+
many times ... ranking of the &lt;&lt;search&gt;&gt; matches to decide
13921402
</screen>
13931403
</para>
13941404

0 commit comments

Comments
 (0)