Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

Randomly sampling vaguely grammatical nonsense since 2009

Notifications You must be signed in to change notification settings

samer--/babbler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

babbler

Randomly sampling vaguely grammatical nonsense since 2009

This repository contains some experiments in sampling English text with stylistic structure (but very little semantic sense). It is built on a fairly complex fragment of English grammar encoded in a 3 different ways:

* `grammar.pl` -- A fairly straightforward representation as a DCG
* `grammar_attr.pl` -- An attribute or feature grammar using graph unification
* `grammar_gap.pl` -- Using the gap threading version of the sampling meta-interpreter from sampledcg

The latter two are untested since several years ago and probably don't work. The graph unification algorithm is based on code by Bob Carpenter.

It requires the Prolog lexical databases which are available at https://code.soundsoftware.ac.uk/hg/plex

Also requires the following SWI Prolog packs:

plrand
genutils
textutils
dcgutils
sampledcg (available at https://github.com/samer--/sampledcg)

Some sample outputs can be found in the examples directory.

The top level file poetry.pl can be set to use either biglex or littlelex (from the plex repository referenced above). The former uses a big lexicon derived mainly from the Oxford Advanced Learners Dictionary, and therefore has lots of great words in it, but can result in a somewhat shotgun approach to vocabulary selection. The smaller lexicon was handwritten.

Sample session:

$ swipl -s poetry.pl
?- main(100,s).
>> resample_probs(dirichlet(2)).
>> render(fmt,rep(5,line(s))).

This will (1) generate and throw away 100 random sentences, (2) enter a stateful shell that keeps track of probability distributions over choices in the grammar, initialised with the choices made while sampling the initial 100 sentences, (3) sample new probability distributions for all known distributions from a Dirichlet process with concentration parameter 3 and (4) sample 5 lines from the resulting grammar. I take no responsibility for any of the nonsense and occasional filth that might emerge from this.

The 'renderer' fmt writes the output to the terminal. If you are using a Mac, you can get the machine to recite the output as well by using fmt && async(say(Voice)) where Voice is one of the voice names. You can log the output to a file with fmt && append(Filename). See render/2 in the textutils pack for more information.

About

Randomly sampling vaguely grammatical nonsense since 2009

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages