cfg

Context-free grammar tools.

Rust library for manipulating context-free grammars. You can check the documentation here.

Motivation

The purpose of cfg is to provide a reusable, well-tested foundation for working with context-free grammars. It simplifies the development of higher-level tools such as Earley, LL(1), or LR(1) parsers, as well as other tool that rely on grammars.

We believe developers often need a common foundation to avoid re-implementing the wheel.

Our current focus is supporting gearley, an Earley parser built on top of this library.

Usage

Add this to your Cargo.toml:

[dependencies]
cfg = "0.10"

If you want grammar serialization support with miniserde, include the feature like this:

[dependencies]
cfg = { version = "0.10", features = ["serialize"] }

If you want weighted generation support, include the weighted-generation feature.

If you want LL(1) classification support, include the ll feature.

List of features

The following tools are implemented thus far:

rich rule building
- sequence rules,
- precedenced rules.
conversions to a shape similar to Chomsky Normal Form
- grammar binarization,
- nulling rule elimination for binarized grammars.
sanity
- cycle detection and elimination,
- useless rule detection and elimination,
- unused symbol removal.
analysis for LR(1), LL(1) and others
- FIRST and FOLLOW set computation,
- minimal distance computation,
- LL(1) classification.
tools for probabilistic grammars
- generation for PCFGs + negative zero-width lookahead.

Building grammars

cfg includes an interface that simplifies grammar construction.

Generating symbols

The easiest way of generating symbols is with the sym method.

let mut grammar = Cfg::new();
let [start, expr, identifier, number,
     plus, multiply, power, l_paren, r_paren, digit] = grammar.sym();

You may set one or more start symbols we call roots.

grammar.set_roots([start]);

Building grammar rules

Rules have a LHS symbol and zero or more RHS symbols.

Example BNF:

start ::= expr | identifier l_paren expr r_paren

With our library:

grammar.rule(start).rhs([expr])
                   .rhs([identifier, l_paren, expr, r_paren]);

Building sequence rules

Sequence rules have a LHS symbol, a RHS symbol, a range of repetitions, and optional separation. Aside from separation, they closely resemble regular expression repetitions.

Example BNF:

number ::= digit+

With our library:

use cfg_sequence::CfgSequenceExt;
grammar.sequence(number).inclusive(1, None).rhs(digit);

Building precedenced rules

Precedenced rules are the most convenient way to describe operators. Once built, they are immediately rewritten into basic grammar rules, and unique symbols are generated. Operator associativity can be set to Right or Group. It's Left by default.

use cfg::precedence::Associativity::{Right, Group};

grammar.precedenced_rule(expr)
           .rhs([number])
           .rhs([identifier])
           .associativity(Group)
           .rhs([l_paren, expr, r_paren])
       .lower_precedence()
           .associativity(Right)
           .rhs([expr, power, expr])
       .lower_precedence()
           .rhs([expr, multiply, expr])
       .lower_precedence()
           .rhs([expr, plus, expr]);

Limitations

We've removed the option to plug in your custom grammar type through traits.

Contributing

All features are welcome. Feel free to make a Pull Request on GitHub.

See CONTRIBUTING.md for more.

License

Dual-licensed for compatibility with the Rust project.

Licensed under the Apache License Version 2.0: http://www.apache.org/licenses/LICENSE-2.0, or the MIT license: http://opensource.org/licenses/MIT, at your option.

Name		Name	Last commit message	Last commit date
Latest commit History 156 Commits
.github/workflows		.github/workflows
.vscode		.vscode
cfg-classify		cfg-classify
cfg-examples		cfg-examples
cfg-generate		cfg-generate
cfg-grammar		cfg-grammar
cfg-history		cfg-history
cfg-load		cfg-load
cfg-predict-distance		cfg-predict-distance
cfg-predict-sets		cfg-predict-sets
cfg-regexp		cfg-regexp
cfg-sequence		cfg-sequence
cfg-symbol-bit-matrix		cfg-symbol-bit-matrix
cfg-symbol		cfg-symbol
cfg		cfg
docs		docs
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.toml		Cargo.toml
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
README.md		README.md
graph.png		graph.png
tokei.sh		tokei.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Licenses found

Uh oh!

Uh oh!

Repository files navigation

cfg

Motivation

Usage

List of features

Building grammars

Generating symbols

Building grammar rules

Building sequence rules

Building precedenced rules

Limitations

Contributing

License

About

Licenses found

Uh oh!

Releases

Packages

Used by 778

Contributors 3

Uh oh!

Languages

Uh oh!

License

Licenses found

Uh oh!

pczarn/cfg

Folders and files

Latest commit

History

Repository files navigation

cfg

Motivation

Usage

List of features

Building grammars

Generating symbols

Building grammar rules

Building sequence rules

Building precedenced rules

Limitations

Contributing

License

About

Resources

License

Licenses found

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Used by 778

Contributors 3

Uh oh!

Languages

Packages