Lex Graph 🕸️

This repo builds a knowledge graph from UK legislation and provides an application for exploring the graph.

Originally prototyped by @livadlivesey and @GavEdwards.

Accessing the Graph ⬇️📄

This repo is the code used to build Lex Graph from scratch. If you wish to simply access the produced knowledge graph, Lex Graph can be downloaded from i.AI's Hugging Face Datasets.

Setup 🛠️

To build the Lex Graph from scratch, please follow these steps:

Clone the repo

git clone https://github.com/i-dot-ai/lex-graph-build.git

Install poetry if you don't have it

pip install poetry

Install the dependencies and create a virtual environment

poetry install

Download raw legislation data 📥

A dump of the latest XML versions of legislation is available from the new Legislation Research website from the National Archives. At the time of publishing this is in beta. If you are interested in gaining access to the raw data to build the graph from scratch, please contact the Legislation Data Team ([email protected]).

Once you have access, download the Legislative Texts Enacted CLML data and unzip it into data/raw.

Build the graph 🚀

The graph build process consists of two steps:

Pre-processing the raw data
Building the graph

The processed data is saved in the data/processed directory and the graph is saved in the data/graph directory.

Processing raw data

Process a single test file

poetry run python scripts/preprocess.py --test

Process a custom file

poetry run python scripts/preprocess.py --file <file_path>

Process a subset of files

poetry run python scripts/preprocess.py --year <year> --type <type>

Process all files

poetry run python scripts/preprocess.py --all

Use a different input path (default input is data/raw, default output is data/processed)

poetry run python scripts/preprocess.py --input_path <input_path> --output_path <output_path>

You can also use a yaml configuration file instead of, or alongside, the command line arguments

poetry run python scripts/preprocess.py --config configs/preprocess_config.yaml

Building the graph

Build graph from a single test file

poetry run python scripts/build_graph.py --test

Build graph from a custom file

poetry run python scripts/build_graph.py --file <file_path>

Build graph from a subset of files

poetry run python scripts/build_graph.py --year <year> --type <type>

Build graph from all files

poetry run python scripts/build_graph.py --all

You can also use a yaml configuration file instead of, or alongside, the command line arguments

poetry run python scripts/build_graph.py --config configs/graph_config.yaml

Streamlit App 🌐

The Streamlit app provides an interactive interface for exploring the UK legislation graph. The Streamlit app in the demo folder provides an interactive interface for exploring the UK legislation graph. The app.py file in the demo directory is the main entry point for the Streamlit application. It provides various functionalities for exploring and visualizing the legislation graph. See the README in the demo folder for more details.

Limitations

This is a prototype and does not guarantee accurate data. The codebase and features are subject to change. Some functionality may be experimental and require further testing and validation.

Data Coverage: This prototype currently processes UK legislation data from the National Archives, but may not capture all legislative documents or their complete revision history. Some older or specialised documents might be missing or incompletely processed.
Graph Completeness: The relationships between legislative documents are primarily based on explicit references found in the XML files. Implicit connections, contextual relationships, or references using non-standard formats may be missed.
Data Accuracy: While we strive for accuracy, the automated parsing and graph construction process may contain errors, particularly when handling:
- Complex nested legislative structures
- Unusual formatting or non-standard XML structures
- Cross-references using ambiguous or incomplete citations
- Amendments and repeals that are conditionally applied
Performance Considerations: Processing the complete legislative dataset can be computationally intensive and time-consuming. On a well-powered laptop (e.g., Apple M3 Macbook Pro), we have found it takes up to 30 minutes to preprocess the full set of XML files (~15 minutes) and build the graph (~15 minutes). Users working with the full dataset should ensure adequate system resources are available.
Visualisation Constraints : The Streamlit visualization interface may experience performance limitations when displaying very large subgraphs or handling complex queries on the full dataset.
Legal Disclaimer: This tool is intended for research and analysis purposes only. It should not be relied upon for legal advice or as an authoritative source of legislation. Users should always refer to official sources for current and accurate legislative information.

Credits

This project builds upon and was inspired by the work of the Graphie team at King’s Quantitative and Digital Law Lab (QuantLaw), King's College London. Their original project Graphie demonstrated innovative approaches to legal knowledge graph construction and analysis of UK legislation, based on the Housing Act 2004. We encourage those interested in legal knowledge graphs to explore the original Graphie project available at: https://github.com/kclquantlaw/graphie.

All data is sourced from The National Archives legislation wesbite. Crown © and database right material reused under the Open Government Licence v3.0. Material derived from the European Institutions © European Union, 1998-2019, reused under the terms of Commission Decision 2011/833/EU.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
.vscode		.vscode
configs		configs
data		data
demo		demo
notebooks		notebooks
scripts		scripts
src/lex_graph		src/lex_graph
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
app_example.png		app_example.png
header.jpg		header.jpg
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Lex Graph 🕸️

Accessing the Graph ⬇️📄

Setup 🛠️

Download raw legislation data 📥

Build the graph 🚀

Processing raw data

Building the graph

Streamlit App 🌐

Limitations

Credits

About

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

License

i-dot-ai/lex-graph

Folders and files

Latest commit

History

Repository files navigation

Lex Graph 🕸️

Accessing the Graph ⬇️📄

Setup 🛠️

Download raw legislation data 📥

Build the graph 🚀

Processing raw data

Building the graph

Streamlit App 🌐

Limitations

Credits

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 2

Uh oh!

Languages