Data Structure Course Project: Lit(erature) Man(agement) (System).
- Importing literatures from DBLP XML format datasets.
- Search literatures by author, title, keyword(extracted from title) supported by B+ tree.
- Count and sort the number of publications by all authors.
- Calculate the top 10 keywords by frequencies for each year.
- Obtain the collaboration network of authors.
- Calculate the exact number of cliques in the author collaboration graph, supported by the Pivoter algorithm.
We use uv to manage project dependencies. If you don't have uv installed, please follow the instructions on the uv installation page.
The B+ tree and Pivoter algorithm are implemented in C++ and use CMake as the build system. They are accessed from Python using pybind11. Additionally, the Pivoter algorithm uses the GMP library for some calculations, so please make sure you have installed cmake and the GMP library. For these reasons, we recommend building in a Linux environment.
The web frontend is built using GitHub Actions, and the built static pages have already been uploaded to the repository, so you don't need to worry about this.
Install dependencies:
uv syncEnter uv venv:
uv venvAfter entering uv venv, run
python main.pyThen open the URL bound to Flask, which is http://127.0.0.1:2747 by default.
After launching Litman, simply run
python load_full_dblp.pyto download, unarchive, split, and finally load DBLP dataset to Litman.
Note: On my computer, this took about seven hours.
Or, you can use the "Import Literature" feature under "Manage Literature" tab in our web frontend to load DBLP XML format datasets manually.
The repository is published under GNU General Public License v3.