LLAssist is a tool for processing and analyzing research articles using Natural Language Processing (NLP) techniques and Large Language Models (LLMs).
Note:
- The paper with title LLAssist: Simple Tools for Automating Literature Review Using Large Language Models uses commit versions prior to
07caad7, specifically commit3bf51a6.
- Read articles from CSV files
- Extract key semantics (topics, entities, keywords) from article titles and abstracts
- Estimate relevance of articles to research questions
- Generate embeddings for keywords
- Output results in both JSON and CSV formats
The main entry point of the application. It orchestrates the process of:
- Reading articles from a CSV file
- Processing each article to extract semantics and estimate relevance
- Writing results incrementally to a CSV file
- Generating a final JSON output
Handles Natural Language Processing tasks:
- Extracting key semantics from text
- Estimating relevance of content to research questions
- Generating embeddings for keywords
Manages connections to various Large Language Models:
- Ollama Gemma 2 (local)
- GPT-3.5 Turbo (OpenAI)
- GPT-4 (OpenAI)
- Text Embedding model (OpenAI)
Handles file I/O operations:
- Reading articles from CSV files
- Writing articles to JSON files
- Writing results to CSV files
dotnet run --project llassist.AppConsole <input_csv_file> <research_questions_file>
Where:
<input_csv_file>is the path to the CSV file containing the articles<research_questions_file>is the path to a text file containing the research questions (one per line)
Run docker compose in the root directory
docker-compose up -d
Run DB migrations in ApiService dir
dotnet ef database update
The program generates two output files:
- A JSON file (
<input_filename>-result.json) containing all processed articles with their semantics and relevance scores - A CSV file (
<input_filename>-result.csv) with the same information in a tabular format
- Microsoft.SemanticKernel
- CsvHelper
- Microsoft.Extensions.Logging
- The program uses a local Ollama instance for the Gemma 2 model. Ensure it's running on
http://localhost:11434before executing the program. - OpenAI API key is required for GPT models and embeddings. Set it in the
LLMServiceconstructor.
This tool is for research purposes. Ensure you have the necessary rights and permissions to process and analyze the articles.
- This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).
- This project is a fork of llassist, which was originally MIT licensed. The original MIT-licensed code has been incorporated into this project. As per the terms of the MIT license, we have included the original MIT license text and copyright notice in our NOTICE file.
- All modifications and additions to the original code, as well as the project as a whole, are licensed under AGPL-3.0.
- For full license text and attribution details, please see the LICENSE and NOTICE files.