Codestin Search App

Tuatara logo

"Artificial intelligence is only as good as the data it learns from."
- Unknown

🦎 What is Tuatara?

Tuatara is a library for generating fine-tuning pairs for large language model (LLM) post training.

🤔 Why Tuatara?

Fine-tuning large language models requires high-quality training data pairs that are well grounded in their source documents. Creating these pairs manually is laborious and error-prone, and existing tools often lack flexibility or fail to scale across different document types and domains. Tuatara addresses these challenges directly.

📦 Installation

Run the following command to install Tuatara:

pip install git+https://github.com/dross20/tuatara

🚀 Quickstart

The following example demonstrates how to use Tuatara's preconfigured pipeline for creating fine tuning pairs from multiple documents. By default, default_pipeline will use the OpenAI API for LLM inference and search for your OpenAI API key in the environment variables.

from tuatara import default_pipeline

documents = [
  "./document1.pdf",
  "./document2.pdf",
  "./document3.txt"
]

pipeline = default_pipeline(model="gpt-4o")
pairs, history = pipeline(documents)

📜 License

This project is licensed under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
tuatara		tuatara
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

🦎 What is Tuatara?

🤔 Why Tuatara?

📦 Installation

🚀 Quickstart

📜 License

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

dross20/tuatara

Folders and files

Latest commit

History

Repository files navigation

🦎 What is Tuatara?

🤔 Why Tuatara?

📦 Installation

🚀 Quickstart

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages