Stars
Neo2RDF is a command line application that converts a Neo4j database into an RDF file in Turtle format.
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
📝 python package to calculate readability statistics of a text object - paragraphs, sentences, articles.
This repository contains a dataset for hate speech detection on social media platforms.
Perplexica is an AI-powered answering engine.
A curated list of retrieval-augmented generation (RAG) in large language models
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
Llama meets EU: Investigating the European Political Spectrum through the Lens of LLMs
An NVIDIA AI Workbench example project for Retrieval Augmented Generation (RAG)
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.
AGDISTIS - Agnostic Named Entity Disambiguation
A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata
SpanMarker for Named Entity Recognition
PyTorch Geometric Signed Directed is a signed/directed graph neural network extension library for PyTorch Geometric. The paper is accepted by LoG 2023.
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to open an issue if you run into any trouble!
This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".