-
Universidade de BrasĂlia (UnB)
- BrasĂlia, Brasil
- jefersonalves.com
- in/ferreirajeferson
Stars
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
Deepagents is an agent harness built on langchain and langgraph. Deep agents are equipped with a planning tool, a filesystem backend, and the ability to spawn subagents - making them well-equipped …
An open-source AI agent that brings the power of Gemini directly into your terminal.
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website …
DSPy: The framework for programming—not prompting—language models
🦛 CHONK docs with Chonkie ✨ — The lightweight ingestion library for fast, efficient and robust RAG pipelines
MCP Toolbox for Databases is an open source MCP server for databases.
Robyn is a Super Fast Async Python Web Framework with a Rust runtime.
Code snippets for Data Engineering Design Patterns book
A comprehensive collection of Model Context Protocol (MCP) servers
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
Examples and guides for using the OpenAI API
Structured data extraction and instruction calling with ML, LLM and Vision LLM
Get your documents ready for gen AI
A concise API for exploratory data visualization implementing a layered grammar of graphics
A static site generator for data apps, dashboards, reports, and more. Observable Framework combines JavaScript on the front-end for interactive graphics with any language on the back-end for data a…
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance …
Open, Multi-modal Catalog for Data & AI
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination…
This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.
Playing with different packages of the Apache Spark
Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)