Stars
A python library for decision tree visualization and model interpretation.
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
A Collection of Awesome Large Weather Models (LWMs) | AI for Earth (AI4Earth) | AI for Science (AI4Science)
ML algorithms implemented and derived from first-principles in Jupyter Notebooks and NumPy
AI Dataset Generator – Create realistic datasets for demos, learning, and dashboards
A Large-Scale Climate Model Dataset for Machine Learning
Curated resources for discovering, reading, and working with arXiv papers
🪐 Markdown with superpowers: from ideas to papers, presentations, websites, books, and knowledge bases.
Just a super thin wrapper for Python tasks that form a flow.
A fully functional and simple Machine Learning library made entirely from scratch with Python.
CleverBee - The Open Source Deep Researcher Tool
Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
An example starter repo for Python projects
Toolkit for linearizing PDFs for LLM datasets/training
A polyglot document intelligence framework with a Rust core. Extract text, metadata, and structured information from PDFs, Office documents, images, and 50+ formats. Available for Rust, Python, Rub…
GPU Accelerated t-SNE for CUDA with Python bindings
Curated Data Science resources (Free & Paid) to help aspiring and experienced data scientists learn, grow, and advance their careers.
🌦️ A catalogue and categorization of AI-based weather forecasting models.
Interactive Tools for Machine Learning, Deep Learning and Math
AI-Powered Data Processing: Use LOTUS to process all of your datasets with LLMs and embeddings. Enjoy up to 1000x speedups with fast, accurate query processing, that's as simple as writing Pandas code
TimeGPT-1: production ready pre-trained Time Series Foundation Model for forecasting and anomaly detection. Generative pretrained transformer for time series trained on over 100B data points. It's …
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
A simplified library for decentralized, privacy preserving machine learning
Scripts for figures and calculations of the manuscript by Warnat-Herresthal el al. 2020