Stars
Accelerated BLAST compatible local sequence aligner.
MMseqs2: ultra fast and sensitive search and clustering suite
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Plotting scripts for long read sequencing data
Accelerate your web app development | Build fast. Run fast.
A comprehensive pipeline for short read metagenomic data
Scrapy Chrono24 Watch Scraper: A Python web scraping project that collects detailed watch information from Chrono24 and saves it in JSON format. Ideal for watch enthusiasts and data analysis
Discover real-time weather analysis through stream and batch processing with Apache Kafka, Apache Spark, and MySQL. This project seamlessly integrates both techniques to compute essential weather …
Free Weather Forecast API for non-commercial use
📚 Openblog is an elegant, simple, and user-friendly blog. Focused on accessibility, SEO and performance.
Roadmap для Data Engineer. Цель роадмапа – устроиться тебе на работу!
Open-Source Web UI for Apache Kafka Management
Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize and recommend app.
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
This project is created as part of the Data Engineering Zoomcamp Cohort 2025.
E2E DE solution for monitoring and analyzing global air quality
Exploring bluesky.social for seasonal digital disease detection
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
A lightweight data processing framework built on DuckDB and 3FS.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.