Lists (10)
Sort Name ascending (A-Z)
Stars
Robyn is an experimental, AI/ML-powered and open sourced Marketing Mix Modeling (MMM) package from Meta Marketing Science. Our mission is to democratise modeling knowledge, inspire the industry thr…
Uplift modeling and causal inference with machine learning algorithms
advertools - online marketing productivity and analysis tools
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, P…
pyspark methods to enhance developer productivity 📣 👯 🎉
PySpark test helper methods with beautiful error messages
Database Markup Language (DBML), designed to define and document database structures
Template for a data science project
Free MLOps course from DataTalks.Club
🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.
Examples for the blog post on pytest-mock
Demo of using the Nutter for testing of Databricks notebooks in the CI/CD pipeline
Dataset extracted from the Jira ITS of four popular open source ecosystems i.e., the Apache Software Foundation, Spring, JBoss and CodeHaus communities.
An ultra-simplified explanation to design patterns
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Generate and Visualize Data Lineage from query history
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
An open source python library for automated feature engineering
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Examples surrounding Databricks.
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
A collection of learning resources for curious software engineers
📙 Awesome Data Catalogs and Observability Platforms.
WIP: Roadmap to becoming a machine learning engineer in 2020
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Panel: The powerful data exploration & web app framework for Python
Always know what to expect from your data.