- New York, U.S.
- http://lichangny.github.io/
Stars
Google Cloud Storage emulator & testing library.
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
A List of Recommender Systems and Resources
DEPRECATED. PLEASE USE https://github.com/confluentinc/kafka-connect-bigquery. A Kafka Connect BigQuery sink connector
A shell script to set up a macOS laptop for web and mobile development.
Track changes to your rails models
Do some browser detection with Ruby. Includes ActionController integration.
Python module installed with setup.py
Google BigQuery connector for pandas
Samples for the DoubleClick for Advertisers Reporting and Trafficking API
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Repository with examples and smoke tests for the GCP Airflow operators and hooks
Data pipeline is a tool to run Data loading pipelines. It is an open sourced app engine app that users can extend to suit their own needs. Out of the box it will load files from a source, transform…
Apache Beam is a unified programming model for Batch and Streaming data processing.
DonorsChoose.org Data Science Team Opensource Code
Pentaho Data Integration ( ETL ) a.k.a Kettle
Upserts, Deletes And Incremental Processing on Big Data.
Adds static typing to JavaScript to improve developer productivity and code quality.
Streaming MapReduce with Scalding and Storm
Ansible playbook to deploy distributed technologies
A short guide for transitioning from Python to Scala
Repo to migrate old wiki to, esp for devs and code examples
Apache Superset is a Data Visualization and Data Exploration Platform
Docker image for Airbnb's Superset
Content for Udacity's Machine Learning curriculum
An extension of GeoJSON that encodes topology! 🌐