Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Complete training data platform for machine learning delivered as a single application.

License

Notifications You must be signed in to change notification settings

DataLearns/diffgram

 
 

Repository files navigation

Diffgram - Open AI Data Platform

Diffgram is annotation and training data software.

What is Diffgram?

Diffgram is a platform for Data Annotation and Data Science. Diffgram is one integrated system that handles most things in the new Training Data (Machine Learning) domain. Diffgram integrates with adjacent tools.

Who is Diffgram for?

Data Scientists, Project Admins, Software Engineers, Data Annotators and Subject Matter Experts.

What is Diffgram a drop in replacement for?

Diffgram is a drop in replacement for the following systems: Labelbox, CVAT, SuperAnnotate, Label Studio (Heartex), V7 Labs (Darwin), BasicAI, SuperbAI, Kili-Technology, HastyAI, Dataloop, Keymakr.

If you see any missing features, bugs etc please report them ASAP to diffgram/issues. See Contribution Guide for more. More on Understanding Diffgram High Level

Features

First, the Full Platform is Open Source. There is no trick where it "sort of works" but you need to pay for a SaaS service to really use it.

This is the full core product. Optional Managed Services and Enterprise support.

This is an ACTIVE project. We are very open to feedback and encourage you to create Issues and help us grow!

User Friendly

  • NEW Import Wizard saves you hours having to map your data.
  • NEW Streamlined Annotation UI suitable both from "First Time" Subject Matter Experts, and powerful options for Professional Full Time Annotators

Annotation

Diffgram is a fully featured annotation tool for images and video to create, update, and maintain high quality training datasets.

Schema (Ontology): Diffgram supports all popular attributes and spatial types including Custom Spatial types.

Data Science

Diffgram is an amazing way to access, view, compare, and collaborate on datasets to create the highest quality models.

Because these features are fully integrated with the Annotation Tooling, it's absolutely seamless to go from spotting an issue, to creating a labeling campaign, updating schema, etc to correct it.

  • Store virtually any scale of dataset and instantly access slices of the data to avoid having to download/unzip/load.
  • Fast access to datasets from multiple machines. Have multiple Data Scientists working on the same data.
  • NEW Data Explorer: Visualize in seconds multiple datasets (Including Video!) and compare models easily without extra computation.
  • Automatic Dataset Versioning and user definable datasets.
  • Collaborate share and comment on specific instances with a Diffgram Permalink.

And coming soon:

  • Load streaming data from Diffgram directly into pytorch and tensorflow with one line
  • Play with model parameters, and see the results in real time with Userscripts

Workflow

Manages Annotation Workflow, Tasks, Quality Assurance and more.

  • One click create human review Pipelines.
  • Webhooks with Actions
  • Easily annotate a single dataset, or scale to hundreds of projects with thousands of subdivided task sets. Includes easy search and filtering.
  • Fully integrated customizable Annotation Reporting.
  • Continually upgrade your data, including easily adding more depth to existing partially annotated sets.

Database & Software Engineering

It's a database for your training data, both metadata and access of raw BLOB data (over top of your storage choice).

  • Runs on your local system or cloud. Less lag, more secure, more control. Security and Privacy
  • Integrates with your tools and 3rd party workforces. Integrations

Tested and Stable Core

Fully integrated automatic test suite, with comprehensive End to End tests and many unit tests.

Quickstart

Try Diffgram Online (Hosted Service, No Setup.)

Diffgram Dev Installer Quickstart

Requires Docker and Docker Compose

git clone https://github.com/diffgram/diffgram.git
cd diffgram
pip install -r requirements.txt
python install.py
# Follow the installer instruction and 
# After install:  View the Web UI at: http://localhost:8085

Cloud

Other Getting Started Docs:

Benefits

  1. Flexible deploy and many integrations - run Diffgram anywhere in the way you want.
  2. Scale every aspect - from volume of data, to number of supervisors, to ML speed up approaches.
  3. Fully featured - 'batteries included'.

Docs

Support & Community

  1. Open an issue (Technical, bugs, etc)
  2. Chat On Slack (Coming soon)
  3. Forum (Coming Soon)

Security issues: Do not create a public issue. Email [email protected] with the details. Docs

Vision

  1. Application: Support all popular media types for raw data; all popular schema, label, and attribute needs; and all annotation assist speed up approaches
  2. Support all popular training data management and organizational needs
  3. Integrate with all popular 3rd party applications and related offerings
  4. Support modification of source code
  5. Run on any hardware, any cloud, and anywhere

Technical Direction

Speed Ups & AI

Latest AI + More

Integrations

Note for initial open core release Actions Hooks are not yet available. Please see Diffgram.com and use them there if needed.

Contributing

We welcome contributions! Please see our contributing documentation.

Architecture & Design Docs

We plan to release more internal architecture docs over time. Please see the general docs in the mean time.

Stack Example

As a loose analogy to LAMP, or MEAN stacks. One example - use a pre-processing tool like Lightly, then do annotation in Diffgram, and model training with Determined AI. This is like a "LDD" stack: lightly diffgram determined-ai .

You can use Diffgram with your choice of surrounding tools - the ones shown are examples and optional.

About

Complete training data platform for machine learning delivered as a single application.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 49.0%
  • Vue 43.8%
  • JavaScript 4.6%
  • TypeScript 1.8%
  • CSS 0.7%
  • Dockerfile 0.1%