Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Roadmap after v0.3 #64

@gavento

Description

@gavento

A document to track the directions from 0.3, replacing #26. Our mid- and long-term goals, their [priority], (asignee) and any sub-tasks.

Any help is welcome with mentoring available for most tasks!

Remaining enhancements from v0.3

Will be updated after prioritization discussion.

Client-side protocols

Replace capnp RPC and the current monitoring dashboard HTTP API with common protocol.
Part of #11 (more discussion there) but specific to the public API.

Improve the dashboard with more information and post-mortem analysis

Fix current bugs

Custom tasks (subworkers) in more languages

  • Python subworker as a library [low] (run standalone scripts as opposed to defining them in the client only)

Easier deployment in the cloud

Packaging for easier deployment

Multiple options, priorities may vary. (@spirali)

  • AppImage/Snap packages [low] (we already have static binaries)
  • Deb/other distro packages [low]

Improve Python API

Pythonize the client API.

Improve testing infrastructure

More real-world code examples

Lower priority, best based on real use-cases. Ideas: numpy subtasks, C++/Rust subworkers

Enhancements to revisit in the (not so distant) future

  • Integration with some popular libraries
    • Apache Arrow content-type
      • Basic type and loading is implemented. We could add more operations (filter, split, merge, ...)
    • XGBoost tasks, etc ...
    • Why not now: Not clear what would be the demand
  • Worker configuration files (needed for common (CPU) and special resources (GPU), different subworker locatins and configurations, ...)
    • Partially done
    • Why not now: Needs to be thought-through (esp. w.r.t. resources), not needed now
  • Separate session construction and running (save/load session)
    • Why not now: Not clear what would be the use-cases, not difficult when API stabilized
  • Clients in other languages: Rust, C++, Java, ...
    • Why not now: Not clear what would be the demand. Easier after the protocol/Python API stabilization.
  • Scale the scheduler, benchmarks
    • There is a benchmark in utils/bench/simple_task_scaling.py. The results as of 0.2 are here.
    • Why not now: While eventually crucial, the scheduler is sufficient when there are <1000 tasks to be scheduled at once.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions