Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MMDA canonical handling of Image-type data

Open
No due date
Last updated Jul 27, 2022
0% complete

Define a policy around how MMDA will:

  • Represent Images in Document
  • Serialize/Load Images
  • Integrate with other vision libraries like LayoutParser. Particularly around this point, aim for 2 options: (1) Indirect integration where user is expected to run their vision models outside of MMDA, format their image data into a manner compatible with MMDA, then load them in to manipulate within MMDA. This is suitable for libraries like LayoutParser that depend on detectron2 and may have incompatible environments with the rest of MMDA. (2) Direct integration where a user can run vision models directly in same environment as other MMDA.Predictors. This is suitable for libraries like Huggingface which are adding vision models
  • Includes MMDA Image fields, such as Tables/Figures and their associated Captions

List view

    There are no open issues in this milestone

    Add issues to milestones to help organize your work for a particular release or project. Find and add issues with no milestones in this repo.