Better predict api design #21

lolipopshock · 2022-06-29T07:04:47Z

This PR introduces two improvements in the predict apis:

It can specify the return types of the predict API -- either layoutparser Layout or just a list of category predictions. This aims to make the API more generalizable and can support downstream uses like mmda
It adds the predict_page API that is dedicated for the vila datamodels. The prediction process is further simplified into one line:

for idx, page_token in enumerate(page_tokens):
    
    # New
    predicted_tokens1 = pdf_predictor.predict_page(
        page_token, page_image=page_images[idx], visual_group_detector=vision_model
    )

    # Previous
    blocks = vision_model.detect(page_images[idx])
    page_token.annotate(blocks=blocks)
    pdf_data = page_token.to_pagedata().to_dict()
    predicted_tokens2 = pdf_predictor.predict(pdf_data, page_token.page_size)

    assert predicted_tokens1 == predicted_tokens2

…odel

lolipopshock added 10 commits June 29, 2022 00:01

Allow specifying the return types

109b52d

change to variable names that make more sense

5005f16

Add predictor test

356a74c

Implement predict_page for directly running predictions on vila datam…

a042685

…odel

Update vila_run test

2059911

fix spacing

37c3c5f

Make line detection more explicit

e38f52e

Fix typo

f17d220

Improve tests

d7fe035

fix readme typo

95b2485

lolipopshock merged commit 25eeeb5 into master Jun 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Better predict api design #21

Better predict api design #21

Uh oh!

lolipopshock commented Jun 29, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Better predict api design #21

Better predict api design #21

Uh oh!

Conversation

lolipopshock commented Jun 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lolipopshock commented Jun 29, 2022 •

edited

Loading