Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@lolipopshock
Copy link
Collaborator

@lolipopshock lolipopshock commented Jul 7, 2022

In the previous version, there are two errors in the box normalization step:

  1. we'll scale the bbox for both the height and width dimension even if only one dimension is oversized
  2. we perform the scaling after inserting the special tokens, which might have the `[1000,1000,1000,1000] box

As such, let's say we have a page of size (700, 1024) (width, height). In the previous normalization, it will:

  1. scale the width dimension to 1000
  2. since it normalizes after injecting a special token of coordinate 1000, it will resize it into 1000*1000/700=1428, which defeats the purpose of resizing.

In the new fix, we

  1. make sure only scale the large side
  2. we do the scaling before injecting the special tokens.

x1 = float(x1) / page_width * target_width
x2 = float(x2) / page_width * target_width

if page_height > target_height:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it matter that this will result in disproportionate scaling? width could reduce by a greater factor than height, or vice versa. That's guaranteed to be true if only one of these if clauses is True, and seems very very likely even if they are both true, unless your target_width / heights match the aspect ratio of the source document.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should further change the scaling logic

Copy link

@cmwilhelm cmwilhelm Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to preserve the aspect ratio I think you can do something like

if page_width > target_width or page_height > target_height:
    scale_factor = target_width / page_width if page_width > page_height else target_height / page_height

    x1 = float(x1) * scale_factor
    x2 = float(x2) * scale_factor
    y1 = float(y1) * scale_factor
    y2 = float(62) * scale_factor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Though again I don't know what the consequences are of changing the aspect ratio)

@lolipopshock lolipopshock merged commit 668f17d into master Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants