Tags: allenai/mmda
Tags
Allows use of pydantic 2 by dependents. (#283) Some applications wants to use pydantic2 but have been blocked by strict constraint on pydantic1 in mmda. This changeset allows the basic library to be used with either major version (but preserves the 1.x requirement for specific models so as not to break anything in S2's SPP pipeline).
Fix for mention predictor span_group overlap (#252) * buncha comments * getting somewhere... able to get the correct previous_word_id but i think i need to keep the Nones and filter later * i think things are working somewhat as i hoped for -- i'm seeing 433 word id filled up with multiple label ids correctly i think, but there are no multi span spangroups in the end... * omg wow not getting 1000 single span mentions and no overlap!! but lower number of total mentions than before. need to fix * append final acc of page * delete the many debug statements and make comments better * annotate onto doc before returning stuffs * bump mmda version
Authors in grobid augmenter (#246) * tests are set up, use plumber doc fixture instead of parsing pdf, new xml for no-authors test * refactored still gets bibs * test passes with page size caching * authors * simplify to 1 get_box_groups method * remove junk * rename fixture with whole sha * bump version * why did all of these become untracked?? added back * forgot it needed this pdf...
Egork/egork/figure table/fix list to extend (#245) * Changes made: 1. Added a default value for `merged_boxes_vila_dict` in case it's not provided. 2. Replaced the nested loops for filtering out vila spans with list comprehensions. 3. Changed the `update` method to `assign` for updating `merged_boxes_vila_dict[page]` since it's a list, not a set. * Adding files for unit test and the test * Added json to the fixture, added names and docstring to the test * Changing version to 0.4.8
Bxgrps to spngrps in bibs (#243) * basic tests pass for bib predictor mmda, more work to be done * remove straggling vila stuffs (no longer using in bib detector model) * bib predictor tests passing w spangroups instead of boxgroups * bib detector tests pass using new tool * version * update for grobid bibs as well
Boxgrps to spangrps tool (#242) * copy _annotate_box_groups into tools, add test (doesn't pass) * only if center + fix the probs + add test * lol don't assert 1 == 0 * version * no need for rows * default to False in line w/ OG behavior * rm junk enumeration * pad_x optional * moved test fixtures, added test for current .annotate of box_groups which does not use the new code * change Document to use new stuffs, still passes test! * delete the now moved and changed code * remove print * docstring on test
PreviousNext