Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Multiple images for an identical matched_text_index #11

@fauconnier

Description

@fauconnier

Dear authors,

Thanks for releasing MMC4.

In the paper the following is stated:

"we use [14] to compute a bipartite assignment of images to sentences, under the constraint that each sentence can only be assigned a single image." ,
"For documents with more images than sentences, after assigning an image to each sentence, we assign according to max similarity.".

However, we found examples where multiple images are aligned to a text span.
For instance, consider the following example in ./docs_shard_10063_v3.jsonl.

{
    "url": "http://easydesigns.biz/easydesigns-wins-3rd-consecutive-best-of-houzz-award/",
    "text_list": [
        "cherry hill, nj, january 19, 2015 \u2013 easydesigns of cherry hill, nj has been awarded \u201cbest of houzz\u201d for customer satisfaction by houzz, the leading platform for home remodeling and design.",
        "the interior design and real estate staging firm, in business since 2005, was chosen by the more than 25 million monthly unique users that comprise the houzz community from among more than 500,000 active home building, remodeling and design industry professionals.",
        "\u201ci am so happy to be selected for the 3rd consecutive year.",
        "customer satisfaction is a primary goal of my firm so i am thrilled to be recognized by such a large and prominent community\u201d, said beth secosky, owner of easydesigns."
    ],
    "image_info": [
        {
            "image_name": "202706b24ac4.png",
            "raw_url": "https://st.hzcdn.com/static/[email protected]",
            "matched_text_index": 0,
            "matched_sim": 0.33591771125793457,
            "face_detections": null
        },
        {
            "image_name": "a97c58871c38.png",
            "raw_url": "https://st.hzcdn.com/static/[email protected]",
            "matched_text_index": 0,
            "matched_sim": 0.31495240330696106,
            "face_detections": null
        },
        {
            "image_name": "ce3c9aa070ce.png",
            "raw_url": "https://st.hzcdn.com/static/[email protected]",
            "matched_text_index": 1,
            "matched_sim": 0.2770630717277527,
            "face_detections": null
        },
        {
            "image_name": "c22425c7d977.png",
            "raw_url": "https://st.hzcdn.com/static/[email protected]",
            "matched_text_index": 0,
            "matched_sim": 0.3448386490345001,
            "face_detections": null
        }
    ],
    "similarity_matrix": [
        [
            0.33591771125793457,
            0.2377069592475891,
            0.17204634845256805,
            0.22403109073638916
        ],
        [
            0.31495240330696106,
            0.27460938692092896,
            0.12367681413888931,
            0.17759563028812408
        ],
        [
            0.3045308589935303,
            0.2770630717277527,
            0.15680742263793945,
            0.21054978668689728
        ],
        [
            0.3448386490345001,
            0.26175469160079956,
            0.16365793347358704,
            0.237198144197464
        ]
    ]
}

Is that intended?

Thanks for any pointers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions