Thanks to visit codestin.com
Credit goes to github.com

Skip to content
/ ViMRHP Public

A Vietnamese Benchmark Dataset for Multimodal Review Helpfulness Prediction via Human-AI Collaborative Annotation.

License

Notifications You must be signed in to change notification settings

trng28/ViMRHP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ViMRHP: A Vietnamese Benchmark Dataset for
Multimodal Review Helpfulness Prediction via
Human-AI Collaborative Annotation'[NLDB2025]

Hugging Face Dataset arXiv Paper Springer Paper Open in Colab

Dataset Usage Note

This dataset is originally constructed and provided for the Multimodal Review Helpfulness Prediction (MRHP) task. In addition to the main fields designed for MRHP, we have included several additional attributes and metadata to support other potential research tasks.

⚠️ Important Note:
The train / dev / test split is based on the product/review list, and is specifically designed for the MRHP task (ranking). If you intend to use this dataset for other tasks, please reconsider creating a custom data split that aligns better with your specific task requirements.

Task

Given product information and a list of user-generated reviews, the objective is to rank the reviews based on their predicted helpfulness scores.

Input Output
Product Information & User-generated Reviews Helpfulness Score (per review)

Example


Example

 {
        "Rating": 5,
        "Region": "VN",
        "ShopId": 740144125,
        "UserId": 1236172816,
        "Comment": "Chất lượng sản phẩm: không thể nào chê vào đâu được nữa ,chất lượngthực sự kkkk",
        "Response": "Dựa trên thông tin văn bản và hình ảnh được cung cấp, phân tích như sau:\n\n### Key aspects\n- Chất lượng sản phẩm rất tốt.\n- Thiết kế sản phẩm hiện đại.\n- Tính năng sử dụng dễ dàng.\n- Độ bền cao.\n\n### Decision-making advice\n- Nên mua sản phẩm này cho những ai cần một thiết bị chất lượng.\n- Phù hợp cho người dùng yêu thích công nghệ.\n- Không nên mua nếu không cần thiết hoặc không sử dụng thường xuyên.\n\n### Image helpfulness\n- **Relevance**: Hình ảnh liên quan đến sản phẩm được mô tả.\n- **Clarity**: Hình ảnh rõ ràng, thể hiện sản phẩm tốt.\n- **Illustrative Value**: Hình ảnh giúp người tiêu dùng hiểu rõ hơn về sản phẩm.\n- **Engagement**: Hình ảnh thu hút người xem.\n\n### Helpfulness_Score\n- **Key aspects**: 5.0\n- **Decision-making advice**: 4.0\n- **Image helpfulness**: 5.0\n\nTính trung bình: (5.0 + 4.0 + 5.0) / 3 = 4.67\n\n**Helpfulness_Score**: 4.67",
        "Anonymous": "Yes",
        "CommentId": 15035263733,
        "ProductId": 25376374972,
        "ScrapedAt": "2024-08-22T19:58:54.015Z",
        "ProductUrl": "https://shopee.vn/Loa-Bluetooth-Ch%E1%BB%AF-G-Led-RGB-REMI-OFFICIAL-S%E1%BA%A1c-Nhanh-Kh%C3%B4ng-D%C3%A2y-%C4%90%C3%A8n-Nh%C3%A1y-Theo-Nh%E1%BA%A1c-Thi%E1%BA%BFt-K%E1%BA%BF-Sang-Tr%E1%BB%8Dng-i.740144125.25376374972?sp_atk=98f8d732-b119-4f00-8ce1-50c336957647&xptdk=98f8d732-b119-4f00-8ce1-50c336957647",
        "UserShopId": 1235759082.0,
        "CommentDate": "2024-04-30T04:56:11.000Z",
        "ProductName": "Loa Bluetooth Chữ G Led RGB,  REMI OFFICIAL Sạc Nhanh Không Dây,Đèn Nháy Theo Nhạc,Thiết Kế Sang Trọng",
        "ProductImage": [
            "https://down-bs-us.img.susercontent.com/vn-11134207-7r98o-lvu9zppbukm195.webp"
        ],
        "CommentImages": [
            "https://down-bs-us.img.susercontent.com/vn-11134103-7r98o-lupxpni2g6qa85.webp",
            "https://down-bs-us.img.susercontent.com/vn-11134103-7r98o-lupxpni2hlaqc7.webp",
            "https://down-bs-us.img.susercontent.com/vn-11134103-7r98o-lupxpni2izv6b2.webp",
            "https://down-bs-us.img.susercontent.com/vn-11134103-7r98o-lupxpni2kefm62.webp",
            "https://down-bs-us.img.susercontent.com/vn-11134103-7r98o-lupxpni2lt0261.webp"
        ],
        "CommentVideos": null,
        "BoughtProducts": "Loa Bluetooth Chữ G Led RGB,  REMI OFFICIAL Sạc Nhanh Không Dây,Đèn Nháy Theo Nhạc,Thiết Kế Sang Trọng, Loa G Đồng Hồ",
        "CommentImagesPath": [
            "ReviewImages/25376374972/15035263733/15035263733_3.jpg",
            "ReviewImages/25376374972/15035263733/15035263733_1.jpg",
            "ReviewImages/25376374972/15035263733/15035263733_4.jpg",
            "ReviewImages/25376374972/15035263733/15035263733_5.jpg",
            "ReviewImages/25376374972/15035263733/15035263733_2.jpg"
        ],
        "ProductImagesPath": [
            "ProductImages/25376374972/25376374972_8.jpg",
            "ProductImages/25376374972/25376374972_7.jpg",
            "ProductImages/25376374972/25376374972_5.jpg",
            "ProductImages/25376374972/25376374972_9.jpg",
            "ProductImages/25376374972/25376374972_2.jpg",
            "ProductImages/25376374972/25376374972_1.jpg",
            "ProductImages/25376374972/25376374972_6.jpg",
            "ProductImages/25376374972/25376374972_3.jpg",
            "ProductImages/25376374972/25376374972_4.jpg"
        ],
        "Helpfulness_Score": 3,
        "DetailRating": "product_quality: 5\nseller_service: 5\ndelivery_service: 5",
        "Id": 134482501,
        "KeyAspects": "2",
        "DecisionMakingAdvice": "3",
        "ImageHelpfulness": "5",
        "SubCategory": "Loa"
    },

Data Field

Field Type Explanation
Anonymous str Indicates whether the user posted the review anonymously
BoughtProducts str List of products purchased by the user, as mentioned in the review
Comment str Text content of the user’s review
CommentDate str Date when the user wrote the review
CommentId int Unique identifier of the review
CommentImages list List of URLs for images uploaded by the user
CommentImagesPath list Local storage path for user-uploaded review images
CommentVideos str URLs of videos (if any) uploaded with the review
DecisionMakingAdvice int Score reflecting how helpful the review is for purchase decision-making (scale 1–5)
DetailRating str Detailed scores for product quality, seller service, and delivery service
Helpfulness_Score (Ground-truth) int Final score for overall helpfulness (scale 1–5)
Id int Id for each data sample
ImageHelpfulness int Score assessing the usefulness of attached images (scale 1–5)
KeyAspects int Score evaluating how well the review covers key product aspects (scale 1–5)
ProductId int Unique identifier of the reviewed product
ProductImage list List of product image URLs scraped from the e-commerce platform
ProductImagesPath list Local storage path of product images
ProductName str Name of the reviewed product
ProductUrl str URL of the product’s page on the platform
Rating int Overall rating score provided by the user (scale from 1 to 5)
Region str Country or region where the review was made
Response str AI-generated analysis response for review
ShopId int Unique identifier of the shop
SubCategory str Subcategory of the product
UserId int Unique identifier of the user who wrote the review

Download from Google Drive

Category Drive Link Quick Command
Fashion Link !gdown 1EsNnvAGUNJJd_XtthaENLywz2DPtoUIC
Electronic Link !gdown 1-D29tOyissD9z1qD6Q0eHQIuXFJkK8hp
Health & Beauty Link !gdown 1-IpBJaQyIQawIr8I-xJcSctnUsP_4dhd
Home & Lifestyle Link !gdown 1-I5e9iUINj1b8CnRdQn6wi45sObI6W31

Dataset Structure

└── 📁ViMRHP

    └── 📁Fashion
        └── 📁ProductImages
        └── 📁ReviewImages
        └── Fashion-train.json
        └── Fashion-dev.json
        └── Fashion-test.json

    └── 📁Electronic
        └── 📁ProductImages
        └── 📁ReviewImages
        └── Electronic-train.json
        └── Electronic-dev.json
        └── Electronic-test.json

    └── 📁HomeLifestyle
        └── 📁ProductImages
        └── 📁ReviewImages
        └── HomeLifestyle-train.json
        └── HomeLifestyle-dev.json
        └── HomeLifestyle-test.json

    └── 📁HealthBeauty
        └── 📁ProductImages
        └── 📁ReviewImages
        └── HealthBeauty-train.json
        └── HealthBeauty-dev.json
        └── HealthBeauty-test.json

Citation

If you are using the ViMRHP in your work, please cite the following:

@inproceedings{
    author="Nguyen, Truc Mai-Thanh and Nguyen, Dat Minh and Luu, Son T. and Nguyen, Kiet Van",
    title="ViMRHP: A Vietnamese Benchmark Dataset for Multimodal Review Helpfulness Prediction via Human-AI Collaborative Annotation",
    booktitle="Natural Language Processing and Information Systems",
    year="2025",
    publisher="Springer Nature Switzerland",
    address="Cham",
    pages="291--305"
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A Vietnamese Benchmark Dataset for Multimodal Review Helpfulness Prediction via Human-AI Collaborative Annotation.

Topics

Resources

License

Stars

Watchers

Forks