Thanks to visit codestin.com
Credit goes to github.com

Skip to content

A retrieval-augmented generation-based framework for automatically constructing a hierarchy of aspects typically considered when addressing a nuanced claim and enriching them with corpus-specific perspectives.

License

Notifications You must be signed in to change notification settings

pkargupta/claimspect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Official Repo of ClaimSpect

profile

Official implementation for ACL 2025 main track paper Beyond True or False: Retrieval-Augmented Hierarchical Analysis of Nuanced Claims.

🪧 Paper Abstract

Claims made by individuals or entities are oftentimes nuanced and cannot be clearly labeled as entirely "true" or "false"---as is frequently the case with scientific and political claims. However, a claim (e.g., "vaccine A is better than vaccine B") can be dissected into its integral aspects and sub-aspects (e.g., efficacy, safety, distribution), which are individually easier to validate. This enables a more comprehensive, structured response that provides a well-rounded perspective on a given problem while also allowing the reader to prioritize specific angles of interest within the claim (e.g., safety towards children). Thus, we propose ClaimSpect, a retrieval-augmented generation-based framework for automatically constructing a hierarchy of aspects typically considered when addressing a claim and enriching them with corpus-specific perspectives. This structure hierarchically partitions an input corpus to retrieve relevant segments, which assist in discovering new sub-aspects. Moreover, these segments enable the discovery of varying perspectives towards an aspect of the claim (e.g., support, neutral, or oppose) and their respective prevalence (e.g., "how many biomedical papers believe vaccine A is more transportable than B?"). We apply ClaimSpect to a wide variety of real-world scientific and political claims featured in our constructed dataset, showcasing its robustness and accuracy in deconstructing a nuanced claim and representing perspectives within a corpus. Through real-world case studies and human evaluation, we validate its effectiveness over multiple baselines.

📦 Repo Setup

  1. Clone the repository:
git clone https://github.com/pkargupta/claimspect.git
cd ClaimSpect
  1. Install the required dependencies:
pip install -r requirements.txt

📊 Data Construction

The data construction process is implemented in the data/dtra and data/vaccine directory. This process involves:

  1. Claim Construction: Generate initial claims using data/dtra/raw_claims/generate_claims.py
  2. Literature Searching: Search for relevant papers using data/dtra/get_literature/get_literature_meta_info.py
  3. Literature Download: Download paper content using data/dtra/get_literature/get_literature_body_from_url.py
  4. Literature Chunking: Split papers into manageable chunks using data/dtra/chunking/run_chunking.sh

Each step builds upon the previous one to create a comprehensive dataset for claim analysis."

But you can get access to the constructed data here: first split, second split.

🔍 Claim Analysis

To run the claim analysis experiments:

bash script/run_experiments.sh

This script will execute the main claim analysis pipeline.

📈 Evaluation

We provide multiple evaluation scripts for different aspects of the system:

Baseline Evaluation

Run the baseline evaluation scripts in eval/baseline/ directory.

LLM-as-Judge Evaluation

bash eval/eval_dtra.sh

Retrieval Corpus Relevance Check

python eval/claim_examine/main.py

Human Judge Evaluation

python eval/human_judge/main.py

Human-Machine Alignment Evaluation

python eval/human_machine_align.py

📖 Citations

Please cite the paper and star this repo if you use ClaimSpect and find it interesting/useful, thanks! Feel free to open an issue if you have any questions.

@article{kargupta2025beyond,
  title={Beyond True or False: Retrieval-Augmented Hierarchical Analysis of Nuanced Claims},
  author={Kargupta, Priyanka and Tian, Runchu and Han, Jiawei},
  journal={arXiv preprint arXiv:2506.10728},
  year={2025}
}

About

A retrieval-augmented generation-based framework for automatically constructing a hierarchy of aspects typically considered when addressing a nuanced claim and enriching them with corpus-specific perspectives.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •