GenFEND

Let Silence Speak: Enhancing Fake News Detection with Generated Comments from Large Language Models (ACM CIKM 2024 Research Track Paper)

Preprint Chinese Blog

Data Preparation

The /data/ folder in GenFEND_release_ch and GenFEND_release_en is where the data used for training and testing is stored.

For GenFEND_release_ch, we experiment on Weibo21. The folder /data/Weibo21/ contains the data with real comments, and the folder /data/Weibo21/Dmeta-embedding-comments-feature/ contains the extracted feature of real comments. the folder /data/role_virtual_comments/ contains the Weibo21 data with generated comments and the corresponding extracted comment feature.

For GenFEND_release_en, we experiment on LLM-mis and GossipCop. The folder /data/LLM-mis/ contains the LLM-mis data with generated comments, and the folder /data/LLM-mis/bge-large-en-v1.5/ contains the extracted feature of generated comments. The folder /data/GossipCop/ contains the GossipCop data with real comments, the folder data/GossipCop/bge-large-en-v1.5/ contains the extracted feature of real comments, and the folder /data/role_virtual_comments/ contains the GossipCop data with generated comments and the corresponding extracted comment feature.

Note that we only list some example instances in train.json, val.json, and test.json. You should prepare the whole dataset in the same format as example instances, and follow STEP I in the How To Run section to generate the complete dataset.

We could not provide the original dataset that we used because they were not collected by us and we were not authorized to dispatch them. Only the generated comments originated from us. Please visit the links provided above to obtain the original datasets.

Sentence Encoder Preparation

Download model files from Dmeta-embedding and put them in the folder GenFEND_release_ch/pretrained_model/Dmeta-embedding/.

Download model files from bge-large-en-v1.5 and put them in the folder GenFEND_release_en/pretrained_model/bge-large-en-v1.5/.

How To Run

STEP I: Comment Encoding

Go to the folder GenFEND_release_ch/data/ or GenFEND_release_en/data/ and run the following command:

python cmts_fea_ext.py
python add_file_index.py

STEP II: Training and Testing

To experiment on the Weibo21 dataset, go to the folder GenFEND_release_ch and run the following command:

python main.py --model_name bert_genfend

or

python main.py --model_name defend_genfend

To experiment on the GossipCop dataset, go to the folder GenFEND_release_en and run the following command:

python main.py --root_path './data/GossipCop/' --model_name bert_genfend

or

python main.py --root_path './data/GossipCop/' --model_name defend_genfend

To experiment on the LLM-mis dataset, go to the folder GenFEND_release_en and run the following command:

python main.py --root_path './data/LLM-mis/' --model_name bert_genfend

How To Integrate with Other Models

MultiSubpp.py serves as a plug-in module for integrating with both content-only and comment-based models.

Refer to the MultiSubppModel used in BERTMtiSppModel.py and dEFENDMtiSppModel.py to see how to integrate with the other models.

How To Cite

@inproceedings{nan2024let,
  title={{Let Silence Speak: Enhancing Fake News Detection with Generated Comments from Large Language Models}},
  author={Nan, Qiong and Sheng, Qiang and Cao, Juan and Hu, Beizhe and Wang, Danding and Li, Jintao},
  booktitle={Proceedings of the 33rd ACM International Conference on Information and Knowledge Management},
  pages = {1732–1742},
  doi={10.1145/3627673.3679519},
  year={2024}
}

Relevant Resources

Paper List LLM-for-misinformation-research: https://github.com/ICTMCG/LLM-for-misinformation-research/
Tutorial @SIGIR 2024 Preventing and Detecting Misinformation Generated by Large Language Models: https://sigir24-llm-misinformation.github.io/

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
GenFEND_release_ch		GenFEND_release_ch
GenFEND_release_en		GenFEND_release_en
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GenFEND

Data Preparation

Sentence Encoder Preparation

How To Run

How To Integrate with Other Models

How To Cite

Relevant Resources

About

Releases

Packages

Languages

ICTMCG/GenFEND

Folders and files

Latest commit

History

Repository files navigation

GenFEND

Data Preparation

Sentence Encoder Preparation

How To Run

How To Integrate with Other Models

How To Cite

Relevant Resources

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages