WALNUT: Benchmark on Semi-weakly Supervised Learning for NLU
This benchmark provides a publicly accessible framework for advocating and facilitate research on weak supervision for NLU. We expect WALNUT to stimulate further research on methodologies to leverage weak supervision more effectively. The benchmark and code for baselines are available at
Website
References
@inproceedings{zheng2022walnut,
title={WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural Language Understanding},
author={Zheng, Guoqing and Karamanolakis, Giannis and Shu, Kai and Awadallah, Ahmed Hassan},
booktitle={Proceedings of 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
year={2022},
organization={ACL}
}
Graph Neural Networks for Fake News Detection
This repository offers a publicly accessible platform and benchmark for using a series of Graph Neural Network (GNN) based fake news detection models. We welcome contributions of results of existing models and the SOTA results of new models based on our dataset. You can check the benchmark hosted by PaperWithCode for SOTA models and their performances.
Benchmark Github
References
@inproceedings{dou2021user,
title={User Preference-aware Fake News Detection},
author={Dou, Yingtong and Shu, Kai and Xia, Congying and Yu, Philip S. and Sun, Lichao},
booktitle={Proceedings of the 44nd International ACM SIGIR Conference on Research and Development in Information Retrieval},
year={2021},
organization={ACM}
}
COVID-19 Data Repository
This repository offers a publicly accessible platform to gather and curate datasets related to COVID-19 with multi-disciplines including spatial-temporal epidemic data, fact-checked content of different types of disinformation (e.g., fraud URLs, false news), social media content and network data from Twitter, scholar articles, etc. The repository also encourages data donation from the research community and promotes collaborations.
Github
dEFEND: Explainable Fake News Detection
In recent years, to mitigate the problem of fake news, computational detection of fake news has been studied, producing some promising early results. While important, however, we argue that a critical missing piece of the study be the explainability of such detection, i.e.,
why a particular piece of news is
detected as fake. In this paper, therefore, we study the
explainable detection of fake news. We develop a sentence-comment co-attention sub-network to exploit both news contents and user comments to jointly capture explainable top-
k check-worthy sentences and user comments for fake news detection. We conduct extensive experiments on real-world datasets and demonstrate that the proposed method not only significantly outperforms
several state-of-the-art fake news detection methods.
Code and Results.
References
@inproceedings{shu2019defend,
title={dEFEND: Explainable Fake News Detection},
author={Shu, Kai and Cui, Limeng and Wang, Suhang and Lee, Dongwon and Liu, Huan},
booktitle={Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
year={2019},
organization={ACM}
}
Unsupervised Fake News Detection
Most existing methods of fake news detection are supervised, which require an extensive amount of time and labor to build a reliably annotated dataset. In search of an alternative, in this paper, we investigate if we could detect fake news in an unsupervised manner. We treat truths of news and users’ credibility as latent random variables, and exploit users’ engagements on social media to identify their opinions towards the authenticity of news.
Code
@inproceedings{yang2019unsupervised,
title={Unsupervised fake news detection on social media: A generative approach},
author={Yang, Shuo and Shu, Kai and Wang, Suhang and Gu, Renjie and Wu, Fan and Liu, Huan},
booktitle={Proceedings of the AAAI conference on artificial intelligence},
year={2019}
organization={ACM}
}
Fake News Detection Data Repository
We released a tool
FakeNewsTracker, for collecting, analyzing, and visualizing of fake news and the related dissemination on social media!
The latest dataset paper with detailed analysis on the dataset can be found at
FakeNewsNet.
FakeNewsNet is a benchmark data repository fake news detection, which contains information of news content, social context, and spatialtemporal information for studying fake news on social media. Data and APIs are available at
Github.
References
If you use this dataset, please consider cite the following papers:
@article{shu2018fakenewsnet,
title={FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media},
author={Shu, Kai and Mahudeswaran, Deepak and Wang, Suhang and Lee, Dongwon and Liu, Huan},
journal={arXiv preprint arXiv:1809.01286},
year={2018}
}
@article{shu2017fake,
title={Fake News Detection on Social Media: A Data Mining Perspective},
author={Shu, Kai and Sliva, Amy and Wang, Suhang and Tang, Jiliang and Liu, Huan},
journal={ACM SIGKDD Explorations Newsletter},
volume={19},
number={1},
pages={22--36},
year={2017},
publisher={ACM}
}