What is Targeted-WTWT?

We find that the models can shortcut stance detection task on the publicly available Twitter stance detection datasets without looking at the target sentence. Targeted-WTWT, a derivative of the WTWT dataset, is a new large-scale stance detection dataset on which target-oblivious models fails to perform well. Thiss serves as a better dataset to train and evaluate models.

For more details about Targeted-WTWT, please refer to our NAACL 2021 paper:

Our code for finding on the other existing stance detection datasets can be found in the following link.

Targeted-WTWT Dataset

Targeted-WTWT is distributed under a CC BY-SA 4.0 License. The link to the dataset is provided below.

We experiments with several baselines on the new dataset. Following is the link to the baselines on this dataset.

To add your model to the leaderboard, please submit your request to the first author. We follow the same release format as the WTWT dataset. Please obtain the Tweets for the given tweet-id using the Twitter API. Feel free to use your choice of target sentences for each merger.

Evaluation

As explained in the Sec 3.1 of the paper, we propose a similar cross-target evaluation setting for Targeted-WTWT as WTWT. We consider F1 is Macro averaged across the classes.
-> For the in domain (health) mergers, train on three health merger (total six targets including negated target for each merger) and test for F1 on the fourth health merger. Repeat this with having each of the four merger being a test set at least once. The two final scores on in-domain are obtained by averaging and weighted averaging the four F1 scores.
-> For the out-of-domain evaluation, train on the eight targets corresponding to the 4 health mergers and test on the two targets for entertainment merger.

Citation

If you use Targeted-WTWT dataset in your research, please cite our paper with the following BibTeX entry

          @inproceedings{kaushal2020stance,
            title={tWT–WT: A Dataset to Assert the Role of Target Entities for Detecting Stance},
            author={Kaushal, Ayush and Saha, Avirup and Ganguly, Niloy} 
            booktitle={Proceedings of the 2021 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2021)},
            year={2021}
          }
        
Leaderboard
Rank Model Code In-Domain Out-of-Domain
F1 Avg F1 Wtd F1
1
Apr 10, 2021
Bert
IIT Kharagpur
Kaushal et al. 2021
0.510 0.527 0.365
2
Apr 10, 2021
SiamNet (with Bert features)
IIT Kharagpur
Kaushal et al. 2021
0.312 0.310 0.150
3
Apr 10, 2021
Bert (target oblivious)
IIT Kharagpur
Kaushal et al. 2021
0.264 0.260 0.163
4
Apr 10, 2021
TAN (with Bert features)
IIT Kharagpur
Kaushal et al. 2021
0.258 0.260 0.150