A counterfactual dataset

For frame Detection

Chung-hong Chan

GESIS

2023-05-29

Title

Frame? Kein Thema! Die Validität der induktiven Frame-Erkennung in Texten mit mehreren Themen

Older slides

A counterfactual dataset for evaluating frame detection methods in multi-topical news content

  • Chung-hong Chan
  • Rainer Freudenthaler
  • Philipp Müller

Acknowledgement

The authors would like to thank the Open Science Office, University of Mannheim for the financial support of this project.

More infomation: https://www.uni-mannheim.de/open-science/open-science-office/

Student helpers: Filippo Borsato, Hannah Erb, Hyosun Jang, Fatih Ozhasar, Zeynep Özgülec, and Jonathan Vincent.

Special thanks: Valerie Hase (LMU Munich)

The current state of framing literature

“Some studies employ the concept only in a metaphoric sense, whereas others reduce frames to story topics, attributes, or issue position”

Carragee & Roefs (2004)

“(Framing researchers) give an obligatory nod to the literature before proceeding to do whatever they were going to do in the first place.”

Reese (2007)

What’s a frame?

Entmanian Frame: “Select some aspects of a perceived reality and make them more salient in a communicating text, in such a way as to promote a particular problem definition, causal interpretation, moral evaluation, and/or treatment recommendation for the item described.”

Entman (1993)

Chan’s restatement: Frame is the result of an act of selecting certain aspects of a perceived reality by a communicator, whose intention is to promote a particular problem definition, causal interpretation, moral evalution, and/or treatment recommendation.

Raison d’être: Generic frames

transcend thematic limitations and can be identified in relation to different topics, some even over time and in different cultural contexts”

de Vreese (2005)

Raison d’être: Factual data

Semetko & Valkenburg (2000)

Raison d’être: Counterfactual data

Generation of counterfactual data

Distribution

topic Conflict Conseq. Hum. Int. Morality Resp.
Climate 6 1 7 2 4
Corona 5 3 4 5 3
Joker 1 5 5 6 3
Tech 4 7 2 3 4
Ukraine 5 6 2 2 5

Example: Evaluation I

Semetko & Valkenburg (2000)

“Gold Standard”

2 x trained students

2 x experts with PhD in communication

van Atteveldt, van der Velden M A C G, Boukes M. (2021)

Multiverse analysis (Preregistered)

Pipal, Song, Boomgaarden (2022)

Example: Evaluation II

Inductive (automatic) methods

  • K-means with TFIDF
  • PCA with TF-IDF
  • LDA
  • STM
  • ANTMN

(Burscher et al., 2016; Greussing & Boomgaarden, 2017; DiMaggio et al., 2013; Nicholls & Culpepper, 2021; Walter & Ophir, 2019)

Semi-supervised methods

  • Seeded-LDA
  • keyATM

(Watanabe & Zhou, 2020; Eshima et al., 2020)

Keywords from two journalism researchers

k = 5

Multiverse analysis (an example)

Method comparison

Contributions

  • A counterfactual data for benchmarking frame detection methods
  • Limitations: “Counterfactual”, small n (not sustainable $$$), only one type of generic frame, Turtles all the way down
  • Preliminary methodological implications: read our paper

OSF Link

(fin.)

Increasing N

Increasing N