A list of selected papers in which research team members participated.
For a full list see below or go to Google Scholar (Jisun An and Haewoon Kwak).
computational journalism
political science
network science
game analytics
AI/ML/NLP
HCI
online harm
dataset/tool
bias/fairness
user engagement
AI/ML/NLP dataset/tool bias/fairness
A conversation corpus is essential to build interactive AI applications. However, the demographic information of the participants in such corpora is largely underexplored mainly due to the lack of individual data in many corpora. In this work, we analyze a Korean nationwide daily conversation corpus constructed by the National Institute of Korean Language (NIKL) to characterize the participation of different demographic (age and sex) groups in the corpus.
Haewoon Kwak, Jisun An, Kunwoo Park
Proceedings of the 16th International AAAI Conference on Web and Social Media (ICWSM), 2022 (short)
AI/ML/NLP dataset/tool bias/fairness
Framing is a process of emphasizing a certain aspect of an issue over the others, nudging readers or listeners towards different positions on the issue even without making a biased argument. Here, we propose FrameAxis, a method for characterizing documents by identifying the most relevant semantic axes (“microframes”) that are overrepresented in the text using word embedding. Our unsupervised approach can be readily applied to large datasets because it does not require manual annotations. …
Haewoon Kwak, Jisun An, Elise Jing, Yong-Yeol Ahn
computational journalism AI/ML/NLP bias/fairness
Framing is an indispensable narrative device for news media because even the same facts may lead to conflicting understandings if deliberate framing is employed. By developing a media frame classifier that achieves state-of-the-art performance, we systematically analyze the media frames of 1.5 million New York Times articles published from 2000 to 2017.
Haewoon Kwak, Jisun An, Yong-Yeol Ahn
Proceedings of the 12th ACM Conference on Web Science (WebSci), 2020
computational journalism network science bias/fairness
In this work, we propose a graph-based semi-supervised method to measure the political bias of pages on most countries and show the political split of the alternative media, mainstream media, and public figures pages. We validate our method using the publicly available U.S. dataset and then apply it to Brazilian pages, where we found a larger number of right-wing pages in general, except for alternative news media.
Samuel S Guimarães, Julio CS Reis, Lucas Lima, Filipe N Ribeiro, Marisa Vasconcelos, Jisun An, Haewoon Kwak, Fabrício Benevenuto
IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2020
computational journalism AI/ML/NLP bias/fairness
Predicting the political bias and the factuality of reporting of entire news outlets are critical elements of media profiling, which is an understudied but an increasingly important research direction. The present level of proliferation of fake, biased, and propagandistic content online has made it impossible to fact-check every single suspicious claim, either manually or automatically. Thus, it has been proposed to profile entire news outlets and to look for those that are likely to publish fake or biased content. This makes it possible to detect likely “fake news” the moment they are published, by simply checking the reliability of their source. From a practical perspective, political bias and factuality of reporting have a linguistic aspect but also a social context.
Ramy Baly, Georgi Karadzhov, Jisun An, Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, Preslav Nakov
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL) (2020)
computational journalism AI/ML/NLP dataset/tool bias/fairness
We empirically validate three common assumptions in building political media bias datasets, which are (i) labelers’ political leanings do not affect labeling tasks, (ii) news articles follow their source outlet’s political leaning, and (iii) political leaning of a news outlet is stable across different topics.
Soumen Ganguly, Juhi Kulshrestha, Jisun An, Haewoon Kwak
Proceedings of the 14th International AAAI Conference on Web and Social Media (ICWSM), 2020
computational journalism AI/ML/NLP bias/fairness dataset/tool
We introduce Tanbih, a news aggregator with intelligent analysis tools to help readers understanding what’s behind a news story. Our system displays news grouped into events and generates media profiles that show the general factuality of reporting, the degree of propagandistic content, hyper-partisanship, leading political ideology, general frame of reporting, and stance with respect to various claims and topics of a news outlet.
Yifan Zhang, Giovanni Da San Martino, Alberto Barrón-Cedeño, Salvatore Romeo, Jisun An, Haewoon Kwak, Todor Staykovski, Israa Jaradat, Georgi Karadzhov, Ramy Baly, Kareem Darwish, James Glass, Preslav Nakov (demo)
bias/fairness
Gender and racial diversity in the mediated images from the media shape our perception of different demographic groups. In this work, we investigate gender and racial diversity of 85,957 advertising images shared by the 73 top international brands on Instagram and Facebook.
Jisun An, Haewoon Kwak
Proceedings of Social Informatics (SocInfo), 2019
Best Paper Award
AI/ML/NLP dataset/tool bias/fairness
We evaluate four widely used face detection tools, which are Face++, IBM Bluemix Visual Recognition, AWS Rekognition, and Microsoft Azure Face API, using multiple datasets to determine their accuracy in inferring user attributes, including gender, race, and age.
Soon-gyo Jung, Jisun An, Haewoon Kwak, Joni Salminen, Bernard Jim Jansen
Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM), 2018 (short)
computational journalism bias/fairness
Published by Reporters Without Borders every year, the Press Freedom Index (PFI) reflects the fear and tension in the newsroom pushed by the government and private sectors. While the PFI is invaluable in monitoring media environments worldwide, the current survey-based method has inherent limitations to updates in terms of cost and time. In this work, we introduce an alternative way to measure the level of press freedom using media attention diversity compiled from Unfiltered News.
Jisun An, Haewoon Kwak
Proceedings of the ICWSM Workshop on NEws and publiC Opinion (NECO), 2017
Picked as The Best of the Physics arXiv (week ending April 15, 2017) in MIT Technology Review
computational journalism AI/ML/NLP bias/fairness dataset/tool
In this work, we analyze more than two million news photos published in January 2016. We demonstrate i) which objects appear the most in news photos; ii) what the sentiments of news photos are; iii) whether the sentiment of news photos is aligned with the tone of the text; iv) how gender is treated; and v) how differently political candidates are portrayed. To our best knowledge, this is the first large-scale study of news photo contents using deep learning-based vision APIs.
Haewoon Kwak, Jisun An
ICWSM Workshop on NEws and publiC Opinion (NECO), 2016
Picked as The Best of the Physics arXiv (week ending March 26, 2016) in MIT Technology Review
computational journalism dataset/tool bias/fairness
In this work, we reveal the structure of global news coverage of disasters and its determinants by using a large-scale news coverage dataset collected by the GDELT (Global Data on Events, Location, and Tone) project that monitors news media in over 100 languages from the whole world. Significant variables in our hierarchical (mixed-effect) regression model, such as population, political stability, damage, and more, are well aligned with a series of previous research. However, we find strong regionalism in news geography, highlighting the necessity of comprehensive datasets for the study of global news coverage.
Haewoon Kwak, Jisun An
Proceedings of Social Informatics, 2014
Press Coverage-MIT Technology Review, ACM TechNews
Who Is Missing? Characterizing the Participation of Different Demographic Groups in a Korean Nationwide Daily Conversation Corpus
Haewoon Kwak, Jisun An, Kunwoo Park
Proceedings of the 16th International AAAI Conference on Web and Social Media (ICWSM), 2022 (short)
FrameAxis: characterizing microframe bias and intensity with word embedding
Haewoon Kwak, Jisun An, Elise Jing, Yong-Yeol Ahn
PeerJ Computer Science 7:e644, 2021
Code repo (github)
25+ papers citing this work (Google scholar)
A Systematic Media Frame Analysis of 1.5 Million New York Times Articles from 2000 to 2017
Haewoon Kwak, Jisun An, Yong-Yeol Ahn
Proceedings of the 12th ACM Conference on Web Science (WebSci), 2020
20+ papers citing this work (Google scholar)
Identifying and Characterizing Alternative News Media on Facebook
Samuel S Guimarães, Julio CS Reis, Lucas Lima, Filipe N Ribeiro, Marisa Vasconcelos, Jisun An, Haewoon Kwak, Fabrício Benevenuto
IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2020
What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context
Ramy Baly, Georgi Karadzhov, Jisun An, Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, Preslav Nakov
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL) (2020)
35+ papers citing this work (Google scholar)
Empirical Evaluation of Three Common Assumptions in Building Political Media Bias Datasets
Soumen Ganguly, Juhi Kulshrestha, Jisun An, Haewoon Kwak
Proceedings of the 14th International AAAI Conference on Web and Social Media (ICWSM), 2020
15+ papers citing this work (Google scholar)
Tanbih: Get To Know What You Are Reading
Yifan Zhang, Giovanni Da San Martino, Alberto Barrón-Cedeño, Salvatore Romeo, Jisun An, Haewoon Kwak, Todor Staykovski, Israa Jaradat, Georgi Karadzhov, Ramy Baly, Kareem Darwish, James Glass, Preslav Nakov (demo)
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019
Gender and Racial Diversity in Commercial Brands’ Advertising Images on Social Media
Jisun An, Haewoon Kwak
Proceedings of Social Informatics (SocInfo), 2019
Best Paper Award
20+ papers citing this work (Google scholar)
Assessing the Accuracy of Four Popular Face Recognition Tools for Inferring Gender, Age, and Race
Soon-gyo Jung, Jisun An, Haewoon Kwak, Joni Salminen, Bernard Jim Jansen
Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM), 2018 (short)
65+ papers citing this work (Google scholar)
Data-driven Approach to Measuring the Level of Press Freedom Using Media Attention Diversity from Unfiltered News
Jisun An, Haewoon Kwak
Proceedings of the ICWSM Workshop on NEws and publiC Opinion (NECO), 2017
Picked as The Best of the Physics arXiv (week ending April 15, 2017) in MIT Technology Review
Multidimensional Analysis of Gender and Age Differences in News Consumption
Jisun An, Haewoon Kwak
Computation+Journalism (C+J) Symposium (2016)
Revealing the Hidden Patterns of News Photos: Analysis of Millions of News Photos Using GDELT and Deep Learning-based Vision APIs
Haewoon Kwak, Jisun An
ICWSM Workshop on NEws and publiC Opinion (NECO), 2016
Picked as The Best of the Physics arXiv (week ending March 26, 2016) in MIT Technology Review
20+ papers citing this work (Google scholar)
Understanding News Geography and Major Determinants of Global News Coverage of Disasters
Haewoon Kwak, Jisun An (extension of SocInfo’14)
Computation+Journalism (C+J) Symposium, 2014
25+ papers citing this work (Google scholar)
A First Look at Global News Coverage of Disasters By Using the GDELT Dataset
Haewoon Kwak, Jisun An
Proceedings of Social Informatics, 2014
Press Coverage-MIT Technology Review, ACM TechNews
60+ papers citing this work (Google scholar)