A list of selected papers in which research team members participated. 
For a full list see below or go to Google Scholar (Jisun An and Haewoon Kwak).
computational journalism
political science
network science
game analytics
AI/ML/NLP
HCI
 
online harm 
dataset/tool 
bias/fairness 
user engagement
AI/ML/NLP

How are our beliefs formed and how do they influence each other? By combining large language models (LLMs) with collective online records of human belief, we created a “map” showing how thousands of beliefs relate to one another. This model allows us to predict a person’s next belief, measure potential cognitive dissonance, and ultimately understand the core principles governing human cognition.
Byunghwee Lee, Rachith Aiyappa, Yong-Yeol Ahn, Haewoon Kwak, Jisun An
Press coverage - Phys.org Featured Article
computational journalism AI/ML/NLP online harm bias/fairness

How can we spot fake news? We go beyond single articles to evaluate the credibility of entire news outlets. Discover the present and future of new technologies that can rapidly identify sources of disinformation by jointly analyzing their factuality and political bias.
Preslav Nakov, Jisun An, Haewoon Kwak, Muhammad Arslan Manzoor, Zain Muhammad Mujahid, Husrev Taha Sencar
AI/ML/NLP

How can we quickly and accurately compare the knowledge graphs that structure a sentence’s meaning? Existing metrics have limitations, often failing to capture true semantic similarity while being computationally expensive. We developed REMATCH, a new and efficient metric that addresses this by extracting core semantic elements called ‘motifs’ and comparing their collective sets. REMATCH measures semantic similarity 1–5 percentage points more accurately than previous state-of-the-art metrics, and it is five times faster. This model can play a key role in building more sophisticated natural language understanding systems and directly benefits downstream applications that rely on analyzing semantic relationships between sentences.
Zoher Kachwala, Jisun An, Haewoon Kwak, Filippo Menczer
AI/ML/NLP HCI

How can we evaluate the quality of Natural Language Explanations for decisions made by AI? Direct human evaluation is accurate, but it is a difficult, time-consuming, and expensive process. We experimented to see if ChatGPT can evaluate AI explanations for ‘informativeness’ and ‘clarity’ like expert annotators, and how its judgment aligns with humans across various scales. The results show that ChatGPT performs very similarly to humans when evaluating explanations into broad categories like “good/bad,” but struggles to assign fine-grained scores from 1 to 7. Notably, in ‘pairwise comparison’ tasks—judging which of two explanations is better—it demonstrated high accuracy comparable to human experts. This research shows that Large Language Models can be used as reliable and efficient tools to supplement human evaluators under specific conditions, which can accelerate the development of transparent and responsible AI systems.
Fan Huang, Haewoon Kwak, Kunwoo Park, Jisun An
AI/ML/NLP

What is the true impact of toxic trolling in the comment sections of anti-vaccine YouTube videos? While it’s widely believed that such comments spread fear and fuel vaccine hesitancy, there has been little empirical evidence to measure this effect. Our latest study tackles this question by analyzing the complex interplay between toxicity and fear across 484 anti-vaccine videos and more than 414,000 of their comments. Using machine learning to score each comment, we found a significant link between the overall toxicity of a video’s comment section and the level of fear expressed within it. More importantly, we discovered a powerful contagion effect; the toxicity of highly liked early comments was significantly associated with a rise in fear in subsequent comments. This influence was also found to be bidirectional, as highly liked fearful comments were linked to an increase in later toxicity.
Kunihiro Miyazaki, Takayuki Uchiba, Haewoon Kwak, Jisun An, Kazutoshi Sasahara
AI/ML/NLP

The emergence of generative AI has sparked substantial discussions, with the potential to have profound impacts on society in all aspects. As emerging technologies continue to advance, it is imperative to facilitate their proper integration into society, managing expectations and fear. This paper investigates users’ perceptions of generative AI using 3M posts on Twitter from January 2019 to March 2023, especially focusing on their occupation and usage. We find that people across various occupations, not just IT-related ones, show a strong interest in generative AI. The sentiment toward generative AI is generally positive, …
Kunihiro Miyazaki, Taichi Murayama, Takayuki Uchiba, Jisun An, Haewoon Kwak
Press coverage-Blockchain News
AI/ML/NLP

Traffic prediction is one of the key elements to ensure the safety and convenience of citizens. Existing traffic prediction models primarily focus on deep learning architectures to capture spatial and temporal correlation. They often overlook the underlying nature of traffic. Specifically, the sensor networks in most traffic datasets do not accurately represent the actual road network exploited by vehicles, failing to provide insights into the traffic patterns in urban activities. To overcome these limitations, we propose an improved traffic prediction method based on graph convolution deep learning algorithms. …
Sumin Han, Youngjun Park, Minji Lee, Jisun An, Dongman Lee
AI/ML/NLP

ChatGPT, the first large language model (LLM) with mass adoption, has demonstrated remarkable performance in numerous natural language tasks. Despite its evident usefulness, evaluating ChatGPT’s performance in diverse problem domains remains challenging due to the closed nature of the model and its continuous updates via Reinforcement Learning from Human Feedback (RLHF). We highlight the issue of data contamination in ChatGPT evaluations, with a case study of the task of stance detection. We discuss the challenge of preventing data contamination and ensuring fair model evaluation in the age of closed and continuously trained models.
Rachith Aiyappa, Jisun An, Haewoon Kwak, Yong-Yeol Ahn
TrustNLP (Collocated with ACL), 2023
AI/ML/NLP

People who share similar opinions towards controversial topics could form an echo chamber and may share similar political views toward other topics as well. The existence of such connections, which we call connected behavior, gives researchers a unique opportunity to predict how one would behave for a future event given their past behaviors. In this work, we propose a framework to conduct connected behavior analysis. Neural stance detection models are trained on Twitter data collected on three seemingly independent topics, i.e., wearing a mask, racial equality, and Trump, to detect people’s stance, …
Hong Zhang, Haewoon Kwak, Wei Gao, Jisun An

New leaders in democratic countries typically enjoy high approval ratings immediately after taking office. This phenomenon is called the honeymoon effect and is regarded as a significant political phenomenon; however, its mechanism remains underexplored. Therefore, this study examines how social media users respond to changes in political leadership in order to better understand the honeymoon effect in politics. In particular, we constructed a 15-year Twitter dataset on eight change timings of Japanese prime ministers consisting of 6.6M tweets and analyzed them in terms of sentiments, topics, and users. …
Kunihiro Miyazaki, Taichi Murayama, Akira Matsui, Masaru Nishikawa, Takayuki Uchiba, Haewoon Kwak, Jisun An
AI/ML/NLP online harm

Recent studies have alarmed that many online hate speeches are implicit. With its subtle nature, the explainability of the detection of such hateful speech has been a challenging problem. In this work, we examine whether ChatGPT can be used for providing natural language explanations (NLEs) for implicit hateful speech detection. We design our prompt to elicit concise ChatGPT-generated NLEs and conduct user studies to evaluate their qualities by comparison with human-generated NLEs. We discuss the potential and limitations of ChatGPT in the context of implicit hateful speech research.
Fan Huang, Haewoon Kwak, Jisun An
AI/ML/NLP online harm

Recent studies have exploited advanced generative language models to generate Natural Language Explanations (NLE) for why a certain text could be hateful. We propose the Chain of Explanation (CoE) Prompting method, using the heuristic words and target group, to generate high-quality NLE for implicit hate speech. We improved the BLUE score from 44.0 to 62.3 for NLE generation by providing accurate target information. We then evaluate the quality of generated NLE using various automatic metrics and human annotations of informativeness and clarity scores.
Fan Huang, Haewoon Kwak, Jisun An
computational journalism online harm

False information spreads on social media, and fact-checkingis a potential countermeasure. However, there is a severeshortage of fact-checkers; an efficient way to scale fact-checking is desperately needed, especially in pandemics likeCOVID-19. In this study, we focus on spontaneous debunk-ing by social media users, which has been missed in exist-ing research despite its indicated usefulness for fact-checkingand countering false information. Specifically, we character-ize the tweets with false information, or fake tweets, thattend to be debunked and Twitter users who often debunk faketweets.
Kunihiro Miyazaki, Takayuki Uchiba, Kenji Tanaka, Jisun An, Haewoon Kwak, Kazutoshi Sasahara
game analytics HCI user engagement

Achievement systems have been actively adopted in gaming platforms to maintain players’ interests. Among them, trophies in PlayStation games are one of the most successful achievement systems. While the importance of trophy design has been casually discussed in many game developers’ forums, there has been no systematic study of the historical dataset of trophies yet. In this work, we construct a complete dataset of PlayStation games and their trophies and investigate them from both the developers’ and players’ perspectives.
Haewoon Kwak
network science

Social media is not only a place for people to communicate on a daily matter but also a virtual venue to transmit and exchange various ideas. Such ideas are known as the raw voices of potential consumers, which come from a wide range of people who may not participate in consumer surveys, and therefore their opinions may contain high value to companies. However, how users share their ideas on social media is still underexplored. This study investigates a spontaneous ideation contest about a generic term for new Big Tech companies, which occurred when Facebook changed its name to Meta. We constructed a comprehensive dataset of tweets containing candidates and examined how they were suggested, spread, and exchanged by social media users. Our findings indicate that different ideas are better on different metrics. The ranking of ideas was not decided immediately after the idea contest started. The first people to post ideas have smaller followers than those who post secondarily or who only share the idea. We also confirmed that replies accumulate unique ideas, but most of them are added in the first depth in reply trees. This study would promote the use of social media as a part of open innovation and co-creation processes in the industry.
Kunihiro Miyazaki, Takayuki Uchiba, Haewoon Kwak, Jisun An
political science network science

The United States have some of the highest rates of gun violence among developed countries. Yet, there is a disagreement about the extent to which firearms should be regulated. In this study, we employ social media signals to examine the predictors of offline political activism, at both population and individual level. We show that it is possible to classify the stance of users on the gun issue, especially accurately when network information is available. Alongside socioeconomic variables, network information such as the relative size of the two sides of the debate is also predictive of state-level gun policy. On individual level, we build a statistical model using network, content, and psycho-linguistic features that predicts real-life political action, and explore the most predictive linguistic features. Thus, we argue that, alongside demographics and socioeconomic indicators, social media provides useful signals in the holistic modeling of political engagement around the gun debate.
Yelena Mejova, Jisun An, Gianmarco De Francisci Morales, Haewoon Kwak
ACM Transactions on Social Computing, 2022
AI/ML/NLP political science online harm

The transfer of power stemming from the 2020 presidential election occurred during an unprecedented period in United States history. Uncertainty from the COVID-19 pandemic, ongoing societal tensions, and a fragile economy increased societal polarization, exacerbated by the outgoing president’s offline rhetoric. As a result, online groups such as QAnon engaged in extra political participation beyond the traditional platforms. This research explores the link between offline political speech and online extra-representational participation by examining Twitter within the context of the January 6 insurrection. Using a mixed-methods approach of quantitative and qualitative thematic analyses, the study combines offline speech information with Twitter data during key speech addresses leading up to the date of the insurrection; exploring the link between Trump’s offline speeches and QAnon’s hashtags across a 3-day timeframe. We find that links between online extra-representational participation and offline political speech exist. This research illuminates this phenomenon and offers policy implications for the role of online messaging as a tool of political mobilization.
Claire Seungeun Lee, Juan Merizalde, John D. Colautti, Jisun An and Haewoon Kwak
Press coverage-PsyPost
computational journalism user engagement

Using Plutchik’s wheel of emotions framework, we identify the emotional content of 133,487 social media posts and the audience’s emotional engagement expressed in 2,824,162 comments on those posts. We measure nine emotions (anger, anticipation, anxiety, disgust, joy, fear, sadness, surprise, trust) and two sentiments (positive and negative) using two extraction resources (EmoLex, LIWC) for eight major news outlets across four social media platforms (Facebook, Instagram, Twitter, and YouTube) during eight months. We then apply two approaches (Logistic Regression, Long Short-Term Memory) to predict emotional audience reactions before and after publishing the posts. …
Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen
ACM Transactions on Social Computing, 2022
AI/ML/NLP online harm

While the contagious nature of online toxicity sparked increasing interest in its early detection and prevention, most of the literature focuses on the Western world. In this work, we demonstrate that 1) it is possible to detect toxicity triggers in an Asian online community, and 2) toxicity triggers can be strikingly different between Western and Eastern contexts.
Yun Yu Chong, Haewoon Kwak
Proceedings of the 16th International AAAI Conference on Web and Social Media (ICWSM), 2022 (short)
Press coverage-AI Ethics Brief Newsletter by Montreal AI Ethics Institute
AI/ML/NLP dataset/tool bias/fairness

A conversation corpus is essential to build interactive AI applications. However, the demographic information of the participants in such corpora is largely underexplored mainly due to the lack of individual data in many corpora. In this work, we analyze a Korean nationwide daily conversation corpus constructed by the National Institute of Korean Language (NIKL) to characterize the participation of different demographic (age and sex) groups in the corpus.
Haewoon Kwak, Jisun An, Kunwoo Park
Proceedings of the 16th International AAAI Conference on Web and Social Media (ICWSM), 2022 (short)
computational journalism user engagement

This research characterises user engagement of approximately 3,000,000 news postings of 53 news outlets and 50,000,000 associated user comments during 8 months on 5 social media platforms (i.e. Facebook, Instagram, Twitter, YouTube, and Reddit). We investigate the effect of sentiments and topics on user engagement across four levels of user engagement expressions (i.e. views, likes, comments, cross-platform posting). We find that sentiments and topics differ by both news outlets and social media platforms, and both sentiments and topics by the four levels of user engagement expression. …
Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen
Behaviour & Information Technology, 2022
AI/ML/NLP online harm

We investigate predictors of anti-Asian hate among Twitter users throughout COVID-19. With the rise of xenophobia and polarization that has accompanied widespread social media usage in many nations, online hate has become a major social issue, attracting many researchers. Here, we apply natural language processing techniques to characterize social media users who began to post anti-Asian hate messages during COVID-19. We compare two user groups – those who posted anti-Asian slurs and those who did not – with respect to a rich set of features measured with data prior to COVID-19 and show that it is possible to predict who later publicly posted anti-Asian slurs. …
Jisun An, Haewoon Kwak, Claire Seungeun Lee, Bogang Jun, Yong-Yeol Ahn
Findings of the Association for Computational Linguistics EMNLP 2021
user engagement

We propose a novel precision public health campaign framework to structure and standardize the process of designing and delivering tailored health messages to target particular population segments using social media–targeted advertising tools. Our framework consists of five stages - defining a campaign goal, priority audience, and evaluation metrics; splitting the target audience into smaller segments; tailoring the message for each segment and conducting a pilot test; running the health campaign formally; and evaluating the performance of the campaigns. We have demonstrated how the framework works through 2 case studies. The precision public health campaign framework has the potential to support higher population uptake and engagement rates by encouraging a more standardized, concise, efficient, and targeted approach to public health campaign development.
Jisun An, Haewoon Kwak, Hanya M Qureshi, Ingmar Weber
JMIR Form Res 2021;5(9):e22313, 2021
AI/ML/NLP dataset/tool bias/fairness

Framing is a process of emphasizing a certain aspect of an issue over the others, nudging readers or listeners towards different positions on the issue even without making a biased argument. Here, we propose FrameAxis, a method for characterizing documents by identifying the most relevant semantic axes (“microframes”) that are overrepresented in the text using word embedding. Our unsupervised approach can be readily applied to large datasets because it does not require manual annotations. …
Haewoon Kwak, Jisun An, Elise Jing, Yong-Yeol Ahn
political science

We investigate differences along these dimensions on the online forum Reddit by comparing linguistic patterns and content of comments in two subreddits focusing on a populist, Donald Trump (/r/The_Donald), and a center-left politician, Hillary Clinton (/r/hillaryclinton), during the 2016 U.S. presidential election campaign.
Andreas Jungherr, Oliver Posegga, Jisun An
Social Science Computer Review. March 2021.
computational journalism AI/ML/NLP user engagement

We first build a parallel corpus of original news articles and their corresponding tweets that were shared by eight media outlets. Then, we explore how those media edited tweets against original headlines, and the effects would be..
Kunwoo Park, Haewoon Kwak, Jisun An, Sanjay Chawla
Proceedings of the 15th International AAAI Conference on Web and Social Media (ICWSM), 2021
computational journalism AI/ML/NLP bias/fairness

Framing is an indispensable narrative device for news media because even the same facts may lead to conflicting understandings if deliberate framing is employed. By developing a media frame classifier that achieves state-of-the-art performance, we systematically analyze the media frames of 1.5 million New York Times articles published from 2000 to 2017.
Haewoon Kwak, Jisun An, Yong-Yeol Ahn
Proceedings of the 12th ACM Conference on Web Science (WebSci), 2020
computational journalism network science bias/fairness

In this work, we propose a graph-based semi-supervised method to measure the political bias of pages on most countries and show the political split of the alternative media, mainstream media, and public figures pages. We validate our method using the publicly available U.S. dataset and then apply it to Brazilian pages, where we found a larger number of right-wing pages in general, except for alternative news media.
Samuel S Guimarães, Julio CS Reis, Lucas Lima, Filipe N Ribeiro, Marisa Vasconcelos, Jisun An, Haewoon Kwak, Fabrício Benevenuto
IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2020
computational journalism AI/ML/NLP bias/fairness

Predicting the political bias and the factuality of reporting of entire news outlets are critical elements of media profiling, which is an understudied but an increasingly important research direction. The present level of proliferation of fake, biased, and propagandistic content online has made it impossible to fact-check every single suspicious claim, either manually or automatically. Thus, it has been proposed to profile entire news outlets and to look for those that are likely to publish fake or biased content. This makes it possible to detect likely “fake news” the moment they are published, by simply checking the reliability of their source. From a practical perspective, political bias and factuality of reporting have a linguistic aspect but also a social context.
Ramy Baly, Georgi Karadzhov, Jisun An, Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, Preslav Nakov
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL) (2020)
computational journalism AI/ML/NLP dataset/tool bias/fairness

We empirically validate three common assumptions in building political media bias datasets, which are (i) labelers’ political leanings do not affect labeling tasks, (ii) news articles follow their source outlet’s political leaning, and (iii) political leaning of a news outlet is stable across different topics.
Soumen Ganguly, Juhi Kulshrestha, Jisun An, Haewoon Kwak
Proceedings of the 14th International AAAI Conference on Web and Social Media (ICWSM), 2020
AI/ML/NLP user engagement

We study two Reddit communities that adopted this scheme, whereby posts include tags identifying education status referred to as flairs, and we examine how the “transferred” social status affects the interactions among the users.
Kunwoo Park, Haewoon Kwak, Hyunho Song, Meeyoung Cha
Proceedings of the 14th International AAAI Conference on Web and Social Media (ICWSM), 2020
AI/ML/NLP online harm

We define toxicity triggers in online discussions as a non-toxic comment that lead to toxic replies. Then, we build a neural network-based prediction model for toxicity trigger.
Hind Almerekhi, Haewoon Kwak, Bernard Jim Jansen, Joni Salminen (short)
Proceedings of The Web Conference (WWW), 2020

We show that estimating homophily in a network can be viewed as a dyadic prediction problem, and that homophily estimates are unbiased when dyad-level residuals sum to zero in the network. Then, we propose a novel “ego-alter” modeling approach that outperforms standard node and dyad classification strategies.
George Berry, Antonio Sirianni, Ingmar Weber, Jisun An, Michael Macy (preprint)
arXiv preprint arXiv:2001.11171, 2020
computational journalism AI/ML/NLP bias/fairness dataset/tool

We introduce Tanbih, a news aggregator with intelligent analysis tools to help readers understanding what’s behind a news story. Our system displays news grouped into events and generates media profiles that show the general factuality of reporting, the degree of propagandistic content, hyper-partisanship, leading political ideology, general frame of reporting, and stance with respect to various claims and topics of a news outlet.
Yifan Zhang, Giovanni Da San Martino, Alberto Barrón-Cedeño, Salvatore Romeo, Jisun An, Haewoon Kwak, Todor Staykovski, Israa Jaradat, Georgi Karadzhov, Ramy Baly, Kareem Darwish, James Glass, Preslav Nakov (demo)
bias/fairness

Gender and racial diversity in the mediated images from the media shape our perception of different demographic groups. In this work, we investigate gender and racial diversity of 85,957 advertising images shared by the 73 top international brands on Instagram and Facebook.
Jisun An, Haewoon Kwak
Proceedings of Social Informatics (SocInfo), 2019
Best Paper Award
political science

We use Reddit to explore the nature of political discussionsin homogeneous and cross-cutting communication spaces. Inparticular, we develop an analytical template to studyinter-actionandlinguistic patternswithin and between politicallyhomogeneous and heterogeneous communication spaces. Ouranalyses reveal different behavioral patterns in homogeneousand cross-cutting communications spaces.
Jisun An, Haewoon Kwak, Oliver Posegga, Andreas Jungherr
Proceedings of the 13th International AAAI Conference on Web and Social Media (ICWSM), 2019
computational journalism user engagement

We evaluate the effects of the topics of social media posts on audiences across five social media platforms (i.e., Facebook, Instagram, Twitter, YouTube, and Reddit) at four levels of user engagement. We collected 3,163,373 social posts from 53 news organizations across five platforms during an 8month period.
Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen
Proceedings of the 13th International AAAI Conference on Web and Social Media (ICWSM), 2019
computational journalism

We propose the concept of discursive power. This describes the ability of contributors to communication spaces to introduce, amplify, and maintain topics, frames, and speakers, thus shaping public discourses and controversies that unfold in interconnected communication spaces.
Andreas Jungherr, Oliver Posegga, Jisun An
The International Journal of Press/Politics, 24(4), 2019
HCI

We develop a methodology to automate creating imaginary people, referred to as personas, by processing complex behavioral and demographic data of social media audiences. From a popular social media account containing more than 30 million interactions by viewers from 198 countries engaging with more than 4,200 online videos produced by a global media corporation, we demonstrate that our methodology has several novel accomplishments.
Jisun An, Haewoon Kwak, Soon-gyo Jung, Joni Salminen, M. Admad, Bernard J. Jansen
ACM Transactions on the Web, 12(4), 2018
AI/ML/NLP dataset/tool bias/fairness

We evaluate four widely used face detection tools, which are Face++, IBM Bluemix Visual Recognition, AWS Rekognition, and Microsoft Azure Face API, using multiple datasets to determine their accuracy in inferring user attributes, including gender, race, and age.
Soon-gyo Jung, Jisun An, Haewoon Kwak, Joni Salminen, Bernard Jim Jansen
Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM), 2018 (short)
online harm

We manually label 5,143 hateful expressions posted to YouTube and Facebook videos among a dataset of 137,098 comments from an online news media. We then create a granular taxonomy of different types and targets of online hate and train machine learning models to automatically detect and classify the hateful comments in the full dataset.
Joni Salminen, Hind Almerekhi, Milica Milenković, Soon-gyo Jung, Jisun An, Haewoon Kwak, Bernard J. Jansen
Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM), 2018
game analytics HCI

We manually label 5,143 hateful expressions posted to YouTube and Facebook videos among a dataset of 137,098 comments from an online news media. We then create a granular taxonomy of different types and targets of online hate and train machine learning models to automatically detect and classify the hateful comments in the full dataset.
Peter Mawhorter, Sercan Şengün, Haewoon Kwak, D. Fox Harrell
IEEE Transactions on Games, 10(2), 2018
AI/ML/NLP dataset/tool

We propose SemAxis, a simple yet powerful framework to characterize word semantics using many semantic axes in word-vector spaces beyond sentiment. We demonstrate that SemAxis can capture nuanced semantic representations in multiple online communities. We also show that, when the sentiment axis is examined, SemAxis outperforms the state-of-the-art approaches in building domain-specific sentiment lexicons.
Jisun An, Haewoon Kwak, Yong-Yeol Ahn
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), 2018
computational journalism network science

We investigate the alignment of international attention of news media organizations within 193 countries with the expressed international interests of the public within those same countries from March 7, 2016 to April 14, 2017. We collect fourteen months of longitudinal data of online news from Unfiltered News and web search volume data from Google Trends and build a multiplex network of media attention and public attention in order to study its structural and dynamic properties.
Haewoon Kwak, Jisun An, Joni Salminen, Soon-Gyo Jung, Bernard J. Jansen.
Proceedings of the 2018 World Wide Web Conference (WWW), 2018
online harm

We provide, to the best of our knowledge, the first characterization of Gab. We collect and analyze 22M posts produced by 336K users between August 2016 and January 2018, finding that Gab is predominantly used for the dissemination and discussion of news and world events, and that it attracts alt-right users, conspiracy theorists, and other trolls
Savvas Zannettou, Barry Bradlyn, Emiliano De Cristofaro, Haewoon Kwak, Michael Sirivianos, Gianluca Stringhini, Jeremy Blackburn
Companion Proceedings of the The Web Conference (WWW), 2018
Press coverage-New Scientist, and Vice
HCI

We investigate if and how more photos than a single headshot can heighten the level of information provided by persona profiles. We conduct eye-tracking experiments and qualitative interviews with variations in the photos-a single headshot, a headshot and images of the persona in different contexts, and a headshot with pictures of different people representing key persona attributes.
Joni Salminen, Lene Nielsen, Soon-Gyo Jung, Jisun An, Haewoon Kwak, Bernard J. Jansen
Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI), 2018
computational journalism network science

The objective of this study is to assess the longitudinal trends of media similarity and dissimilarity on the international scale. As news value has well-established political, cultural, and economic consequences, the degree to which media coverage and content is converging across countries has implications for international relations. To study this convergence, we use the daily data of the 100 topics that were overreported in each country, compared to other countries, from March 7 to October 9, 2016.
Jisun An, Hassan Aldarbesti, Haewoon Kwak
Proceedings of Social Informatics (SocInfo), 2017 (short)
computational journalism user engagement

Examining 103,133 news articles that are the most popular for different demographic groups in Daum News (the second most popular news portal in South Korea) during the whole year of 2015, we provided multi-level analyses of gender and age differences in news consumption. We measured such differences in four different levels - (1) by actual news items, (2) by section, (3) by topic, and (4) by subtopic. We characterized the news items at the four levels by using the computational techniques, which are topic modeling and the vector representation of words and news items. We found that differences in news reading behavior across different demographic groups are the most noticeable in subtopic level but neither section nor topic levels.
Jisun An, Haewoon Kwak
Proceedings of Social Informatics (SocInfo), 2017
computational journalism network science

We built a multiplex media attention and disregard network (MADN) among 129 countries over 212 days. By characterizing the MADN from multiple levels, we found that it is formed primarily by skewed, hierarchical, and asymmetric relationships. Also, we found strong evidence that our news world is becoming a “global village.” However, at the same time, unique attention blocks of the Middle East and North Africa (MENA) region, as well as Russia and its neighbors, still exist.
Haewoon Kwak, Jisun An
computational journalism user engagement

The widespread adoption and dissemination of online news through social media systems have been revolutionizing many segments of our society and ultimately our daily lives. In these systems, users can play a central role as they share content to their friends. Despite that, little is known about news spreaders in social media. In this paper, we provide the first of its kind in-depth characterization of news spreaders in social media. In particular, we investigate their demographics, what kind of content they share, and the audience they reach.
Julio Reis, Haewoon Kwak, and Jisun An, Johnnatan Messias, Fabrıcio Benevenuto
Proceedings of the 28th ACM Conference on Hypertext and Social Media (HT), 2017
computational journalism bias/fairness

Published by Reporters Without Borders every year, the Press Freedom Index (PFI) reflects the fear and tension in the newsroom pushed by the government and private sectors. While the PFI is invaluable in monitoring media environments worldwide, the current survey-based method has inherent limitations to updates in terms of cost and time. In this work, we introduce an alternative way to measure the level of press freedom using media attention diversity compiled from Unfiltered News.
Jisun An, Haewoon Kwak
Proceedings of the ICWSM Workshop on NEws and publiC Opinion (NECO), 2017
Picked as The Best of the Physics arXiv (week ending April 15, 2017) in MIT Technology Review
game analytics

We use player behavior during the closed beta test of the MMORPG ArcheAge as a proxy for an extreme situation-at the end of the closed beta test, all user data is deleted, and thus, the outcome (or penalty) of players’ in-game behaviors in the last few days loses its meaning. We analyzed 270 million records of player behavior in the 4th closed beta test of ArcheAge.
Ah Reum Kang, Jeremy Blackburn, Haewoon Kwak, Huy Kang Kim
Proceedings of the 26th International Conference on World Wide Web (WWW) Companion, 2017
Press coverage-New Scientist, IFL Science, PC Gamer, Massively OK, El Confidencial, Joongang Ilbo, and so on.
game analytics user engagement

Retaining players over an extended period of time is a long-standing challenge in game industry. Significant effort has been paid to understanding what motivates players enjoy games. While individuals may have varying reasons to play or abandon a game at different stages within the game, previous studies have looked at the retention problem from a snapshot view. This study, by analyzing in-game logs of 51,104 distinct individuals in an online multiplayer game, uniquely offers a multifaceted view of the retention problem over the players’ virtual life phases.
Kunwoo Park, Meeyoung Cha, Haewoon Kwak, Kuan-Ta Chen
Proceedings of the 26th International Conference on World Wide Web (WWW) Companion, 2017

The Middle East respiratory syndrome coronavirus (MERS-CoV) was exported to Korea in 2015, resulting in a threat to neighboring nations. We evaluated the possibility of using a digital surveillance system based on web searches and social media data to monitor this MERS outbreak. We collected the number of daily laboratory-confirmed MERS cases and quarantined cases from May 11, 2015 to June 26, 2015 using the Korean government MERS portal. The daily trends observed via Google search and Twitter during the same time period were also ascertained using Google Trends and Topsy. Correlations among the data were then examined using Spearman correlation analysis.
Soo-Yong Shin, Dong-Woo Seo, Jisun An, Haewoon Kwak, Sung-Han Kim, Jin Gwack, Min-Woo Jo
Scientific Reports 6, Article number 32920 (2016)
computational journalism AI/ML/NLP bias/fairness dataset/tool

In this work, we analyze more than two million news photos published in January 2016. We demonstrate i) which objects appear the most in news photos; ii) what the sentiments of news photos are; iii) whether the sentiment of news photos is aligned with the tone of the text; iv) how gender is treated; and v) how differently political candidates are portrayed. To our best knowledge, this is the first large-scale study of news photo contents using deep learning-based vision APIs.
Haewoon Kwak, Jisun An
ICWSM Workshop on NEws and publiC Opinion (NECO), 2016
Picked as The Best of the Physics arXiv (week ending March 26, 2016) in MIT Technology Review

We study the response to the Charlie Hebdo shootings of January 7, 2015 on Twitter across the globe. We ask whether the stances on the issue of freedom of speech can be modeled using established sociological theories, including Huntington’s culturalist Clash of Civilizations, and those taking into consideration social context, including Density and Interdependence theories. We find support for Huntington’s culturalist explanation, in that the established traditions and norms of one’s “civilization” predetermine some of one’s opinion.
Jisun An, Haewoon Kwak, Yelena Mejova, Sonia Alonso Saenz De Oger, Braulio Gomez Fortes
Proceeding of the 10th International Conference on Web and Social Media (ICWSM), 2016
game analytics HCI online harm

In this work we explore cyberbullying and other toxic behavior in team competition online games. Using a dataset of over 10 million player reports on 1.46 million toxic players along with corresponding crowdsourced decisions, we test several hypotheses drawn from theories explaining toxic behavior. Besides providing large-scale, empirical based understanding of toxic behavior, our work can be used as a basis for building systems to detect, prevent, and counter-act toxic behavior.
Haewoon Kwak, Jeremy Blackburn, Seungyeop Han
Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI), 2015
computational journalism dataset/tool bias/fairness

In this work, we reveal the structure of global news coverage of disasters and its determinants by using a large-scale news coverage dataset collected by the GDELT (Global Data on Events, Location, and Tone) project that monitors news media in over 100 languages from the whole world. Significant variables in our hierarchical (mixed-effect) regression model, such as population, political stability, damage, and more, are well aligned with a series of previous research. However, we find strong regionalism in news geography, highlighting the necessity of comprehensive datasets for the study of global news coverage.
Haewoon Kwak, Jisun An
Proceedings of Social Informatics, 2014
Press Coverage-MIT Technology Review, ACM TechNews
game analytics online harm

In this paper we explore the linguistic components of toxic behavior by using crowdsourced data from over 590 thousand cases of accused toxic players in a popular match-based competition game, League of Legends. We perform a series of linguistic analyses to gain a deeper understanding of the role communication plays in the expression of toxic behavior. We characterize linguistic behavior of toxic players and compare it with that of typical players in an online competition game. We also find empirical support describing how a player transitions from typical to toxic behavior. Our findings can be helpful to automatically detect and warn players who may become toxic and thus insulate potential victims from toxic playing in advance.
Haewoon Kwak, Jeremy Blackburn
SocInfo Workshop on Exploration on Games and Gamers (EGG), 2014
network science

We introduce the the concept of “flow motifs” to characterize the statistically significant pass sequence patterns. It extends the idea of the network motifs, highly significant subgraphs that usually consists of three or four nodes. The analysis of the motifs in the pass networks allows us to compare and differentiate the styles of different teams. Although most teams tend to apply homogenous style, surprisingly, a unique strategy of soccer exists. Specifically, FC Barcelona’s famous tiki-taka does not consist of uncountable random passes but rather has a precise, finely constructed structure.
Laszlo Gyarmati, Haewoon Kwak, Pablo Rodriguez
KDD Workshop on Large-Scale Sports Analytics, 2014
Press coverage-BBC, MIT Technology Review, The Times, The Economist, Slate, Pacific Standard, and so on.
game analytics online harm

We propose a supervised learning approach for predicting crowdsourced decisions on toxic behavior with large-scale labeled data collections; over 10 million user reports involved in 1.46 million toxic players and corresponding crowdsourced decisions. Our result shows good performance in detecting overwhelmingly majority cases and predicting crowdsourced decisions on them. We demonstrate good portability of our classifier across regions.
Jeremy Blackburn, Haewoon Kwak
Proceedings of the 23rd international conference on World wide web (WWW), 2014
Press coverage-Nature, Scientific American, Chosun Ilbo

One of the most popular crowdfunding sites is Kickstarter. In it, creators post descriptions of their projects and advertise them on social media sites (mainly Twitter), while investors look for projects to support. We set out to propose different ways of recommending investors found on Twitter for specific Kickstarter projects. We do so by conducting hypothesis-driven analyses of pledging behavior and translate the corresponding findings into different recommendation strategies. The best strategy achieves, on average, 84% of accuracy in predicting a list of potential investors’ Twitter accounts for any given project.
Jisun An, Daniele Quercia, Jon Crowcroft
Proceedings of the 23rd international conference on World wide web (WWW), 2014
Press coverage-FastCompany
network science HCI

We analyze the dynamics of the behavior known as ‘unfollow’ in Twitter. We collected daily snapshots of the online relationships of 1.2 million Korean-speaking users for 51 days as well as all of their tweets. We found that Twitter users frequently unfollow. We then discover the major factors, including the reciprocity of the relationships, the duration of a relationship, the followees’ informativeness, and the overlap of the relationships, which affect the decision to unfollow. We conduct interview with 22 Korean respondents to supplement the quantitative results.
Haewoon Kwak, Hyunwoo Chun, Sue Moon
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), 2011.
Press coverage - Kyunghyang Shinmun
computational journalism network science

We present a preliminary but groundbreaking study of the media landscape of Twitter. We use public data on whom follows who to uncover common behaviour in media consumption, the relationship between various classes of media, and the diversity of media content which social links may bring. Our analysis shows that there is a non-negligible amount of indirect media exposure, either through friends who follow particular media sources, or via retweeted messages. We show that the indirect media exposure expands the political diversity of news to which users are exposed to a surprising extent, increasing the range by between 60-98%. These results are valuable because they have not been readily available to traditional media, and they can help predict how we will read news, and how publishers will interact with us in the future.
Jisun An, Meeyoung Cha, Krishna Gummadi, Jon Crowcroft
Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM), 2011.
network science

We have crawled the entire Twitter site and obtained 41.7 million user profiles, 1.47 billion social relations, 4,262 trending topics, and 106 million tweets. In its follower-following topology analysis we have found a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks [28]. In order to identify influentials on Twitter, we have ranked users by the number of followers and by PageRank and found two rankings to be similar. Ranking by retweets differs from the previous two rankings, indicating a gap in influence inferred from the number of followers and that from the popularity of one’s tweets.
Haewoon Kwak, Changhyun Lee, Hosung Park, Sue Moon
Proceedings of the 19th international conference on World wide web (WWW), 2010.
Press coverage - Mashable Op-Ed, ReadWrite, The Guardian, PC News, Chosun Ilbo, DongA Ilbo
A semantic embedding space based on large language models for modelling human beliefs 
  Byunghwee Lee, Rachith Aiyappa, Yong-Yeol Ahn, Haewoon Kwak, Jisun An 
Nature Human Behaviour, 2025
  Press coverage - Phys.org Featured Article
View-only without paywall (ReadCube)
A Survey on Predicting the Factuality and the Bias of News Media 
  Preslav Nakov, Jisun An, Haewoon Kwak, Muhammad Arslan Manzoor, Zain Muhammad Mujahid, Husrev Taha Sencar 
ACL Findings, 2024
  15+ papers citing this work (Google scholar)
Rematch: Robust and Efficient Matching of Local Knowledge Graphs to Improve Structural and Semantic Similarity 
  Zoher Kachwala, Jisun An, Haewoon Kwak, Filippo Menczer 
NAACL Findings, 2024
ChatGPT Rates Natural Language Explanation Quality Like Humans: But on Which Scales? 
  Fan Huang, Haewoon Kwak, Kunwoo Park, Jisun An 
LREC-COLING, 2024
  15+ papers citing this work (Google scholar)
The Impact of Toxic Trolling Comments on Anti-vaccine YouTube Videos 
  Kunihiro Miyazaki, Takayuki Uchiba, Haewoon Kwak, Jisun An, Kazutoshi Sasahara 
Scientific Reports, 2024
Public Perception of Generative AI on Twitter: An Empirical Study Based on Occupation and Usage 
  Kunihiro Miyazaki, Taichi Murayama, Takayuki Uchiba, Jisun An, Haewoon Kwak 
EPJ Data Science, 2024
  Press coverage-Blockchain News
35+ papers citing this work (Google scholar)
Enhancing Spatiotemporal Traffic Prediction through Urban Human Activity Analysis 
  Sumin Han, Youngjun Park, Minji Lee, Jisun An, Dongman Lee 
ACM CIKM, 2023
Can We Trust the Evaluation on ChatGPT? 
  Rachith Aiyappa, Jisun An, Haewoon Kwak, Yong-Yeol Ahn 
TrustNLP (Collocated with ACL), 2023
  100+ papers citing this work (Google scholar)
Wearing Masks Implies Refuting Trump?: Towards Target-specific User Stance Prediction across Events in COVID-19 and US Election 2020 
  Hong Zhang, Haewoon Kwak, Wei Gao, Jisun An 
ACM WebSci, 2023
  5+ papers citing this work (Google scholar)
Political Honeymoon Effect on Social Media: Characterizing Social Media Reaction to the Changes of Prime Minister in Japan 
  Kunihiro Miyazaki, Taichi Murayama, Akira Matsui, Masaru Nishikawa, Takayuki Uchiba, Haewoon Kwak, Jisun An 
ACM WebSci, 2023
  5+ papers citing this work (Google scholar)
Is ChatGPT better than Human Annotators? Potential and Limitations of ChatGPT in Explaining Implicit Hate Speech 
  Fan Huang, Haewoon Kwak, Jisun An 
WWW Companion, 2023
  340+ papers citing this work (Google scholar)
Chain of Explanation: New Prompting Method to Generate Higher Quality Natural Language Explanation for Implicit Hate Speech 
  Fan Huang, Haewoon Kwak, Jisun An 
WWW Companion, 2023
  35+ papers citing this work (Google scholar)
YouNICon: YouTube’s CommuNIty of Conspiracy Videos 
  Shaoyi Liaw, Fan Huang, Fabricio Benevenuto, Haewoon Kwak, Jisun An 
ICWSM Dataset, 2023
  5+ papers citing this work (Google scholar)
‘This is Fake News’: Characterizing the Spontaneous Debunking from Twitter Users to COVID-19 False Information 
  Kunihiro Miyazaki, Takayuki Uchiba, Kenji Tanaka, Jisun An, Haewoon Kwak, Kazutoshi Sasahara 
AAAI ICWSM, 2023
  10+ papers citing this work (Google scholar)
You Have Earned a Trophy: Characterize In-Game Achievements and Their Completions 
  Haewoon Kwak 
ACM WebSci, 2022
MAANG? MANGA? Characterizing Spontaneous Ideation Contest on Social Media 
  Kunihiro Miyazaki, Takayuki Uchiba, Haewoon Kwak, Jisun An 
IEEE BigData, 2022 (short)
Modeling Political Activism around Gun Debate via Social Media 
  Yelena Mejova, Jisun An, Gianmarco De Francisci Morales, Haewoon Kwak 
ACM Transactions on Social Computing, 2022
  5+ papers citing this work (Google scholar)
Storm the Capitol: Linking Offline Political Speech and Online Twitter Extra-Representational Participation on QAnon and the January 6 Insurrection 
  Claire Seungeun Lee, Juan Merizalde, John D. Colautti, Jisun An and Haewoon Kwak 
Frontiers in Sociology, 2022
  Press coverage-PsyPost
35+ papers citing this work (Google scholar)
Measuring 9 Emotions of News Posts from 8 News Organizations across 4 Social Media Platforms for 8 Months 
  Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen 
ACM Transactions on Social Computing, 2022
  10+ papers citing this work (Google scholar)
Understanding Toxicity Triggers on Reddit in the Context of Singapore 
  Yun Yu Chong, Haewoon Kwak 
Proceedings of the 16th International AAAI Conference on Web and Social Media (ICWSM), 2022 (short)
  Press coverage-AI Ethics Brief Newsletter by Montreal AI Ethics Institute
20+ papers citing this work (Google scholar)
Who Is Missing? Characterizing the Participation of Different Demographic Groups in a Korean Nationwide Daily Conversation Corpus 
  Haewoon Kwak, Jisun An, Kunwoo Park 
Proceedings of the 16th International AAAI Conference on Web and Social Media (ICWSM), 2022 (short)
What really matters?: characterising and predicting user engagement of news postings using multiple platforms, sentiments and topics 
  Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen 
Behaviour & Information Technology, 2022
  10+ papers citing this work (Google scholar)
Predicting Anti-Asian Hateful Users on Twitter during COVID-19 
  Jisun An, Haewoon Kwak, Claire Seungeun Lee, Bogang Jun, Yong-Yeol Ahn 
Findings of the Association for Computational Linguistics EMNLP 2021
  Code repo (github)
30+ papers citing this work (Google scholar)
Precision Public Health Campaign: Delivering Persuasive Messages to Relevant Segments Through Targeted Advertisements on Social Media 
  Jisun An, Haewoon Kwak, Hanya M Qureshi, Ingmar Weber 
JMIR Form Res 2021;5(9):e22313, 2021
  25+ papers citing this work (Google scholar)
FrameAxis: characterizing microframe bias and intensity with word embedding 
  Haewoon Kwak, Jisun An, Elise Jing, Yong-Yeol Ahn 
PeerJ Computer Science 7:e644, 2021
  Code repo (github)
35+ papers citing this work (Google scholar)
Populist Supporters on Reddit: A Comparison of Content and Behavioral Patterns Within Publics of Supporters of Donald Trump and Hillary Clinton 
  Andreas Jungherr, Oliver Posegga, Jisun An 
Social Science Computer Review. March 2021.
  15+ papers citing this work (Google scholar)
How-to Present News on Social Media: A Causal Analysis of Editing News Headlines for Boosting User Engagement 
  Kunwoo Park, Haewoon Kwak, Jisun An, Sanjay Chawla 
Proceedings of the 15th International AAAI Conference on Web and Social Media (ICWSM), 2021
  15+ papers citing this work (Google scholar)
A Systematic Media Frame Analysis of 1.5 Million New York Times Articles from 2000 to 2017 
  Haewoon Kwak, Jisun An, Yong-Yeol Ahn 
Proceedings of the 12th ACM Conference on Web Science (WebSci), 2020
  40+ papers citing this work (Google scholar)
Identifying and Characterizing Alternative News Media on Facebook 
  Samuel S Guimarães, Julio CS Reis, Lucas Lima, Filipe N Ribeiro, Marisa Vasconcelos, Jisun An, Haewoon Kwak, Fabrício Benevenuto 
IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2020
What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context 
  Ramy Baly, Georgi Karadzhov, Jisun An, Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, Preslav Nakov 
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL) (2020)
  55+ papers citing this work (Google scholar)
Empirical Evaluation of Three Common Assumptions in Building Political Media Bias Datasets 
  Soumen Ganguly, Juhi Kulshrestha, Jisun An, Haewoon Kwak 
Proceedings of the 14th International AAAI Conference on Web and Social Media (ICWSM), 2020
  25+ papers citing this work (Google scholar)
“Trust Me, I Have a Ph.D.”: A Propensity Score Analysis on the Halo Effect of Disclosing One’s Offline Social Status in Online Communities 
  Kunwoo Park, Haewoon Kwak, Hyunho Song, Meeyoung Cha 
Proceedings of the 14th International AAAI Conference on Web and Social Media (ICWSM), 2020
  15+ papers citing this work (Google scholar)
Are These Comments Triggering? Predicting Triggers of Toxicity in Online Discussions 
  Hind Almerekhi, Haewoon Kwak, Bernard Jim Jansen, Joni Salminen (short) 
Proceedings of The Web Conference (WWW),  2020
  50+ papers citing this work (Google scholar)
Going beyond accuracy: estimating homophily in social networks using predictions 
  George Berry, Antonio Sirianni, Ingmar Weber, Jisun An, Michael Macy (preprint) 
arXiv preprint arXiv:2001.11171, 2020
Persona Perception Scale: Development and Exploratory Validation of an Instrument for Evaluating Individuals’ Perceptions of Personas 
  Joni Salminen, Joao M. Santos, Haewoon Kwak, Jisun An, Soon-gyo Jung, Bernard J. Jansen 
International Journal of Human-Computer Studies, 2020
Tanbih: Get To Know What You Are Reading  
  Yifan Zhang, Giovanni Da San Martino, Alberto Barrón-Cedeño, Salvatore Romeo, Jisun An, Haewoon Kwak, Todor Staykovski, Israa Jaradat, Georgi Karadzhov, Ramy Baly, Kareem Darwish, James Glass, Preslav Nakov (demo) 
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019
Gender and Racial Diversity in Commercial Brands’ Advertising Images on Social Media 
  Jisun An, Haewoon Kwak 
Proceedings of Social Informatics (SocInfo), 2019
  Best Paper Award
20+ papers citing this work (Google scholar)
Stylistic Features Usage: Similarities and Differences Using Multiple Social Networks 
  Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen 
Proceedings of Social Informatics (SocInfo), 2019
Predicting Audience Engagement Across Social Media Platforms in the News Domain 
  Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen 
Proceedings of Social Informatics (SocInfo), 2019
Detecting Toxicity Triggers in Online Discussions 
  Hind Almerekhi, Haewoon Kwak, Bernard Jim Jansen, Joni Salminen (poster) 
Proceedings of the 30th ACM Conference on Hypertext and Social Media (HT),  2019
  40+ papers citing this work (Google scholar)
Political Discussions in Homogeneous and Cross-Cutting Communication Spaces 
  Jisun An, Haewoon Kwak, Oliver Posegga, Andreas Jungherr 
Proceedings of the 13th International AAAI Conference on Web and Social Media (ICWSM), 2019
  65+ papers citing this work (Google scholar)
View, Like, Comment, Post: Analyzing User Engagement by Topic at 4 Levels across 5 Social Media Platforms for 53 News Organizations 
  Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen 
Proceedings of the 13th International AAAI Conference on Web and Social Media (ICWSM), 2019
  85+ papers citing this work (Google scholar)
The Challenges of Creating Engaging Content: Results from a Focus Group Study of a Popular News Media Organization 
  Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen (Extended Abstracts) 
Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (CHI), 2019
Social media mining for journalism 
  Arkaitz Zubiaga, Bahareh Heravi, Jisun An, Haewoon Kwak (Guest editorial) 
Online Information Review, 2019
  10+ papers citing this work (Google scholar)
Discursive Power in Contemporary Media Systems: A Comparative Framework 
  Andreas Jungherr, Oliver Posegga, Jisun An 
The International Journal of Press/Politics, 24(4), 2019
  130+ papers citing this work (Google scholar)
Reports of the Workshops Held at the 2018 International AAAI Conference on Web and Social Media 
  Jisun An, Rumi Chunara, David J. Crandall, Darian Frajberg, Megan French, Bernard J. Jansen, Juhi Kulshrestha, Yelena Mejova, Daniel M. Romero, Joni Salminen, Amit Sharma, Amit Sheth, Chenhao Tan, Samuel Hardman Taylor, Sanjaya Wijeratne 
AI Magazine, 2018
Imaginary People Representing Real Numbers: Generating Personas from Online Social Media Data 
  Jisun An, Haewoon Kwak, Soon-gyo Jung, Joni Salminen, M. Admad, Bernard J. Jansen 
ACM Transactions on the Web, 12(4), 2018
  110+ papers citing this work (Google scholar)
Assessing the Accuracy of Four Popular Face Recognition Tools for Inferring Gender, Age, and Race 
  Soon-gyo Jung, Jisun An, Haewoon Kwak, Joni Salminen, Bernard Jim Jansen 
Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM), 2018  (short)
  65+ papers citing this work (Google scholar)
Anatomy of Online Hate: Developing a Taxonomy and Machine Learning Models for Identifying and Classifying Hate in Online News Media 
  Joni Salminen, Hind Almerekhi, Milica Milenković, Soon-gyo Jung, Jisun An, Haewoon Kwak, Bernard J. Jansen 
Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM), 2018
  140+ papers citing this work (Google scholar)
Identifying Regional Trends in Avatar Customization 
  Peter Mawhorter, Sercan Şengün, Haewoon Kwak, D. Fox Harrell 
IEEE Transactions on Games, 10(2), 2018
SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment 
  Jisun An, Haewoon Kwak, Yong-Yeol Ahn 
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), 2018
  50+ papers citing this work (Google scholar)
What We Read, What We Search: Media Attention and Public Attention among 193 Countries 
  Haewoon Kwak, Jisun An, Joni Salminen, Soon-Gyo Jung, Bernard J. Jansen. 
Proceedings of the 2018 World Wide Web Conference (WWW), 2018
  20+ papers citing this work (Google scholar)
What is Gab? A Bastion of Free Speech or an Alt-Right Echo Chamber? 
  Savvas Zannettou, Barry Bradlyn, Emiliano De Cristofaro, Haewoon Kwak, Michael Sirivianos, Gianluca Stringhini, Jeremy Blackburn 
Companion Proceedings of the The Web Conference (WWW), 2018
  Press coverage-New Scientist, and Vice
280+ papers citing this work (Google scholar)
Fixation and Confusion: Investigating Eye-tracking Participants’ Exposure to Information in Personas 
  Joni Salminen, Bernard J. Jansen, Jisun An, Soon-Gyo Jung, Lene Nielsen, Haewoon Kwak 
Proceedings of the 2018 Conference on Human Information Interaction & Retrieval (CHIIR), 2018
“Is More Better?”: Impact of Multiple Photos on Perception of Persona Profiles 
  Joni Salminen, Lene Nielsen, Soon-Gyo Jung, Jisun An, Haewoon Kwak, Bernard J. Jansen 
Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI), 2018
Reports of the Workshops Held at the 2017 International AAAI Conference on Web and Social Media 
  Jisun An, Giovanni Luca Ciampaglia, Nir Grinberg, Kenneth Joseph, Alexios Mantzarlis, Gregory Maus, Filippo Menczer, Nicholas Proferes, Brooke Foucault Welles 
AI Magazine, 2017
Convergence of Media Attention Across 129 Countries 
  Jisun An, Hassan Aldarbesti,  Haewoon Kwak 
Proceedings of Social Informatics (SocInfo), 2017 (short)
Multidimensional Analysis of the News Consumption of Different Demographic Groups on a Nationwide Scale 
  Jisun An, Haewoon Kwak 
Proceedings of Social Informatics (SocInfo), 2017
Multiplex Media Attention and Disregard Network among 129 Countries 
  Haewoon Kwak, Jisun An 
Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2017
Demographics of News Sharing in the US Twittersphere 
  Julio Reis, Haewoon Kwak, and Jisun An, Johnnatan Messias, Fabrıcio Benevenuto 
Proceedings of the 28th ACM Conference on Hypertext and Social Media (HT), 2017
  35+ papers citing this work (Google scholar)
Data-driven Approach to Measuring the Level of Press Freedom Using Media Attention Diversity from Unfiltered News 
  Jisun An, Haewoon Kwak 
Proceedings of the ICWSM Workshop on NEws and publiC Opinion (NECO), 2017
  Picked as The Best of the Physics arXiv (week ending April 15, 2017) in MIT Technology Review
What Gets Media Attention and How Media Attention Evolves Over Time - Large-scale Empirical Evidence from 196 Countries 
  Jisun An, Haewoon Kwak (short) 
Proceedings of the 11th International AAAI Conference on Web and Social Media (ICWSM), 2017
Persona Generation from Aggregated Social Media Data 
  Soon-Gyo Jung, Jisun An, Haewoon Kwak, Moeed Ahmad, Lene Nielsen, Bernard J. Jansen (Extended Abstract) 
Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems (CHI), 2017
I Would Not Plant Apple Trees If the World Will Be Wiped: Analyzing Hundreds of Millions of Behavioral Records of Players During an MMORPG Beta Test 
  Ah Reum Kang, Jeremy Blackburn, Haewoon Kwak, Huy Kang Kim 
Proceedings of the 26th International Conference on World Wide Web (WWW) Companion, 2017
  Press coverage-New Scientist, IFL Science, PC Gamer, Massively OK, El Confidencial, Joongang Ilbo, and so on.
10+ papers citing this work (Google scholar)
Achievement and Friends: Key Factors of Player Retention Vary Across Player Levels in Online Multiplayer Games 
  Kunwoo Park, Meeyoung Cha, Haewoon Kwak, Kuan-Ta Chen 
Proceedings of the 26th International Conference on World Wide Web (WWW) Companion, 2017
  35+ papers citing this work (Google scholar)
Culturally-Grounded Analysis of Everyday Creativity in Social Media: A Case Study in Qatari Context 
  D. Fox Harrell, Sarah Vieweg, Haewoon Kwak, Chong-U Lim, Sercan Sengun, Ali Jahanian, Pablo Ortiz 
Proceedings of the 2017 ACM SIGCHI Conference on Creativity and Cognition (C&C), 2017
Who Are Your Users? Comparing Media Professionals’ Preconception of Users to Data-Driven Personas 
  Lene Nielsen, Soon-Gyo Jung, Jisun An, Joni Salminen, Haewoon Kwak, Bernard J. Jansen 
Proceedings of the 29th Australian Conference on Computer-Human Interaction (OZCHI), 2017
Generating Cultural Personas from Social Data: A Perspective of Middle Eastern Users 
  J. Salminen, S. Sengün, H. Kwak, B. Jansen, J. An, S. Jung, S. Vieweg, D. F. Harrell 
Proceedings of the 5th International Conference on Future Internet of Things and Cloud Workshops, 2017
Personas for Content Creators via Decomposed Aggregate Audience Statistics 
  Jisun An, Haewoon Kwak, Bernard J. Jansen (short) 
Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2017
High correlation of Middle East respiratory syndrome spread with Google search and Twitter trends in Korea 
  Soo-Yong Shin, Dong-Woo Seo, Jisun An, Haewoon Kwak, Sung-Han Kim, Jin Gwack, Min-Woo Jo 
Scientific Reports 6, Article number 32920 (2016)
  130+ papers citing this work (Google scholar)
Multidimensional Analysis of Gender and Age Differences in News Consumption 
  Jisun An, Haewoon Kwak 
Computation+Journalism (C+J) Symposium (2016)
Revealing the Hidden Patterns of News Photos: Analysis of Millions of News Photos Using GDELT and Deep Learning-based Vision APIs 
  Haewoon Kwak, Jisun An 
ICWSM Workshop on NEws and publiC Opinion (NECO), 2016
  Picked as The Best of the Physics arXiv (week ending March 26, 2016) in MIT Technology Review
20+ papers citing this work (Google scholar)
Two Tales of the World: Comparison of Widely Used World News Datasets: GDELT and EventRegistry 
  Haewoon Kwak, Jisun An 
Proceeding of the 10th International Conference on Web and Social Media (ICWSM), 2016 (short)
  30+ papers citing this work (Google scholar)
Are You Charlie or Ahmed? Cultural Pluralism in Charlie Hebdo Response on Twitter 
  Jisun An, Haewoon Kwak, Yelena Mejova, Sonia Alonso Saenz De Oger, Braulio Gomez Fortes 
Proceeding of the 10th International Conference on Web and Social Media (ICWSM), 2016
  45+ papers citing this work (Google scholar)
#greysanatomy vs. #yankees: Demographics and Hashtag Use on Twitter. 
  Jisun An, Ingmar Weber (short) 
Proceeding of the 10th International Conference on Web and Social Media (ICWSM), 2016
Whom should we sense in ‘social sensing’-analyzing which users work best for social media now-casting 
  Jisun An, Ingmar Weber 
EPJ Data Science, 4, Article number 22, 2015
Consumers and Suppliers: Attention asymmetries. A Case Study of Aljazeera’s News Coverage and Comments 
  Sofiane Abbar, Jisun An, Haewoon Kwak, Yacine Messaoui, Javier Borge-Holthoefer 
Computation+Journalsim (C+J) Symposium, 2015
Breaking the News: First Impressions Matter on Online News 
  Julio Reis, Fabrıcio Benevenuto, Pedro Olmo, Raquel Prates, Haewoon Kwak, Jisun An 
Proceeding of the 9th International Conference on Web and Social Media (ICWSM), 2015
  Picked as Other Interesting arXiv Papers (Week ending April 11, 2015) in MIT Technology Review, and O Globo
200+ papers citing this work (Google scholar)
Exploring Cyberbullying and Other Toxic Behavior in Team Competition Online Games 
  Haewoon Kwak, Jeremy Blackburn, Seungyeop Han 
Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI), 2015
  310+ papers citing this work (Google scholar)
From Cells to Streets: Estimating Mobile Paths with Cellular-Side Data 
  Ilias Leontiadis, Antonio Lima, Haewoon Kwak, Rade Stanojevic, David Wetherall, Konstantina Papagiannaki 
Proceedings of the 10th ACM International on Conference on emerging Networking Experiments and Technologies (CoNEXT), 2014
Understanding News Geography and Major Determinants of Global News Coverage of Disasters 
  Haewoon Kwak, Jisun An (extension of SocInfo’14) 
Computation+Journalism (C+J) Symposium, 2014
  25+ papers citing this work (Google scholar)
A First Look at Global News Coverage of Disasters By Using the GDELT Dataset 
  Haewoon Kwak, Jisun An 
Proceedings of Social Informatics, 2014
  Press Coverage-MIT Technology Review, ACM TechNews
60+ papers citing this work (Google scholar)
Linguistic Analysis of Toxic Behavior in an Online Video Game 
  Haewoon Kwak, Jeremy Blackburn 
SocInfo Workshop on Exploration on Games and Gamers (EGG), 2014
  110+ papers citing this work (Google scholar)
Searching for a Unique Style in Soccer 
  Laszlo Gyarmati, Haewoon Kwak, Pablo Rodriguez 
KDD Workshop on Large-Scale Sports Analytics, 2014
  Press coverage-BBC, MIT Technology Review, The Times, The Economist, Slate, Pacific Standard, and so on.
110+ papers citing this work (Google scholar)
Didn’t You See My Message? Predicting Attentiveness to Mobile Instant Messages 
  Martin Pielot, Rodrigo de Oliveira, Haewoon Kwak, Nuria Oliver 
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), 2014
  240+ papers citing this work (Google scholar)
STFU NOOB! Predicting Crowdsourced Decisions on Toxic Behavior in Online Games 
  Jeremy Blackburn, Haewoon Kwak 
Proceedings of the 23rd international conference on World wide web (WWW), 2014
  Press coverage-Nature, Scientific American, Chosun Ilbo
190+ papers citing this work (Google scholar)
Has Much Potential but Biased: Exploring the Scholarly Landscape in Twitter 
  Haewoon Kwak, Jonggun Lee (poster) 
Proceedings of the 23rd International Conference on World Wide Web Companion, 2014
Sharing political news: the balancing act of intimacy and socialization in selective exposure 
  Jisun An, Daniele Quercia, Meeyoung Cha, Krishna Gummadi, Jon Crowcroft 
EPJ Data Science volume 3, Article number 12, 2014
Recommending investors for crowdfunding projects 
  Jisun An, Daniele Quercia, Jon Crowcroft 
Proceedings of the 23rd international conference on World wide web (WWW), 2014
  Press coverage-FastCompany
100+ papers citing this work (Google scholar)
Partisan Sharing: Facebook Evidence and Societal Consequences 
  Jisun An, Daniele Quercia, Jon Crowcroft 
Proceedings of the Second ACM Conference on Online Social Networks (COSN), 2014
  110+ papers citing this work (Google scholar)
Tower of Babel: A Crowdsourcing Game Building Sentiment Lexicons for Resource-scarce Languages 
  Yoonsung Hong, Haewoon Kwak, Youngmin Baek, Sue Moon 
WWW Workshop on Multidisciplinary Approaches to Big Social Data Analysis, 2013
  25+ papers citing this work (Google scholar)
Structures of Broken Ties: Exploring Unfollow Behavior on Twitter 
  Bo Xu, Yun Huang, Haewoon Kwak, Noshir S. Contractor 
Proceedings of the 16th ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW), 2013
  70+ papers citing this work (Google scholar)
Why Individuals Seek Diverse Opinions (or Why They Don’t) 
  Jisun An, Daniele Quercia, Jon Crowcroft 
Proceedings of the 5th Annual ACM Web Science Conference (WebSci), 2013
Why Do I Retweet It? An Information Propagation Model for Microblogs 
  Fabio Pezzoni, Jisun An, Andrea Passarella, Jon Crowcroft, Marco Conti 
Proceedings of the 5th International Conference on Social Informatics (SocInfo), 2013
Traditional Media Seen from Social Media 
  Jisun An, Daniele Quercia, Meeyoung Cha, Krishna Gummadi, Jon Crowcroft 
Proceedings of the 5th Annual ACM Web Science Conference (WebSci), 2013
Fragmented Social Media: A Look into Selective Exposure to Political News 
  Jisun An, Daniele Quercia, Jon Crowcroft (poster) 
Proceedings of the 22nd International Conference on World Wide Web (WWW) Companion, 2013
More of a Receiver than a Giver: Why Do People Unfollow in Twitter? 
  Haewoon Kwak, Sue Moon, Wonjae Lee (4 page poster) 
Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM), 2012
  50+ papers citing this work (Google scholar)
Visualizing Media Bias through Twitter 
  Jisun An, Meeyoung Cha, Krishna Gummadi, Jon Crowcroft, Daniele Quercia 
ICWSM Workshop on the Potential of Social Media Tools and Data for Journalists, 2012
Consistent Community Identification in Complex Networks 
  Haewoon Kwak, Sue Moon, Young-Ho Eom, Yoonchan Choi, Hawoong Jeong 
Journal of Korean Physical Society, Vol. 59, No. 5, November 2011.
Fragile Online Relationship: a First Look at Unfollow Dynamics in Twitter 
  Haewoon Kwak, Hyunwoo Chun, Sue Moon 
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), 2011.
  Press coverage - Kyunghyang Shinmun
150+ papers citing this work (Google scholar)
Media Landscape in Twitter: A World of New Conventions and Political Diversity 
  Jisun An, Meeyoung Cha, Krishna Gummadi, Jon Crowcroft 
Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM), 2011.
  200+ papers citing this work (Google scholar)
What is Twitter, a Social Network or a News Media? 
  Haewoon Kwak, Changhyun Lee, Hosung Park, Sue Moon 
Proceedings of the 19th international conference on World wide web (WWW), 2010.
  Press coverage - Mashable Op-Ed, ReadWrite, The Guardian, PC News, Chosun Ilbo, DongA Ilbo
10k+ papers citing this work (Google scholar)
Finding Influentials based on the Temporal Order of Information Adoption in Twitter 
  Changhyun Lee, Haewoon Kwak, Hosung Park, Sue Moon (poster) 
Proceedings of the 19th international conference on World wide web (WWW), 2010.
  200+ papers citing this work (Google scholar)
Understanding Topological Mesoscale Features in Community Mining 
  Sue Moon, Jinyoung You, Haewoon Kwak, Daniel Kim, and Hawoong Jeong (invited paper) 
Proceedings of the Second International Conference on COMmunication Systems and NETworks (COMSNETS), 2010.
Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems 
  Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, and Sue Moon 
ACM/IEEE Transactions on Networking, Vol 17, Issue 5, 2009
  600+ papers citing this work (Google scholar)
Mining Communities in Networks: a Solution for Consistency and Its Evaluation 
  Haewoon Kwak, Yoonchan Choi, Young-Ho Eom, Hawoong Jeong, Sue Moon 
Proceedings of the 9th ACM SIGCOMM conference on Internet measurement (IMC), 2009
  90+ papers citing this work (Google scholar)
The Wisdom of the Few: A Collaborative Filtering Approach based on Expert Opinions from the Web  
  Xavier Amatriaain, Neal Lathia, Josep M. Pujol, Haewoon Kwak, Nuria Oliver 
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval (SIGIR), 2009
  150+ papers citing this work (Google scholar)
Connecting Users with Similar Interests Across Multiple Web Services 
  Haewoon Kwak, Hwa-Yong Shin, Jong-Il Yoon, Sue Moon (poster) 
Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media (ICWSM), 2009
Comparison of Online Social Relations in Volume vs Interaction: A Case Study of Cyworld 
  Hyunwoo Chun, Haewoon Kwak, Young-Ho Eom, Yong-Yeol Ahn, Sue Moon, and Hawoong Jeong 
Proceedings of the 8th ACM SIGCOMM conference on Internet measurement (IMC), 2008
  250+ papers citing this work (Google scholar)
I Tube, You Tube, Everybody Tubes: Analyzing the World’s Largest User Generated Content Video System 
  Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, Sue Moon 
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement (IMC), 2009
  Best paper award, The IMC Test of Time Award
2,200+ papers citing this work (Google scholar)
Analysis of topological characteristics of huge online social networking services 
  Yong-Yeol Ahn, Seungyeop Han, Haewoon Kwak, Sue Moon, Hawoong Jeong 
Proceedings of the 16th international conference on World Wide Web (WWW), 2007
  1,400+ papers citing this work (Google scholar)