SoDA Lab

Studying social phenomena
with large-scale data and AI

Led by Haewoon Kwak and Jisun An, we focus on human behavior on online platforms: its measurement, understanding, design, and the assessment of its implications. Every day we use our devices to read the news, watch videos, search for places to eat, chat with friends, and post on social media. Those electronic footprints make it possible to study both individual and collective behavior, including what people like and dislike, how they feel about different topics, and how they engage with one another. Understanding human behavior on these platforms has become essential.

We develop new computational methods and tools for understanding, predicting, and shaping human behavior online. A central challenge is the sheer diversity and complexity of online data. We work across many kinds of large-scale data, examining existing tools, surfacing their limits, and developing new measurements, machine-learning models, and linguistic methods that let us understand online behavior and address real-world problems.

But our goal is not only to solve problems. We also want to improve the online spaces themselves. We are interested in identifying obstacles to a trustworthy public space online, developing methodologies that make those obstacles transparent, building frameworks for real-time large-scale monitoring, and ultimately helping the online public square become more credible.

We are based at the Luddy School of Informatics, Computing, and Engineering at Indiana University Bloomington, and a member of the Center for Complex Networks and Systems Research (CNetS).

Featured work

All publications →

Byunghwee Lee, Rachith Aiyappa, Yong-Yeol Ahn, Haewoon Kwak, Jisun An

Nature Human Behaviour, 2025 · 2025 · 16 cites

How do beliefs interconnect, and what drives a person to adopt new ones? We fine-tune large language models on online debate data to map thousands of beliefs into a semantic space where proximity reflects coherence. Position in that space predicts which beliefs an individual is likely to adopt next and quantifies cognitive dissonance via distance between existing and new beliefs.

Kunihiro Miyazaki, Taichi Murayama, Takayuki Uchiba, Jisun An, Haewoon Kwak

EPJ Data Science, 2024 · 2024 · 80 cites

What does the public actually think of generative AI? We analyze 3M tweets from 2019 to 2023 and find broad interest across occupations, not just tech. Sentiment is generally positive and tracks exposure, with one exception: illustrators are notably negative, reflecting concerns over training-data ethics.

Preslav Nakov, Jisun An, Haewoon Kwak, Muhammad Arslan Manzoor, Zain Muhammad Mujahid, Husrev Taha Sencar

ACL Findings, 2024 · 2024 · 31 cites

Fact-checking every story is impossible. Instead, this survey reviews how to profile entire news outlets, so any article can be flagged the moment it appears. We argue that factuality and political bias should be modeled jointly rather than separately, and survey state-of-the-art across text, social context, and beyond.

Fan Huang, Haewoon Kwak, Kunwoo Park, Jisun An

LREC-COLING, 2024 · 2024 · 26 cites

As AI explains its own decisions in natural language, who should grade those explanations? We compare ChatGPT and human judgments on informativeness and clarity across binary, ternary, and 7-point scales, finding ChatGPT aligns well with humans on coarse-grained ratings, with paired comparison and dynamic prompting further improving alignment.