Computational social science lies at the intersection of various computational or data science techniques and social science. In this workshop, we will present various works closely related to society, namely, segregation, political and social movements, and misinformation (rumor) which we have analyzed using various data science techniques.
Luca Aiello holds a PhD in Computer Science from the university of Turin, Italy. He is currently an Associate Professor at the IT University of Copenhagen. Previously, he worked for 10 years as a Research Scientist in the industry: at Yahoo Labs in Barcelona, and at Bell Labs in Cambridge (UK). He conducts research in Computational Social Science, an interdisciplinary field of studies that uses Social Science theories to guide the solution to Data Science problems. He is currently working on text analysis techniques that, when applied to conversations, can help understand people's social behavior and psychological well-being. His work has been covered by hundreds of news articles published by news outlets worldwide including Wired, WSJ, and BBC.
Social relationships are the key determinant of crucial societal outcomes, including diffusion of innovation, productivity, happiness, and life expectancy. To better attain such outcomes at scale, it is therefore paramount to have technologies that can effectively capture the type of social relationships from digital data. NLP researchers have tried to do so from conversational text but mostly focusing on sentiment or topic mining, techniques that fall short on either conciseness or exhaustiveness.
We propose a theoretical model of 10 dimensions (colors) of social relationships that is backed by decades of research in social sciences and that captures most of the common relationship types. We trained a deep-learning model to classify text along these ten dimensions, and we reached performance up to 0.98 AUC. By applying this tool on large-scale conversational data, we show that the combination of the predicted dimensions suggests both the types of relationships people entertain and the types of real-world communities they shape.
We believe that the ability of capturing interpretable social dimensions from language using AI will help closing the gap between the oversimplified social constructs that existing social network analysis methods can measure and the multifaceted understanding of social dynamics that has been developed by decades of theoretical research.
Have you ever questioned if the MeToo movement is only about sexual harassment and assault, or if there is more to it? In this session, we will investigate the hidden facets of the MeToo movement using the Twitter dataset. We will uncover that color race, people showing distrust towards the victims, places of harassment such as workplace, home, public places, etc. were some topics of discussion under the #MeToo.
Luca Pappalardo is a full-time researcher at the Institute of Information Science and Technologies of the National Research Council of Italy (ISTI-CNR) in Pisa (since 2017) and a member of the KDD-Lab, a joint research initiative of the University of Pisa, the Italian National Research Council (CNR), and Scuola Normale Superiore of Pisa. Luca’s research focuses on data science, AI, computational social science, and their impact on society, with a particular focus on the (privacy-preserving) analysis of human mobility, the design of mechanistic and AI models for the prediction and generation of human mobility, and the impact of AI on the urban environment. Luca is also part of SoBigData.eu, the European H2020 Research Infrastructure “Big Data Analytics and Social Mining Ecosystem”, in which he is responsible for coordinating the research that is conducted within the infrastructure. Luca has been a visiting scientist at Barabasi Lab (Center for Complex Network Research) of Northeastern University, Boston, at the Central European University (CEU) in Budapest, Hungary, at the Pontifícia Universidade do Rio de Janeiro, Brazil, at the Universidad del Desarrollo (UDD) in Santiago de Chile and at INRIA-Saclay, France. In 2014, Luca received a grant from Google and the Italian National Statistics Bureau (ISTAT) for the most innovative ideas in using big data sources to study complex economic phenomena.
An intriguing open question is whether measurements derived from big data recording human activities can yield high-fidelity proxies of well-being. For example, peace is a principal dimension of well-being and its measurement has drawn the attention of researchers, policymakers, and peacekeepers. During the last years, novel digital data streams have drastically changed the research in this field.
In this talk, we show how to exploit information extracted from a new digital database called Global Data on Events, Location, and Tone (GDELT) to capture peace through the Global Peace Index (GPI). Applying predictive machine learning models, we demonstrate that news media attention from GDELT can be used as a proxy for measuring GPI at a monthly level. Additionally, we use explainable AI techniques to obtain the most important variables that drive the predictions. This analysis highlights each country’s profile and provides explanations for the predictions, and particularly for the errors and the events that drive these errors. We believe that digital data exploited by researchers, policymakers, and peacekeepers, with data science tools as powerful as machine learning, could contribute to maximizing the societal benefits and minimizing the risks to peace.
Users make great use of online social media sites like Twitter to quickly and easily transmit misinformation to a big audience. Misinformation has been shown to contribute to panic, anxiety, and financial loss. Do you know that there exist various types of misinformation that are harmful to society? Here, we focus on rumors, a type of misinformation (other types are fake news, hoaxes, etc.). How do rumors affect us in real life? Have you ever considered the possibility that a friend you have on social media may be a rumor spreader?
If so, how would one recognize these users? What are the ways to identify them? Does machine learning techniques enough to identify them, or do we require more advanced techniques? During my talk, I'm looking forward to discussing the answers to these queries as well as additional findings.
Language is one of the most important aspects of any culture. It is often considered a part of self-identification, especially in multilingual countries, where the population usually speaks two or more languages. But can people easily switch from one language to another depending on circumstances? Our paper investigates this phenomenon on an individual level using user behaviour data from online social media in Ukraine during the Euromaidan revolution in 2014.
How did we identify such language patterns? Did most Ukrainians switch to another language or stick to their predominant one? Was it temporal? Whether we found out the reasons for these changes?
I look forward to sharing answers to these questions and other results during my talk.