Research Directions

  • image

    Mining across Sites

    Users are often active on multiple social media sites. To systematically study users, we need their information on all sites.

    To mine across social media sites, we particularly focus on two specific problems. First, how does user behavior vary across sites (e.g., difference between LinkedIn Friends and Facebook Friends). In addition to designing new techniques, we investigate means to scale and adapt traditional models that analyze user behavior for a single site to multiple sites. For recent results on this research question, see my papers in Information Fusion'16 and ICWSM'14 and this book chapter. Second, I study user behaviors that are only observed across sites. An example includes our study on user migrations across sites.

  • image

    Identifying Users across Sites

    Investigating means to identify the same user across social media sites, allowing to understand users online comprehensively.

    I investigate identifying the same user across multiple sites using link (friendships) [TKDD' 15] and content information [ICWSM'09, KDD' 13]. User identification using link information is closely related to the graph isomorphism problem, not known to be in P or NP-complete. .

  • image

    Analyzing Human Behavior Using Online Traces

    Realistically model, predict, or mine human behavior and/or incentivize human activity through mechanism design.

    My research has investigated means to realistically analyze human behavior online by focusing on ways to exploit information redundancies generated by user behavior. The methodology has been used to identify sarcasm on Twitter, to identify users across sites, among other behaviors. For more on the topic see this article, this chapter, or our recent workshop on the topic. As a by-product, my research on human behavior modeling has had implication in information verification, privacy and security.

  • image

    Evaluation in Social Media Research

    With no face-to-face access to users on social media, how can we guarantee that the patterns that we identify online represents the true intentions of online users?

    In data mining terms, ground truth is rarely available online. I recently started to investigate this problem and identified some ways to tackle the problem. For a succinct review of the topic see my recent Communciations of the ACM (CACM) paper on this issue.

  • image

    Mining with Absolute Minimum Information

    What is the minimum information required to perform data mining tasks on social media?

    I have looked at how to utilize minimum information to identify users, detect malicious users, or to recommend friends on social media sites with high accuracy. As these methods utilize only minimum information, they scale easily to millions of users. Recently, I have been investigating theoretical limits of using minimum information.

  • image

    Theoretical and Empirical Limits of Privacy

    How much user privacy is violated by mining user's content?

    I have recently investigated the balance between privacy and mining user-generated content by connecting ideas from complexity theory, specifically Kolmogrov complexity, information theory, and statistical natural language processing. See this paper for some preliminary results.

  • image

    Online Crisis and Disaster Management

    How can we identify areas impacted by natural disasters and provide assistance to individuals impacted by natural disasters using online data?

    My research has focused on (1) online means to map areas impacted by natural disasters in real-time [ICDM'15], (2) identifying relevant users that provide most useful information in case of crises [HT 2014], and (3) systematic approaches to crowdsource user-generated content in case of disasters [CMOT'12].

  • image

    Information Propagation and Sentiment Analysis

    How does information and sentiment propagate in large-scale networks?

    Previous research has shown that human sentiment and/or mental state depends on those of friends and family. I have investigated how sentiment and information propagates in large scale networks. See my preliminary results here.