NLP - ML - RL - Data Science

Understanding the complexities of human language with algorithms

Adversarial Generation of Dialog with contextual information

Generative models such as Seq2Seq, HRED, VHRED models have typically been able to generate dialog responses over short contexts. Recently, researchers are investigating the possibility of using Generative Adversarial Networks to enhance these generative models to generate proper responses. Our work is to investigate the role of context and to possibly train and enhance generative models with adversarial training to be able to handle varied contexts.

  • Topic : NLP, Social Networks, Machine Learning
  • Paper : WIP
  • Research Team : Koustuv Sinha, Prasanna Parthasarathy and Dr. Joelle Pineau

Analyzing and predicting Interactions in Literature

In literature, social networks among the characters are formed on the basis of interactions among each other. Some characters interact with each other directly, while some characters are connected on the basis of implied interactions. Our research is to devise a novel method to detect these interactions linguistically, and to analyze which features of the discourse are prominent in identifying these interactions.

  • Topic : NLP, Social Networks, Machine Learning
  • Paper : “On the unreasonable complexity of detecting social interactions in literature” - In review
  • Research Team : Koustuv Sinha, Dr. Derek Ruths and Dr. Andrew Piper

Gtopics - Inferencing human oriented topics from data

When training a topic based document classification system, how do we infer which topics to train? Traditionally, researchers have used unsupervised topic models such as LDA to generate clusters and then tag meaningful topics. Our work, however is to prove that meaningful topics can be inferred from social media discourse itself, which provides an easy way to just train your classifier on the topics inferred by our pipeline.

  • Topic : Topic Modelling, Information Extraction
  • Paper : WIP
  • Research Team : Koustuv Sinha, Derek Ruths and David Jurgens

Real Time Crime News Extraction

In Indian context, Crime data from NCRB is a major source for researchers to analyze the data. However. NCRB data is at least two years old, so it is hard for researchers to properly understand the real crime trends. Our work is to extract real time Crime data from online sources, such as News Papers and Tweets, and create a real time latest updated dataset for researchers to work on.

  • Topic : Data Mining, Information Extraction
  • Paper : “Extraction, Identification & Classification of Crime Data from Real Time News Sources
  • Research Team : Prof Saptarsi Goswami, Debasish Banik, Urmi Saha, Subhashree Bose