Doing cool things with data

Introduction

NLP is going to be the most transformational tech of this decade. And Transformer models are fuelling the advancement in NLP. In this blog, we look at an application of Transformers on multilingual data. Many businesses are now global in their reach and are collecting data in many languages. NLP is also building strong capabilities to deal with data in multiple languages. Transformer models can now be trained in many languages and seamlessly ingest text in them. This is amazing!

In this blog, we use a pretrained multilingual XLM Roberta model and fine-tune it on a downstream classification task. XLM-Roberta…


Doing cool things with data

Introduction

Hyperpartisan news is described as news that is extremely biased towards one political party. This could be extreme left or extreme right. Unfortunately, there is a lot of political polarization in the US right now. People only like policies and views shared by members of their group. The recent riot at the U.S. Capitol exposes this growing gap between liberal and conservative voters. This political divide has been growing over time. Check out this article from Pew Research on striking finds for 2020. Their research suggests that Biden and Trump's voters disagree fundamentally on basic things like core American Values…


Doing cool things with data

Introduction

Autonomous driving is set to revolutionize travel in the coming decade. Autonomous driving applications are currently being tested for a variety of use cases ranging from passenger cars, robot taxis, automated commercial delivery trucks, smart forklifts and event automated tractors for farming.

Autonomous driving needs a computer vision perception module to understand and navigate the environment. The role of this perception module among other things is to :

  • Detect Lane Lines
  • Detect other objects — vehicles, humans, animals in the environment
  • Track detected objects
  • Predict their likely motion

A good perception system should be able to do this in real-time…


Doing cool things with data

Introduction

In this second blog focused on the US 2020 election, we train a GPT-2 model on Donald Trump's speeches. With minimal training, the GPT-2 model successfully replicates his style and starts writing in his style. See sample Donald Trump speech generated by GPT-2. It does sound like him!

As president, I kept my promise. Nobody else did. We set records. We set records. Thank you very much. Great job. We set records. And by the way, there's two really greats, right? There's Sonny Perdue and there's Barack Hussein Obama. The two greats. Really great. One has become the most powerful…


Doing cool things with data

Introduction

The 2020 elections in the US are around the corner. Fake News published on social media is a HUGE problem around the election time. While some of the Fake News is produced purposefully for skewing election results or to make a quick buck through advertisement, false information can also be shared by misinformed individuals in their social media posts. These posts can quickly become viral blurring. Most people believe posts liked by many others must be true.

Detecting Fake News is not an easy task for a machine learning model. Many of these stories are very well written. …


Doing cool things with data!

Introduction

Question Answering is a very common task in NLP. SQuAD data set is a popular data set for question answering problem. Typically for question answering, the model is presented with a question and a context, with the goal of finding the answer (if it exists) from this context. For SQuAD the context is typically 1–2 paragraphs of text from Wikipedia. For many practical applications, this approach of providing a concise context can be very limiting. As an example, if you have a library of documents and want a particular question answered. The context here can be thousands of documents. …


Doing cool things with data!

Introduction

I am amazed with the power of the T5 transformer model! T5 which stands for text to text transfer transformer makes it easy to fine tune a transformer model on any text to text task. Any NLP task event if it is a classification task, can be framed as an input text to output text problem.

In this blog, I show how you can tune this model on any data set you have. In particular, I demo how this can be done on Summarization data sets. I have personally tested this on CNN-Daily Mail and the WikiHow data sets. …


Doing cool things with data!

Introduction

The deep learning community is abuzz with YOLO v5. This blog recently introduced YOLOv5 as — State-of-the-Art Object Detection at 140 FPS. This immediately generated significant discussions across Hacker News, Reddit and even Github but not for its inference speed. Two prominent issues were — Should the model be called YOLO and are the speed benchmarking results accurate and reproducible. If you are interested in Roboflow’s response then you can find it here.

All the controversy aside, YOLOv5 looked like a promising model. So I have compared it to one of the best two stage detectors — Faster RCNN. To…


Doing cool things with data!

Introduction

Knowledge graphs are gaining popularity as a data structure for storing unstructured data. In this blog, we show how key elements of resumes can be stored and visualized as knowledge graphs. We then walk through the knowledge graph of resumes to answer questions. Our code is available on this Github link.

A Knowledge Graph is a type of graph which enables us to model knowledge of a particular domain by organizing it in an ontology through data interlinking. Machine learning can then be applied on a knowledge graph to get insights. …


Doing cool things with data!

Introduction

Many cities in the US and Europe are reopening cautiously now. People have been instructed to follow social distancing rules as they venture out. But do people follow them? It can be important for cities to assess this and take action accordingly. If most people follow them, then more places can be opened safely. However, if there are many violations then it may be safer to close. This is exactly what happened at Miami Beach park. The park opened at the end of April but was closed within the week since too many people were…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store