NLP Text Classification

Using a Kaggle Playground data to implement ML and DL techniques and further drawing comparisons.

Purpose

Natural language processing has been widely popular, with the large amount of data available (in emails, web pages, sms) it becomes important to extract valuable information from textual data. There are an assortment of machine learning techniques designed to accomplish this task. With current advances in deep learning, we felt it would be an interesting idea to compare traditional and deep learning techniques. We decided to pick up a playground kaggle dataset with the purpose of text classification and proceeded to implement both these types of algorithms for comparison purposes.

Problem

In today’s world, websites have to deal with toxic and divisive content. Especially major websites like Quora which cater to large traffic and their purpose is to provide a platform to people for asking and answering questions. A key challenge is to weed out insincere questions, those founded upon false premises or questions that intend to make a statement rather than look for helpful answers.

Details

A break down of the code and a run through of the methodology used are available in the following blog.

Code:

quora-nlp-data-exploration: Contains the code regarding the data exploration and some visualizations.
quora-nlp-text-classification-v1: Contains all the steps involved from text cleaning to feature generation and model selection/evaluation. The details have been described in the following link.
quora-nlp-text-classification-DL: Contains a deep learning approach to solve the text classification problem. More details are available in the following link.

About Us

Data science discovery is a step on the path of your data science journey. Please follow us on LinkedIn to stay updated.

About the writers:

Ujjayant Sinha: Data science enthusiast with interest in natural language problems.
Ankit Gadi: Driven by a knack and passion for data science coupled with a strong foundation in Operations Research and Statistics has helped me embark on my data science journey.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
quora-nlp-data-exploration.ipynb		quora-nlp-data-exploration.ipynb
quora-nlp-text-classification-DL.ipynb		quora-nlp-text-classification-DL.ipynb
quora-nlp-text-classification-v1.ipynb		quora-nlp-text-classification-v1.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP Text Classification

Purpose

Problem

Details

About Us

About

Releases

Packages

Languages

ujjayants/NLP_Text_Classification

Folders and files

Latest commit

History

Repository files navigation

NLP Text Classification

Purpose

Problem

Details

About Us

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages