Natural Language Processing (NLP) Course Repository

Welcome to the NLP course (Winter 2024) repository.

Overview

This repository is part of a Natural Language Processing (NLP) course and includes four assignments that cover various aspects of NLP, from text preprocessing to advanced model fine-tuning. Additionally, it contains course materials and a comprehensive source book to aid your learning journey.

Setup

To get started with the assignments and the provided resources, follow these steps:

Clone this repository:

git clone https://github.com/AlirezaSaei1/NLP-Assignments.git
cd NLP-Assignments

You are ready to go!

Assignments

Assignment 1

Title: Research and Preprocessing Steps

Description: This assignment includes a summary of the paper "Abstractive Summarization Guided by Latent Hierarchical Document Structure." Additionally, it covers preprocessing steps for English and Farsi, followed by a spell checker implementation.

Files:

Assignment1/Research/Research.pdf: Summary of the paper.
Assignment1/Codes/Preprocessing_Fa.py: Preprocessing steps for Farsi.
Assignment1/Codes/Preprocessing_Eng.py: Preprocessing steps for English.
Assignment1/Codes/SpellChecker.py: Spell checker code.

Assignment 2

Title: Autofill and POS Tagging

Description: This assignment involves creating an autofill feature using n-gram modeling on the Digikala comments dataset. It also includes a Part-of-Speech (POS) tagger.

Files:

Assignment2/Codes/AutoFiller.py: Autofill implementation using n-gram modeling.
Assignment2/Codes/POS_Tagging.py: POS tagger code.

Assignment 3

Title: Sentiment Analysis using RNNs

Description: This assignment focuses on sentiment analysis using Recurrent Neural Networks (RNNs). Both SimpleRNN and LSTM architectures are utilized to analyze sentiments in the given dataset.

Files:

Assignment3/Codes/Sentiment_Analysis.py: Sentiment analysis using SimpleRNN.

Assignment 4

Title: Fine-Tuning wav2vec 2.0

Description: This assignment involves fine-tuning the wav2vec 2.0 model (xlsr-53) using the Common Voice Mozilla Persian dataset.

Files:

Assignment4/Codes/ASR_fa_v1.py: Code for fine-tuning wav2vec 2.0. (v1)
Assignment4/Codes/ASR_fa_v2.py: Code for fine-tuning wav2vec 2.0. (v2 - main)

Additional Resources

Course Slides

The course slides provide a comprehensive overview of the topics covered in the course taught by Dr.Baradaran. They are available in the Course Slides directory.

Presentation

A presentation on LangChain and Retrieval-Augmented Generation (RAG) is included in the Presentation directory, containing the report, PowerPoint, and code. (with the help of DeepLearning.AI)

Source Folder

The Source folder contains Jurafsky's NLP book, a valuable resource for understanding the theoretical foundations of NLP.

Feel free to explore each assignment and utilize the additional resources to enhance your learning experience. If you have any questions or need further assistance, please reach out.

Happy Learning!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Natural Language Processing (NLP) Course Repository

Table of Contents

Overview

Setup

Assignments

Assignment 1

Assignment 2

Assignment 3

Assignment 4

Additional Resources

Course Slides

Presentation

Source Folder

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
Assignment 1		Assignment 1
Assignment 2		Assignment 2
Assignment 3		Assignment 3
Assignment 4		Assignment 4
Course Slides		Course Slides
Presentation		Presentation
Source		Source
.gitignore		.gitignore
README.md		README.md

AlirezaSaei1/NLP-Assignments

Folders and files

Latest commit

History

Repository files navigation

Natural Language Processing (NLP) Course Repository

Table of Contents

Overview

Setup

Assignments

Assignment 1

Assignment 2

Assignment 3

Assignment 4

Additional Resources

Course Slides

Presentation

Source Folder

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages