I am going to show you step by step how to perform text clustering with Python. For full article, feel free to visit https://learndatascienceskill.com/index.php/2020/08/06/text-clustering-with-python/
In this tutorial, I will show you how to perform Unsupervised Machine learning with Python using Text Clustering. We will look at how to turn text into numbers with using TF-IDF Vectorizer from sklearn. What we will also do is to check the centroid of each cluster. Once we know the centroid, we will know the movies that are closed to the centroids and that helps us to understand the similarities between these movies.
I will show you step by step of:
- How to load the data into Google Colab notebook
- How to explore the data
- How to pre-process the data with TF-IDF Vectorizer from sklearn
- How to perform K-Means clustering with using Scikit-Learn library
- How to evaluate the results of the clustering
Youtube video: https://www.youtube.com/watch?v=ORpDAUQUnkU