76 |
76 |
27 June 2020 |
A fun mini-project in which we perform basic NLP tasks using requests, BeautifulSoup and nltk library |
75 |
75 |
26 June 2020 |
Added more self notes on Word2Vec, also solved some fun probability puzzles today (not here yet.) Word2Vec_from_scratch.ipynb , Slow but steady, too much work at work, so could only give 30 mins today. |
74 |
74 |
25 June 2020 |
finished the implementation of word2vec (some more work needed) Word2Vec_from_scratch.ipynb , Slow but steady, too much work at work, so could only give 30 mins today. |
73 |
73 |
24 June 2020 |
Added some more implementation to Word2Vec_from_scratch.ipynb , Slow but steady, too much work at work, so could only give 30 mins today. |
72 |
72 |
23 June 2020 |
Added some more implementation to Word2Vec_from_scratch.ipynb , still far from done. |
71 |
71 |
22 June 2020 |
Started implementing word2vec from scratch using NumPy. Too little progress. Perfectionism is evil :( Word2Vec_from_scratch.ipynb |
70 |
70 |
21 June 2020 |
Complete word2vec math derivation from scratch. Lecture-1-Introduction-and-Word-Vectors.ipynb |
69 |
69 |
20 June 2020 |
Wrote notes on word2vec from what I have learned so far Lecture-1-Introduction-and-Word-Vectors.ipynb |
68 |
68 |
19 June 2020 |
Skip gram maths Lecture-1-Introduction-and-Word-Vectors.ipynb |
67 |
67 |
18 June 2020 |
Derievation of Word2Vec, but not in Lecture-1-Introduction-and-Word-Vectors.ipynb yet. Also worked on a quick data wrangling project : (generating_keywords_for_google_ads)[data_wrangling\generating_keywords_for_google_ads\notebook.ipynb] |
66 |
66 |
17 June 2020 |
Progress in Stanford 224N NLP with Deep Learning by Christopher Manning. Mathematics of Word2Vec Lecture-1-Introduction-and-Word-Vectors.ipynb |
65 |
65 |
16 June 2020 |
Progress in Stanford 224N NLP with Deep Learning by Christopher Manning. Localist and Distributed Representations. Word Embeddings and Distributional Semantics Lecture-1-Introduction-and-Word-Vectors.ipynb |
64 |
64 |
15 June 2020 |
Started Stanford 224N NLP with Deep Learning by Christopher Manning Lecture-1-Introduction-and-Word-Vectors.ipynb |
63 |
63 |
14 June 2020 |
Probability and Linear Regression, exploring conditions for linear models using residuals |
62 |
62 |
13 June 2020 |
Practical problems of central limit theorem. And Implementation of Linear Regression from scratch Probability and Random Process and [Linear Regression](algorithms_from_scratch\Linear Regression.ipynb) |
61 |
61 |
12 June 2020 |
Law of large numbers and central limit theorem. A lot of other exercises. Probability and Random Process. |
60 |
60 |
11 June 2020 |
Poisson and Geometric Distribution with lot of practical exercises Probability and Random Process. |
59 |
59 |
10 June 2020 |
Deep Dive of Normal Distribution with lot of practical exercises Probability and Random Process. |
58 |
58 |
9 June 2020 |
Poisson Distribution, Continuous Random Variable, Probability Density Function, Cumulative Distribution Functions and some exercises Probability and Random Process. |
57 |
57 |
8 June 2020 |
Bayes Rule, and some fun other probability problems. Probability and Random Process. |
56 |
56 |
7 June 2020 |
Some fun practical problems of Conditional Probability and Law of total probability. Probability and Random Process. |
55 |
55 |
6 June 2020 |
Conditional probability and some problems on independent events using python and scipy . Probability and Random Process. |
54 |
54 |
5 June 2020 |
Calculating probabilities of multiple independent events using python and scipy . Probability and Random Process. |
53 |
53 |
4 June 2020 |
Worked on a super interesting project Reducing traffic mortality in the USA. It has lot of good deep data analysis, data wrangling, plotting, dimensionality reduction, and unsupervised clustering. |
52 |
52 |
3 June 2020 |
Some good practical problems of Random Distribution using python and scipy . Probability and Random Process |
51 |
51 |
2 June 2020 |
Back to concepts of Probability. Probability and Random Process |
50 |
50 |
1 June 2020 |
Did a fun data wrangling project where we learn about Sharpe Ratio by calculating it for the stocks of Amazon and Facebook, to figure which one is better investment. We use S&P 500 as the benchmark which measures the performance of 500 largest stocks in the US. Tomorrow, back to probability and some exercise problems. |
49 |
49 |
31 May 2020 |
Continued concepts of Couting. Rabbit holed today to Bose-Einstein value and unordered sampling with replacment.I need to jump back to probability tomorrow. Counting |
48 |
48 |
30 May 2020 |
Continued concepts of Couting. Sampling with/without replacements, with/without order. Birthday paradox, binomial coefficients. Will jump back to probability soon. Counting |
47 |
47 |
29 May 2020 |
Revised basic concepts of Couting, will jump back to Probability. Moving too slow, but slow progress is better than no progress. Counting |
46 |
46 |
28 May 2020 |
Revised basic concepts of Probability. Probability and Random Process |
45 |
45 |
27 May 2020 |
Continued Course "Statistical thinking in Python Poisson Distribution, Binomial Distribution, explored using NumPy . Notebook: Statistical_thinking_part_1 |
44 |
44 |
26 May 2020 |
Continued Course "Statistical thinking in Python Random Variable, Discrete Random Variable, Continous Random Variables, Probability Mass Functions, Binomial Distribution . Notebook: Statistical_thinking_part_1 |
43 |
43 |
25 May 2020 |
Continued Course "Statistical thinking in Python Probablistic Logic and Statistical Inference, Hacker statistics, Bernoulli trials, Part of probability distributions (Binomial) . Notebook: Statistical_thinking_part_1 |
42 |
42 |
24 May 2020 |
Continued Course "Statistical thinking in Python Quantitative data exploration, summary statistics, variance, correlation, pearson correlation coeffieicient, and scatter plots. Notebook: Statistical_thinking_part_1 |
41 |
41 |
23 May 2020 |
Continued Course "Statistical thinking in Python Empirical Cumulative Distribution Functions Plot, Bee Swarm Plots, Summary Statistics, Percentiles, Outliers and Box Plots, . Notebook: Statistical_thinking_part_1 |
40 |
40 |
22 May 2020 |
Course: Started "Statistical thinking in Python" on datacamp. Notebook: Statistical_thinking_part_1 |
39 |
39 |
21 May 2020 |
Mini project: Find Movie Similarity From Plot Summaries. Fun mini project where we use NLP and clustering on movie plot summaries from IMDb and Wikipedia to quantify movie similarity. |
38 |
38 |
20 May 2020 |
Mini project: Predicting Credit card Approvals. This involves EDA, data cleaning, data imputing and then using logistic regression, and then applying grid search to predict credit card approvals. |
37 |
37 |
19 May 2020 |
Progress with Linear algebra notebook. Some concepts like linear dependence and independence, Vector spaces and span. Not a good progress today, but some progres. |
36 |
36 |
18 May 2020 |
Further progress with vector space, span, linear combination, and also dot product.Linear algebra notebook |
35 |
35 |
17 May 2020 |
Geometric interpretation of dot product of vectors. Also some read up on cosine similarity. Linear algebra notebook |
34 |
34 |
16 May 2020 |
Vector algebra in Linear algebra notebook |
33 |
33 |
15 May 2020 |
Wrote about vector in the Linear algebra notebook. Finished 3Blue1Brown's linear algebra series. Added the anki cards [here](anki\Linear Algebra2.apkg). Happy Friday. This week was tough, too much work at work, "Don't break the chain" motto kept me going. Hoping for a better week next week. |
32 |
32 |
14 May 2020 |
Created a 30 linear algebra Anki cards based on 3Blue1Brown's linear algebra series. Added the anki cards [here](anki\Linear Algebra2.apkg) |
31 |
31 |
13 May 2020 |
Added some figures to Linear algebra. Also watched more episodes of 3Blue1Brown's linear algebra series. |
30 |
30 |
12 May 2020 |
Made very little progress in Linear algebra. Added few diagrams explaining functions. Also re-watched 3Blue1Brown's linear algebra series's first few chapters. |
29 |
29 |
11 May 2020 |
Made further progress in Linear algebra. Relations, function vectors etc. |
28 |
28 |
10 May 2020 |
Started a new notebook for concepts of Linear algebra.This set of notebooks should cover major topics of Linear Algebra used in Machine Learning. |
27 |
27 |
9 May 2020 |
Finished Exercise 4 of the HOML book, Built a spam classifier using Spam Assassin data. Most of the work is around exploring the data, preprocessing of data which included parsing of HTML to text, email header parsing, removing unwanted elements (URLs, numbers), Tokenizations, removing punctuations, stemming and then finally converting them to sparse vector. After this, applied logistic regression. |
26 |
26 |
8 May 2020 |
An exploratory analysis of Titanic Dataset from Kaggle, few tips to get summary statistics. |
25 |
25 |
7 May 2020 |
Continuing Exercise 4 of the HOML book, building a spam classifier using Spam Assassin data. Lot of preprocessing steps involving parsing of emails to plain/text, also nested emails etc. |
24 |
24 |
6 May 2020 |
Exercise 4 of the HOML book, building a spam classifier using Spam Assassin data. Loaded, manipulated and explored data. |
23 |
23 |
5 May 2020 |
Worked on Kaggle's Titanic problem, Did data exploration, imputation, built a preprocessing pipeline using Scikit-Learn . Trained a SVM and a Random Forest Classifier. The accuracy is around 80%. It was fun. Let's see if I can improve it tomorrow. |
22 |
22 |
4 May 2020 |
Working on Exercise 3 of Chapter 3 (classification) of the book HOML. It is the famous Kaggle's titanic survivor classification problem. Did basic data exploration. The notebook is not done yet, and need to be improved. |
21 |
21 |
3 May 2020 |
Multilabel classification using KNN, also created data for Multioutput classification, trained a classfier and got predictions. Classificaiton done!. Now waiting for results of exercise 1. Tomorrow it is going to be fun day of exercise 2, the titanic problem from Kaggle! FINISHED Chapter 3 (classification) of the book HOML |
20 |
20 |
2 May 2020 |
Analysis of errors of multi-class classification and improving it based on insights. Also studied linear algebra. Notes and Anki card tomorrow.Continuing Chapter 3 (classification) of the book HOML |
19 |
19 |
1 May 2020 |
Improving multiclass classification using scaling, Error analysis of the multiclass classification using Confusion Matrix. Trained SGDClassifier and SVC .Continuing Chapter 3 (classification) of the book HOML |
18 |
18 |
30 April 2020 |
Multi-Class Classification concepts. One Vs Rest, One Vs One. Also used Scikit-Learn to explore OneVsRestClassifier using SVC . Continuing Chapter 3 (classification) of the book HOML |
17 |
17 |
29 April 2020 |
Comparing two classification model side by side. Build RandomForestClassifier and compared it with SGDClassifier on all metrics. continuing Chapter 3 (classification) of the book HOML |
16 |
16 |
28 April 2020 |
Just AUC today, revised previous concepts. Continuing Chapter 3 (classification) of the book HOML. Hoping for a better tomorrow. |
15 |
15 |
27 April 2020 |
ROC and AUC, and various considerations while choosing threshold.. Continuing Chapter 3 (classification) of the book HOML. Also used Google's Machine Learning Crash Course to under precision, recall, ROC, AUC better. Not a very productive day, but what matters is getting something done. |
14 |
14 |
26 April 2020 |
Thoroghly understood precision-recall tradeoff by using various examples. Created an Anki deck. Also learned how to choose the right decision threshold. Continuing Chapter 3 (classification) of the book HOML. Used various scikit-learn functions to understand precision-recall threshold. Also used Google's Machine Learning Crash Course. |
13 |
13 |
25 April 2020 |
Continuing Chapter 3 (classification) of the book HOML. Used plot_confusion_matrix , deeper understanding of precision and recall, and learned F1-score |
12 |
12 |
24 April 2020 |
Chapter 3 (classification) of the book HOML. Learned confusion matrix again, so not to get confused with it ever again. Precision and Recall. StratifiedKFold and cross_val_predict of `Scikit-Learn |
11 |
11 |
23 April 2020 |
Started chapter 3 of the book HOML. Learned classification basics. Also learned some matplotlib and linear algebra. |
10 |
10 |
22 April 2020 |
All exercises of Chapter 2 of HOML. RandomizedSearchCV with SVM , full Scikit-Learn pipeline with data preparation, top k feature importances and predictions. |
9 |
9 |
21 April 2020 |
Finally finished Chapter 2 of HOML. Feature Importance, Test data evaluation and SVM regressor done today! Exercises next!! |
8 |
8 |
20 April 2020 |
Fine-tuning the models using grid search. Chapter 2 of HOML. Too much on plate today, so this is the best I could get done. Much better than nothing! |
7 |
7 |
19 April 2020 |
Built models using Random Forest, Decision Tree and Linear Regression. Also revised importance of cross validation. All in chapter 2 of HOML. |
6 |
6 |
18 April 2020 |
Some good progress in the chapter 2 of HOML. Data preparation, attribute selection, scikit-learn 's transformation, One-hot encoding, pipelines and feature scaling. This is a good productive saturday! |
5 |
5 |
17 April 2020 |
Continuing the chapter 2 of HOML. Stratified Sampling, Correlation Coefficient and visualization. Slow but steady! |
4 |
4 |
16 April 2020 |
chapter 2 End to End Machine Learning of the book Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. Worked on explaining the RMSE and test_train_split . Very slow progress, need to improve. |
3 |
3 |
15 April 2020 |
Made very little progress in chapter 2 End to End Machine Learning of the book Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. Analyzed the histograms of the housing data, and created test/train sets. In hope of better tomorrow. |
2 |
2 |
14 April 2020 |
Made some progress in chapter 2 End to End Machine Learning of the book Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. Loaded data and described it. Less but some progress. |
1 |
2 |
13 April 2020 |
Currently reading chapter 2 End to End Machine Learning of the book Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. The chapter has interesting insights on how to approach a machine learning problem end-to-end. |