Skip to content

Latest commit

 

History

History
79 lines (79 loc) · 20.5 KB

daily_log.md

File metadata and controls

79 lines (79 loc) · 20.5 KB
Seq No Streak No Date Notes
76 76 27 June 2020 A fun mini-project in which we perform basic NLP tasks using requests, BeautifulSoup and nltk library
75 75 26 June 2020 Added more self notes on Word2Vec, also solved some fun probability puzzles today (not here yet.) Word2Vec_from_scratch.ipynb , Slow but steady, too much work at work, so could only give 30 mins today.
74 74 25 June 2020 finished the implementation of word2vec (some more work needed) Word2Vec_from_scratch.ipynb , Slow but steady, too much work at work, so could only give 30 mins today.
73 73 24 June 2020 Added some more implementation to Word2Vec_from_scratch.ipynb , Slow but steady, too much work at work, so could only give 30 mins today.
72 72 23 June 2020 Added some more implementation to Word2Vec_from_scratch.ipynb , still far from done.
71 71 22 June 2020 Started implementing word2vec from scratch using NumPy. Too little progress. Perfectionism is evil :( Word2Vec_from_scratch.ipynb
70 70 21 June 2020 Complete word2vec math derivation from scratch. Lecture-1-Introduction-and-Word-Vectors.ipynb
69 69 20 June 2020 Wrote notes on word2vec from what I have learned so far Lecture-1-Introduction-and-Word-Vectors.ipynb
68 68 19 June 2020 Skip gram maths Lecture-1-Introduction-and-Word-Vectors.ipynb
67 67 18 June 2020 Derievation of Word2Vec, but not in Lecture-1-Introduction-and-Word-Vectors.ipynb yet. Also worked on a quick data wrangling project : (generating_keywords_for_google_ads)[data_wrangling\generating_keywords_for_google_ads\notebook.ipynb]
66 66 17 June 2020 Progress in Stanford 224N NLP with Deep Learning by Christopher Manning. Mathematics of Word2Vec Lecture-1-Introduction-and-Word-Vectors.ipynb
65 65 16 June 2020 Progress in Stanford 224N NLP with Deep Learning by Christopher Manning. Localist and Distributed Representations. Word Embeddings and Distributional Semantics Lecture-1-Introduction-and-Word-Vectors.ipynb
64 64 15 June 2020 Started Stanford 224N NLP with Deep Learning by Christopher Manning Lecture-1-Introduction-and-Word-Vectors.ipynb
63 63 14 June 2020 Probability and Linear Regression, exploring conditions for linear models using residuals
62 62 13 June 2020 Practical problems of central limit theorem. And Implementation of Linear Regression from scratch Probability and Random Process and [Linear Regression](algorithms_from_scratch\Linear Regression.ipynb)
61 61 12 June 2020 Law of large numbers and central limit theorem. A lot of other exercises. Probability and Random Process.
60 60 11 June 2020 Poisson and Geometric Distribution with lot of practical exercises Probability and Random Process.
59 59 10 June 2020 Deep Dive of Normal Distribution with lot of practical exercises Probability and Random Process.
58 58 9 June 2020 Poisson Distribution, Continuous Random Variable, Probability Density Function, Cumulative Distribution Functions and some exercises Probability and Random Process.
57 57 8 June 2020 Bayes Rule, and some fun other probability problems. Probability and Random Process.
56 56 7 June 2020 Some fun practical problems of Conditional Probability and Law of total probability. Probability and Random Process.
55 55 6 June 2020 Conditional probability and some problems on independent events using python and scipy. Probability and Random Process.
54 54 5 June 2020 Calculating probabilities of multiple independent events using python and scipy. Probability and Random Process.
53 53 4 June 2020 Worked on a super interesting project Reducing traffic mortality in the USA. It has lot of good deep data analysis, data wrangling, plotting, dimensionality reduction, and unsupervised clustering.
52 52 3 June 2020 Some good practical problems of Random Distribution using python and scipy. Probability and Random Process
51 51 2 June 2020 Back to concepts of Probability. Probability and Random Process
50 50 1 June 2020 Did a fun data wrangling project where we learn about Sharpe Ratio by calculating it for the stocks of Amazon and Facebook, to figure which one is better investment. We use S&P 500 as the benchmark which measures the performance of 500 largest stocks in the US. Tomorrow, back to probability and some exercise problems.
49 49 31 May 2020 Continued concepts of Couting. Rabbit holed today to Bose-Einstein value and unordered sampling with replacment.I need to jump back to probability tomorrow. Counting
48 48 30 May 2020 Continued concepts of Couting. Sampling with/without replacements, with/without order. Birthday paradox, binomial coefficients. Will jump back to probability soon. Counting
47 47 29 May 2020 Revised basic concepts of Couting, will jump back to Probability. Moving too slow, but slow progress is better than no progress. Counting
46 46 28 May 2020 Revised basic concepts of Probability. Probability and Random Process
45 45 27 May 2020 Continued Course "Statistical thinking in Python Poisson Distribution, Binomial Distribution, explored using NumPy . Notebook: Statistical_thinking_part_1
44 44 26 May 2020 Continued Course "Statistical thinking in Python Random Variable, Discrete Random Variable, Continous Random Variables, Probability Mass Functions, Binomial Distribution . Notebook: Statistical_thinking_part_1
43 43 25 May 2020 Continued Course "Statistical thinking in Python Probablistic Logic and Statistical Inference, Hacker statistics, Bernoulli trials, Part of probability distributions (Binomial) . Notebook: Statistical_thinking_part_1
42 42 24 May 2020 Continued Course "Statistical thinking in Python Quantitative data exploration, summary statistics, variance, correlation, pearson correlation coeffieicient, and scatter plots. Notebook: Statistical_thinking_part_1
41 41 23 May 2020 Continued Course "Statistical thinking in Python Empirical Cumulative Distribution Functions Plot, Bee Swarm Plots, Summary Statistics, Percentiles, Outliers and Box Plots, . Notebook: Statistical_thinking_part_1
40 40 22 May 2020 Course: Started "Statistical thinking in Python" on datacamp. Notebook: Statistical_thinking_part_1
39 39 21 May 2020 Mini project: Find Movie Similarity From Plot Summaries. Fun mini project where we use NLP and clustering on movie plot summaries from IMDb and Wikipedia to quantify movie similarity.
38 38 20 May 2020 Mini project: Predicting Credit card Approvals. This involves EDA, data cleaning, data imputing and then using logistic regression, and then applying grid search to predict credit card approvals.
37 37 19 May 2020 Progress with Linear algebra notebook. Some concepts like linear dependence and independence, Vector spaces and span. Not a good progress today, but some progres.
36 36 18 May 2020 Further progress with vector space, span, linear combination, and also dot product.Linear algebra notebook
35 35 17 May 2020 Geometric interpretation of dot product of vectors. Also some read up on cosine similarity. Linear algebra notebook
34 34 16 May 2020 Vector algebra in Linear algebra notebook
33 33 15 May 2020 Wrote about vector in the Linear algebra notebook. Finished 3Blue1Brown's linear algebra series. Added the anki cards [here](anki\Linear Algebra2.apkg). Happy Friday. This week was tough, too much work at work, "Don't break the chain" motto kept me going. Hoping for a better week next week.
32 32 14 May 2020 Created a 30 linear algebra Anki cards based on 3Blue1Brown's linear algebra series. Added the anki cards [here](anki\Linear Algebra2.apkg)
31 31 13 May 2020 Added some figures to Linear algebra. Also watched more episodes of 3Blue1Brown's linear algebra series.
30 30 12 May 2020 Made very little progress in Linear algebra. Added few diagrams explaining functions. Also re-watched 3Blue1Brown's linear algebra series's first few chapters.
29 29 11 May 2020 Made further progress in Linear algebra. Relations, function vectors etc.
28 28 10 May 2020 Started a new notebook for concepts of Linear algebra.This set of notebooks should cover major topics of Linear Algebra used in Machine Learning.
27 27 9 May 2020 Finished Exercise 4 of the HOML book, Built a spam classifier using Spam Assassin data. Most of the work is around exploring the data, preprocessing of data which included parsing of HTML to text, email header parsing, removing unwanted elements (URLs, numbers), Tokenizations, removing punctuations, stemming and then finally converting them to sparse vector. After this, applied logistic regression.
26 26 8 May 2020 An exploratory analysis of Titanic Dataset from Kaggle, few tips to get summary statistics.
25 25 7 May 2020 Continuing Exercise 4 of the HOML book, building a spam classifier using Spam Assassin data. Lot of preprocessing steps involving parsing of emails to plain/text, also nested emails etc.
24 24 6 May 2020 Exercise 4 of the HOML book, building a spam classifier using Spam Assassin data. Loaded, manipulated and explored data.
23 23 5 May 2020 Worked on Kaggle's Titanic problem, Did data exploration, imputation, built a preprocessing pipeline using Scikit-Learn. Trained a SVM and a Random Forest Classifier. The accuracy is around 80%. It was fun. Let's see if I can improve it tomorrow.
22 22 4 May 2020 Working on Exercise 3 of Chapter 3 (classification) of the book HOML. It is the famous Kaggle's titanic survivor classification problem. Did basic data exploration. The notebook is not done yet, and need to be improved.
21 21 3 May 2020 Multilabel classification using KNN, also created data for Multioutput classification, trained a classfier and got predictions. Classificaiton done!. Now waiting for results of exercise 1. Tomorrow it is going to be fun day of exercise 2, the titanic problem from Kaggle! FINISHED Chapter 3 (classification) of the book HOML
20 20 2 May 2020 Analysis of errors of multi-class classification and improving it based on insights. Also studied linear algebra. Notes and Anki card tomorrow.Continuing Chapter 3 (classification) of the book HOML
19 19 1 May 2020 Improving multiclass classification using scaling, Error analysis of the multiclass classification using Confusion Matrix. Trained SGDClassifier and SVC.Continuing Chapter 3 (classification) of the book HOML
18 18 30 April 2020 Multi-Class Classification concepts. One Vs Rest, One Vs One. Also used Scikit-Learn to explore OneVsRestClassifier using SVC. Continuing Chapter 3 (classification) of the book HOML
17 17 29 April 2020 Comparing two classification model side by side. Build RandomForestClassifier and compared it with SGDClassifier on all metrics. continuing Chapter 3 (classification) of the book HOML
16 16 28 April 2020 Just AUC today, revised previous concepts. Continuing Chapter 3 (classification) of the book HOML. Hoping for a better tomorrow.
15 15 27 April 2020 ROC and AUC, and various considerations while choosing threshold.. Continuing Chapter 3 (classification) of the book HOML. Also used Google's Machine Learning Crash Course to under precision, recall, ROC, AUC better. Not a very productive day, but what matters is getting something done.
14 14 26 April 2020 Thoroghly understood precision-recall tradeoff by using various examples. Created an Anki deck. Also learned how to choose the right decision threshold. Continuing Chapter 3 (classification) of the book HOML. Used various scikit-learn functions to understand precision-recall threshold. Also used Google's Machine Learning Crash Course.
13 13 25 April 2020 Continuing Chapter 3 (classification) of the book HOML. Used plot_confusion_matrix, deeper understanding of precision and recall, and learned F1-score
12 12 24 April 2020 Chapter 3 (classification) of the book HOML. Learned confusion matrix again, so not to get confused with it ever again. Precision and Recall. StratifiedKFold and cross_val_predict of `Scikit-Learn
11 11 23 April 2020 Started chapter 3 of the book HOML. Learned classification basics. Also learned some matplotlib and linear algebra.
10 10 22 April 2020 All exercises of Chapter 2 of HOML. RandomizedSearchCV with SVM, full Scikit-Learn pipeline with data preparation, top k feature importances and predictions.
9 9 21 April 2020 Finally finished Chapter 2 of HOML. Feature Importance, Test data evaluation and SVM regressor done today! Exercises next!!
8 8 20 April 2020 Fine-tuning the models using grid search. Chapter 2 of HOML. Too much on plate today, so this is the best I could get done. Much better than nothing!
7 7 19 April 2020 Built models using Random Forest, Decision Tree and Linear Regression. Also revised importance of cross validation. All in chapter 2 of HOML.
6 6 18 April 2020 Some good progress in the chapter 2 of HOML. Data preparation, attribute selection, scikit-learn's transformation, One-hot encoding, pipelines and feature scaling. This is a good productive saturday!
5 5 17 April 2020 Continuing the chapter 2 of HOML. Stratified Sampling, Correlation Coefficient and visualization. Slow but steady!
4 4 16 April 2020 chapter 2 End to End Machine Learning of the book Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. Worked on explaining the RMSE and test_train_split. Very slow progress, need to improve.
3 3 15 April 2020 Made very little progress in chapter 2 End to End Machine Learning of the book Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. Analyzed the histograms of the housing data, and created test/train sets. In hope of better tomorrow.
2 2 14 April 2020 Made some progress in chapter 2 End to End Machine Learning of the book Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. Loaded data and described it. Less but some progress.
1 2 13 April 2020 Currently reading chapter 2 End to End Machine Learning of the book Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. The chapter has interesting insights on how to approach a machine learning problem end-to-end.