- 1. Use scikit-learn to demonstrate K-NN classification using the Iris dataset.
- 2. Write code to add, subtract, and multiply two matrices without using external libraries.
- 3. Implement a function to calculate the transpose of a given matrix.
- 4. Implement a function to calculate the L2 norm of a vector.
- 5. Write a Python function that calculates the probability of rolling a sum of 'S' on two dice.
- 6. Implement a function that simulates a biased coin flip n times and estimates the probability of heads.
- 7. Simulate the Law of Large Numbers using Python: verify that as the number of coin tosses increases, the average of the results becomes closer to the expected value.
- 8. Write Python code to calculate mean, median, and mode from a given list of numbers.
- 9. Given a dataset, apply feature scaling and run K-Means Clustering using scikit-learn.
- 10. Create a Python script to visualize the results of K-Means Clustering on a 2D dataset.
- 11. Write a Python function to identify the centroid of a new data point in an existing K-Means model.
- 12. Write a function that calculates the Gini impurity for a given dataset in a Decision Tree.
- 13. Code a Support Vector Machine using scikit-learn to classify data from a toy dataset.
- 14. Create a k-NN classifier in Python and test its performance on a sample dataset.
- 15. Implement a function for feature scaling and normalization in preparation for classification.
- 16. Implement basic Gradient Descent to minimize a simple quadratic function.
- 17. Develop a Python script that uses the Adam optimizer from a library like TensorFlow or PyTorch.
- 18. Code an SVM model in scikit-learn to classify text data using TF-IDF features.
- 19. Write a script to visualize support vectors in a trained SVM model.
- 20. Write a Python script that estimates the mean and variance of a dataset and plots the corresponding Gaussian distribution.
- 21. Generate and visualize 1,000 random points from a Normal distribution in Python.
- 22. Simulate the Monty Hall problem in Python and analyze the results.
- 23. Create a Python program that estimates Pi using a Monte Carlo simulation.
- 24. Implement a Python function that calculates the Mean Squared Error between predicted and actual values.
- 25. Write a Python implementation of basic gradient descent to find the minimum of a quadratic function.
- 26. Write a Python function to check the gradients computed by a gradient descent algorithm.
- 27. Implement a basic K-Means Clustering algorithm from scratch using Python.
- 28. Write a function in Python that determines the best value of k (number of clusters) using the Elbow Method.
- 29. Script a program to compare the performance of different initialization methods for centroids.
- 30. Write code to compute the silhouette coefficient for evaluating the clustering quality.
- 31. Using Pandas and Python, clean and prepare a real-world dataset for K-Means Clustering.
- 32. Write a Python function to implement K-NN from scratch on a simple dataset.
- 33. Create a Python script to visualize the decision boundary of K-NN on a 2D dataset.
- 34. Implement a logistic regression model from scratch using Python.
- 35. Develop a Python script that visualizes the decision boundary of a given classification model.
- 36. Code to find the determinant of a matrix using recursion.
- 37. Develop a Python function to compute the inverse of a matrix.
- 38. Write a program to verify if a given square matrix is orthogonal.
- 39. Implement simple linear regression from scratch in Python.
- 40. Write a Python function that performs the gradient descent algorithm for linear regression.
- 41. Use pandas to load a dataset and prepare it for linear regression, handling any missing values.
- 42. Plot residual diagrams and analyze the model fit using Matplotlib or Seaborn.
- 43. Write a Python function to compute and print out model evaluation metrics (RMSE, MAE, R-squared).
- 44. Perform a polynomial regression on a sample dataset and plot the results.
- 45. Write a Python function to perform SGD on a sample dataset.
- 46. Code a simulation in Python demonstrating the effects of different learning rates on convergence.
- 47. Implement the Momentum technique in a Gradient Descent optimizer.
- 48. Create a regularization function in Python that penalizes large weights in a linear regression model.
- 49. Implement PCA on a given dataset using scikit-learn and plot the explained variance ratio.
- 50. Code a small example to demonstrate the use of LDA for classification.
- 51. Write a Python function to select an optimal C parameter for an SVM using cross-validation.
- 52. Develop a multi-class SVM classifier on a given dataset using the one-vs-one strategy.
- 53. Use Python to demonstrate the impact of different kernels on SVM decision boundaries with a 2D dataset.
- 54. Create a Python function for grid search optimization to find the best kernel and its parameters for an SVM.
- 55. Code a Gaussian Naive Bayes classifier from scratch using Python.
- 56. Implement a simple linear regression model from scratch in Python.
- 57. Create a Python function to perform a t-test given two sample datasets.
- 58. Write a Python script to compute and graphically display a correlation matrix for a given dataset.
- 59. Write a Python code snippet for performing a Chi-squared test of independence on a contingency table.
- 60. Write a Python code snippet to compute the Cross-Entropy loss given predicted probabilities and actual labels.
- 61. Implement a gradient descent algorithm in Python to minimize a simple quadratic cost function.
- 62. Implement batch gradient descent for linear regression from scratch using Python.
- 63. Design a Python function to compare the convergence speed of gradient descent with and without momentum.
- 64. Implement gradient descent with early stopping using Python.
- 65. Code a mini-batch gradient descent optimizer and test it on a small dataset.
- 66. Experiment with different weight initializations and observe their impact on gradient descent optimization.
- 67. Implement a mini-batch K-Means clustering using Python.
- 68. Create a multi-dimensional K-Means clustering example and visualize it using PCA for dimensionality reduction.
- 69. Implement a LazyLearningClassifier in Python that uses K-NN under the hood.
- 70. Develop a weighted K-NN algorithm in Python and test its performance against the standard K-NN.
- 71. Given a dataset with time-series data, how would you apply K-NN for forecasting?
- 72. Use a Boosting algorithm to improve the accuracy of a weak classifier on a dataset.
- 73. Write a function that showcases the difference between L1 and L2 regularization on a small dataset.
- 74. Write a Python function that performs feature selection using Recursive Feature Elimination (RFE).
- 75. Implement a basic version of an autoencoder for dimensionality reduction using TensorFlow/Keras.
- 76. Develop a Python script to compare the performance of PCA and LDA on a sample dataset.
- 77. Create a Python function that uses Factor Analysis for dimensionality reduction on multivariate data.
- 78. Write a code snippet to perform feature extraction using Non-negative Matrix Factorization (NMF).
- 79. Use the feature importance provided by a trained ensemble model to reduce the dimensionality of a dataset in Python.
- 80. Implement a basic linear SVM from scratch using Python.
- 81. Implement an SVM in Python using a stochastic gradient descent approach.
- 82. Implement the Metropolis-Hastings algorithm for a simple Bayesian inference simulation.
- 83. Develop a Python function to convert a non-stationary time series into a stationary one.
- 84. Write an R script to conduct an ANOVA test on a given dataset.
- 85. Implement PCA for dimensionality reduction on a high-dimensional dataset in Python.
- 86. Create a Python simulation that compares the convergence speed of batch and stochastic gradient descent.
- 87. Write a Python function that minimizes a cost function using simulated annealing.
- 88. Implement a basic version of the RMSprop optimization algorithm in Python.
- 89. Create a stochastic gradient descent algorithm in Python for optimizing a logistic regression model.
- 90. Simulate annealing of the learning rate in gradient descent and plot the convergence over time.
- 91. Implement and visualize the optimization path of the Adam optimizer vs. vanilla gradient descent.
- 92. Optimize a K-NN model in a large dataset using approximate nearest neighbors techniques like LSH or kd-trees.
- 93. Write an algorithm to perform eigenvalue and eigenvector decomposition.
- 94. Create a Python script to solve a system of linear equations using NumPy.
- 95. Implement a multiple linear regression model using NumPy or similar libraries.
- 96. Create a Python script to calculate the VIF for each predictor in a dataset.
- 97. Code a Python function to implement ridge regression using scikit-learn.
- 98. Use scikit-learn to perform cross-validation on a linear regression model and extract the test scores.
- 99. Implement a linear regression model to predict customer lifetime value using scikit-learn.
- 100. Develop a regularized regression model to analyze and predict healthcare costs.
- 101. Perform a time-series linear regression analysis on stock market data.
- 102. Create a Python script that tunes the hyperparameters of an elastic net regression model using grid search.
- 103. Write a Python function that incorporates polynomial features into a regression model for better fit and analyzes the trade-off with model complexity.
- 104. Modify a given t-SNE implementation to work more efficiently on a large-scale dataset.
- 105. Build a Python class that implements an adaptive learning rate algorithm, like Adam or AdaGrad, from scratch.
- Problems Inspired from https://devinterview.io/questions/machine-learning-and-data-science