Deployed 7d94583 with MkDocs version: 1.5.2

imarranz · Jul 16, 2024 · 436cf60 · 436cf60
1 parent b6574ee
commit 436cf60
Show file tree

Hide file tree

Showing 6 changed files with 89 additions and 64 deletions.
diff --git a/05_acquisition/0580_data_acquisition_and_preparation.html b/05_acquisition/0580_data_acquisition_and_preparation.html
@@ -229,6 +229,7 @@
             <div class="section" itemprop="articleBody">
 
                 <h2 id="references">References<a class="headerlink" href="#references" title="Permanent link">#</a></h2>
+<h3 id="books">Books<a class="headerlink" href="#books" title="Permanent link">#</a></h3>
 <ul>
 <li>
 <p>Smith CA, Want EJ, O'Maille G, et al. "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification." Analytical Chemistry, vol. 78, no. 3, 2006, pp. 779-787.</p>
@@ -239,6 +240,18 @@ <h2 id="references">References<a class="headerlink" href="#references" title="Pe
 <li>
 <p>Pluskal T, Castillo S, Villar-Briones A, Oresic M. "MZmine 2: Modular Framework for Processing, Visualizing, and Analyzing Mass Spectrometry-Based Molecular Profile Data." BMC Bioinformatics, vol. 11, no. 1, 2010, p. 395.</p>
 </li>
+</ul>
+<h3 id="software">Software<a class="headerlink" href="#software" title="Permanent link">#</a></h3>
+<ul>
+<li>
+<p><a href="https://pypi.org/project/ydata-profiling/">ydata-profiling</a></p>
+</li>
+<li>
+<p><a href="https://pypi.org/project/pandas-profiling/">pandas-profiling</a></p>
+</li>
+<li>
+<p><a href="https://dataprep.ai/">dataprep</a></p>
+</li>
 </ul>
 
             </div>

diff --git a/07_modelling/073_modeling_and_data_validation.html b/07_modelling/073_modeling_and_data_validation.html
@@ -245,6 +245,12 @@ <h3 id="regression_modeling">Regression Modeling<a class="headerlink" href="#reg
 <li>
 <p><strong>Gradient Boosting</strong>: Gradient Boosting is another ensemble technique that combines weak learners to create a strong predictive model. It sequentially fits new models to correct the errors made by previous models. Gradient Boosting algorithms like XGBoost and LightGBM are popular for their high predictive accuracy.</p>
 </li>
+<li>
+<p><strong>Lasso Regression (Least Absolute Shrinkage and Selection Operator)</strong>: A variant of linear regression that includes a penalty term in the cost function to achieve dimensionality reduction through feature selection. This method is particularly useful when dealing with data with multicollinearity or when improving model interpretation by removing less important variables is desired.</p>
+</li>
+<li>
+<p><strong>Support Vector Regression (SVR)</strong>: Based on the principles of Support Vector Machines, SVR can be used for both linear and non-linear relationships between the independent variables and the dependent variable. It uses the same principles of maximizing the margin, but for regression.</p>
+</li>
 </ul>
 <h3 id="classification_modeling">Classification Modeling<a class="headerlink" href="#classification_modeling" title="Permanent link">#</a></h3>
 <p>For classification problems, the objective is to predict a categorical or discrete class label. The choice of classification algorithm depends on factors such as the nature of the data, the number of classes, and the desired interpretability. Here are some commonly used classification algorithms:</p>
@@ -253,6 +259,12 @@ <h3 id="classification_modeling">Classification Modeling<a class="headerlink" hr
 <p><strong>Logistic Regression</strong>: Logistic regression is a popular algorithm for binary classification. It models the probability of belonging to a certain class using a logistic function. Logistic regression can be extended to handle multi-class classification problems.</p>
 </li>
 <li>
+<p><strong>K-Nearest Neighbors (KNN)</strong>: A simple and effective algorithm that classifies a new case based on a majority vote of its 'k' nearest neighbors. It is easy to implement and understand but can become computationally expensive as the dataset size grows.</p>
+</li>
+<li>
+<p><strong>Linear Discriminant Analysis (LDA)</strong>: A statistical method used in pattern recognition that attempts to find a linear combination of features that characterizes or separates two or more classes of objects or events. It is very effective for dimensionality reduction combined with classification.</p>
+</li>
+<li>
 <p><strong>Support Vector Machines (SVM)</strong>: SVM is a powerful algorithm for both binary and multi-class classification. It finds a hyperplane that maximizes the margin between different classes. SVMs can handle complex decision boundaries and are effective with high-dimensional data.</p>
 </li>
 <li>

diff --git a/index.html b/index.html
@@ -354,5 +354,5 @@ <h3>Model Implementation and Maintenance</h3>
 
 <!--
 MkDocs version : 1.5.2
-Build Date UTC : 2024-06-13 09:56:32.782335+00:00
+Build Date UTC : 2024-07-16 10:56:47.114442+00:00
 -->
diff --git a/search/search_index.json b/search/search_index.json