Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorganize folders of DataScienceTutorials.jl #218

Merged
merged 10 commits into from
Apr 17, 2024
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions Project_old.toml

This file was deleted.

155 changes: 89 additions & 66 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# DataScienceTutorials.jl

This repository contains the source code for a [set of tutorials](https://juliaai.github.io/DataScienceTutorials.jl/) introducing the use of Julia and Julia packages such as MLJ (but not only) to do "data science" in Julia.
This repository contains the source code for a [set of tutorials](https://juliaai.github.io/DataScienceTutorials.jl/) introducing the use of Julia and Julia packages such as MLJ, among others, to do "data science" in Julia.

## For readers
## 📖 For readers

You can read the tutorials [online](https://juliaai.github.io/DataScienceTutorials.jl/).

You can find a runnable script for each tutorial at the top of each tutorial page along with a `Project.toml` and a `Manifest.toml` you can use to re-create the exact environment that was used to run the tutorial.
You can find a runnable script for each tutorial linked at the top of each tutorial page along with a `Project.toml` and a `Manifest.toml` you can use to re-create the exact environment that was used to run the tutorial.

To do so, save both files in an appropriate folder, start Julia, `cd` to the folder and

Expand All @@ -18,106 +18,129 @@ Pkg.instantiate()

**Note**: you are strongly encouraged to [open issues](https://github.com/juliaai/DataScienceTutorials.jl/issues/new) on this repository indicating points that are unclear or could be better explained, help us have great tutorials!

## For developers
## 👩‍💻 For developers

The rest of these instructions assume that you've cloned the package and have `cd` to it.

### Structure
### 📂 Structure
The following are the folders relevant to pages on the website:
```
├── _literate
│ ├── data # has "Data Basics" tutorials
│ ├── getting-started # has "Getting Started" tutorials
│ ├── isl # has "Introduction to Statistical Learning" tutorials
│ ├── end-to-end # has "End-to-End" tutorials
│ └── advanced # has "Advanced" tutorials
├── data # This and the four folders below import content from "_literate" to the website
├── getting-started
├── isl
├── end-to-end
├── advanced
├── info # has markdown files corresponding to info pages
├── index.md # has markdown for the landing page
├── search.md # has markdown for the search page
├── routes.json # has all the navigation bar data
├── collapse-script.jl # script that adds collapsible sections to tutorials
├── deploy.jl # deployment script
└── Project.toml # For the project's environment
EssamWisam marked this conversation as resolved.
Show resolved Hide resolved
```
To understand the rest of the structure which could help you change styles with CSS or add interaction with JavaScript read the relevant page on [Franklin's documentation](https://franklinjl.org/workflow/).

All tutorials correspond to a Literate script that's in `_literate/`.
### 👨🏻‍🔧 Modifying an existing tutorial

### Fixing an existing tutorial
* Find the corresponding Julia script in `_literate` and fix it in a PR.
* Ensure it works and renders properly as explained in [this section](#👀-visualise-modifications-locally).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This internal link (and another like it) do not seem to work, at least in the preview.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I followed this which claims it works on Github and it does work locally for me. Let's merge and if it doesn't work I will make rightaway make a PR when I see it to replace it with "as in the... section below" or similar.


Find the corresponding script, fix it in a PR.

### Add a new tutorial
### Add a new tutorial

* Duplicate the folder `EX-wine`.
* Change its name:
* `EX-somename` for an "end-to-end" tutorial `somename`
* `A-somename` for a "getting started" tutorial `somename`
* `D0-somename` for a "data" tutorial `somename`
* `ISL-lab-x` for an "Introduction to Statistical Learning" tutorial
* Go to the appropriate folder inside `_literate` depending on the category of the tutorial as described above
* Duplicate one of the tutorials as a starting point.
* Remove `Manifest.toml` and `Project.toml`
* Activate that folder and add the packages that you'll need (MLJ, ...)
* Create and activate an environment in that folder and add the packages that you'll need (MLJ, ...)
* Write your tutorial following the blueprint
* Run `julia collapse-script.jl` to add necessary Franklin syntax to your tutorial to make sections in it collapsible like other tutorials

**Note**: your tutorial **must** "just work" otherwise it will be ignored, in other words, we should be able to just copy the folder containing your `.jl` and `.toml` files, and run it without having to do anything special.

Once all that's done, the remaining things to do are to create the HTML page and a link in the appropriate location. Let's assume you wanted to add an E2E tutorial "Dinosaurs" then in the previous step you'd have `EX-dinosaurs` and you would
> [!IMPORTANT]
> Your tutorial **must** "just work" otherwise it will be ignored, in other words, any Julia user should be able to just copy the folder containing your `.jl` and `.toml` files, and run it without having to do anything special.

* create a file `dinosaurs.md` in `end-to-end/` by duplicating the `end-to-end/wine.md` and changing the reference in it to `\tutorial{EX-dinosaurs}`
* add a link pointing to that tutorial in `_libs/nav/head.js` following the template so your tutorial shows in the navigation bar
* lastly, to make sections in your tutorial collapsible like other tutorials run the `collapse-script.jl` file via `julia collapse-script.jl`
Once all that's done, the remaining things to do are to create the HTML page and a link in the appropriate location. Let's assume you wanted to add an E2E tutorial "Dinosaurs" then this implies that `_literate/end-to-end/dinosaurs.jl` exists and you would:

* Create a file `dinosaurs.md` in the top-level folder `end-to-end/` by duplicating the `end-to-end/wine.md` and changing the reference in it to `\tutorial{end-to-end/dinosaurs}`
* Add a link pointing to that tutorial in `routing.json` following the template so your tutorial shows in the navigation bar
* Ensure your tutorials renders correctly as explained in the [next section](#👀-visualise-modifications-locally).

### Publishing updates
> [!NOTE]
> For plots, we do prefer that you use `Plots.jl` with the default backend. In general, try to avoid having Python as a dependency in your tutorial.

**Assumptions**:
<details>
<summary> For more information about adding plots</summary>
<br>
Follow the pattern in existing tutorials; finish a code block defining a plot with:

* you have a PR with changes, someone has reviewed them and they got merged into the main branch
```julia
savefig(joinpath(@OUTPUT, "MyTutorial-Fig1.svg")) # hide

* Be sure the version of Julia declared near the top of `index.md`
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refers to the old index.md. The referred to part is now in how-to-run-code hence the change.

matches the version used to generate the web-site (which should
match the version declared in each tutorial's Manifest.toml file)
# \figalt{the alt here}{MyTutorial-Fig1.svg}
```

Here "the alt here" is the text that appears if there is problem rendering the
figure. Please do not use anything else than SVG; please also stick to
this path and start the name of the file with the name of the tutorial
(to help keep files organised).

**Once the changes are in the main branch**:
Do not forget to add the `# hide` which will ensure the line is not displayed on the website, notebook, or script.
</details>

* run `cd("path/to/DataScienceTutorials"); using Franklin` to launch Franklin
* run `serve(single=true, verb=true)` to ensure no issues generating the relevant html pages with code block evaluations, and then run `serve()` (after restarting) to serve the pages live on a local browser for viewing
* run `include("deploy.jl")` to re-generate the LUNR index and push the changes to GitHub.
### 👀 Visualise modifications locally

The second step requires you have `lunr` and `cheerio` installed, if not:
```julia
using Pkg
Pkg.activate(".")
Pkg.instantiate()

```
using NodeJS
run(`sudo $(npm_cmd()) i lunr cheerio`)
using Franklin
serve()
```

This should take ≤ 15 seconds to complete.
This makes Franklin to re-evaluate some of the code based on the changes which may take some time, progress is indicated in the REPL. Once it finishes it will open the browser to render the website after the changes.

---
**Note**:
- If you decide to change some of the code while `serve()` is running, this is fine, Franklin will detect it and trigger an update of the relevant web pages (after evaluating the new code).

# Old instructions (still valid)
- This may generate some files under `__site` don't push them in your PR as they will be pushed upon deployment.

### Visualise modifications locally
- Avoid modifying the literate file, killing the Julia session, then calling `serve()` that sequence can cause weird issues where Julia will complain about the age of the world...

```julia
cd("path/to/DataScienceTutorials")
using Franklin
serve()
```

If you have changed the *code* of some of the literate scripts, Franklin will need to re-evaluate some of the code which may take some time, progress is indicated in the REPL.
### 🚀 Publishing updates

If you decide to change some of the code while `serve()` is running, this is fine, Franklin will detect it and trigger an update of the relevant web pages (after evaluating the new code).
**Assumptions**:

**Notes**:
- avoid modifying the literate file, killing the Julia session, then calling `serve()` that sequence can cause weird issues where Julia will complain about the age of the world...
- the `serve()` command above activates the environment.
* you have a PR with changes, someone has reviewed them and they got merged into the master branch

### Plots
* Be sure the version of Julia declared [here](https://juliaai.github.io/DataScienceTutorials.jl/how-to-run-code/)
matches the version used to generate the web-site (which should
match the version declared in each tutorial's Manifest.toml file)

For the moment, plots are done with `PyPlot.jl` (though you're not restricted to use it).
It's best not to use `Plots.jl` because the loading time would risk making full updates of the webpage annoyingly slow.

In order to display a plot, finish a code block defining a plot with
**Once the changes are in the master branch:**

* Run `cd("path/to/DataScienceTutorials"); using Franklin` to launch Franklin
* In case you don't have `lunr` and `cheerio` installed already, also do:
```julia
using NodeJS
run(`sudo $(npm_cmd()) i lunr cheerio`)
```
savefig(joinpath(@OUTPUT, "MyTutorial-Fig1.svg")) # hide
* Run `serve(single=true, verb=true)` to ensure no issues generating the relevant html pages with code block evaluations
* Run `serve()` (after restarting) to serve the pages live on a local browser for viewing
* run `include("deploy.jl")` which re-generates the LUNR index and automatically pushes the changes to GitHub.

# \figalt{the alt here}{MyTutorial-Fig1.svg}
```

Here "the alt here" is the text that appears if there is problem rendering the
figure. Please do not use anything else than SVG; please also stick to
this path and start the name of the file with the name of the tutorial
(to help keep files organised).
This should take ≤ 15 seconds to complete.

Do not forget to add the `# hide` which will ensure the line is not displayed on the website, notebook, or script.
### 🕵🏽 Troubleshooting

### Troubleshooting

#### Stale files

Expand All @@ -131,17 +154,17 @@ save the file, wait for Franklin to complete its update and then remove it (othe

If you get an "age of the world" error, the `reeval` steps above usually works as well.

If you want to force the reevaluation of everything once, restart a Julia session and use
If you want to force the reevaluation of all tutorials at once, restart a Julia session and use

```julia
serve(; eval_all=true)
```

note that this will take a while.

#### Merge conflicts
#### Merge conflicts or Missing Styles

If you get merge conflicts, do
If you get merge conflicts or have new website styles that seem to be missing after `serve()`, do

```julia
cleanpull()
Expand Down
2 changes: 1 addition & 1 deletion _layout/foot.html
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
<script src="/libs/collapse/collapse.js"></script>
<script src="/libs/pure/ui.min.js"></script>
<!-- head and footer-nav -->
<script src="/libs/nav/head.js"></script>
<script src="/libs/nav/head.js" type="module"></script>
<!-- landing page -->
<script src="/libs/landing/landing.js"></script>
<!-- navigation bar -->
Expand Down
2 changes: 1 addition & 1 deletion _libs/landing/landing.js
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,6 @@ document.addEventListener("DOMContentLoaded", function() {
// Add click event listener to the div element
element.addEventListener('click', function(event) {
// Change the location to "/how-to-run-code"
window.location.href = (hosted) ? origin + "/DataScienceTutorials.jl" + "/how-to-run-code" : "/how-to-run-code";
window.location.href = (hosted) ? origin + "/DataScienceTutorials.jl" + "/info/how-to-run-code" : "/info/how-to-run-code";
});
});
82 changes: 3 additions & 79 deletions _libs/nav/head.js
Original file line number Diff line number Diff line change
@@ -1,80 +1,4 @@

const navItems = [
{ name: 'Home', href: '/', sections: [], sectionItemWidth: '', id: 'home' },
{
name: 'Data Basics', // Category name to be shown in navigation bar
id: 'data', // id to be manipulated with js
href: '/data', // in case it should link anywhere
sections: [ // list items to be shown in its dropdown and where they link to
{ name: 'Loading Data', href: '/data/loading/' },
{ name: 'Data Frames', href: '/data/dataframe/' },
{ name: 'Categorical Arrays', href: '/data/categorical/' },
{ name: 'Scientific Type', href: '/data/scitype/' },
{ name: 'Data processing', href: '/data/processing/' },
],
sectionItemWidth: 'short-item'
},
{
name: 'Getting Started',
id: 'getting-started',
href: '/getting-started',
sections: [
{ name: 'Choosing a model', href: '/getting-started/choosing-a-model/' },
{ name: 'Fit, predict, transform', href: '/getting-started/fit-and-predict/' },
{ name: 'Model tuning', href: '/getting-started/model-tuning/' },
{ name: 'Ensembles', href: '/getting-started/ensembles/' },
{ name: 'Ensembles (2)', href: '/getting-started/ensembles-2/' },
{ name: 'Composing models', href: '/getting-started/composing-models/' },
{ name: 'Stacking', href: '/getting-started/stacking/' },
],
sectionItemWidth: 'medium-item'
},
{
name: 'Intro to Stats Learning',
id: 'stats-learning',
href: '/isl',
sections: [
{ name: 'Lab 2', href: '/isl/lab-2/' },
{ name: 'Lab 3', href: '/isl/lab-3/' },
{ name: 'Lab 4', href: '/isl/lab-4/' },
{ name: 'Lab 5', href: '/isl/lab-5/' },
{ name: 'Lab 6b', href: '/isl/lab-6b/' },
{ name: 'Lab 8', href: '/isl/lab-8/' },
{ name: 'Lab 9', href: '/isl/lab-9/' },
{ name: 'Lab 10', href: '/isl/lab-10/' },
],
sectionItemWidth: 'long-item'
},

{
name: 'End to End',
id: 'end-to-end',
href: '/end-to-end',
sections: [
{ name: 'Telco Churn', href: '/end-to-end/telco/' },
{ name: 'AMES', href: '/end-to-end/AMES/' },
{ name: 'Wine', href: '/end-to-end/wine/' },
{ name: 'Crabs (XGB)', href: '/end-to-end/crabs-xgb/' },
{ name: 'Horse', href: '/end-to-end/horse/' },
{ name: 'King County Houses', href: '/end-to-end/HouseKingCounty/' },
{ name: 'Airfoil', href: '/end-to-end/airfoil' },
{ name: 'Boston (lgbm)', href: '/end-to-end/boston-lgbm' },
{ name: 'Using GLM.jl', href: '/end-to-end/glm/' },
{ name: 'Power Generation', href: '/end-to-end/powergen/' },
{ name: 'Boston (Flux)', href: '/end-to-end/boston-flux' },
{ name: 'Breast Cancer', href: '/end-to-end/breastcancer' },
{ name: 'Credit Fraud', href: '/end-to-end/creditfraud' },
],
sectionItemWidth: 'long-item'
},
{
name: 'Advanced',
id: 'advanced',
href: '#!',
sections: [{ name: 'Ensembles (3)', href: '/advanced/ensembles-3' }],
sectionItemWidth: 'medium-item'
},
];
import navItems from '../../routes.json' assert {type: 'json'};

// first get info on whether hosted or not
const origin = window.location.origin;
Expand Down Expand Up @@ -130,7 +54,7 @@ navItems.forEach((item) => {

// add a final li as searchform
let formAction = (hosted) ? origin + "/DataScienceTutorials.jl" + "/search/index.html" : "/search/index.html";
searchForm = `
let searchForm = `
<li>
<form id="lunrSearchForm" name="lunrSearchForm" style="margin-left: 1.5rem; margin-right: -2rem;">
<input class="search-input" name="q" placeholder="Search..." type="text">
Expand Down Expand Up @@ -192,7 +116,7 @@ generateSidebar(navItems);
// Flatten the nav items so we can easily iterate through them
function flattenNavItems(items) {
return items.reduce((acc, item) => {
const mainHrefs = ["/data", "/end-to-end", "/getting-started", "/isl", "#!"];
const mainHrefs = ["/info/data", "/info/end-to-end", "/info/getting-started", "/info/isl", "#!"];
if (!mainHrefs.includes(item.href)) {
acc.push(item);
}
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion advanced/ensembles-3.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@

# Ensemble models 3 (learning networks)

\tutorial{ADV-ensembles-3}
\tutorial{advanced/ensembles-3}
30 changes: 16 additions & 14 deletions collapse-script.jl
Original file line number Diff line number Diff line change
Expand Up @@ -71,22 +71,24 @@ function read_tutorials(tutorials_dir)

# Iterate over all files named "tutorial.js" in subdirectories
for subdir in readdir(tutorials_dir)
file_path = joinpath(tutorials_dir, subdir, "tutorial.jl")
if isfile(file_path)
try
file = open(file_path, "r")
content = read(file, String)
close(file)
file = open(file_path, "w")
modified_content = introduce_dropdowns(content)
write(file, modified_content)
close(file)
# Store content in the dictionary
catch e
throw("Error reading file '$file_path': $(e)")
for tutorial_subdir in readdir(joinpath(tutorials_dir, subdir))
file_path = joinpath(tutorials_dir, subdir, tutorial_subdir, "tutorial.jl")
if isfile(file_path)
try
file = open(file_path, "r")
content = read(file, String)
close(file)
file = open(file_path, "w")
modified_content = introduce_dropdowns(content)
write(file, modified_content)
close(file)
# Store content in the dictionary
catch e
throw("Error reading file '$file_path': $(e)")
end
end
end
end
end
end


Expand Down
2 changes: 1 addition & 1 deletion data/categorical.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@

# Handling categorical data

\tutorial{D0-categorical}
\tutorial{data/categorical}
Loading
Loading