Skip to content

Commit

Permalink
Merge pull request #4 from tralynca/tralynca-patch-1
Browse files Browse the repository at this point in the history
Intro to galaxy text file upload
  • Loading branch information
pvanheus authored Mar 7, 2024
2 parents bc28b48 + 39b9279 commit 2eae8d3
Show file tree
Hide file tree
Showing 23 changed files with 449 additions and 0 deletions.
31 changes: 31 additions & 0 deletions Intro_to_Galaxy_March_2024.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
title: Introduction to Galaxy
---

### Getting started with Galaxy

In our training we will be using the [Galaxy](https://galaxyproject.org) bioinformatics platform. More specifically, we will be using the Galaxy **Europe** server (usegalaxy.eu)[https://usegalaxy.eu]. Sign up for usegalaxy.eu is free. Select the _Login or Register_ item at the top right to sign up.

The Galaxy Training Network (GTN) provides tutorials on doing bioinformatics analysis and data science using Galaxy at [https://training.galaxyproject.org/] and we will be using some of this materials in the course.

#### Introductory materials on Galaxy

What is Galaxy? See the slides [here](https://training.galaxyproject.org/training-material/topics/introduction/tutorials/introduction/slides.html). To find out more about Galaxy and using Galaxy, read about [Galaxy Resources for Research](https://docs.google.com/presentation/d/1dgKt1KJEazVPLmUXoXDUKgQl4hu1-Mute_AhSt183lQ/edit#slide=id.p) and see some common [Questions and Answers about Galaxy](https://www.slideshare.net/kbradnam/13-questions-you-might-have-about-galaxy).

#### First hands on with Galaxy

Firstly, if you are part of the March 2024 training at SANBI, log into usegalaxy.eu and click [this link](https://usegalaxy.eu/join-training/training-path-gen-march-24) to register as part of the training group. This will ensure that you have priority access to computing resources during our training. **Remember** that you must first **be logged in** to usegalaxy.eu before clicking this link.

To get used to the interface, start with [A Short Introduction To Galaxy](https://training.galaxyproject.org/training-material/topics/introduction/tutorials/galaxy-intro-short/tutorial.html). This will get you used to uploading data, running tools, doing some basic work with Galaxy Histories and also give you a first introduction to Galaxy Workflows.

After completing that tutorial, proceed to [Galaxy 101 for Everyone](https://training.galaxyproject.org/training-material/topics/introduction/tutorials/galaxy-intro-101-everyone/tutorial.html), which uses data analysis of the well-known (in the history of statistics) _Iris flower dataset_ as a slightly longer introduction to using Galaxy.

#### A note on Sharing Histories

When you are working with Galaxy, trainers might ask you to "share your history" so that they can see what you have been working on. Similar to how
one shares documents or folders on Google Docs or Dropbox. This gives the person you are sharing with access to the data in your history, so that they can check on your progress and help you with any errors that you may encounter. Read this [FAQ Entry on How to Share a History](https://training.galaxyproject.org/training-material/faqs/galaxy/histories_sharing.html).





Binary file added img/10_Kraken_single_file_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/11_Kraken_aggregate_new_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/12_Extract_workflow_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/13_Create_workflow_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/14_Edit_workflow_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/15_Expand_workflow_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/16_Share_workflow_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/17_Publish_workflow_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/18_Import_public_workflow_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/1_checked_samples_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/20_SANBI_public_workflow_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/2_UNchecked_samples_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/3_Build_List_Dataset_Pairs_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/4_Unpair_all_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/5_Pair_forward_reverse_reads_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/6_Create_collection_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/7_Uncheck_box_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/8_Search_fastp_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/9_Fastp_parameters_Galaxy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
43 changes: 43 additions & 0 deletions modules/galaxy-introduction/_posts/2024-02-13-GALAXY_topic_1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
# Introduction to Galaxy
---

### Getting started with Galaxy

In our training we will be using the [Galaxy](https://galaxyproject.org) bioinformatics platform.
<br>
<br>

> [!WARNING]
> We will be using the Galaxy **Europe** server (usegalaxy.eu)[https://usegalaxy.eu].
<br>

Sign up for usegalaxy.eu is free. Select the _Login or Register_ item at the top right to sign up.
<br>
<br>
The Galaxy Training Network (GTN) provides tutorials on doing bioinformatics analysis and data science using Galaxy at [https://training.galaxyproject.org/] and we will be using some of this materials in the course.

<br>

---

#### Introductory materials on Galaxy

What is Galaxy? See the slides [here](https://training.galaxyproject.org/training-material/topics/introduction/tutorials/introduction/slides.html). To find out more about Galaxy and using Galaxy, read about [Galaxy Resources for Research](https://docs.google.com/presentation/d/1dgKt1KJEazVPLmUXoXDUKgQl4hu1-Mute_AhSt183lQ/edit#slide=id.p) and see some common [Questions and Answers about Galaxy](https://www.slideshare.net/kbradnam/13-questions-you-might-have-about-galaxy).
<br>
<br>
#### First hands on with Galaxy

Firstly, if you are part of the March 2024 training at SANBI, log into usegalaxy.eu and click [this link](https://usegalaxy.eu/join-training/training-path-gen-march-24) to register as part of the training group. This will ensure that you have priority access to computing resources during our training. **Remember** that you must first **be logged in** to usegalaxy.eu before clicking this link.

To get used to the interface, start with [A Short Introduction To Galaxy](https://training.galaxyproject.org/training-material/topics/introduction/tutorials/galaxy-intro-short/tutorial.html). This will get you used to uploading data, running tools, doing some basic work with Galaxy Histories and also give you a first introduction to Galaxy Workflows.

After completing that tutorial, proceed to [Galaxy 101 for Everyone](https://training.galaxyproject.org/training-material/topics/introduction/tutorials/galaxy-intro-101-everyone/tutorial.html), which uses data analysis of the well-known (in the history of statistics) _Iris flower dataset_ as a slightly longer introduction to using Galaxy.
<br>
<br>
#### A note on Sharing Histories

When you are working with Galaxy, trainers might ask you to "share your history" so that they can see what you have been working on. Similar to how
one shares documents or folders on Google Docs or Dropbox. This gives the person you are sharing with access to the data in your history, so that they can check on your progress and help you with any errors that you may encounter. Read this [FAQ Entry on How to Share a History](https://training.galaxyproject.org/training-material/faqs/galaxy/histories_sharing.html).

Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
---
title: COLLECTIONS AND WORKFLOWS
---
## Making Collections in Galaxy

<br>

When doing Bioinformatics analyses, it’s a more common practice to process many samples at once. This often also requires the grouping of samples.
- For example aired short reads, such as Illumina reads, are usually grouped together. But grouping may extend to any specific way in which one wishes to classify our data.


- Similarly, we may want to compare a set of drug resistant strains to drug sensitive strains. We would call each of these groupings of samples, a dataset. Similarly, we may want to compare a dataset containing samples from Namibia to those of Cape Verde, etc.


Galaxy allows us to do this in one step. That is, we can make sure a forward and reverse read belonging to the same sample is paired. We can then also group these paired reads together into one dataset collection, so that we can run analyses on all of our samples at once, rather than doing them one by one.

<br>
Let’s see this in action.
<br>

<br>

> [!NOTE]
>The samples for this lesson can be found at https://zenodo.org/records/10760705
> You may copy this url into your browser to download the zip file containing all the files, and then upload the files to Galaxy from your local machine.
> Or you may paste the links below for each file into Galaxy.
> However, you will need to change the file names if you use the second method.
```
https://zenodo.org/records/10760705/files/SRR24446250_1.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446250_2.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446252_1.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446252_2.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446254_1.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446254_2.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446261_1.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446261_2.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446273_1.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446273_2.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446275_1.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446275_2.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446288_1.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446288_2.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446302_1.fastq.gz?download=1
https://zenodo.org/records/10760705/files/SRR24446302_2.fastq.gz?download=1
```




**1.** **Import** the selection of cholera samples


**2.** Rename your History to **Cholera paired collection**
<br>
![checked_samples_Galaxy](/img/1_checked_samples_Galaxy.png)
<br>
<br>

**3.** Just above your imported samples, you will see a box that is checked.
<br>
Uncheck that box, to allow you to individually select each sample.
<br>
![UNchecked_samples_Galaxy](/img/2_UNchecked_samples_Galaxy.png)
<br>
<br>
**4.** Click on the **Select All** button on the right.
<br>
<br>
**5.** Click the drop-down arrow in the box that says **All 10 selected**
<br>
<br>
**6.** Choose the option that says **Build List of Dataset Pairs**
<br>
![Build_List_Dataset_Pairs](/img/3_Build_List_Dataset_Pairs_Galaxy.png)
<br>
<br>
<br>
This new box allows you to pair your samples by looking for any common pattern that will allow you to select all samples belonging to one group and separate them from another.
<br>
<br>
In our case, the algorithm predicted that the forward reads will all have a **_1** and the reverse reads all have an **_2**. So, it already paired our forward and reverse reads for us based on this pattern.
<br>
<br>
But we can look at the view we would get if the samples were named differently, and the algorithm may not have gotten it right. For example, forward and reverse reads are sometimes labelled with R1 and R2 to distinguish read pairs.
<br>
<br>
<br>
**7.** So let’s click on **Unpair all**, which is just above the green coloured samples
![Unpair_all](/img/4_Unpair_all_Galaxy.png)
<br>
<br>
Your box should now look like this.
<br>

- If your samples were named differently, and Galaxy couldn’t successfully pair it, the samples will all appear on top and you would have to find a pattern in your sample names to allow the correct classification. E.g. **R1** in the box on the top left and **R2** in the box on the top right.
<br>

![Pair_forward_reverse_reads](/img/5_Pair_forward_reverse_reads_Galaxy.png)
<br>
<br>
<br>
**8.** You can manually pair the set of reads by clicking the white button in the middle that says **Pair these datasets**, and it will move them all to the bottom as they were before.
<br>
<br>

**9.** **BEFORE** clicking on Create Collection at the bottom, give your dataset a name. For example “Cholera paired reads”, and click on “Create Collection”
![Create_collection](/img/6_Create_collection_Galaxy.png)
<br>
<br>
<br>
**10.** You now have all your samples grouped. You can deselect the collection by unchecking the box, you originally selected at number 3.
<br>
![Uncheck_box](/img/7_Uncheck_box_Galaxy.png)
<br>
<br>

Below is a tutorial called [Using dataset collections](https://training.galaxyproject.org/training-material/topics/galaxy-interface/tutorials/collections/tutorial.html) that you can follow if you ever need to revisit this topic
Loading

0 comments on commit 2eae8d3

Please sign in to comment.