diff --git a/CHANGELOG.md b/CHANGELOG.md index 22b2172..ca4452e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,9 @@ # Changelog +## 0.1.4 - 2023-10-11 + ++ Update - Markdown explanations within notebooks `01` and `02` + ## 0.1.3 - 2023-09-14 + Add - GitHub Actions to build Docker image and push to DockerHub diff --git a/README.md b/README.md index bc48157..7ed48e4 100644 --- a/README.md +++ b/README.md @@ -1,21 +1,24 @@ # Welcome to DataJoint tutorials! -DataJoint is an open-source library for science labs to design and build data pipelines for automated data analysis and sharing. +DataJoint is an open-source library for scientific research labs to design and build +data pipelines for automated data analysis and sharing. -This document will guide you as a new DataJoint user through interactive tutorials organized in [Jupyter notebooks](https://jupyter-notebook.readthedocs.io/en/stable/) and written in [Python](https://www.python.org/). +This document will guide you through interactive tutorials written in +[Python](https://www.python.org/) and organized in [Jupyter +notebooks](https://jupyter-notebook.readthedocs.io/en/stable/). *Please note that these hands-on DataJoint tutorials are friendly to non-expert users, and advanced programming skills are not required.* ## Table of contents -- In the [tutorials](./tutorials) folder are interactive Jupyter notebooks to learn DataJoint. The calcium imaging and electrophysiology tutorials provide examples of defining and interacting with data pipelines. In addition, some fill-in-the-blank sections are included for you to code yourself! +- The [tutorials](./tutorials) folder contains interactive Jupyter notebooks designed to teach DataJoint. The calcium imaging and electrophysiology tutorials provide examples of defining and interacting with data pipelines. In addition, some fill-in-the-blank sections are included for you to code yourself! - 01-DataJoint Basics - 02-Calcium Imaging Imported Tables - 03-Calcium Imaging Computed Tables - 04-Electrophysiology Imported Tables - 05-Electrophysiology Computed Tables -- In the [completed_tutorials](./completed_tutorials) folder are Jupyter notebooks with the code sections completed and solved. +- The [completed_tutorials](./completed_tutorials) folder contains Jupyter notebooks with all code sections completed and solved. - You will find the following notebooks in the [short_tutorials](./short_tutorials) folder: - DataJoint in 30min diff --git a/completed_tutorials/03-Calcium Imaging Computed Tables.ipynb b/completed_tutorials/03-Calcium Imaging Computed Tables.ipynb index 26a181a..0fd34f6 100644 --- a/completed_tutorials/03-Calcium Imaging Computed Tables.ipynb +++ b/completed_tutorials/03-Calcium Imaging Computed Tables.ipynb @@ -811,7 +811,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The outcome is different across different threholds we set. Therefore, this threshold is a parameter we could potentially tweak." + "The outcome is different across different thresholds we set. Therefore, this threshold is a parameter we could potentially tweak." ] }, { @@ -1064,7 +1064,99 @@ "outputs": [ { "data": { - "image/svg+xml": "\n\n%3\n\n\n\nMouse\n\n\nMouse\n\n\n\n\n\nSession\n\n\nSession\n\n\n\n\n\nMouse->Session\n\n\n\n\nScan\n\n\nScan\n\n\n\n\n\nSession->Scan\n\n\n\n\nSegmentationParam\n\n\nSegmentationParam\n\n\n\n\n\nAverageFrame\n\n\nAverageFrame\n\n\n\n\n\nScan->AverageFrame\n\n\n\n", + "image/svg+xml": [ + "\n", + "\n", + "%3\n", + "\n", + "\n", + "\n", + "Mouse\n", + "\n", + "\n", + "Mouse\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Session\n", + "\n", + "\n", + "Session\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Mouse->Session\n", + "\n", + "\n", + "\n", + "\n", + "Scan\n", + "\n", + "\n", + "Scan\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Session->Scan\n", + "\n", + "\n", + "\n", + "\n", + "SegmentationParam\n", + "\n", + "\n", + "SegmentationParam\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "AverageFrame\n", + "\n", + "\n", + "AverageFrame\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Scan->AverageFrame\n", + "\n", + "\n", + "\n", + "" + ], "text/plain": [ "" ] @@ -1142,7 +1234,140 @@ "outputs": [ { "data": { - "image/svg+xml": "\n\n%3\n\n\n\nMouse\n\n\nMouse\n\n\n\n\n\nSession\n\n\nSession\n\n\n\n\n\nMouse->Session\n\n\n\n\nScan\n\n\nScan\n\n\n\n\n\nSession->Scan\n\n\n\n\nSegmentation\n\n\nSegmentation\n\n\n\n\n\nSegmentation.Roi\n\n\nSegmentation.Roi\n\n\n\n\n\nSegmentation->Segmentation.Roi\n\n\n\n\nSegmentationParam\n\n\nSegmentationParam\n\n\n\n\n\nSegmentationParam->Segmentation\n\n\n\n\nAverageFrame\n\n\nAverageFrame\n\n\n\n\n\nAverageFrame->Segmentation\n\n\n\n\nScan->AverageFrame\n\n\n\n", + "image/svg+xml": [ + "\n", + "\n", + "%3\n", + "\n", + "\n", + "\n", + "Mouse\n", + "\n", + "\n", + "Mouse\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Session\n", + "\n", + "\n", + "Session\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Mouse->Session\n", + "\n", + "\n", + "\n", + "\n", + "Scan\n", + "\n", + "\n", + "Scan\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Session->Scan\n", + "\n", + "\n", + "\n", + "\n", + "Segmentation\n", + "\n", + "\n", + "Segmentation\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Segmentation.Roi\n", + "\n", + "\n", + "Segmentation.Roi\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Segmentation->Segmentation.Roi\n", + "\n", + "\n", + "\n", + "\n", + "SegmentationParam\n", + "\n", + "\n", + "SegmentationParam\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SegmentationParam->Segmentation\n", + "\n", + "\n", + "\n", + "\n", + "AverageFrame\n", + "\n", + "\n", + "AverageFrame\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "AverageFrame->Segmentation\n", + "\n", + "\n", + "\n", + "\n", + "Scan->AverageFrame\n", + "\n", + "\n", + "\n", + "" + ], "text/plain": [ "" ] @@ -2640,7 +2865,180 @@ "outputs": [ { "data": { - "image/svg+xml": "\n\n%3\n\n\n\nFluorescence.Trace\n\n\nFluorescence.Trace\n\n\n\n\n\nFluorescence\n\n\nFluorescence\n\n\n\n\n\nFluorescence->Fluorescence.Trace\n\n\n\n\nMouse\n\n\nMouse\n\n\n\n\n\nSession\n\n\nSession\n\n\n\n\n\nMouse->Session\n\n\n\n\nScan\n\n\nScan\n\n\n\n\n\nSession->Scan\n\n\n\n\nSegmentation\n\n\nSegmentation\n\n\n\n\n\nSegmentation->Fluorescence\n\n\n\n\nSegmentation.Roi\n\n\nSegmentation.Roi\n\n\n\n\n\nSegmentation->Segmentation.Roi\n\n\n\n\nSegmentationParam\n\n\nSegmentationParam\n\n\n\n\n\nSegmentationParam->Segmentation\n\n\n\n\nSegmentation.Roi->Fluorescence.Trace\n\n\n\n\nAverageFrame\n\n\nAverageFrame\n\n\n\n\n\nAverageFrame->Segmentation\n\n\n\n\nScan->AverageFrame\n\n\n\n", + "image/svg+xml": [ + "\n", + "\n", + "%3\n", + "\n", + "\n", + "\n", + "Fluorescence.Trace\n", + "\n", + "\n", + "Fluorescence.Trace\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Fluorescence\n", + "\n", + "\n", + "Fluorescence\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Fluorescence->Fluorescence.Trace\n", + "\n", + "\n", + "\n", + "\n", + "Mouse\n", + "\n", + "\n", + "Mouse\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Session\n", + "\n", + "\n", + "Session\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Mouse->Session\n", + "\n", + "\n", + "\n", + "\n", + "Scan\n", + "\n", + "\n", + "Scan\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Session->Scan\n", + "\n", + "\n", + "\n", + "\n", + "Segmentation\n", + "\n", + "\n", + "Segmentation\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Segmentation->Fluorescence\n", + "\n", + "\n", + "\n", + "\n", + "Segmentation.Roi\n", + "\n", + "\n", + "Segmentation.Roi\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "Segmentation->Segmentation.Roi\n", + "\n", + "\n", + "\n", + "\n", + "SegmentationParam\n", + "\n", + "\n", + "SegmentationParam\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SegmentationParam->Segmentation\n", + "\n", + "\n", + "\n", + "\n", + "Segmentation.Roi->Fluorescence.Trace\n", + "\n", + "\n", + "\n", + "\n", + "AverageFrame\n", + "\n", + "\n", + "AverageFrame\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "AverageFrame->Segmentation\n", + "\n", + "\n", + "\n", + "\n", + "Scan->AverageFrame\n", + "\n", + "\n", + "\n", + "" + ], "text/plain": [ "" ] diff --git a/tutorials/01-DataJoint Basics.ipynb b/tutorials/01-DataJoint Basics.ipynb index aabadac..8523a9f 100644 --- a/tutorials/01-DataJoint Basics.ipynb +++ b/tutorials/01-DataJoint Basics.ipynb @@ -51,12 +51,17 @@ "If you visit the [documentation for DataJoint](https://docs.datajoint.io/introduction/Data-pipelines.html), we define a data pipeline as follows:\n", "> A data pipeline is a sequence of steps (more generally a directed acyclic graph) with integrated storage at each step. These steps may be thought of as nodes in a graph.\n", "\n", - "While this is an accurate description, it may not be the most intuitive definition. Put succinctly, a data pipeline is a listing or a \"map\" of various \"things\" that you work with in a project, with line connecting things to each other to indicate their dependencies. The \"things\" in a data pipeline tends to be the *nouns* you find when describing a project. The \"things\" may include anything from mouse, experimenter, equipment, to experiment session, trial, two-photon scans, electric activities, to receptive fields, neuronal spikes, to figures for a publication! A data pipeline gives you a framework to:\n", + ">* Nodes in this graph are represented as database **tables**. Examples of such tables include `Subject`, `Session`, `Implantation`, `Experimenter`, `Equipment`, but also `OptoWaveform`, `OptoStimParams`, or `NeuronalSpikes`. \n", "\n", - "1. define these \"things\" as tables in which you can store the information about them\n", - "2. define the relationships (in particular the dependencies) between the \"things\"\n", + ">* The data pipeline is formed by making these tables interdependent (as the nodes are connected in a network). A **dependency** is a situation where a step of the data pipeline is dependent on a result from a sequentially previous step before it can complete its execution. A dependency graph forms an entire cohesive data pipeline. \n", "\n", - "A data pipeline can then serve as a map that describes everything that goes on in your experiment, capturing what is collected, what is processed, and what is analyzed/computed. A well designed data pipeline not only let's you organize your data well, but can bring out logical clarity to your experiment, and may even bring about new insights by making how everything in your experiment relates together obvious.\n", + "In order to create a data pipeline, you need to know the \"things\" in your experiments\n", + "and the relationship between them. Within the pipeline you will then:\n", + "\n", + "1. define these \"things\" as tables in which you can store the information about them.\n", + "2. define the relationships (in particular the dependencies) between the \"things\".\n", + "\n", + "The data pipeline can then serve as a map that describes everything that goes on in your experiment, capturing what is collected, what is processed, and what is analyzed/computed. A well designed data pipeline not only let's you organize your data well, but can bring out logical clarity to your experiment, and may even bring about new insights by making how everything in your experiment relates together obvious.\n", "\n", "Let's go ahead and build together a pipeline from scratch to better understand what a data pipeline is all about." ] @@ -65,7 +70,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Building our first pipeline: " + "#### Practical examples" ] }, { @@ -129,7 +134,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Just by going though the description, we can start to identify **things** or **entities** that we might want to store and represent in our data pipeline:\n", + "Just by going through the description, we can start to identify **entities** that need to be stored and represented in our data pipeline:\n", "\n", "* mouse\n", "* experimental session\n", @@ -157,16 +162,35 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "In DataJoint data pipeline, we represent these **entities** as **tables**. Different *kinds* of entities become distinct tables, and each row of the table is a single example (instance) of the category of entity. \n", - "\n", - "For example, if we have a `Mouse` table, then each row in the mouse table represents a single mouse!" + "### Schemas and tables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "When constructing such table, we need to figure out what it would take to **uniquely identify** each entry. Let's take the example of the **mouse** and think about what it would take to uniquely identify a mouse." + "##### Concepts" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In a data pipeline, we represent these **entities** as **tables**. Different *kinds* of entities become distinct tables, and each table row is a single example (instance) of the entity's category. \n", + "\n", + "For example, if we have a `Mouse` table, each row in the mouse table represents a single mouse. \n", + "\n", + "It is essential to think about what information will **uniquely identify** each entry. \n", + "\n", + "In this case, the information that uniquely identifies the `Mouse` table is their\n", + "**mouse ID** - a unique ID number assigned to each animal in the lab. This attribute is\n", + "named the **primary key** of the table. By convention, table attributes are lower case\n", + "and do not contain spaces.\n", + "\n", + "| `mouse_id*` (*Primary key attribute*)|\n", + "|:--------: | \n", + "| 11234 |\n", + "| 11432 |" ] }, { @@ -175,17 +199,21 @@ "source": [ "After some thought, we might conclude that each mouse can be uniquely identified by knowing its **mouse ID** - a unique ID number assigned to each mouse in the lab. The mouse ID is then a column in the table or an **attribute** that can be used to **uniquely identify** each mouse. Such attribute is called the **primary key** of the table.\n", "\n", - "| mouse_id* |\n", - "|:--------:|\n", - "| 11234 |\n", - "| 11432 |" + "The mouse ID is then a column in the table or an **attribute** that can be used to **uniquely identify** each mouse. \n", + "\n", + "Such an attribute is called the **primary key** of the table: the subset of table attributes uniquely identifying each entity in the table. The **secondary attribute** refers to any field in a table, not in the primary key.\n", + "\n", + "| `mouse_id*` (*Primary key attribute*) \n", + "|:--------:| \n", + "| 11234 (*Secondary attribute*)\n", + "| 11432 (*Secondary attribute*)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Once we have successfully identified the primary key of the table, we can now think about what other columns, or **non-primary key attributes** that we would want to include in the table. These are additional information **about each entry in the table that we want to store**." + "Once we have successfully identified the table's primary key, we can now think about what other columns, or **non-primary key attributes** - additional information **about each entry in the table that need to be stored as well**." ] }, { @@ -199,7 +227,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "| mouse_id* | dob | sex |\n", + "| `mouse_id*` | `dob` | `sex` |\n", "|:--------:|------------|--------|\n", "| 11234 | 2017-11-17 | M |\n", "| 11432 | 2018-03-04 | F |" @@ -209,14 +237,21 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now we have an idea on how to represent information about mouse, let's create the table using **DataJoint**!" + "Now that we have an idea of how to represent information about the mouse, let's create the table using **DataJoint**!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Practical example" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## Create a schema - house for your tables" + "##### Schema" ] }, { @@ -254,14 +289,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Creating your first table" + "##### Table" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "In DataJoint, you define each table as a class, and provide the table definition (e.g. attribute definitions) as the `definition` static string property. The class will inherit from the `dj.Manual` class provided by DataJoint (more on this later)." + "In DataJoint, you define each table as a `class`, and provide the table definition (e.g. attribute definitions) as the `definition` static string property. The class will inherit from the `dj.Manual` class provided by DataJoint (more on this later)." ] }, { @@ -301,7 +336,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Insert entries with `insert1` and `insert` methods" + "### Basic relational operators" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Insert operators" ] }, { @@ -441,7 +483,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Data integrity" + "##### Data integrity" ] }, { @@ -527,7 +569,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "As with mouse, we should think about **what information (i.e. attributes) is needed to uniquely identify an experimental session**. Here is the relevant section of the project description:\n", + "As with `mouse`, we should consider **what information (i.e. attributes) is needed to identify an experimental `session`** uniquely. Here is the relevant section of the project description:\n", "\n", "> * As a hard working neuroscientist, you perform experiments every day, sometimes working with **more than one mouse in a day**! However, on an any given day, **a mouse undergoes at most one recording session**.\n", "> * For each experimental session, you would like to record **what mouse you worked with** and **when you performed the experiment**. You would also like to keep track of other helpful information such as the **experimental setup** you worked on." @@ -537,19 +579,17 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Based on the above, it appears that you need to know:\n", + "Based on the above, it seems that you need to know the following data to uniquely identify a single experimental session:\n", "\n", "* the date of the session\n", - "* the mouse you recorded from in that session\n", - "\n", - "to uniquely identify a single experimental session." + "* the mouse you recorded from in that session" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Note that, to uniquely identify an experimental session (or simply a **session**), we need to know the mouse that the session was about. In other words, a session cannot existing without a corresponding mouse! \n", + "Note that, to uniquely identify an experimental session (or simply a `Session`), we need to know the mouse that the session was about. In other words, a session cannot exist without a corresponding mouse! \n", "\n", "With **mouse** already represented as a table in our pipeline, we say that the session **depends on** the mouse! We could graphically represent this in an **entity relationship diagram (ERD)** by drawing the line between two tables, with the one below (**session**) depending on the one above (**mouse**)." ] @@ -560,7 +600,7 @@ "source": [ "Thus we will need both **mouse** and a new attribute **session_date** to uniquely identify a single session. \n", "\n", - "Remember that a **mouse** is already uniquely identified by its primary key - **mouse_id**. In DataJoint, you can declare that **session** depends on the mouse, and DataJoint will automatically include the mouse's primary key (`mouse_id`) as part of the session's primary key, along side any additional attribute(s) you specify." + "Remember that a **mouse** is uniquely identified by its primary key - **mouse_id**. In DataJoint, you can declare that **session** depends on the mouse, and DataJoint will automatically include the mouse's primary key (`mouse_id`) as part of the session's primary key, alongside any additional attribute(s) you specify." ] }, { @@ -742,11 +782,17 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We will introduce four major types of queries used in DataJoint:\n", - "* restriction (`&`) and negative restriction (`-`): filter data\n", - "* join (`*`): bring fields from different tables together\n", - "* projection (`.proj()`): focus on a subset of attributes\n", - "* aggregation (`.aggr()`): simple computation of one table against another table" + "We will introduce the major types of queries used in DataJoint:\n", + "1. Restriction (`&`) and negative restriction (`-`): filter the data with certain conditions\n", + "2. Join (`*`): bring fields from different tables together\n", + "3. Projection (`.proj()`): focus on a subset of attributes\n", + "\n", + "Following the query operations, you might work with one or more of the following\n", + "data manipulation operations supported by DataJoint:\n", + " \n", + "1. Fetch (`.fetch()`): pull the data from the database\n", + "2. Deletion (`.delete()`): delete entries and their dependencies\n", + "3. Drop (`.drop()`): drop the table from the schema" ] }, { @@ -767,7 +813,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Exact match" + "#### Exact match" ] }, { @@ -838,7 +884,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Inequality" + "#### Inequality" ] }, { @@ -932,7 +978,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Result of one query can be used in another query! Let's first find **all female mice** and store the result." + "The result of one query can be used in another query! Let's first find `all the female mice` and `store the result`:" ] }, { @@ -972,7 +1018,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Restriction one table with another" + "#### Restrict one table with another" ] }, { @@ -995,7 +1041,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Combining restrictions" + "#### Combine restrictions" ] }, { @@ -1041,7 +1087,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Negative restriction - with the `-` operator" + "#### Negative restriction: with the `-` operator" ] }, { @@ -1096,10 +1142,10 @@ "source": [ "Behavior of join:\n", "\n", - "1. match the common field(s) of the primary keys in the two tables\n", - "2. do a combination of the non-matched part of the primary key\n", - "3. listing out the secondary attributes for each combination\n", - "4. if two tables have secondary attributes that share a same name, it will throw an error. To join, we need to rename that attribute for at least one of the tables." + "1. Match the common field(s) of the primary keys in the two tables.\n", + "2. Do a combination of the non-matched part of the primary key.\n", + "3. Listing out the secondary attributes for each combination.\n", + "4. If two tables have secondary attributes that share a same name, it will throw an error. To join, we need to rename that attribute for at least one of the tables." ] }, { @@ -1270,7 +1316,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Fetch it!" + "Fetch it!:" ] }, { @@ -1517,6 +1563,81 @@ "Mouse.delete()" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "Mouse()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note that the `.delete()` method not only delete the entries in a table, but also all the corresponding entries in subsequent (downstream) tables!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 6. Drop (`.drop()`): remove the table from the schema" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Contrary to the `.delete()` method - where the table is preserved but its content is deleted - with `.drop()` we remove the whole table from the pipeline. \n", + "\n", + "Again, `drop()` method not only drop the whole table, but also all the subsequent (downstream) tables." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dj.Diagram(schema)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Remember, again, that after running the following method, you will be asked to confirm to commit the delete:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "Session.drop()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dj.Diagram(schema)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "schema.drop()" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -1533,8 +1654,10 @@ "In the next session, we are going to extend our data pipeline with tables to represent **imported data** and define new tables to **compute and hold analysis results**.\n", "\n", "We will use both ephys and calcium imaging as example pipelines:\n", - "+ [02-electrophysiology](../02-Electrophysiology/02-Imported%20Tables%20-%20Interactive.ipynb)\n", - "+ [02-calcium imaging](../01-Calcium_Imaging/02-Imported%20Tables%20-%20Interactive.ipynb)" + "+ [02-Calcium Imaging Imported Tables](./02-Calcium%20Imaging%20Imported%20Tables.ipynb)\n", + "+ [03-Calcium Imaging Computed Tables](./03-Calcium%20Imaging%20Computed%20Tables.ipynb)\n", + "+ [04-Electrophysiology Imported Tables](./04-Electrophysiology%20Imported%20Tables.ipynb)\n", + "+ [05-Electrophysiology Computed Tables](./05-Electrophysiology%20Computed%20Tables.ipynb)" ] }, { diff --git a/tutorials/02-Calcium Imaging Imported Tables.ipynb b/tutorials/02-Calcium Imaging Imported Tables.ipynb index 279da47..d5004c1 100644 --- a/tutorials/02-Calcium Imaging Imported Tables.ipynb +++ b/tutorials/02-Calcium Imaging Imported Tables.ipynb @@ -11,12 +11,21 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Welcome back! In this session, we are going to continue working with the pipeline for the mouse calcium imaging example. \n", + "Welcome back! The practical example of this session is Calcium Imaging! \n", "\n", "In this session, we will learn to:\n", "\n", - "* import neuron imaging data from data files into an `Imported` table\n", - "* automatically trigger data importing and computations for all missing entries with `populate`" + "During this session you will learn:\n", + "\n", + "* To import neuron imaging data from data files into an `Imported` table\n", + "* To automatically trigger data importing and computations for all the missing entries with `populate`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Importing libraries" ] }, { @@ -57,7 +66,41 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now we would like to continue working with the tables we defined in the previous notebook. To do so, we would need the classes for each table: `Mouse` and `Session`. We can either redefine it here, but for your convenience, we have included the schema and table class definitions in a package called `tutorial_pipeline.mouse_session`, from which you can import the classes as well as the schema object. We will use the schema object again to define more tables." + "## Calcium imaging dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `data` folder in this repository contains a small dataset of three different calcium imaging scans: `example_scan_01.tif`, `example_scan_02.tif`and `example_scan_03.tif`.\n", + "\n", + "As you might know, calcium imaging scans (raw data) are stored as *.tif* files. \n", + "\n", + "*NOTE: For this tutorial you do not need to explore this dataset thoroughly. It simply\n", + "serves as an example to populate our data pipeline with example data.*" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Pipeline design: `Mouse` & `Session`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can continue working with the tables we defined in the previous notebook in one of\n", + "two ways such that the classes for each table, `Mouse` and `Session`, are declared here: \n", + "* We can redefine them here. \n", + "* Import them from an existing file containing their table definitions.\n", + "\n", + "Here, for your convenience, we have included the schema and table\n", + "class definitions in a package called `tutorial_pipeline.mouse_session`, from which you\n", + "can import the classes as well as the schema object. We will use the schema object again\n", + "to define more tables." ] }, { @@ -139,7 +182,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This table is dependent on the table `Session`, inheriting its primary key attributes, with an additional primary key attribute `scan_idx`. One session could contain multiple scans, which is another example of **one-to-many** relationship. We could take a look at the Diagram again." + "The table `Scan` is dependent on the table `Session`, inheriting its primary key attributes. In addition, `Scan` has an additional primary key attribute `scan_idx`. \n", + "\n", + "One session might contain multiple scans - This is another example of **one-to-many** relationship. \n", + "\n", + "Take a look at the `Diagram` again:" ] }, { @@ -247,7 +294,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This tiff file contains 100 frames. Let's take the average of the images over frames and look at it." + "This example contains 100 frames. Let's calculate the average of the images over the frames and plot the result." ] }, { @@ -274,7 +321,7 @@ "source": [ "Now let's create a table `AverageFrame` to compute and save the average fluorescence. \n", "\n", - "For each scan, we have one average frame. Therefore, the table shares the exact same primary key as the table `Scan`" + "For each scan, we have one average frame. Therefore, the table shares the exact same primary key as the table `Scan`." ] }, { @@ -319,7 +366,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We defined `average_frame` as a `longblob` so that it can store a NumPy array. This NumPy array will be imported and computed from the file corresponding to each scan." + "We defined `average_frame` as a `longblob`, which allows us to store a NumPy array. This NumPy array will be imported and computed from the file corresponding to each scan." ] }, { @@ -342,7 +389,7 @@ "source": [ "In DataJoint, the tier of the table indicates **the nature of the data and the data source for the table**. So far we have encountered two table tiers: `Manual` and `Imported`, and we will encounter the two other major tiers in this session. \n", "\n", - "DataJoint tables in `Manual` tier, or simply **Manual tables** indicate that its contents are **manually** entered by either experimenters or a recording system, and its content **do not depend on external data files or other tables**. This is the most basic table type you will encounter, especially as the tables at the beginning of the pipeline. In the Diagram, `Manual` tables are depicted by green rectangles.\n", + "DataJoint tables in `Manual` tier, or simply **Manual tables** indicate that its contents are **manually** entered by either experimenters or a recording system, and its content **do not depend on external data files or other tables**. This is the most basic table type you will encounter, especially as the tables at the beginning of the pipeline. In the diagram, `Manual` tables are depicted by green rectangles.\n", "\n", "On the other hand, **Imported tables** are understood to pull data (or *import* data) from external data files, and come equipped with functionalities to perform this importing process automatically, as we will see shortly! In the Diagram, `Imported` tables are depicted by blue ellipses." ] @@ -367,7 +414,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Rather than filling out the content of the table manually using `insert1` or `insert` methods, we are going to make use of the `make` and `populate` logic that comes with `Imported` tables to automatically figure out what needs to be imported and perform the import!" + "Rather than filling out the content of the table manually using `insert1` or `insert` methods, we are going to make use of the `make` and `populate` logic that comes with `Imported` tables. These two methods automatically figure out what needs to be imported, and perform the import." ] }, { @@ -381,7 +428,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "`Imported` table comes with a special method called `populate`. Let's try calling it." + "`Imported` table comes with a special method called `populate`. Let's call it for `AverageFrame`:\n", + "\n", + "*Note that the following code line is intended to generate a code error.*" ] }, { @@ -694,8 +743,13 @@ "source": [ "At this point, our pipeline contains the core elements with data populated, ready for further downstream analysis.\n", "\n", - "In the next [session](./03-Computed%20Table,%20Lookup%20Table,%20and%20Part%20Table%20-%20Interactive.ipynb), we are going to introduce the concept of `Computed` table, and `Lookup` table, as well as learning to set up a automated computation routine." + "In the next [session](./03-Calcium%20Imaging%20Computed%20Tables.ipynb), we are going to introduce the concept of `Computed` table, and `Lookup` table, as well as learning to set up a automated computation routine." ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] } ], "metadata": { diff --git a/tutorials/03-Calcium Imaging Computed Tables.ipynb b/tutorials/03-Calcium Imaging Computed Tables.ipynb index f90f78a..5bd952d 100644 --- a/tutorials/03-Calcium Imaging Computed Tables.ipynb +++ b/tutorials/03-Calcium Imaging Computed Tables.ipynb @@ -268,7 +268,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The outcome is different across different threholds we set. Therefore, this threshold is a parameter we could potentially tweak." + "The outcome is different across different thresholds we set. Therefore, this threshold is a parameter we could potentially tweak." ] }, {