Merge pull request #4 from pykoala/a3d/tut0

Data container tutorial
pykoala · Nov 20, 2024 · 7ac2b01 · 7ac2b01
2 parents 15c138f + fccb8bc
commit 7ac2b01
Show file tree

Hide file tree

Showing 3 changed files with 384 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -131,6 +131,7 @@ venv/
 ENV/
 env.bak/
 venv.bak/
+venv_koala/
 
 # Spyder project settings
 .spyderproject

diff --git a/tutorials/0-introduction/data_containers.ipynb b/tutorials/0-introduction/data_containers.ipynb
@@ -0,0 +1,379 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# PyKOALA Data Containers"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "PyKOALA uses data containers to organise, read, and manage input/output. `PyKOALA` data containers can be imported from `data_container` module:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pykoala.data_container "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Alternatively, users can import explicitly the requires classes. In this tutorial, we'll demonstrate examples using this approach."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## `HistoryRecord` class"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The `HistoryRecord` class represents a structured record in a log. This is typically useful for logging, note-taking, or any system that keeps track of messages or events."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from pykoala.data_container import HistoryRecord\n",
+    "\n",
+    "hist_record = HistoryRecord(title=\"Error log\",comments=\"This is line one.\\nThis is line two.\")\n",
+    "print('Title of history record: ', hist_record.title)\n",
+    "print('Comments in history record: ', hist_record.comments)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "`HistoryRecord` can convert the record into a string with `to_str`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "hist_record.to_str()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "However, this output is hard to read. Since it is an iterable, a better way to show the comments is: "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for record in hist_record.comments:\n",
+    "    print(record)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## `DataContainerHistory` class"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The `DataContainerHistory` class is designed to store and manage the history of data reduction steps by creating a log of entries, which can be used to trace the steps applied to a dataset. It can log a sequence of entries where each entry records details of a data processing step."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from pykoala.data_container import DataContainerHistory\n",
+    "initial_entries = [\n",
+    "    (\"Step 1\", \"Loaded data\"),\n",
+    "    (\"Step 2\", \"Filtered outliers\"),\n",
+    "    HistoryRecord(\"Step 3\", \"Normalized data\", \"preprocessing\")\n",
+    "]\n",
+    "\n",
+    "history_log = DataContainerHistory(list_of_entries=initial_entries)\n",
+    "\n",
+    "# Accessing the recorded entries\n",
+    "for record in history_log.record_entries:\n",
+    "    print(record.to_str())\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Functions "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Note:** Make sure to run the following cells in order to ensure correct execution."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "`DataContainerHistory` class use several methods to process data. For example, we can start explicitly a new log from the class with `initialise_record`. Let's redo the previous cell with this method: "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### `initialise_record` "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "initial_entries = [\n",
+    "    (\"Step 1\", \"Loaded data\"),\n",
+    "    (\"Step 2\", \"Filtered outliers\"),\n",
+    "    HistoryRecord(\"Step 3\", \"Normalized data\", \"preprocessing\")\n",
+    "]\n",
+    "\n",
+    "history_log = DataContainerHistory()\n",
+    "history_log.initialise_record(list_of_entries=initial_entries)\n",
+    "\n",
+    "history_log.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the cell above we also used the `show()` method to present the content of the history log. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### `log_record`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can add a new record in an existing history log by using the `log_record` method:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#We use the same history log as before\n",
+    "initial_entries = [\n",
+    "    (\"Step 1\", \"Loaded data\"),\n",
+    "    (\"Step 2\", \"Filtered outliers\"),\n",
+    "    HistoryRecord(\"Step 3\", \"Normalized data\", \"preprocessing\")\n",
+    "]\n",
+    "\n",
+    "history_log = DataContainerHistory()\n",
+    "history_log.initialise_record(list_of_entries=initial_entries)\n",
+    "\n",
+    "history_log.log_record(title='Step 4', comments='Additional data added with log_record method') \n",
+    "\n",
+    "history_log.show()\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### `is_record` "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can find if a certain record (title) is present in the history log with the `is_record` method. Following the previous cell:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(history_log.is_record(title='Step 1'))\n",
+    "print(history_log.is_record(title='Step 42'))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the first case, the output is `True` because there is a record with the title `Step 1`. However, there is no record with the title `Step 42`, so the second line returns `False`."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### `find_record`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can find, save and display the content using `find_record`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "search_results = history_log.find_record(title='Step 1')\n",
+    "print('Number of found items:', len(search_results))\n",
+    "\n",
+    "for result in search_results:\n",
+    "    print(result.to_str())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The output of this method is given in a list. This is useful when several entries have the same `title` record:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "history_log_repeated_titles = DataContainerHistory()\n",
+    "history_log_repeated_titles.initialise_record(list_of_entries=initial_entries)\n",
+    "\n",
+    "history_log_repeated_titles.log_record(title='Step 1', comments='Rinse and repeat') #We add a record with the same title as an existing one.\n",
+    "\n",
+    "search_results = history_log_repeated_titles.find_record(title='Step 1')\n",
+    "print('Number of found items:', len(search_results))\n",
+    "\n",
+    "for result in search_results:\n",
+    "    print(result.to_str())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### `dump_to_header`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This method writes the log into a astropy.fits.Header"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from astropy.io import fits\n",
+    "history_log.dump_to_header()\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### `dump_to_text`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This method writes the log into a text file"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from pathlib import Path\n",
+    "import os\n",
+    "\n",
+    "os.system('pwd')\n",
+    "path_to_file = './output/history_log.txt'\n",
+    "history_log.dump_to_text(file=path_to_file)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can confirm that the file was created in the `output` directory."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "venv_koala",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/tutorials/0-introduction/output/history_log.txt b/tutorials/0-introduction/output/history_log.txt
@@ -0,0 +1,4 @@
+Step 1: Loaded data
+Step 2: Filtered outliers
+Step 3: Normalized data
+Step 4: Additional data added with log_record method