diff --git a/.gitignore b/.gitignore index 7db40f9..154370f 100644 --- a/.gitignore +++ b/.gitignore @@ -131,6 +131,7 @@ venv/ ENV/ env.bak/ venv.bak/ +venv_koala/ # Spyder project settings .spyderproject diff --git a/tutorials/0-introduction/data_containers.ipynb b/tutorials/0-introduction/data_containers.ipynb index e69de29..b6023a9 100644 --- a/tutorials/0-introduction/data_containers.ipynb +++ b/tutorials/0-introduction/data_containers.ipynb @@ -0,0 +1,379 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# PyKOALA Data Containers" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "PyKOALA uses data containers to organise, read, and manage input/output. `PyKOALA` data containers can be imported from `data_container` module:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import pykoala.data_container " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Alternatively, users can import explicitly the requires classes. In this tutorial, we'll demonstrate examples using this approach." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## `HistoryRecord` class" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `HistoryRecord` class represents a structured record in a log. This is typically useful for logging, note-taking, or any system that keeps track of messages or events." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from pykoala.data_container import HistoryRecord\n", + "\n", + "hist_record = HistoryRecord(title=\"Error log\",comments=\"This is line one.\\nThis is line two.\")\n", + "print('Title of history record: ', hist_record.title)\n", + "print('Comments in history record: ', hist_record.comments)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`HistoryRecord` can convert the record into a string with `to_str`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "hist_record.to_str()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "However, this output is hard to read. Since it is an iterable, a better way to show the comments is: " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "for record in hist_record.comments:\n", + " print(record)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## `DataContainerHistory` class" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `DataContainerHistory` class is designed to store and manage the history of data reduction steps by creating a log of entries, which can be used to trace the steps applied to a dataset. It can log a sequence of entries where each entry records details of a data processing step." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from pykoala.data_container import DataContainerHistory\n", + "initial_entries = [\n", + " (\"Step 1\", \"Loaded data\"),\n", + " (\"Step 2\", \"Filtered outliers\"),\n", + " HistoryRecord(\"Step 3\", \"Normalized data\", \"preprocessing\")\n", + "]\n", + "\n", + "history_log = DataContainerHistory(list_of_entries=initial_entries)\n", + "\n", + "# Accessing the recorded entries\n", + "for record in history_log.record_entries:\n", + " print(record.to_str())\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Functions " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Note:** Make sure to run the following cells in order to ensure correct execution." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`DataContainerHistory` class use several methods to process data. For example, we can start explicitly a new log from the class with `initialise_record`. Let's redo the previous cell with this method: " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### `initialise_record` " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "initial_entries = [\n", + " (\"Step 1\", \"Loaded data\"),\n", + " (\"Step 2\", \"Filtered outliers\"),\n", + " HistoryRecord(\"Step 3\", \"Normalized data\", \"preprocessing\")\n", + "]\n", + "\n", + "history_log = DataContainerHistory()\n", + "history_log.initialise_record(list_of_entries=initial_entries)\n", + "\n", + "history_log.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the cell above we also used the `show()` method to present the content of the history log. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### `log_record`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can add a new record in an existing history log by using the `log_record` method:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#We use the same history log as before\n", + "initial_entries = [\n", + " (\"Step 1\", \"Loaded data\"),\n", + " (\"Step 2\", \"Filtered outliers\"),\n", + " HistoryRecord(\"Step 3\", \"Normalized data\", \"preprocessing\")\n", + "]\n", + "\n", + "history_log = DataContainerHistory()\n", + "history_log.initialise_record(list_of_entries=initial_entries)\n", + "\n", + "history_log.log_record(title='Step 4', comments='Additional data added with log_record method') \n", + "\n", + "history_log.show()\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### `is_record` " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can find if a certain record (title) is present in the history log with the `is_record` method. Following the previous cell:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(history_log.is_record(title='Step 1'))\n", + "print(history_log.is_record(title='Step 42'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the first case, the output is `True` because there is a record with the title `Step 1`. However, there is no record with the title `Step 42`, so the second line returns `False`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### `find_record`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can find, save and display the content using `find_record`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "search_results = history_log.find_record(title='Step 1')\n", + "print('Number of found items:', len(search_results))\n", + "\n", + "for result in search_results:\n", + " print(result.to_str())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The output of this method is given in a list. This is useful when several entries have the same `title` record:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "history_log_repeated_titles = DataContainerHistory()\n", + "history_log_repeated_titles.initialise_record(list_of_entries=initial_entries)\n", + "\n", + "history_log_repeated_titles.log_record(title='Step 1', comments='Rinse and repeat') #We add a record with the same title as an existing one.\n", + "\n", + "search_results = history_log_repeated_titles.find_record(title='Step 1')\n", + "print('Number of found items:', len(search_results))\n", + "\n", + "for result in search_results:\n", + " print(result.to_str())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### `dump_to_header`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This method writes the log into a astropy.fits.Header" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from astropy.io import fits\n", + "history_log.dump_to_header()\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### `dump_to_text`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This method writes the log into a text file" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from pathlib import Path\n", + "import os\n", + "\n", + "os.system('pwd')\n", + "path_to_file = './output/history_log.txt'\n", + "history_log.dump_to_text(file=path_to_file)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can confirm that the file was created in the `output` directory." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "venv_koala", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/tutorials/0-introduction/output/history_log.txt b/tutorials/0-introduction/output/history_log.txt new file mode 100644 index 0000000..f19d03b --- /dev/null +++ b/tutorials/0-introduction/output/history_log.txt @@ -0,0 +1,4 @@ +Step 1: Loaded data +Step 2: Filtered outliers +Step 3: Normalized data +Step 4: Additional data added with log_record method