-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #4 from pykoala/a3d/tut0
Data container tutorial
- Loading branch information
Showing
3 changed files
with
384 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -131,6 +131,7 @@ venv/ | |
ENV/ | ||
env.bak/ | ||
venv.bak/ | ||
venv_koala/ | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,379 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# PyKOALA Data Containers" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"PyKOALA uses data containers to organise, read, and manage input/output. `PyKOALA` data containers can be imported from `data_container` module:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import pykoala.data_container " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Alternatively, users can import explicitly the requires classes. In this tutorial, we'll demonstrate examples using this approach." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## `HistoryRecord` class" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"The `HistoryRecord` class represents a structured record in a log. This is typically useful for logging, note-taking, or any system that keeps track of messages or events." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from pykoala.data_container import HistoryRecord\n", | ||
"\n", | ||
"hist_record = HistoryRecord(title=\"Error log\",comments=\"This is line one.\\nThis is line two.\")\n", | ||
"print('Title of history record: ', hist_record.title)\n", | ||
"print('Comments in history record: ', hist_record.comments)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"`HistoryRecord` can convert the record into a string with `to_str`:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"hist_record.to_str()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"However, this output is hard to read. Since it is an iterable, a better way to show the comments is: " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"for record in hist_record.comments:\n", | ||
" print(record)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## `DataContainerHistory` class" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"The `DataContainerHistory` class is designed to store and manage the history of data reduction steps by creating a log of entries, which can be used to trace the steps applied to a dataset. It can log a sequence of entries where each entry records details of a data processing step." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from pykoala.data_container import DataContainerHistory\n", | ||
"initial_entries = [\n", | ||
" (\"Step 1\", \"Loaded data\"),\n", | ||
" (\"Step 2\", \"Filtered outliers\"),\n", | ||
" HistoryRecord(\"Step 3\", \"Normalized data\", \"preprocessing\")\n", | ||
"]\n", | ||
"\n", | ||
"history_log = DataContainerHistory(list_of_entries=initial_entries)\n", | ||
"\n", | ||
"# Accessing the recorded entries\n", | ||
"for record in history_log.record_entries:\n", | ||
" print(record.to_str())\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Functions " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"**Note:** Make sure to run the following cells in order to ensure correct execution." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"`DataContainerHistory` class use several methods to process data. For example, we can start explicitly a new log from the class with `initialise_record`. Let's redo the previous cell with this method: " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### `initialise_record` " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"initial_entries = [\n", | ||
" (\"Step 1\", \"Loaded data\"),\n", | ||
" (\"Step 2\", \"Filtered outliers\"),\n", | ||
" HistoryRecord(\"Step 3\", \"Normalized data\", \"preprocessing\")\n", | ||
"]\n", | ||
"\n", | ||
"history_log = DataContainerHistory()\n", | ||
"history_log.initialise_record(list_of_entries=initial_entries)\n", | ||
"\n", | ||
"history_log.show()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"In the cell above we also used the `show()` method to present the content of the history log. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### `log_record`" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"We can add a new record in an existing history log by using the `log_record` method:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"#We use the same history log as before\n", | ||
"initial_entries = [\n", | ||
" (\"Step 1\", \"Loaded data\"),\n", | ||
" (\"Step 2\", \"Filtered outliers\"),\n", | ||
" HistoryRecord(\"Step 3\", \"Normalized data\", \"preprocessing\")\n", | ||
"]\n", | ||
"\n", | ||
"history_log = DataContainerHistory()\n", | ||
"history_log.initialise_record(list_of_entries=initial_entries)\n", | ||
"\n", | ||
"history_log.log_record(title='Step 4', comments='Additional data added with log_record method') \n", | ||
"\n", | ||
"history_log.show()\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### `is_record` " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"We can find if a certain record (title) is present in the history log with the `is_record` method. Following the previous cell:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"print(history_log.is_record(title='Step 1'))\n", | ||
"print(history_log.is_record(title='Step 42'))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"In the first case, the output is `True` because there is a record with the title `Step 1`. However, there is no record with the title `Step 42`, so the second line returns `False`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### `find_record`" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"We can find, save and display the content using `find_record`:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"search_results = history_log.find_record(title='Step 1')\n", | ||
"print('Number of found items:', len(search_results))\n", | ||
"\n", | ||
"for result in search_results:\n", | ||
" print(result.to_str())" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"The output of this method is given in a list. This is useful when several entries have the same `title` record:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"history_log_repeated_titles = DataContainerHistory()\n", | ||
"history_log_repeated_titles.initialise_record(list_of_entries=initial_entries)\n", | ||
"\n", | ||
"history_log_repeated_titles.log_record(title='Step 1', comments='Rinse and repeat') #We add a record with the same title as an existing one.\n", | ||
"\n", | ||
"search_results = history_log_repeated_titles.find_record(title='Step 1')\n", | ||
"print('Number of found items:', len(search_results))\n", | ||
"\n", | ||
"for result in search_results:\n", | ||
" print(result.to_str())" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### `dump_to_header`" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"This method writes the log into a astropy.fits.Header" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from astropy.io import fits\n", | ||
"history_log.dump_to_header()\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### `dump_to_text`" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"This method writes the log into a text file" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from pathlib import Path\n", | ||
"import os\n", | ||
"\n", | ||
"os.system('pwd')\n", | ||
"path_to_file = './output/history_log.txt'\n", | ||
"history_log.dump_to_text(file=path_to_file)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"You can confirm that the file was created in the `output` directory." | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "venv_koala", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.12" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Step 1: Loaded data | ||
Step 2: Filtered outliers | ||
Step 3: Normalized data | ||
Step 4: Additional data added with log_record method |