Skip to content

Commit

Permalink
Docs/versioning (#38)
Browse files Browse the repository at this point in the history
* added examples of data versions
* added example of code snapshotting
* fixed docstrings examples formatting
jakubczakon authored Jun 30, 2019

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
1 parent fc03a52 commit 5779417
Showing 22 changed files with 687 additions and 437 deletions.
5 changes: 3 additions & 2 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -19,6 +19,7 @@

# -- Imports mock -----------------------------------------------------]
autodoc_mock_imports = ['altair',
'boto3',
'fastai',
'fastai.callbacks',
'telegram',
@@ -39,9 +40,9 @@
author = 'Neptune Dev Team'

# The short X.Y version
version = '0.5'
version = '0.6'
# The full version, including alpha/beta/rc tags
release = '0.5.2'
release = '0.6.1'

# -- General configuration ---------------------------------------------------

72 changes: 72 additions & 0 deletions docs/examples/code_snapshots.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Snapshoting code\n",
"\n",
"Neptune keeps track of your `.git` commit to make sure you know on which code you ran your experiment.\n",
"But sometimes you don't want to commit everything and in those dirty, in between commit, situations you may want Neptune\n",
"to snapshot your code and save it with experiment.\n",
"\n",
"In that case you need to pass a list of files you want to snaphsot to the `upload_source_files` argument of `neptune.create_experiment` method.\n",
"We wrote a helper that lets you create this list getting all the names of files of certain extensions in your folder and subfolders."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import neptune\n",
"from neptunecontrib.api.utils import get_filepaths\n",
"\n",
"neptune.init('USER_NAME/PROJECT_NAME')\n",
"\n",
"with neptune.create_experiment(upload_source_files=get_filepaths(directory='.',extensions=['.py', '.yaml', '.yml'])):\n",
" neptune.set_property('code_snapshot','yes!')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now you can view your code snapshot in Neptune.\n",
"\n",
"Check [this example experiment](https://ui.neptune.ml/neptune-ml/credit-default-prediction/e/CRED-108/source-code?path=src%2Fmodels%2F&file=train_lgbm.py):\n",
" \n",
"![img](https://gist.githubusercontent.com/jakubczakon/f754769a39ea6b8fa9728ede49b9165c/raw/e08d47e0af278225142eaa849c86964adfa7abf0/code_snapshots.png)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
160 changes: 160 additions & 0 deletions docs/examples/data_versioning.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Version Data\n",
"\n",
"With `log_data_version` and `log_s3_data_version` helpers you can log data location and data hash to Neptune.\n",
"It will be stored as property and can be viewed both in the `Details` section of an experiment:\n",
"\n",
"![img](https://gist.githubusercontent.com/jakubczakon/f754769a39ea6b8fa9728ede49b9165c/raw/7b98d5a5ef9dc702e9b1cf47dd1019efffc32753/feature_versions.png)\n",
"\n",
"and in the experiment dashboard as a column.\n",
"\n",
"![img1](https://gist.githubusercontent.com/jakubczakon/f754769a39ea6b8fa9728ede49b9165c/raw/7b98d5a5ef9dc702e9b1cf47dd1019efffc32753/feature_versions_dashboard.png)\n",
"\n",
"Check [this example project](https://ui.neptune.ml/neptune-ml/credit-default-prediction/experiments) to see more.\n",
"\n",
"## Prerequisites\n",
"Initialize Neptune"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import neptune\n",
"neptune.init('USER_NAME/PROJECT_NAME')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## File data version"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from neptunecontrib.versioning.data import log_data_version\n",
"\n",
"FILEPATH = '/path/to/data/my_data.csv'\n",
"with neptune.create_experiment():\n",
" log_data_version(FILEPATH)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Folder data version"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from neptunecontrib.versioning.data import log_data_version\n",
"\n",
"DIRPATH = '/path/to/data/folder'\n",
"with neptune.create_experiment():\n",
" log_data_version(DIRPATH)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## S3 bucket data version \n",
"We can log both a version of a particular `key` which is similar to file versioning."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"BUCKET = 'my-bucket'\n",
"PATH = 'training_dataset.csv'\n",
"with neptune.create_experiment():\n",
" log_s3_data_version(BUCKET, PATH)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can log a combined version of all the `keys` that start with a particular string which is similar to versioning a directory"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"BUCKET = 'my-bucket'\n",
"PATH = 'train_dir/'\n",
"with neptune.create_experiment():\n",
" log_s3_data_version(BUCKET, PATH)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prefixing\n",
"If you want to track multiple data sources make sure to prefix them before logging.\n",
"For example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from neptunecontrib.versioning.data import log_data_version\n",
"\n",
"FILEPATH_TABLE_1 = '/path/to/data/my_table_1.csv'\n",
"FILEPATH_TABLE_2 = '/path/to/data/my_table_2.csv'\n",
"\n",
"with neptune.create_experiment():\n",
" log_data_version(FILEPATH_TABLE_1, prefix='table_1_')\n",
" log_data_version(FILEPATH_TABLE_2, prefix='table_2_')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
2 changes: 2 additions & 0 deletions docs/examples/examples_index.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
.. toctree::
Data versioning <data_versioning>
Interactive experiment run comparison <interactive_compare_experiments>
Code snapshoting <code_snapshots>
Hyper parameter comparison <explore_hyperparams_skopt>
Log model diagnostics <log_model_diagnostics>
Monitor lightGBM training <monitor_lgbm>
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -37,6 +37,7 @@ And the best thing is you can extend it yourself or... tell us to do it for you
monitoring.skopt <user_guide/monitoring/skopt>
monitoring.utils <user_guide/monitoring/utils>
sync.with_json <user_guide/sync/with_json>
versioning.data <user_guide/versioning/data>
viz.experiments <user_guide/viz/experiments>
viz.projects <user_guide/viz/projects>

2 changes: 1 addition & 1 deletion docs/user_guide/bots/telegram_bot.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Telegram bot
===========
======================

.. automodule:: neptunecontrib.bots.telegram_bot
:members:
2 changes: 1 addition & 1 deletion docs/user_guide/hpo/utils.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Hyper parameter optimization utils
===========
============================================

.. automodule:: neptunecontrib.hpo.utils
:members:
2 changes: 1 addition & 1 deletion docs/user_guide/sync/with_json.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Sync experiments with Neptune via json file
===========
=======================================================

.. automodule:: neptunecontrib.sync.with_json
:members:
6 changes: 6 additions & 0 deletions docs/user_guide/versioning/data.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Versioning data
======================

.. automodule:: neptunecontrib.versioning.data
:members:
:show-inheritance:
54 changes: 27 additions & 27 deletions neptunecontrib/api/utils.py
Original file line number Diff line number Diff line change
@@ -38,20 +38,20 @@ def concat_experiments_on_channel(experiments, channel_name):
values concatenated from a list of experiments.
Examples:
Instantiate a session.
Instantiate a session::
>>> from neptune.sessions import Session
>>> session = Session()
from neptune.sessions import Session
session = Session()
Fetch a project and a list of experiments.
Fetch a project and a list of experiments::
>>> project = session.get_projects('neptune-ml')['neptune-ml/Salt-Detection']
>>> experiments = project.get_experiments(state=['aborted'], owner=['neyo'], min_running_time=100000)
project = session.get_projects('neptune-ml')['neptune-ml/Salt-Detection']
experiments = project.get_experiments(state=['aborted'], owner=['neyo'], min_running_time=100000)
Construct a channel value dataframe:
Construct a channel value dataframe::
>>> from neptunecontrib.api.utils import concat_experiments_on_channel
>>> compare_df = concat_experiments_on_channel(experiments,'unet_0 epoch_val iout loss')
from neptunecontrib.api.utils import concat_experiments_on_channel
compare_df = concat_experiments_on_channel(experiments,'unet_0 epoch_val iout loss')
Note:
If an experiment in the list of experiments does not contain the channel with a specified channel_name
@@ -91,22 +91,22 @@ def extract_project_progress_info(leadearboard, metric_colname, time_colname='fi
columns.
Examples:
Instantiate a session.
Instantiate a session::
>>> from neptune.sessions import Session
>>> session = Session()
from neptune.sessions import Session
session = Session()
Fetch a project and the experiment view of that project.
Fetch a project and the experiment view of that project::
>>> project = session.get_projects('neptune-ml')['neptune-ml/Salt-Detection']
>>> leaderboard = project.get_leaderboard()
project = session.get_projects('neptune-ml')['neptune-ml/Salt-Detection']
leaderboard = project.get_leaderboard()
Create a progress info dataframe.
Create a progress info dataframe::
>>> from neptunecontrib.api.utils import extract_project_progress_info
>>> progress_df = extract_project_progress_info(leadearboard,
>>> metric_colname='channel_IOUT',
>>> time_colname='finished')
from neptunecontrib.api.utils import extract_project_progress_info
progress_df = extract_project_progress_info(leadearboard,
metric_colname='channel_IOUT',
time_colname='finished')
"""
system_columns = ['id', 'owner', 'running_time', 'tags']
progress_columns = system_columns + [time_colname, metric_colname]
@@ -207,16 +207,16 @@ def get_filepaths(dirpath='.', extensions=None):
list: A list of filepaths with given extensions that are in the directory or subdirecotries.
Examples:
Initialize Neptune
Initialize Neptune::
>>> import neptune
>>> from neptunecontrib.versioning.data import log_data_version
>>> neptune.init('USER_NAME/PROJECT_NAME')
import neptune
from neptunecontrib.versioning.data import log_data_version
neptune.init('USER_NAME/PROJECT_NAME')
Create experiment and track all .py files from given directory and subdirs:
Create experiment and track all .py files from given directory and subdirs::
>>> with neptune.create_experiment(upload_source_files=get_filepaths(extensions=['.py'])):
>>> neptune.send_metric('score', 0.97)
with neptune.create_experiment(upload_source_files=get_filepaths(extensions=['.py'])):
neptune.send_metric('score', 0.97)
"""
if not extensions:
Loading

0 comments on commit 5779417

Please sign in to comment.