Skip to content

Commit

Permalink
Feature/#79 rename data science sandbox to exasol ai lab (#147)
Browse files Browse the repository at this point in the history
* Updated error_code_config and changes file
* Updated git reference in exasol/ds/sandbox/templates/ci_code_build.jinja.yaml
* Updated default password
* Update documentation
TBD: images not yet updated

* Renamed AMI prefix
* Updated names of VM images
* Replaced name of GitHub repository in additional places
* Updated additional documentation
* Updated links in pyproject.toml
* Updated project description in pyproject.toml
* Updated name of docker image
* Additional renamings
* Renamed poetry package
* Updated DEFAULT_CHANGE_SET_PREFIX
* Updated AWS asset tag
* Renamed AWS Bucket prefix (folder in bucket)
* Replaced short tag XAL by XAIL
* Release notes, S3: Change title and removed column "S3 URI"
* Updated references in developer guide
* Renamed default filename for Secure Configuration Storage
* Replaced path in AWS bucket by jinja variable and renamed to ai-lab
* Removed file setup_db.ipynb
  • Loading branch information
ckunki authored Jan 30, 2024
1 parent 4c6fc9b commit 372172c
Show file tree
Hide file tree
Showing 37 changed files with 105 additions and 183 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Exasol Data Science Sandbox
# Exasol AI-Lab

## Overview

Expand All @@ -7,7 +7,7 @@ enabling users to try out data science algorithms in Jupyter notebooks connected

## Where to find the VM images

The release process will automatically store the links to the images in the [release notes](https://github.com/exasol/data-science-sandbox/releases/latest), as there will be a specific AMI per release.
The release process will automatically store the links to the images in the [release notes](https://github.com/exasol/ai-lab/releases/latest), as there will be a specific AMI per release.
Please check the user guide about details of the image.

## Links
Expand Down
2 changes: 1 addition & 1 deletion aws-code-build/ci/buildspec_release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ version: 0.2
env:
shell: bash
variables:
DEFAULT_PASSWORD: "dss"
DEFAULT_PASSWORD: "ai-lab"
ASSET_ID: ""
AWS_USER_NAME: "release_user"
MAKE_AMI_PUBLIC_OPTION: "--no-make-ami-public"
Expand Down
7 changes: 4 additions & 3 deletions doc/changes/changes_0.1.0.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# data-science-sandbox 0.1.0, released t.b.d.
# ai-lab 0.1.0, released t.b.d.

Code name: Initial release

## Summary

Initial release of the data-science-sandbox. It provides the creation of an Amazon Machine Image (AMI) and virtual machine images for a specific version of the data-science-sanbox-release project.
Initial release of the Exasol AI-Lab. It provides the creation of an Amazon Machine Image (AMI), virtual machine images, and a docker image for a specific version of the AI-Lab project.

## Data-Science-Sandbox-Release
## AI-Lab-Release

Version: 0.1.0

Expand All @@ -31,6 +31,7 @@ Version: 0.1.0
* #137: Set Jupyter lab default URL to AI-Lab start page
* #75: Changed default port of Jupyter server to 49494
* #145: Add Docket Test Library to prepare Notebook tests
* #255: Renamed data science sandbox to exasol-ai-lab

## Bug Fixes

Expand Down
4 changes: 2 additions & 2 deletions doc/developer_guide/ci.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ The CodeBuild will take about 20 minutes to complete.

To run these tests locally please use

```shell
export DSS_RUN_CI_TEST=true; poetry run test/codebuild/test_ci.py
```shell
export DSS_RUN_CI_TEST=true; poetry run test/codebuild/test_ci.py
```

4 changes: 2 additions & 2 deletions doc/developer_guide/commands.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Commands

The commands offered by the DSS CLI can be organized into three groups:
The AI-Lab CLI offers commands in the following three groups:

| Group | Usage |
|----------------------|-----------------------------------------|
Expand All @@ -14,7 +14,7 @@ The following commands are used during the release AWS Codebuild job:
* `create-vm`: Create a new AMI and VM images.
* `update-release`: Update release notes of an existing Github release.
* `start-release-build`: Start the release on AWS codebuild.
* `create-docker-image`: Create a Docker image for data-science-sandbox and deploy it to hub.docker.com/exasol/data-science-sandbox.
* `create-docker-image`: Create a Docker image for ai-lab and deploy it to hub.docker.com/exasol/ai-lab.

Script `start-release-build`:
* Is usually called from github workflow `release_droid_upload_github_release_assets.yml`.
Expand Down
5 changes: 2 additions & 3 deletions doc/developer_guide/developer_guide.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Data Science Sandbox Developer Guide
# Exasol AI-Lab Developer Guide

## Overview

Expand Down Expand Up @@ -28,8 +28,7 @@ bash install.sh

## Design Goals

The Data Science Sandbox (DSS) uses AWS as backend, because it provides the possibility to run the whole workflow during
a ci-test.
The Exasol AI-Lab (XAIL) uses AWS as backend, because it provides the possibility to run the whole workflow during a ci-test.

This project uses

Expand Down
6 changes: 3 additions & 3 deletions doc/developer_guide/testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ Creating a docker image is quite time-consuming, currently around 7 minutes. In
docker image in the tests in `integration/test_create_dss_docker_image.py`
simply add CLI option `--dss-docker-image` when calling `pytest`:

```shell
poetry run pytest --dss-docker-image exasol/data-science-sandbox:0.1.0
```
```shell
poetry run pytest --dss-docker-image exasol/ai-lab:0.1.0
```

#### Executing tests involving AWS resources

Expand Down
2 changes: 1 addition & 1 deletion doc/user_guide/ami_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ __Important__: The AMI is currently only available in the AWS region `eu-central
5. Launch the EC2 instance:
- In the navigation bar on the left select "Instances"
- Click button "Launch instances"
- At field "Application and OS Images" select the AMI id of the sandbox (found in the [release notes](https://github.com/exasol/data-science-sandbox/releases/latest))
- At field "Application and OS Images" select the AMI id of the sandbox (found in the [release notes](https://github.com/exasol/ai-lab/releases/latest))
- Select an appropriate instance type (at least "t2.small" or similar)
- Choose your key pair
- Choose the security group which your created in step 3.
Expand Down
10 changes: 5 additions & 5 deletions doc/user_guide/docker/docker_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Using Exasol AI-Lab Docker Edition requires some specific prerequisites but also

## Need to Know About Docker Images and Containers

Exasol AI-Lab Docker Edition is published as a so-called _Docker Image_ on [Docker Hub](https://hub.docker.com/r/exasol/data-science-sandbox).
Exasol AI-Lab Docker Edition is published as a so-called _Docker Image_ on [Docker Hub](https://hub.docker.com/r/exasol/ai-lab).

In order to use such an image you need two components
* Docker client
Expand Down Expand Up @@ -101,7 +101,7 @@ The following command will
docker run \
--volume ${VOLUME}:/root/notebooks \
--publish ${LISTEN_IP}:49494:49494 \
exasol/data-science-sandbox:${VERSION}
exasol/ai-lab:${VERSION}
```

Additional options
Expand All @@ -121,15 +121,15 @@ docker run \
--volume ${VOLUME}:/root/notebooks \
--volume /var/run/docker.sock:/var/run/docker.sock \
--publish ${LISTEN_IP}:49494:49494 \
exasol/data-science-sandbox:${VERSION}
exasol/ai-lab:${VERSION}
```

## Connecting to Jupyter Service

When starting AI-Lab as Docker container the command line will display a welcome message showing connection instructions and a reminder to change the default password:

```
$ docker run --publish 0.0.0.0:$PORT:49494 exasol/data-science-sandbox:$VERSION
$ docker run --publish 0.0.0.0:$PORT:49494 exasol/ai-lab:$VERSION
Server for Jupyter has been started successfully.
You can connect with http://<host>:<port>
Expand All @@ -141,7 +141,7 @@ port to the same port then you can connect with http://localhost:49494.
│ │├─┘ ││├─┤ │ ├┤ └┬┘│ ││ │├┬┘ ││ │├─┘└┬┘ │ ├┤ ├┬┘ ├─┘├─┤└─┐└─┐││││ │├┬┘ ││ │
└─┘┴ ─┴┘┴ ┴ ┴ └─┘ ┴ └─┘└─┘┴└─ └┘└─┘┴ ┴ ┴ └─┘┴└─ ┴ ┴ ┴└─┘└─┘└┴┘└─┘┴└──┴┘ o
The default password is "dss".
The default password is "ai-lab".
To update the password as user root run
/root/jupyterenv/bin/jupyter-lab server password
```
Expand Down
8 changes: 4 additions & 4 deletions doc/user_guide/editions.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Exasol AI-Lab is available in the following editions:
| Docker Edition | Docker Image |

Each of the editions is associated with an _image_ in a specific format which
* Is linked in the [release notes](https://github.com/exasol/data-science-sandbox/releases/latest) for download
* Is linked in the [release notes](https://github.com/exasol/ai-lab/releases/latest) for download
* Contains all necessary dependencies
* Provides a running Jupyterlab instance which is automatically started when booting or running the image

Expand All @@ -24,9 +24,9 @@ Recommendations

### AMI Edition

The ID of the AMI (Amazon Machine Image) is mentioned in the [release notes](https://github.com/exasol/data-science-sandbox/releases/latest) and can be used to start an EC2-instance in your AWS account.
The ID of the AMI (Amazon Machine Image) is mentioned in the [release notes](https://github.com/exasol/ai-lab/releases/latest) and can be used to start an EC2-instance in your AWS account.

The naming scheme is: "_Exasol-Data-Science-Sandbox-${VERSION}_", e.g. "_Exasol-Data-Science-Sandbox-5.0.0_"
The naming scheme is: "_Exasol-AI-Lab-${VERSION}_", e.g. "_Exasol-AI-Lab-5.0.0_"

See also [User Guide for AI-Lab AMI Edition](ami_usage.md).

Expand All @@ -51,6 +51,6 @@ See also [User Guide for AI-Lab VM Edition](vm_usage.md).

### Docker Edition

The Docker image is published to DockerHub at https://hub.docker.com/r/exasol/data-science-sandbox.
The Docker image is published to DockerHub at https://hub.docker.com/r/exasol/ai-lab.

See also [User Guide for AI-Lab Docker Edition](docker/docker_usage.md).
6 changes: 3 additions & 3 deletions doc/user_guide/user_guide.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Data Science Sandbox User Guide
# Exasol AI-Lab User Guide

## Overview

Expand Down Expand Up @@ -29,7 +29,7 @@ AI-Lab can automatically launch such an instance on demand. However when using A
Username: **ubuntu**

At the first login to the sandbox (image or AMI) you will be prompted to change your password.
The default password is: **dss**
The default password is: **ai-lab**

However, we suggest to use ssh-keys for the connection. When you use the AWS AMI, this will work automatically. When using the VM images, you need to deploy your ssh-keys. After you enabled ssh-keys, we recommend to disable ssh password authentication:
```shell
Expand All @@ -47,7 +47,7 @@ Root location
|---------------------|------------------------------------------|
| Virtual environment | location `/$ROOT/jupyterenv` |
| Location notebooks | location `/$ROOT/notebooks` |
| Password | `dss` |
| Password | `ai-lab` |
| Http Port | `49494` (or the port you forwared it to) |

Exasol strongly recommends to change the Jupyter password as soon as possible. Details about how to do that will be shown in the login screen.
Expand Down
2 changes: 1 addition & 1 deletion doc/user_guide/vm_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

### Step-by-step

1. Download the `VMDK` file from the [release notes](https://github.com/exasol/data-science-sandbox/releases/latest).
1. Download the `VMDK` file from the [release notes](https://github.com/exasol/ai-lab/releases/latest).
2. Open Boxes
3. Create a new VM: Click the + Button
4. Choose: "Create virtual machine from file"<br />
Expand Down
2 changes: 1 addition & 1 deletion error_code_config.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
error-tags:
DSS:
XAIL:
highest-index: 0
8 changes: 4 additions & 4 deletions exasol/ds/sandbox/cli/commands/create_docker_image.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,10 @@ def create_docker_image(
log_level: str,
):
"""
Create a Docker image for data-science-sandbox. If option
``--publish`` is specified then deploy the image to the Docker registry
using the specified user name and reading the password from environment
variable ``PASSWORD_ENV``.
Create a Docker image for ai-lab. If option ``--publish`` is
specified then deploy the image to the Docker registry using the specified
user name and reading the password from environment variable
``PASSWORD_ENV``.
"""
def registry_password():
if registry_user is None:
Expand Down
4 changes: 2 additions & 2 deletions exasol/ds/sandbox/lib/asset_id.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
class AssetId:
def __init__(self, asset_id: str, stack_prefix="EC2-DATA-SCIENCE-SANDBOX-", ami_prefix="Exasol-Data-Science-Sandbox"):
def __init__(self, asset_id: str, stack_prefix="EC2-DATA-SCIENCE-SANDBOX-", ami_prefix="Exasol-AI-Lab"):
self._asset_id = asset_id
self._stack_prefix = stack_prefix
self._ami_prefix = ami_prefix
Expand All @@ -23,4 +23,4 @@ def stack_prefix(self):
def __repr__(self):
return self._asset_id

BUCKET_PREFIX = "data_science_sandbox"
BUCKET_PREFIX = "ai_lab"
13 changes: 5 additions & 8 deletions exasol/ds/sandbox/lib/asset_printing/print_assets.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,41 +156,38 @@ def print_s3_objects(aws_access: AwsAccess, asset_id: Optional[AssetId], printin
else:
prefix = ""

table_printer = printing_factory.create_table_printer(title=f"S3 Objects (Bucket={vm_bucket} Prefix={prefix})")
table_printer = printing_factory.create_table_printer(title=f"VM Images, Prefix={prefix})")

table_printer.add_column("Key", no_wrap=True)
table_printer.add_column("Size", no_wrap=True)
table_printer.add_column("S3 URI", no_wrap=False)
table_printer.add_column("URL", no_wrap=False)

# How the filtering works:
# 1. The VM are stored under following location in the S3 Bucket: $BUCKET_PREFIX/$AssetId/name.$VM_FORMAT
# For example "data_science_sandbox/5.0.0/export-ami-01be860e6a6a98bf8.vhd"
# For example "ai_lab/5.0.0/export-ami-01be860e6a6a98bf8.vhd"
# 2. Because S3 list_s3_object does not support wildcards,
# we need to implement our own wildcard implementation here.
# We call list_s3_object with the standard prefix (e.g. "data_science_sandbox"),
# We call list_s3_object with the standard prefix (e.g. "ai_lab"),
# which returns ALL stored vm objects.
# 3. If no filter is given (asset_id == None), "prefix" will be empty, and we return all s3 objects
# 4. If the variable "prefix" is not empty, we need to ensure that it ends with a wildcard, so that the matching
# works correctly.
# => Assume that a filter is given "5.0.0". Variable prefix would be "data_science_sandbox/5.0.0".
# => Assume that a filter is given "5.0.0". Variable prefix would be "ai_lab/5.0.0".

s3_objects = aws_access.list_s3_objects(bucket=vm_bucket, prefix=AssetId.BUCKET_PREFIX)

if s3_objects is not None and len(prefix) > 0:
if prefix[-1] != "*":
prefix = f"{prefix}*"
s3_objects = [s3_object for s3_object in s3_objects if fnmatch.fnmatch(s3_object.key, prefix)]
s3_bucket_uri = "s3://{bucket}/{{object}}".format(bucket=vm_bucket)
https_bucket_url = "https://{url_for_bucket}/{{object}}".format(url_for_bucket=url_for_bucket)

if s3_objects is not None:
for s3_object in s3_objects:
obj_size = humanfriendly.format_size(s3_object.size)
key = s3_object.key
s3_uri = s3_bucket_uri.format(object=key)
https_url = https_bucket_url.format(object=urllib.parse.quote(key))
table_printer.add_row(key, obj_size, s3_uri, https_url)
table_printer.add_row(key, obj_size, https_url)

table_printer.finish()

Expand Down
2 changes: 1 addition & 1 deletion exasol/ds/sandbox/lib/aws_access/deployer.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
"ChangeSetResult", ["changeset_id", "changeset_type"])


DEFAULT_CHANGE_SET_PREFIX="dss-ci-setup-deploy-"
DEFAULT_CHANGE_SET_PREFIX="ai-lab-ci-setup-deploy-"


class Deployer(object):
Expand Down
9 changes: 5 additions & 4 deletions exasol/ds/sandbox/lib/dss_docker/create_image.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,9 @@
from exasol.ds.sandbox.lib.setup_ec2.host_info import HostInfo
from exasol.ds.sandbox.lib.setup_ec2.run_install_dependencies import run_install_dependencies

DEFAULT_ORG_AND_REPOSITORY = "exasol/data-science-sandbox"
DSS_VERSION = version("exasol-data-science-sandbox")
DEFAULT_ORG_AND_REPOSITORY = "exasol/ai-lab"
# name of the project as specified in file pyproject.toml
DSS_VERSION = version("exasol-ai-lab")
_logger = get_status_logger(LogType.DOCKER_IMAGE)


Expand Down Expand Up @@ -94,7 +95,7 @@ def _ansible_run_context(self) -> AnsibleRunContext:
"docker_container": self.container_name,
}
return AnsibleRunContext(
playbook="dss_docker_playbook.yml",
playbook="ai_lab_docker_playbook.yml",
extra_vars=extra_vars,
)

Expand Down Expand Up @@ -144,7 +145,7 @@ def _commit_container(
container: DockerContainer,
facts: AnsibleFacts,
) -> DockerImage:
_logger.debug(f'DSS facts: {get_fact(facts)}')
_logger.debug(f'AI-Lab facts: {get_fact(facts)}')
_logger.info("Committing changes to docker container")
virtualenv = get_fact(facts, "jupyter", "virtualenv")
port = get_fact(facts, "jupyter", "port")
Expand Down
4 changes: 2 additions & 2 deletions exasol/ds/sandbox/lib/export_vm/rename_s3_objects.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ def build_image_source(prefix: str, export_image_task_id: str, vm_image_format:

def build_image_destination(prefix: str, asset_id: AssetId, vm_image_format: VmDiskImageFormat) -> str:
img_format = vm_image_format.value.lower()
return "{bucket_prefix}exasol-data-science-sandbox-{asset_id}.{img_format}".format(
return "{bucket_prefix}exasol-ai-lab-{asset_id}.{img_format}".format(
bucket_prefix=prefix,
asset_id=str(asset_id),
img_format=img_format)
Expand All @@ -26,7 +26,7 @@ def rename_image_in_s3(aws_access: AwsAccess, export_image_task: ExportImageTask
"""
Renames the resulting S3 object of an export-image-task.
The source objects always have the format "$export-image-task-id.$format".
The destination objects always have the format "exasol-data-science-sandbox-{asset_id}.{img_format}"
The destination objects always have the format "exasol-ai-lab-{asset_id}.{img_format}"
The bucket and prefix in bucket do not change.
:param aws_access: Access proxy to Aws
:param export_image_task: The export image task which is expected to be completed successfully.
Expand Down
2 changes: 1 addition & 1 deletion exasol/ds/sandbox/lib/github_release_access.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,5 +70,5 @@ def upload(self, archive_path: str, label: str, release_id: int, content_type: s
@property
def _get_repo(self) -> Repository:
gh = Github(self._gh_token)
gh_repo = gh.get_repo("exasol/data-science-sandbox")
gh_repo = gh.get_repo("exasol/ai-lab")
return gh_repo
2 changes: 1 addition & 1 deletion exasol/ds/sandbox/lib/logging.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ class LogType(Enum):


def get_status_logger(log_type: LogType) -> logging.Logger:
return logging.getLogger(f"edss-{log_type.value}")
return logging.getLogger(f"ai-lab-{log_type.value}")


def set_log_level(level: str):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
from exasol.ds.sandbox.lib.logging import get_status_logger, LogType
from exasol.ds.sandbox.lib.render_template import render_template
from exasol.ds.sandbox.lib.vm_bucket.vm_dss_bucket import find_vm_bucket
from exasol.ds.sandbox.lib.asset_id import AssetId

RELEASE_CODE_BUILD_STACK_NAME = "DATA-SCIENCE-SANDBOX-RELEASE-CODEBUILD"

Expand All @@ -13,6 +14,7 @@ def run_setup_release_codebuild(aws_access: AwsAccess) -> None:
yml = render_template(
"release_code_build.jinja.yaml",
vm_bucket=find_vm_bucket(aws_access),
path_in_bucket=AssetId.BUCKET_PREFIX,
dockerhub_secret_arn=secret_arn,
)
aws_access.upload_cloudformation_stack(yml, RELEASE_CODE_BUILD_STACK_NAME)
Expand Down
2 changes: 1 addition & 1 deletion exasol/ds/sandbox/lib/tags.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
DEFAULT_TAG_KEY = "exa_dss_id"
DEFAULT_TAG_KEY = "exa_ai_lab_id"


def create_default_asset_tag(value: str) -> list:
Expand Down
Loading

0 comments on commit 372172c

Please sign in to comment.