Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrapping docker container for sample datasets #803

Open
wants to merge 14 commits into
base: development
Choose a base branch
from
16 changes: 16 additions & 0 deletions .factory/automation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,22 @@ config:

build:
correctness:
test-sample-bootstrapper:
filter:
owner: vaticle
branch: development
image: vaticle-ubuntu-21.04
type: foreground
command: |
sudo apt-get update -y
sudo apt-get install -y docker
sudo service docker start
docker build . -f typedb-samples-docker/Dockerfile --tag test-typedb-samples
docker run -p 1729:1729 -e BOOTSTRAPPER_VERBOSE=true -e BOOTSTRAPPER_CONFIG=/typedb-samples/__test/config.yml -e BOOTSTRAPPER_DATASET_ROOT=/typedb-samples/__test --name test-typedb-samples-container test-typedb-samples &
sleep 30
docker exec test-typedb-samples-container typedb console --command="database list" && export TEST_SUCCESS=0 || export TEST_SUCCESS=1
docker stop test-typedb-samples-container; docker rm test-typedb-samples-container
exit $TEST_SUCCESS
Copy link
Member Author

@krishnangovindraj krishnangovindraj Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often would we want to run this test?
It shouldn't be significantly longer than testing it against a local installation of TypeDB, which is the alternative.

The test has run successfully in CI: https://factory.vaticle.com/krishnangovindraj/typedb-docs/1168c68815d9f9fa01580dbc4f8a9233607d677a/build/1/correctness/1/test-sample-bootstrapper/1

deploy-development:
filter:
owner: vaticle
Expand Down
18 changes: 18 additions & 0 deletions typedb-samples-docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
FROM ubuntu:22.04
WORKDIR /typedb-samples
EXPOSE 1729/tcp

COPY typedb-samples-docker/vaticle.gpg /etc/apt/trusted.gpg.d/vaticle.gpg
RUN echo "deb https://repo.typedb.com/public/public-release/deb/ubuntu trusty main" > /etc/apt/sources.list.d/vaticle.list; apt update -y && apt install -y default-jre python3 python3-pip; python3 -m pip install requests==2.31.0 pyyaml==6.0.1
COPY typedb-samples-docker/bootstrapper.py /typedb-samples

# We need the 'exec' before typedb server, or the SIGTERM from `docker stop` will not forwarded to it
CMD python3 -u bootstrapper.py && exec typedb server
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lack of exec on the first also means we can't interrupt the container bootstrap



# The below copies are for testing. Uncomment if re-creating the master container.
COPY typedb-samples-docker/config.yml /typedb-samples/__test/config.yml
COPY learn-src/modules/ROOT/attachments/ /typedb-samples/__test

# To re-create the master container:
# `docker build . -f typedb-samples-docker/Dockerfile --tag vaticle/typedb-sample-datasets:<VERSION>`
119 changes: 119 additions & 0 deletions typedb-samples-docker/bootstrapper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
import os
import requests
import subprocess
import urllib
import yaml

from time import sleep

BOOTSTRAPPER_VERSION = "1.0.0" # Will fail if the config.yml is of a different version.
DEFAULT_CONFIG_YML = "https://raw.githubusercontent.com/vaticle/typedb-docs/master/typedb-samples-docker/config.yml"
BOOTSTRAPPER_PORT = 1730 # during dataset loading, typedb runs on a different port to to remain unreachable.

VERBOSE = os.environ.get("BOOTSTRAPPER_VERBOSE", "false").lower() != "false"
# Overrides for testing with local files
CONFIG_OVERRIDE = os.environ.get("BOOTSTRAPPER_CONFIG", None)
DATASET_ROOT_OVERRIDE = os.environ.get("BOOTSTRAPPER_DATASET_ROOT", None)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overrides for testing with local files. This can be used to test against a local typedb instance, or in a locally built docker container as the CI test does.


class TypeDBBootstrapperException(Exception):
pass

def _is_url(s):
return s.startswith("https://") or s.startswith("http://")

def _http_get(url):
resp = requests.get(url)
if resp.status_code == 200:
return resp.text
else:
raise TypeDBBootstrapperException("Could not download sample.yml from: " + url)

def _console_command(cmd, silence_errors=False):
return _run_cmd(["typedb", "console", "--core=127.0.0.1:%d"%BOOTSTRAPPER_PORT] + cmd, silence_errors)

def _run_cmd(cmd, silence_errors=False):
stdout_to = None if VERBOSE else subprocess.DEVNULL
stderr_to = subprocess.DEVNULL if silence_errors and not VERBOSE else None
result = subprocess.run(cmd, stdout=stdout_to, stderr=stderr_to)
if result.returncode != 0:
raise TypeDBBootstrapperException("Running command failed: " + " ".join(cmd))

def load_config(config_path):
if CONFIG_OVERRIDE is not None:
print("BOOTSTRAPPER_CONFIG was defined. Loading config from %s"%CONFIG_OVERRIDE)
raw_yaml = _http_get(CONFIG_OVERRIDE) if _is_url(CONFIG_OVERRIDE) else open(CONFIG_OVERRIDE, 'r').read()
else:
print("Loading config from %s"%config_path)
raw_yaml = _http_get(config_path)
return yaml.safe_load(raw_yaml)

def install_typedb(version):
print("Installing TypeDB: " + version)
_run_cmd(["apt", "update", "-y"])
_run_cmd(["apt", "install", "-y", "default-jre", "typedb=%s"%version])
version_output = subprocess.check_output(["typedb", "server", "--version"])
version_line = version_output.decode().strip().split("\n")[-1]
installed_version= version_line[len("Version:"):].strip()
assert installed_version == version
print("Successfully installed TypeDB: " + installed_version)

def start_typedb():
print("Starting TypeDB for bootstrap")
output_to = None if VERBOSE else subprocess.DEVNULL
process = subprocess.Popen(["typedb","server", "--server.address=127.0.0.1:%d"%BOOTSTRAPPER_PORT], stdout=output_to, stderr=output_to)
for i in range(10):
try:
_console_command(["--command=database list"], i != 9)
return process
except TypeDBBootstrapperException as e:
sleep(2)
raise TypeDBBootstrapperException("Could not start typedb server")

def _download_dataset(from_root, from_relative, to_path):
if DATASET_ROOT_OVERRIDE is not None:
content = _http_get(urllib.parse.urljoin(DATASET_ROOT_OVERRIDE, from_relative)) if _is_url(DATASET_ROOT_OVERRIDE) else open(os.path.join(DATASET_ROOT_OVERRIDE, from_relative), 'r').read()
else:
content = _http_get(urllib.parse.urljoin(from_root, from_relative))
with open(to_path, "w") as f:
f.write(content)

def install_datasets(dataset_root, datasets):
if DATASET_ROOT_OVERRIDE is not None:
print("BOOTSTRAPPER_DATASET_ROOT detected. Datasets will be loaded from: ", DATASET_ROOT_OVERRIDE)
else:
print("Loading datasets from: ", dataset_root)

for i, dataset in enumerate(datasets, start=1):
print("%d/%d Loading %s"%(i, len(datasets), dataset))
schema_file = "%s.schema.tql"%dataset
data_file = "%s.data.tql"%dataset
_download_dataset(dataset_root, datasets[dataset]["schema"], schema_file)
_download_dataset(dataset_root, datasets[dataset]["data"], data_file)
_console_command(["--command=database create %s" % dataset])
_console_command(["--command=transaction %s schema write" % dataset, "--command=source %s" % schema_file, "--command=commit"])
_console_command(["--command=transaction %s data write" % dataset, "--command=source %s" % data_file, "--command=commit"])
print("%d/%d Completed loading dataset: %s"%(i, len(datasets), dataset))

def main():
try:
config = load_config(DEFAULT_CONFIG_YML)
if config['bootstrapper-version'] != BOOTSTRAPPER_VERSION:
raise TypeDBBootstrapperException("This bootstrapper is outdated and will not run. Please update to version: " + config['bootstrapper-version'])
install_typedb(config['typedb-version'])
with start_typedb() as typedb_process:
try:
install_datasets(config['dataset-root'], config['datasets'])
finally:
print("Shutting down TypeDB.")
typedb_process.terminate()
typedb_process.wait()
print("Bootstrapping complete!")
except Exception as e:
print("Error during bootstrapping. Run with environment variable BOOTSTRAPPER_VERBOSE=True for subcommand output")
if VERBOSE:
raise e
else:
print(str(e))
quit(1)

if __name__ == "__main__": main()
12 changes: 12 additions & 0 deletions typedb-samples-docker/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Track version for breaking changes.
bootstrapper-version: 1.0.0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bootstrapper-version for if/when we have to make breaking changes.

# TypeDB Version needed
typedb-version: 2.26.6

# Datasets to pre-install
dataset-root: https://raw.githubusercontent.com/vaticle/typedb-docs/master/learn-src/modules/ROOT/attachments/
datasets:
# The key will be used as database name. Schema & data file paths are relative to dataset-root
bookstore:
schema: bookstore-schema.tql
data: bookstore-data.tql
Binary file added typedb-samples-docker/vaticle.gpg
Binary file not shown.