Check out new version v0.17.0 (#45)

hstreamdb · Aug 25, 2023 · 9633248 · 9633248
1 parent 21d3aff
commit 9633248
Show file tree

Hide file tree

Showing 137 changed files with 12,457 additions and 0 deletions.
diff --git a/docs/v0.17.0/_index.md b/docs/v0.17.0/_index.md
@@ -0,0 +1,16 @@
+---
+order:
+  [
+    'overview',
+    'start',
+    'platform',
+    'write',
+    'receive',
+    'process',
+    'ingest-and-distribute',
+    'deploy',
+    'security',
+    'reference',
+    'release-notes.md',
+  ]
+---
diff --git a/docs/v0.17.0/deploy/_index.md b/docs/v0.17.0/deploy/_index.md
@@ -0,0 +1,6 @@
+---
+order: ["deploy-helm", "deploy-k8s", "deploy-docker", "quick-deploy-ssh"]
+collapsed: false
+---
+
+Deploy
diff --git a/docs/v0.17.0/deploy/deploy-docker.md b/docs/v0.17.0/deploy/deploy-docker.md
@@ -0,0 +1,268 @@
+# Manual Deployment with Docker
+
+This document describes how to run HStreamDB cluster with docker.
+
+::: warning
+
+This tutorial only shows the main process of starting HStreamDB cluster with
+docker, the parameters are not configured with any security in mind, so please
+do not use them directly when deploying!
+
+:::
+
+## Set up a ZooKeeper ensemble
+
+`HServer` and `HStore` require ZooKeeper in order to store some metadata. We
+need to set up a ZooKeeper ensemble first.
+
+You can find a tutorial online on how to build a proper ZooKeeper ensemble. As
+an example, here we just quickly start a single-node ZooKeeper via docker.
+
+```shell
+docker run --rm -d --name zookeeper --network host zookeeper
+```
+
+## Create data folders on storage nodes
+
+Storage nodes store data in shards. Typically each shard maps to a different
+physical disk. Assume your data disk is mounted on `/mnt/data0`
+
+```shell
+# creates the root folder for data
+sudo mkdir -p /data/logdevice/
+
+# writes the number of shards that this box will have
+echo 1 | sudo tee /data/logdevice/NSHARDS
+
+# creates symlink for shard 0
+sudo ln -s /mnt/data0 /data/logdevice/shard0
+
+# adds the user for the logdevice daemon
+sudo useradd logdevice
+
+# changes ownership for the data directory and the disk
+sudo chown -R logdevice /data/logdevice/
+sudo chown -R logdevice /mnt/data0/
+```
+
+- See
+  [Create data folders](https://logdevice.io/docs/FirstCluster.html#4-create-data-folders-on-storage-nodes)
+  for details
+
+## Create a configuration file
+
+Here is a minimal configuration file example. Before using it, please modify it
+to suit your situation.
+
+```json
+{
+  "server_settings": {
+    "enable-nodes-configuration-manager": "true",
+    "use-nodes-configuration-manager-nodes-configuration": "true",
+    "enable-node-self-registration": "true",
+    "enable-cluster-maintenance-state-machine": "true"
+  },
+  "client_settings": {
+    "enable-nodes-configuration-manager": "true",
+    "use-nodes-configuration-manager-nodes-configuration": "true",
+    "admin-client-capabilities": "true"
+  },
+  "cluster": "logdevice",
+  "internal_logs": {
+    "config_log_deltas": {
+      "replicate_across": {
+        "node": 3
+      }
+    },
+    "config_log_snapshots": {
+      "replicate_across": {
+        "node": 3
+      }
+    },
+    "event_log_deltas": {
+      "replicate_across": {
+        "node": 3
+      }
+    },
+    "event_log_snapshots": {
+      "replicate_across": {
+        "node": 3
+      }
+    },
+    "maintenance_log_deltas": {
+      "replicate_across": {
+        "node": 3
+      }
+    },
+    "maintenance_log_snapshots": {
+      "replicate_across": {
+        "node": 3
+      }
+    }
+  },
+  "metadata_logs": {
+    "nodeset": [],
+    "replicate_across": {
+      "node": 3
+    }
+  },
+  "zookeeper": {
+    "zookeeper_uri": "ip://10.100.2.11:2181",
+    "timeout": "30s"
+  }
+}
+```
+
+- If you have a multi-node ZooKeeper ensemble, use the list of ZooKeeper
+  ensemble nodes and ports to modify `zookeeper_uri` in the `zookeeper` section:
+
+  ```json
+      "zookeeper": {
+          "zookeeper_uri": "ip://10.100.2.11:2181,10.100.2.12:2181,10.100.2.13:2181",
+          "timeout": "30s"
+      }
+  ```
+
+- Detailed explanations of all the attributes can be found in the
+  [Cluster configuration](https://logdevice.io/docs/Config.html) docs.
+
+## Store the configuration file
+
+You can the store configuration file in ZooKeeper, or store it on each storage
+nodes.
+
+### Store configuration file in ZooKeeper
+
+Suppose you have a configuration file on one of your ZooKeeper nodes with the
+path `~/logdevice.conf`. Save the configuration file to the ZooKeeper by running
+the following command.
+
+```shell
+docker exec zookeeper zkCli.sh create /logdevice.conf "`cat ~/logdevice.conf`"
+```
+
+You can verify the create operation by:
+
+```shell
+docker exec zookeeper zkCli.sh get /logdevice.conf
+```
+
+## Set up HStore cluster
+
+For the configuration file stored in ZooKeeper, assume that the value of the
+`zookeeper_uri` field in the configuration file is `"ip:/10.100.2.11:2181"` and
+the path to the configuration file in ZooKeeper is `/logdevice.conf`.
+
+For the configuration file stored on each node, assume that your file path is
+`/data/logdevice/logdevice.conf`.
+
+### Start admin server on a single node
+
+- Configuration file stored in ZooKeeper：
+
+  ```shell-vue
+  docker run --rm -d --name storeAdmin --network host -v /data/logdevice:/data/logdevice \
+          hstreamdb/hstream:{{ $version() }} /usr/local/bin/ld-admin-server \
+          --config-path zk:10.100.2.11:2181/logdevice.conf \
+          --enable-maintenance-manager \
+          --maintenance-log-snapshotting \
+          --enable-safety-check-periodic-metadata-update
+  ```
+
+  - If you have a multi-node ZooKeeper ensemble, Replace `--config-path`
+    parameter to:
+    `--config-path zk:10.100.2.11:2181,10.100.2.12:2181,10.100.2.13:2181/logdevice.conf`
+
+- Configuration file stored in each node：
+
+  Replace `--config-path` parameter to
+  `--config-path /data/logdevice/logdevice.conf`
+
+### Start logdeviced on every node
+
+- Configuration file stored in ZooKeeper：
+
+  ```shell-vue
+  docker run --rm -d --name hstore --network host -v /data/logdevice:/data/logdevice \
+          hstreamdb/hstream:{{ $version() }} /usr/local/bin/logdeviced \
+          --config-path zk:10.100.2.11:2181/logdevice.conf \
+          --name store-0 \
+          --address 192.168.0.3 \
+          --local-log-store-path /data/logdevice
+  ```
+
+  - For each node, you should update the `--name` to a **different value** and
+    `--address` to the host IP address of that node.
+
+- Configuration file stored in each node：
+
+  Replace `--config-path` parameter to
+  `--config-path /data/logdevice/logdevice.conf`
+
+### Bootstrap the cluster
+
+After starting the admin server and logdeviced for each storage node, now we can
+bootstrap our cluster.
+
+On the admin server node, run:
+
+```shell
+docker exec storeAdmin hadmin store nodes-config bootstrap --metadata-replicate-across 'node:3'
+```
+
+And you should see something like this:
+
+```
+Successfully bootstrapped the cluster, new nodes configuration version: 7
+Took 0.019s
+```
+
+You can check the cluster status by run:
+
+```shell
+docker exec storeAdmin hadmin store status
+```
+
+And the result should be:
+
+```
++----+---------+----------+-------+-----------+---------+---------------+
+| ID |  NAME   | PACKAGE  | STATE |  UPTIME   |  SEQ.   | HEALTH STATUS |
++----+---------+----------+-------+-----------+---------+---------------+
+| 0  | store-0 | 99.99.99 | ALIVE | 2 min ago | ENABLED | HEALTHY       |
+| 1  | store-2 | 99.99.99 | ALIVE | 2 min ago | ENABLED | HEALTHY       |
+| 2  | store-1 | 99.99.99 | ALIVE | 2 min ago | ENABLED | HEALTHY       |
++----+---------+----------+-------+-----------+---------+---------------+
+Took 7.745s
+```
+
+Now we finish setting up the `HStore` cluster.
+
+## Set up HServer cluster
+
+To start a single `HServer` instance, you can modify the start command to fit
+your situation:
+
+```shell-vue
+docker run -d --name hstream-server --network host \
+        hstreamdb/hstream:{{ $version() }} /usr/local/bin/hstream-server \
+        --bind-address $SERVER_HOST \
+        --advertised-address $SERVER_HOST \
+        --seed-nodes $SERVER_HOST \
+        --metastore-uri zk://$ZK_ADDRESS \
+        --store-config zk:$ZK_ADDRESS/logdevice.conf \
+        --store-admin-host $ADMIN_HOST \
+        --server-id 1
+```
+
+- `$SERVER_HOST` ：The host IP address of your server node, e.g `192.168.0.1`
+- `metastore-uri`: The address of HMeta, it currently support `zk://$ZK_ADDRESS` for zookeeper and `rq://$RQ_ADDRESS` for rqlite (experimental).
+- `$ZK_ADDRESS` ：Your ZooKeeper ensemble address list, e.g
+  `10.100.2.11:2181,10.100.2.12:2181,10.100.2.13:2181`
+- `--store-config` ：The path to your `HStore` configuration file. Should match
+  the value of the `--config-path` parameter when starting the `HStore` cluster
+- `--store-admin-host`： The IP address of the `HStore Admin Server` node
+- `--server-id` ：You should set a **unique identifier** for each server
+  instance
+
+You can start multiple server instances on different nodes in the same way.
diff --git a/docs/v0.17.0/deploy/deploy-helm.md b/docs/v0.17.0/deploy/deploy-helm.md
@@ -0,0 +1,106 @@
+# Running on Kubernetes by Helm
+
+This document describes how to run HStreamDB kubernetes using the helm chart
+that we provide. The document assumes basic previous kubernetes knowledge. By
+the end of this section, you'll have a fully running HStreamDB cluster on
+kubernetes that's ready to receive reads/writes, process datas, etc.
+
+## Building your Kubernetes Cluster
+
+The first step is to have a running kubernetes cluster. You can use a managed
+cluster (provided by your cloud provider), a self-hosted cluster or a local
+kubernetes cluster using a tool like minikube. Make sure that kubectl points to
+whatever cluster you're planning to use.
+
+Also, you need a storageClass, you can create by `kubectl`or by your cloud
+provider web page if it has. minikube provides a storage class called `standard`
+by default, which is used by the helm chart by default.
+
+## Starting HStreamDB
+
+### Clone code and get helm dependencies
+
+```sh
+git clone https://github.com/hstreamdb/hstream.git
+cd hstream/deploy/chart/hstream/
+helm dependency build .
+```
+
+### Deploy HStreamDB by Helm
+
+```sh
+helm install my-hstream .
+```
+
+Helm chart also provides the `value.yaml` file where you can modify your
+configuration, for example when you want to use other storage classes to deploy
+the cluster, you can modify `logdevice.persistence.storageClass` and
+`zookeeper.persistence.storageClass` in `value.yaml`, and use
+`helm install my-hstream -f values.yaml .` to deploy.
+
+### Check Cluster Status
+
+The `helm install` command will deploy the zookeeper cluster, logdevice cluster
+and hstream cluster, this can take some time, you can check the status of the
+cluster with `kubectl get pods`, there will be some `Error` and
+`CrashLoopBackOff` status during the cluster deployment, these will disappear
+after some time, eventually you will see something like the following.
+
+```
+NAME                                                 READY   STATUS    RESTARTS      AGE
+my-hstream-0                                         1/1     Running   3 (16h ago)   16h
+my-hstream-1                                         1/1     Running   2 (16h ago)   16h
+my-hstream-2                                         1/1     Running   0             16h
+my-hstream-logdevice-0                               1/1     Running   3 (16h ago)   16h
+my-hstream-logdevice-1                               1/1     Running   3 (16h ago)   16h
+my-hstream-logdevice-2                               1/1     Running   0             16h
+my-hstream-logdevice-3                               1/1     Running   0             16h
+my-hstream-logdevice-admin-server-6867fd9494-bk5mf   1/1     Running   3 (16h ago)   16h
+my-hstream-zookeeper-0                               1/1     Running   0             16h
+my-hstream-zookeeper-1                               1/1     Running   0             16h
+my-hstream-zookeeper-2                               1/1     Running   0             16h
+```
+
+You can check the status of the HStreamDB cluster with the `hadmin server`
+command.
+
+```sh
+kubectl exec -it hstream-1 -- bash -c "hadmin server status"
+```
+```
++---------+---------+------------------+
+| node_id |  state  |     address      |
++---------+---------+------------------+
+| 100     | Running | 172.17.0.4:6570  |
+| 101     | Running | 172.17.0.10:6570 |
+| 102     | Running | 172.17.0.12:6570 |
++---------+---------+------------------+
+```
+
+## Manage HStore Cluster
+
+Now you can run `hadmin store` to manage the hstore cluster.
+To check the state of the cluster, you can then run:
+
+```sh
+kubectl exec -it my-hstream-0 -- bash -c "hadmin store --host my-hstream-logdevice-admin-server status"
+```
+```
++----+------------------------+----------+-------+--------------+----------+
+| ID |     NAME               | PACKAGE  | STATE |    UPTIME    | LOCATION |
++----+------------------------+----------+-------+--------------+----------+
+| 0  | my-hstream-logdevice-0 | 99.99.99 | ALIVE | 16 hours ago |          |
+| 1  | my-hstream-logdevice-1 | 99.99.99 | DEAD  | 16 hours ago |          |
+| 2  | my-hstream-logdevice-2 | 99.99.99 | DEAD  | 16 hours ago |          |
+| 3  | my-hstream-logdevice-3 | 99.99.99 | DEAD  | 16 hours ago |          |
++----+------------------------+----------+-------+--------------+----------+
++---------+-------------+---------------+------------+---------------+
+|  SEQ.   | DATA HEALTH | STORAGE STATE | SHARD OP.  | HEALTH STATUS |
++---------+-------------+---------------+------------+---------------+
+| ENABLED | HEALTHY(1)  | READ_WRITE(1) | ENABLED(1) | HEALTHY       |
+| ENABLED | HEALTHY(1)  | READ_WRITE(1) | ENABLED(1) | HEALTHY       |
+| ENABLED | HEALTHY(1)  | READ_WRITE(1) | ENABLED(1) | HEALTHY       |
+| ENABLED | HEALTHY(1)  | READ_WRITE(1) | ENABLED(1) | HEALTHY       |
++---------+-------------+---------------+------------+---------------+
+Took 16.727s
+```