Skip to content

Commit

Permalink
update: cluster management
Browse files Browse the repository at this point in the history
  • Loading branch information
chongyuanyin committed Aug 8, 2024
1 parent 9909a8e commit 96ecda3
Show file tree
Hide file tree
Showing 27 changed files with 266 additions and 140 deletions.
4 changes: 2 additions & 2 deletions ecp/directory.json
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@
},
{
"title": "EMQX 集群监控",
"path": "https://docs.emqx.com/zh/enterprise/v4.4/getting-started/dashboard-ee.html#%E7%9B%91%E6%8E%A7"
"path": "monitor/monitor_cluster"
},
{
"title": "边缘服务监控",
Expand Down Expand Up @@ -326,7 +326,7 @@
},
{
"title": "Monitor EMQX Clusters",
"path": "https://docs.emqx.com/en/enterprise/v4.4/getting-started/dashboard-ee.html#monitor"
"path": "monitor/monitor_cluster"
},
{
"title": "Monitor Edge Services",
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified ecp/en_US/cluster/_assets/cluster-existing-reg.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified ecp/en_US/cluster/_assets/cluster-list.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
87 changes: 53 additions & 34 deletions ecp/en_US/cluster/add_manage.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,27 @@
# Add EMQX Clusters

ECP supports adding clusters by creating (recommended) or adding existing EMQX clusters. It is recommended to add clusters by creating with ECP, which offers more extensive functionality and allows for license and connection quota sharing.
ECP supports adding clusters by creating (recommended) or adding existing EMQX clusters:

- Creating clusters with ECP offers more extensive functionality and allows for license and connection quota sharing.
- Adding clusters into ECP allows easy management for existing clusters. ECP supports management for EMQX v4 Enterprise Edition (4.4.6 and above) and EMQX v5 Enterprise Edition (5.6.0 and above).

There are functional differences between creating (**Hosted Clusters**) and managing clusters (**Managed Clusters**) on the ECP platform, as shown in the table below.

|Function|Hosted Clusters|Managed Clusters|
|:--------:|:----:|:----:|
|Start/Stop|||
|Horizontal Scaling|||
|Vertical Scaling|||
|Update Network Type|||
|Update Connection Limit|||
|Upgrade/Downgrade|||
|Cluster Transfer|||
|Delete|||
|Log|||
|Function|Hosted v4 Clusters|Managed v4 Clusters|Managed v5 Clusters|
|:--------:|:----:|:----:|:----:|
|Start/Stop||||
|Deletion||||
|Horizontal Scaling||||
|Vertical Scaling||||
|Update Network Type||||
|Update Connection Limit|||*|
|Upgrade/Downgrade||||
|Log||||
|Cluster Monitor||||
|Cluster Alarm||||
|Cluster Transfer||||

\* For managed clusters, **Update Connection Limit** feature applies to EMQX v5.7.0 and above.

## Create a Hosted Cluster

Expand All @@ -33,6 +40,30 @@ The newly-created clusters will be listed in the **Cluster List** panel with the

<img src="./_assets/cluster-running.png" alt="cluster-running" style="zoom:50%;" />

## Status for Hosted Cluster

You can start or stop a cluster as your business requirement changes.

1. Log in as system admin, organization admin, or project admin.
2. On the target cluster, click the more icon and select **Stop**/**Start**.

Hosted EMQX cluster can be in the following states:

| Status | Description |
| ------------------ | ------------------------------------------------------------ |
| Creating | Intermediate state during the process of new cluster creation |
| Updating | Intermediate state during cluster OM operations, such as horizontal or vertical scaling, network type modifications, connection number modifications, cluster upgrade or downgrade |
| Starting | When starting the service |
| Running | Normal running state of the cluster |
| Stopping | When stopping the service or an intermediate state after deleting a cluster |
| Stopped | After stopping or deleting |
| Syncing Status | Intermediate state during horizontal or vertical scaling, cluster upgrade or downgrade, network type modifications, connection number modifications |
| Downgraded Running | One or more nodes of the cluster are unavailable, but the overall cluster is still usable |
| Error | The most recent task executed by the cluster failed (can auto-recover), or a cluster fault or dirty data occurred (this state rarely appears)<!--shall we remove the dirty data part?--> |
| Nonexistent | The task to create the cluster was not successfully issued |

For clusters in the state of Error, you can click the more icon and click **Try Fix**. If the problem is successfully solved, the cluster state will be Running; or consider deleting the cluster or reaching out to EMQ's technical support.

## Add an Existing Cluster

ECP also provides the capability to manage existing EMQX clusters. ECP supports the management of EMQX v4 (version 4.4.6 and above) and EMQX v5 (version 5.6.0 and above).
Expand Down Expand Up @@ -109,30 +140,18 @@ ECP also provides the capability to manage existing EMQX clusters. ECP supports
![cluster-v5-dashboard](./_assets/cluster-v5-dashboard.png)


## Cluster Status

You can start or stop a cluster as your business requirement changes.

1. Log in as system admin, organization admin, or project admin.
2. On the target cluster, click the more icon and select **Stop**/**Start**.
## Status for Managed Cluster

Managed EMQX cluster can be in the following states:

| Status | Description |
| ----------- | ------------------------------------------------------------ |
| Created | Cluster with no node registered yet |
| Registering | Intermediate state during cluster node registration |
| Running | Normal running state of the cluster |
| Deleting | Intermediate state before cluster deletion completes |
| Error | Abnormal running state of the cluster, or network connection issue between agent and cluster or between agent and ECP |

EMQX cluster can be in the following states:

| Status | Description |
| ------------------ | ------------------------------------------------------------ |
| Creating | Intermediate state during the process of new cluster creation |
| Updating | Intermediate state during cluster OM operations, such as horizontal or vertical scaling, network type modifications, connection number modifications, cluster upgrade or downgrade |
| Starting | When starting the service |
| Running | Normal running state of the cluster |
| Stopping | When stopping the service or an intermediate state after deleting a cluster |
| Stopped | After stopping or deleting |
| Syncing Status | Intermediate state during horizontal or vertical scaling, cluster upgrade or downgrade, network type modifications, connection number modifications |
| Downgraded Running | One or more nodes of the cluster are unavailable, but the overall cluster is still usable |
| Error | The most recent task executed by the cluster failed (can auto-recover), or a cluster fault or dirty data occurred (this state rarely appears)<!--shall we remove the dirty data part?--> |
| Nonexistent | The task to create the cluster was not successfully issued |

For clusters in the state of Error, you can click the more icon and click **Try Fix**. If the problem is successfully solved, the cluster state will be Running; or consider deleting the cluster or reaching out to EMQ's technical support.
For clusters in the state of Error, you can click the Error status icon to view possible cause.

<!--also the English for the status should be confirmed-->
6 changes: 3 additions & 3 deletions ecp/en_US/cluster/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@

In ECP, clusters refer to the EMQX clusters deployed on cloud servers, which serve as high-performance message broker for IoT devices. Built on the MQTT protocol, EMQX offers a lightweight, reliable, and scalable solution for communication between IoT devices. It excels in real-time, high availability, and easy implementation. For a comprehensive understanding of EMQX, please refer to the detailed documentation available on [EMQX Enterprise](https://docs.emqx.com/en/enterprise/v4.4/).

With ECP's cluster management features, users can efficiently handle multiple clusters, create new ones, onboard existing ones, and perform various tasks like troubleshooting, scaling, modifying network types, adjusting connections, upgrading/downgrading, transferring ownership, and deletion. The platform's user-friendly interface offers cluster information overview and log access for improved visibility.
With ECP's cluster management features, users can efficiently handle multiple clusters, create new ones, onboard existing ones, and perform various tasks like troubleshooting, scaling, modifying network types, adjusting connections, upgrading/downgrading, transferring ownership, and deletion. The platform's user-friendly interface offers cluster information overview, monitor, alarm and log access for improved visibility.

## Access Cluster Workspace

After logging in, you can find the **Workspace** option in the ribbon area. Click on it to navigate to the **Workspace - Cluster** page. This page provides an overview of the EMQX clusters hosted or managed by ECP and displays this project's current number of members.
After logging in, you can find the **Workspace** option in the ribbon area. Click on it to navigate to the **Workspace - Cluster** page. This page provides an overview of the EMQX clusters hosted or managed by ECP.

:::tip
System admin, organization admin, project admin, and regular users all can access this page, however, regular users do not have access to the administration page, and the **Workspace - Cluster** page serves as their landing page.
System admin, organization admin, project admin, and regular users all can access this page.

For the permission of each role, see [Permissions and Roles](../acl/authorize.md#roles-and-permissions).
:::
Expand Down
43 changes: 35 additions & 8 deletions ecp/en_US/cluster/ops.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ To view the external IP and port, click on the Cluster name (or Cluster ID), or
![LoadBalancer](./_assets/cluster-loadbalancer.png)


## Update Connect Limit (Kubernetes Deployment)
## Update Connect Limit

You can dynamically modify the number of connections in the cluster according to business needs:

Expand All @@ -95,9 +95,9 @@ You can dynamically modify the number of connections in the cluster according to

However, please be noted that:

1. The number of connections is limited by the total number of connections allowed by the license.
2. If you are using the LoadBalancer network type, please avoid modifying the number of connections if not necessary, otherwise, it will cause the LoadBalancer's IP address to change.
1. For hosted cluster, if you are using the LoadBalancer network type, please avoid modifying the number of connections if not necessary, otherwise, it will cause the LoadBalancer's IP address to change.
2. For managed cluster, This feature applies to EMQX v5.7.0 and above. The license quota on ECP will be restored once unregistration or cluster deletion completes, and the connections assigned to EMQX cluster will be reclaimed then. Please reset the EMQX license via **Reset License** from EMQX dashboard.
3. For clusters in the state of Error, you can click the Error status icon to view possible cause.


## Upgrade (Kubernetes Deployment)
Expand Down Expand Up @@ -128,17 +128,44 @@ For easier management, ECP provides a feature for transferring EMQX clusters acr
ECP offers a unified log feature.

1. Log in as system admin, organization admin, or project admin.

2. On the target cluster, click the more icon and select **Log**.
2. If the existing cluster is added into ECP, please enable and specify logs location when registering cluster node.
- parameter for enabling log collection: `--emqx-log-collection-enabled`
- parameter for cluster log directory path: `--emqx-log-collection-dir /opt/emqx/log`. If the cluster is installed via docker, please mount the log directory on the host machine into the container, and use the directory path on the host machine in the parameter.
3. On the target cluster, click the more icon and select **Log**.

You will be directed to the **Log** page, where you can view the log level, generated time, and log messages. For more information on logs, see [Logs](../log/introduction.md).

<img src="./_assets/cluster-log.png" alt="log" style="zoom:50%;" />

## Delete Clusters
## Delete Cluster

For unused clusters, it's advisable to delete them to save IT resources.

1. Log in as system admin, organization admin, or project admin.

2. On the target cluster, click the more icon and select **Delete** and confirm the action. ECP will first stop the cluster before proceeding with the deletion.

## Monitor Cluster

ECP provides the status overview for managed clusters on the **Cluster Monitor** page. For details, see [Monitor EMQX Clusters](../monitor/monitor_cluster.md)

:::tip

Monitoring on clusters feature applies to EMQX v5.

:::

## View Cluster Alarms

ECP provides the alarm management for cluster rules and connectors on the **Alarm** page.

To notify cluster alarms by email or Webhook, enable "Push EMQX Alarm" when creating notification.

<img src="./_assets/cluster-alarm-notification.png" style="zoom: 80%;" align="middle">

For other details of alarm, see [Alarms](../monitor/alarm_rules.md)

:::tip

Cluster alarms feature applies to EMQX v5.

::
2 changes: 1 addition & 1 deletion ecp/en_US/edge_service/edge_project_statistics.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,4 @@ After finishing creating the edge service instances or adding existing edge serv

## Driver and Rule Lists

Underneath these two cards, you'll find the driver and rule lists for edge services, like list for not running drivers, list for abnormal drivers, list for not running rules. These lists provide name and type for each driver or rule, and details about instance it runs on, including edge service name, status, endpoint and version. You can filter edge services in the list by nam. Moreover, you can perform O&M actions per instance by clicking **Details** button from Action column.
Underneath these two cards, you'll find the driver and rule lists for edge services, like list for not running drivers, list for abnormal drivers, list for not running rules. These lists provide name and type for each driver or rule, and details about instance it runs on, including edge service name, status, endpoint and version. You can filter edge services in the list by name. Moreover, you can perform O&M actions per instance by clicking **Details** button from Action column.
Binary file modified ecp/en_US/monitor/_assets/alert-notification.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified ecp/en_US/monitor/_assets/alert-rules.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ecp/en_US/monitor/_assets/cluster-monitor.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 8 additions & 4 deletions ecp/en_US/monitor/alarm_rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ You can configure the notification silence duration and the objects for which th

If the silence duration applies to "Single alarm level", then ECP won't repeatedly send notifications for the same alarm within the silence duration period. Notifications will resume once the silence duration expires.

If the silence duration applies to "Edge service instance level", then any alarms generated on the same edge service within the silence duration period won't trigger repeated notifications. Notifications will resume once the silence duration expires.
If the silence duration applies to "Edge/Cluster service instance level", then any alarms generated on the same edge service or cluster within the silence duration period won't trigger repeated notifications. Notifications will resume once the silence duration expires.

Notification silence settings only affect alarms notification through emails and Webhooks. All alarm events will still be displayed in the Active/History Alarms.

Expand All @@ -58,7 +58,7 @@ Log in as system admins, organization admins, or project admins, you can also se

![alert_rules](./_assets/alert-rules.png)

ECP currently supports alarm rules triggered by edge services and those triggered by ECP itself. Rules triggered by edge services include NeuronEX driver exceptions, NeuronEX rule exceptions, and NeuronEX restarted event. ECP-triggered rules include NeuronEX offline event, email sending failures, and Webhook sending failures. For more details on these rules, please refer to the [Operations Management - Alarm Rules List](../monitor/rules.md).
ECP currently supports alarm rules triggered by edge services, by EMQX clusters and by ECP itself. Rules triggered by edge services include NeuronEX driver exceptions, NeuronEX rule exceptions, NeuronEX offline event, and NeuronEX restarted event. Rules triggered by clusters include EMQX rule exceptions and EMQX connector exceptions. ECP-triggered rules include email sending failures, and Webhook sending failures. For more details on these rules, please refer to the [Operations Management - Alarm Rules List](../monitor/rules.md).

You can set both the triggering conditions and rescovery conditions for each rule. The only exception is **NeuronEX restart** alarm rule, which you cannot set for either. You can set smaller triggering values if you want alarms to be more sensitive. Or you can set larger triggering values if you prefer to limit the frequency of alarms. Currently, the upper limit for triggering and recovery values is 10.

Expand All @@ -70,16 +70,20 @@ Log in as system admins, organization admins, or project admins, you can also se

![alarm-notification-config](./_assets/alarm-notification-config.png)

ECP supports configuring one or more alarm notifications. Different alarm notifications are associated with different edge services by service tags. When alarms are triggered on these associated edge services, notifications will be sent to the corresponding email and Webhooks.
ECP supports configuring one or more alarm notifications. Different alarm notifications are associated with different edge services by service tags, or associated to cluster if **Push EMQX Alarm** is enabled. When alarms are triggered on these associated edge services or clusters, notifications will be sent to the corresponding email and Webhooks.

<img src="./_assets/alert-notification.png" style="zoom: 50%;" align="middle">
<img src="./_assets/alert-notification.png" style="zoom: 80%;" align="middle">

### Alarmed Edge Services

If "All" is selected, any alarms triggered on edge services within the project will be sent to the emails and Webhooks set in this configuration. Alternatively, one or more service tags can be chosen, and only alarms from edge services associated with these selected tags will be notified.

Please note: If the alarm is triggered on project level, such as email sending failure or Webhook sending failure alarms, notifications will be sent to emails and Webhooks in all notification configurations.

### Push EMQX Alarm

If "Push EMQX Alarm" is enabled, any alarm generated on clusters within the project will be sent to the email or Webhook.

### Email Notification

1. Click the **Email Notification** toggle switch to enable notification by email.
Expand Down
15 changes: 15 additions & 0 deletions ecp/en_US/monitor/monitor_cluster.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Monitor EMQX Cluster

ECP provides a comprehensive operating status overview on the **Cluster Monitor** page.

![cluster-monitor](./_assets/cluster-monitor.png)

## Basic Statistics

- Connection: The number of all connections and live connections of all clusters in the project.
- Rule: The number of total cluster rules, running rules and stopped rules in the project.
- Connector: The number of total cluster connectors, connected ones and disconnected ones in the project.

### Cluster Rule and Connector Lists

Underneath these cards, you'll find the details of stopped cluster rules and disconnected connectors lists. You can filter the lists by cluster name. Moreover, you can perform O&M actions per instance by clicking **Details** button from Action column.
Loading

0 comments on commit 96ecda3

Please sign in to comment.