Skip to content

v1.2.0

Pre-release
Pre-release
Compare
Choose a tag to compare
@github-actions github-actions released this 07 Nov 20:36
· 52 commits to main since this release
v1.2.0
e9d6d83

Release 1.2.0

What's New

  • New Router Metrics
  • Changes to identity connect status
  • HA Bootstrap Changes
  • Connect Events
  • SDK Events
  • Bug fixes and other HA work

New Router Metrics

The following new metrics are available for edge routers:

  1. edge.connect.failures - meter tracking failed connect attempts from sdks
    This tracks failures to not having a valid token. Other failures which
    happen earlier in the connection process may not be tracked here.
  2. edge.connect.successes - meter tracking successful connect attempts from sdks
  3. edge.disconnects - meter tracking disconnects of previously successfully connected sdks
  4. edge.connections - gauge tracking count of currently connected sdks

Identity Connect Status

Ziti tracks whether an identity is currently connected to an edge router.
This is the hasEdgeRouterConnection field on Identity.

Identity connection status used to be driven off of heartbeats from the edge router.
This feature doesn't work correctly when running with controller HA.

To address this, while also providing more operation insight, connect events were added
(see below for more details on the events themselves).

The controller can be configured to use status from heartbeats, connect events or both.
If both are used as source, then if either reports the identity as connected, then it
will show as connected. This is intended for when you have a mix of routers and they
don't all yet supported connect events.

The controller now also aims to be more precise about identity state. There is a new
field on Identity: edgeRouterConnectionStatus. This field can have one of three
values:

  • offline
  • online
  • unknown

If the identity is reported as connected to any ER, it will be marked as online.
If the identity has been reported as connected, but the reporting ER is now
offline, the identity may still be connected to the ER. While in this state
it will be marked as 'unknown'. After a configurable interval, it will be marked
as offline.

New controller config options:

identityStatusConfig:
  # valid values ['heartbeats', 'connect-events', 'hybrid']
  # defaults to 'hybrid' for now
  source: connect-events 

  # determines how often we scan for disconnected routers
  # defaults to 1 minute
  scanInterval: 1m

  # determines how long an identity will stay in unknown status before it's marked as offline
  # defaults to 5m
  unknownTimeout: 5m

HA Bootstrapping Changes

Previously bootstrapping the RAFT cluster and initializing the controller with a
default administrator were separate operations.
Now, the raft cluster will be bootstrapped whenever the controller is initialized.

The controller can be initialized as follows:

  1. Using ziti agent controller init
  2. Using ziti agent controller init-from-db
  3. Specifying a db: entry in the config file. This is equivalent to using ziti agent controller init-from-db.

Additionally:

  1. minClusterSize has been removed. The cluster will always be initialized with a size of 1.
  2. bootstrapMembers has been renamed to initialMembers. If initialMembers are specified,
    the bootstrapping controller will attempt to add them after bootstrap has been complete. If
    they are invalid they will be ignored. If they can't be reached (because they're not running
    yet), the controller will continue to retry until they are reached, or it is restarted.

Connect Events

These are events generated when a successful connection is made to a controller, from any of:

  1. Identity, using the REST API
  2. Router
  3. Controller (peer in an HA cluster)

They are also generated when an SDK connects to a router.

Controller Configuration

events:
  jsonLogger:
    subscriptions:
      - type: connect
    handler:
      type: file
      format: json
      path: /tmp/ziti-events.log

Router Configuration

connectEvents:
  # defaults to true. 
  # If set to false, minimal information about which identities are connected will still be 
  # sent to the controller, so the `edgeRouterConnectionStatus` field can be populated, 
  # but connect events will not be generated.
  enabled: true

  # The interval at which connect information will be batched up and sent to the controller. 
  # Shorter intervals will improve data resolution on the controller. Longer intervals could
  # more efficient.
  batchInterval: 3s

  # The router will also periodically sent the full state to the controller, to ensure that 
  # it's in sync. It will do this automatically if the router gets disconnected from the 
  # controller, or if the router is unable to send a connect events messages to the controller.
  # This controls how often the full state will be sent under ordinairy conditions
  fullSyncInterval: 5m

  # If enabled is set to true, the router will collect connect events and send them out
  # at the configured batch interval. If there are a huge number of connecting identities
  # or if the router is disconnected from the controller for a time, it may be unable to
  # send events. In order to prevent queued events from exhausting memory, a maximum 
  # queue size is configured. 
  # Default value 100,000
  maxQueuedEvents: 100000
  

Example Events

{
  "namespace": "connect",
  "src_type": "identity",
  "src_id": "ji2Rt8KJ4",
  "src_addr": "127.0.0.1:59336",
  "dst_id": "ctrl_client",
  "dst_addr": "localhost:1280/edge/management/v1/edge-routers/2L7NeVuGBU",
  "timestamp": "2024-10-02T12:17:39.501821249-04:00"
}
{
  "namespace": "connect",
  "src_type": "router",
  "src_id": "2L7NeVuGBU",
  "src_addr": "127.0.0.1:42702",
  "dst_id": "ctrl_client",
  "dst_addr": "127.0.0.1:6262",
  "timestamp": "2024-10-02T12:17:40.529865849-04:00"
}
{
  "namespace": "connect",
  "src_type": "peer",
  "src_id": "ctrl2",
  "src_addr": "127.0.0.1:40056",
  "dst_id": "ctrl1",
  "dst_addr": "127.0.0.1:6262",
  "timestamp": "2024-10-02T12:37:04.490859197-04:00"
}

SDK Events

Building off of the connect events, there are events generated when an identity/sdk comes online or goes offline.

events:
  jsonLogger:
    subscriptions:
      - type: sdk
    handler:
      type: file
      format: json
      path: /tmp/ziti-events.log
{
  "namespace": "sdk",
  "event_type" : "sdk-online",
  "identity_id": "ji2Rt8KJ4",
  "timestamp": "2024-10-02T12:17:39.501821249-04:00"
}

{
  "namespace": "sdk",
  "event_type" : "sdk-status-unknown",
  "identity_id": "ji2Rt8KJ4",
  "timestamp": "2024-10-02T12:17:40.501821249-04:00"
}

{
  "namespace": "sdk",
  "event_type" : "sdk-offline",
  "identity_id": "ji2Rt8KJ4",
  "timestamp": "2024-10-02T12:17:41.501821249-04:00"
}

Component Updates and Bug Fixes