Previous change logs can be found at CHANGELOG-3.3.

The minimum recommended etcd versions to run in production are 3.2.28+, 3.3.18+, and 3.4.2+.

v3.4.4 (2020 TBD)

See code changes and v3.4 upgrade guide for any breaking changes.

Again, before running upgrades from any previous release, please make sure to read change logs below and v3.4 upgrade guide.

Metrics, Monitoring

See List of metrics for all metrics per release.

Note that any etcd_debugging_* metrics are experimental and subject to change.

Add etcd_debugging_mvcc_total_put_size_in_bytes Prometheus metric.
Fix bug where etcd_debugging_mvcc_db_compaction_keys_total is always 0.

Auth

Fix NoPassword check when adding user through GRPC gateway (issue#11414)

v3.4.3 (2019-10-24)

See code changes and v3.4 upgrade guide for any breaking changes.

Again, before running upgrades from any previous release, please make sure to read change logs below and v3.4 upgrade guide.

Metrics, Monitoring

See List of metrics for all metrics per release.

Note that any etcd_debugging_* metrics are experimental and subject to change.

Change etcd_cluster_version Prometheus metrics to include only major and minor version.

Go

Compile with Go 1.12.12.

v3.4.2 (2019-10-11)

See code changes and v3.4 upgrade guide for any breaking changes.

Again, before running upgrades from any previous release, please make sure to read change logs below and v3.4 upgrade guide.

Dependency

Upgrade google.golang.org/grpc from v1.23.1 to v1.24.0.

etcdctl v3

Fix etcdctl member add command to prevent potential timeout.

etcdserver

Add tracing to range, put and compact requests in etcdserver.
Fix wait purge file loop during shutdown.
- Previously, during shutdown etcd could accidentally remove needed wal files, resulting in catastrophic error etcdserver: open wal error: wal: file not found. during startup.
- Now, etcd makes sure the purge file loop exits before server signals stop of the raft node.

Go

Compile with Go 1.12.9 including Go 1.12.8 security fixes.

client v3

Fix client balancer failover against multiple endpoints.
- Fix "kube-apiserver: failover on multi-member etcd cluster fails certificate check on DNS mismatch" (kubernetes#83028).
Fix IPv6 endpoint parsing in client.
- Fix "1.16: etcd client does not parse IPv6 addresses correctly when members are joining" (kubernetes#83550).

v3.4.1 (2019-09-17)

See code changes and v3.4 upgrade guide for any breaking changes.

Again, before running upgrades from any previous release, please make sure to read change logs below and v3.4 upgrade guide.

Metrics, Monitoring

See List of metrics for all metrics per release.

Note that any etcd_debugging_* metrics are experimental and subject to change.

Add etcd_debugging_mvcc_current_revision Prometheus metric.
Add etcd_debugging_mvcc_compact_revision Prometheus metric.

etcd server

Fix secure server logging message.
Remove redundant % characters in file descriptor warning message.

Package `embed`

Add embed.Config.ZapLoggerBuilder to allow creating a custom zap logger.

Dependency

Upgrade google.golang.org/grpc from v1.23.0 to v1.23.1.

Go

Compile with Go 1.12.9 including Go 1.12.8 security fixes.

v3.4.0 (2019-08-30)

See code changes and v3.4 upgrade guide for any breaking changes.

v3.4.0 (2019-08-30), see code changes.
v3.4.0-rc.4 (2019-08-29), see code changes.
v3.4.0-rc.3 (2019-08-27), see code changes.
v3.4.0-rc.2 (2019-08-23), see code changes.
v3.4.0-rc.1 (2019-08-15), see code changes.
v3.4.0-rc.0 (2019-08-12), see code changes.

Again, before running upgrades from any previous release, please make sure to read change logs below and v3.4 upgrade guide.

Documentation

etcd now has a new website! Please visit https://etcd.io.

Improved

Add Raft learner: etcd#10725, etcd#10727, etcd#10730.
- User guide: runtime-configuration document.
- API change: API reference document.
- More details on implementation: learner design document and implementation task list.
Rewrite client balancer with new gRPC balancer interface.
- Upgrade gRPC to v1.23.0.
- Improve client balancer failover against secure endpoints.
  - Fix "kube-apiserver 1.13.x refuses to work when first etcd-server is not available" (kubernetes#72102).
- Fix gRPC panic "send on closed channel.
- The new client balancer uses an asynchronous resolver to pass endpoints to the gRPC dial function. To block until the underlying connection is up, pass grpc.WithBlock() to clientv3.Config.DialOptions.
Add backoff on watch retries on transient errors.
Add jitter to watch progress notify to prevent spikes in etcd_network_client_grpc_sent_bytes_total.
Improve read index wait timeout warning log, which indicates that local node might have slow network.
Improve slow request apply warning log.
- e.g. read-only range request "key:\"/a\" range_end:\"/b\" " with result "range_response_count:3 size:96" took too long (97.966µs) to execute.
- Redact request value field.
- Provide response size.
Improve "became inactive" warning log, which indicates message send to a peer failed.
Improve TLS setup error logging to help debug TLS-enabled cluster configuring issues.
Improve long-running concurrent read transactions under light write workloads.
- Previously, periodic commit on pending writes blocks incoming read transactions, even if there is no pending write.
- Now, periodic commit operation does not block concurrent read transactions, thus improves long-running read transaction performance.
Make backend read transactions fully concurrent.
- Previously, ongoing long-running read transactions block writes and future reads.
- With this change, write throughput is increased by 70% and P99 write latency is reduced by 90% in the presence of long-running reads.
Improve Raft Read Index timeout warning messages.
Adjust election timeout on server restart to reduce disruptive rejoining servers.
- Previously, etcd fast-forwards election ticks on server start, with only one tick left for leader election. This is to speed up start phase, without having to wait until all election ticks elapse. Advancing election ticks is useful for cross datacenter deployments with larger election timeouts. However, it was affecting cluster availability if the last tick elapses before leader contacts the restarted node.
- Now, when etcd restarts, it adjusts election ticks with more than one tick left, thus more time for leader to prevent disruptive restart.
Add Raft Pre-Vote feature to reduce disruptive rejoining servers.
- For instance, a flaky(or rejoining) member may drop in and out, and start campaign. This member will end up with a higher term, and ignore all incoming messages with lower term. In this case, a new leader eventually need to get elected, thus disruptive to cluster availability. Raft implements Pre-Vote phase to prevent this kind of disruptions. If enabled, Raft runs an additional phase of election to check if pre-candidate can get enough votes to win an election.
Adjust periodic compaction retention window.
- e.g. etcd --auto-compaction-mode=revision --auto-compaction-retention=1000 automatically Compact on "latest revision" - 1000 every 5-minute (when latest revision is 30000, compact on revision 29000).
- e.g. Previously, etcd --auto-compaction-mode=periodic --auto-compaction-retention=24h automatically Compact with 24-hour retention windown for every 2.4-hour. Now, Compact happens for every 1-hour.
- e.g. Previously, etcd --auto-compaction-mode=periodic --auto-compaction-retention=30m automatically Compact with 30-minute retention windown for every 3-minute. Now, Compact happens for every 30-minute.
- Periodic compactor keeps recording latest revisions for every compaction period when given period is less than 1-hour, or for every 1-hour when given compaction period is greater than 1-hour (e.g. 1-hour when etcd --auto-compaction-mode=periodic --auto-compaction-retention=24h).
- For every compaction period or 1-hour, compactor uses the last revision that was fetched before compaction period, to discard historical data.
- The retention window of compaction period moves for every given compaction period or hour.
- For instance, when hourly writes are 100 and etcd --auto-compaction-mode=periodic --auto-compaction-retention=24h, v3.2.x, v3.3.0, v3.3.1, and v3.3.2 compact revision 2400, 2640, and 2880 for every 2.4-hour, while v3.3.3 or later compacts revision 2400, 2500, 2600 for every 1-hour.
- Futhermore, when etcd --auto-compaction-mode=periodic --auto-compaction-retention=30m and writes per minute are about 1000, v3.3.0, v3.3.1, and v3.3.2 compact revision 30000, 33000, and 36000, for every 3-minute, while v3.3.3 or later compacts revision 30000, 60000, and 90000, for every 30-minute.
Improve lease expire/revoke operation performance, address lease scalability issue.
Make Lease Lookup non-blocking with concurrent Grant/Revoke.
Make etcd server return raft.ErrProposalDropped on internal Raft proposal drop in v3 applier and v2 applier.
- e.g. a node is removed from cluster, or raftpb.MsgProp arrives at current leader while there is an ongoing leadership transfer.
Add snapshot package for easier snapshot workflow (see godoc.org/github.com/etcd/clientv3/snapshot for more).
Improve functional tester coverage: proxy layer to run network fault tests in CI, TLS is enabled both for server and client, liveness mode, shuffle test sequence, membership reconfiguration failure cases, disastrous quorum loss and snapshot recover from a seed member, embedded etcd.
Improve index compaction blocking by using a copy on write clone to avoid holding the lock for the traversal of the entire index.
Update JWT methods to allow for use of any supported signature method/algorithm.
Add Lease checkpointing to persist remaining TTLs to the consensus log periodically so that long lived leases progress toward expiry in the presence of leader elections and server restarts.
- Enabled by experimental flag "--experimental-enable-lease-checkpoint".
Add gRPC interceptor for debugging logs; enable etcd --debug flag to see per-request debug information.
Add consistency check in snapshot status. If consistency check on snapshot file fails, snapshot status returns "snapshot file integrity check failed..." error.
Add Verify function to perform corruption check on WAL contents.
Improve heartbeat send failure logging.
Support users with no password for reducing security risk introduced by leaked password. The users can only be authenticated with CommonName based auth.
Add etcd --experimental-peer-skip-client-san-verification to skip verification of peer client address.
Add etcd --experimental-compaction-batch-limit to sets the maximum revisions deleted in each compaction batch.
Reduced default compaction batch size from 10k revisions to 1k revisions to improve p99 latency during compactions and reduced wait between compactions from 100ms to 10ms.

Breaking Changes

Rewrite client balancer with new gRPC balancer interface.
- Upgrade gRPC to v1.23.0.
- Improve client balancer failover against secure endpoints.
  - Fix "kube-apiserver 1.13.x refuses to work when first etcd-server is not available" (kubernetes#72102).
- Fix gRPC panic "send on closed channel.
- The new client balancer uses an asynchronous resolver to pass endpoints to the gRPC dial function. To block until the underlying connection is up, pass grpc.WithBlock() to clientv3.Config.DialOptions.
Require Go 1.12+.
- Compile with Go 1.12.9 including Go 1.12.8 security fixes.
Migrate dependency management tool from glide to Go module.
- <= 3.3 puts vendor directory under cmd/vendor directory to prevent conflicting transitive dependencies.
- 3.4 moves cmd/vendor directory to vendor at repository root.
- Remove recursive symlinks in cmd directory.
- Now go get/install/build on etcd packages (e.g. clientv3, tools/benchmark) enforce builds with etcd vendor directory.
Deprecated latest release container tag.
- docker pull gcr.io/etcd-development/etcd:latest would not be up-to-date.
Deprecated minor version release container tags.
- docker pull gcr.io/etcd-development/etcd:v3.3 would still work.
- docker pull gcr.io/etcd-development/etcd:v3.4 would not work.
- Use docker pull gcr.io/etcd-development/etcd:v3.4.x instead, with the exact patch version.
Deprecated ACIs from official release.
- AppC was officially suspended, as of late 2016.
- acbuild is not maintained anymore.
- *.aci files are not available from v3.4 release.
Move "github.com/coreos/etcd" to "github.com/etcd-io/etcd".
- Change import path to "go.etcd.io/etcd".
- e.g. import "go.etcd.io/etcd/raft".
Make ETCDCTL_API=3 etcdctl default.
- Now, etcdctl set foo bar must be ETCDCTL_API=2 etcdctl set foo bar.
- Now, ETCDCTL_API=3 etcdctl put foo bar could be just etcdctl put foo bar.
Make etcd --enable-v2=false default.
Make embed.DefaultEnableV2 false default.
Deprecated etcd --ca-file flag. Use etcd --trusted-ca-file instead (etcd --ca-file flag has been marked deprecated since v2.1).
Deprecated etcd --peer-ca-file flag. Use etcd --peer-trusted-ca-file instead (etcd --peer-ca-file flag has been marked deprecated since v2.1).
Deprecated pkg/transport.TLSInfo.CAFile field. Use pkg/transport.TLSInfo.TrustedCAFile instead (CAFile field has been marked deprecated since v2.1).
Exit on empty hosts in advertise URLs.
- Address advertise client URLs accepts empty hosts.
- e.g. exit with error on --advertise-client-urls=http://:2379.
- e.g. exit with error on --initial-advertise-peer-urls=http://:2380.
Exit on shadowed environment variables.
- Address error on shadowed environment variables.
- e.g. exit with error on ETCD_NAME=abc etcd --name=def.
- e.g. exit with error on ETCD_INITIAL_CLUSTER_TOKEN=abc etcd --initial-cluster-token=def.
- e.g. exit with error on ETCDCTL_ENDPOINTS=abc.com ETCDCTL_API=3 etcdctl endpoint health --endpoints=def.com.
Change etcdserverpb.AuthRoleRevokePermissionRequest/key,range_end fields type from string to bytes.
Deprecating etcd_debugging_mvcc_db_total_size_in_bytes Prometheus metric (to be removed in v3.5). Use etcd_mvcc_db_total_size_in_bytes instead.
Deprecating etcd_debugging_mvcc_put_total Prometheus metric (to be removed in v3.5). Use etcd_mvcc_put_total instead.
Deprecating etcd_debugging_mvcc_delete_total Prometheus metric (to be removed in v3.5). Use etcd_mvcc_delete_total instead.
Deprecating etcd_debugging_mvcc_range_total Prometheus metric (to be removed in v3.5). Use etcd_mvcc_range_total instead.
Deprecating etcd_debugging_mvcc_txn_totalPrometheus metric (to be removed in v3.5). Use etcd_mvcc_txn_total instead.
Rename etcdserver.ServerConfig.SnapCount field to etcdserver.ServerConfig.SnapshotCount, to be consistent with the flag name etcd --snapshot-count.
Rename embed.Config.SnapCount field to embed.Config.SnapshotCount, to be consistent with the flag name etcd --snapshot-count.
Change embed.Config.CorsInfo in *cors.CORSInfo type to embed.Config.CORS in map[string]struct{} type.
Deprecated embed.Config.SetupLogging.
- Now logger is set up automatically based on embed.Config.Logger, embed.Config.LogOutputs, embed.Config.Debug fields.
Rename etcd --log-output to etcd --log-outputs to support multiple log outputs.
- etcd --log-output will be deprecated in v3.5.
Rename embed.Config.LogOutput to embed.Config.LogOutputs to support multiple log outputs.
Change embed.Config.LogOutputs type from string to []string to support multiple log outputs.
- Now that etcd --log-outputs accepts multiple writers, etcd configuration YAML file log-outputs field must be changed to []string type.
- Previously, etcd --config-file etcd.config.yaml can have log-outputs: default field, now must be log-outputs: [default].
Deprecating etcd --debug flag. Use etcd --log-level=debug flag instead.
- v3.5 will deprecate etcd --debug flag in favor of etcd --log-level=debug.
Change v3 etcdctl snapshot exit codes with snapshot package.
- Exit on error with exit code 1 (no more exit code 5 or 6 on snapshot save/restore commands).
Deprecated grpc.ErrClientConnClosing.
- clientv3 and proxy/grpcproxy now does not return grpc.ErrClientConnClosing.
- grpc.ErrClientConnClosing has been deprecated in gRPC >= 1.10.
- Use clientv3.IsConnCanceled(error) or google.golang.org/grpc/status.FromError(error) instead.
Deprecated gRPC gateway endpoint /v3beta with /v3.
- Deprecated /v3alpha.
- To deprecate /v3beta in v3.5.
- In v3.4, curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}' still works as a fallback to curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}', but curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}' won't work in v3.5. Use curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}' instead.
Change wal package function signatures to support structured logger and logging to file in server-side.
- Previously, Open(dirpath string, snap walpb.Snapshot) (*WAL, error), now Open(lg *zap.Logger, dirpath string, snap walpb.Snapshot) (*WAL, error).
- Previously, OpenForRead(dirpath string, snap walpb.Snapshot) (*WAL, error), now OpenForRead(lg *zap.Logger, dirpath string, snap walpb.Snapshot) (*WAL, error).
- Previously, Repair(dirpath string) bool, now Repair(lg *zap.Logger, dirpath string) bool.
- Previously, Create(dirpath string, metadata []byte) (*WAL, error), now Create(lg *zap.Logger, dirpath string, metadata []byte) (*WAL, error).
Remove pkg/cors package.
Move internal packages to etcdserver.
- "github.com/coreos/etcd/alarm" to "go.etcd.io/etcd/etcdserver/api/v3alarm".
- "github.com/coreos/etcd/compactor" to "go.etcd.io/etcd/etcdserver/api/v3compactor".
- "github.com/coreos/etcd/discovery" to "go.etcd.io/etcd/etcdserver/api/v2discovery".
- "github.com/coreos/etcd/etcdserver/auth" to "go.etcd.io/etcd/etcdserver/api/v2auth".
- "github.com/coreos/etcd/etcdserver/membership" to "go.etcd.io/etcd/etcdserver/api/membership".
- "github.com/coreos/etcd/etcdserver/stats" to "go.etcd.io/etcd/etcdserver/api/v2stats".
- "github.com/coreos/etcd/error" to "go.etcd.io/etcd/etcdserver/api/v2error".
- "github.com/coreos/etcd/rafthttp" to "go.etcd.io/etcd/etcdserver/api/rafthttp".
- "github.com/coreos/etcd/snap" to "go.etcd.io/etcd/etcdserver/api/snap".
- "github.com/coreos/etcd/store" to "go.etcd.io/etcd/etcdserver/api/v2store".
Change snapshot file permissions: On Linux, the snapshot file changes from readable by all (mode 0644) to readable by the user only (mode 0600).
Change pkg/adt.IntervalTree from struct to interface.
- See pkg/adt README and pkg/adt godoc.
Release branch /version defines version 3.4.x-pre, instead of 3.4.y+git.
- Use 3.4.5-pre, instead of 3.4.4+git.

Dependency

Upgrade github.com/coreos/bbolt from v1.3.1-coreos.6 to go.etcd.io/bbolt v1.3.3.
Upgrade google.golang.org/grpc from v1.7.5 to v1.23.0.
Migrate github.com/ugorji/go/codec to github.com/json-iterator/go, to regenerate v2 client (See #10667 for more).
Migrate github.com/ghodss/yaml to sigs.k8s.io/yaml (See #10687 for more).
Upgrade golang.org/x/crypto from crypto@9419663f5 to crypto@0709b304e793.
Upgrade golang.org/x/net from net@66aacef3d to net@adae6a3d119a.
Upgrade golang.org/x/sys from sys@ebfc5b463 to sys@c7b8b68b1456.
Upgrade golang.org/x/text from text@b19bf474d to v0.3.0.
Upgrade golang.org/x/time from time@c06e80d93 to time@fbb02b229.
Upgrade github.com/golang/protobuf from golang/protobuf@1e59b77b5 to v1.3.2.
Upgrade gopkg.in/yaml.v2 from yaml@cd8b52f82 to yaml@5420a8b67.
Upgrade github.com/dgrijalva/jwt-go from v3.0.0 to v3.2.0.
Upgrade github.com/soheilhy/cmux from v0.1.3 to v0.1.4.
Upgrade github.com/google/btree from google/btree@925471ac9 to v1.0.0.
Upgrade github.com/spf13/cobra from spf13/cobra@1c44ec8d3 to v0.0.3.
Upgrade github.com/spf13/pflag from v1.0.0 to spf13/pflag@1ce0cc6db.
Upgrade github.com/coreos/go-systemd from v15 to v17.
Upgrade github.com/prometheus/client_golang from prometheus/client_golang@5cec1d042 to v1.0.0.
Upgrade github.com/grpc-ecosystem/go-grpc-prometheus from grpc-ecosystem/go-grpc-prometheus@0dafe0d49 to v1.2.0.
Upgrade github.com/grpc-ecosystem/grpc-gateway from v1.3.1 to v1.4.1.
Migrate github.com/kr/pty to github.com/creack/pty, as the later has replaced the original module.
Upgrade github.com/gogo/protobuf from v1.0.0 to v1.2.1.

Metrics, Monitoring

See List of metrics for all metrics per release.

Note that any etcd_debugging_* metrics are experimental and subject to change.

Add etcd_snap_db_fsync_duration_seconds_count Prometheus metric.
Add etcd_snap_db_save_total_duration_seconds_bucket Prometheus metric.
Add etcd_network_snapshot_send_success Prometheus metric.
Add etcd_network_snapshot_send_failures Prometheus metric.
Add etcd_network_snapshot_send_total_duration_seconds Prometheus metric.
Add etcd_network_snapshot_receive_success Prometheus metric.
Add etcd_network_snapshot_receive_failures Prometheus metric.
Add etcd_network_snapshot_receive_total_duration_seconds Prometheus metric.
Add etcd_network_active_peers Prometheus metric.
- Let's say "7339c4e5e833c029" server /metrics returns etcd_network_active_peers{Local="7339c4e5e833c029",Remote="729934363faa4a24"} 1 and etcd_network_active_peers{Local="7339c4e5e833c029",Remote="b548c2511513015"} 1. This indicates that the local node "7339c4e5e833c029" currently has two active remote peers "729934363faa4a24" and "b548c2511513015" in a 3-node cluster. If the node "b548c2511513015" is down, the local node "7339c4e5e833c029" will show etcd_network_active_peers{Local="7339c4e5e833c029",Remote="729934363faa4a24"} 1 and etcd_network_active_peers{Local="7339c4e5e833c029",Remote="b548c2511513015"} 0.
Add etcd_network_disconnected_peers_total Prometheus metric.
- If a remote peer "b548c2511513015" is down, the local node "7339c4e5e833c029" server /metrics would return etcd_network_disconnected_peers_total{Local="7339c4e5e833c029",Remote="b548c2511513015"} 1, while active peer metrics will show etcd_network_active_peers{Local="7339c4e5e833c029",Remote="729934363faa4a24"} 1 and etcd_network_active_peers{Local="7339c4e5e833c029",Remote="b548c2511513015"} 0.
Add etcd_network_server_stream_failures_total Prometheus metric.
- e.g. etcd_network_server_stream_failures_total{API="lease-keepalive",Type="receive"} 1
- e.g. etcd_network_server_stream_failures_total{API="watch",Type="receive"} 1
Improve etcd_network_peer_round_trip_time_seconds Prometheus metric to track leader heartbeats.
- Previously, it only samples the TCP connection for snapshot messages.
Increase etcd_network_peer_round_trip_time_seconds Prometheus metric histogram upper-bound.
- Previously, highest bucket only collects requests taking 0.8192 seconds or more.
- Now, highest buckets collect 0.8192 seconds, 1.6384 seconds, and 3.2768 seconds or more.
Add etcd_server_is_leader Prometheus metric.
Add etcd_server_id Prometheus metric.
Add etcd_cluster_version Prometheus metric.
Add etcd_server_version Prometheus metric.
- To replace Kubernetes etcd-version-monitor.
Add etcd_server_go_version Prometheus metric.
Add etcd_server_health_success Prometheus metric.
Add etcd_server_health_failures Prometheus metric.
Add etcd_server_read_indexes_failed_total Prometheus metric.
Add etcd_server_heartbeat_send_failures_total Prometheus metric.
Add etcd_server_slow_apply_total Prometheus metric.
Add etcd_server_slow_read_indexes_total Prometheus metric.
Add etcd_server_quota_backend_bytes Prometheus metric.
- Use it with etcd_mvcc_db_total_size_in_bytes and etcd_mvcc_db_total_size_in_use_in_bytes.
- etcd_server_quota_backend_bytes 2.147483648e+09 means current quota size is 2 GB.
- etcd_mvcc_db_total_size_in_bytes 20480 means current physically allocated DB size is 20 KB.
- etcd_mvcc_db_total_size_in_use_in_bytes 16384 means future DB size if defragment operation is complete.
- etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes is the number of bytes that can be saved on disk with defragment operation.
Add etcd_mvcc_db_total_size_in_use_in_bytes Prometheus metric.
- Use it with etcd_mvcc_db_total_size_in_bytes and etcd_mvcc_db_total_size_in_use_in_bytes.
- etcd_server_quota_backend_bytes 2.147483648e+09 means current quota size is 2 GB.
- etcd_mvcc_db_total_size_in_bytes 20480 means current physically allocated DB size is 20 KB.
- etcd_mvcc_db_total_size_in_use_in_bytes 16384 means future DB size if defragment operation is complete.
- etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes is the number of bytes that can be saved on disk with defragment operation.
Add etcd_mvcc_db_open_read_transactions Prometheus metric.
Add etcd_snap_fsync_duration_seconds Prometheus metric.
Add etcd_disk_backend_defrag_duration_seconds Prometheus metric.
Add etcd_mvcc_hash_duration_seconds Prometheus metric.
Add etcd_mvcc_hash_rev_duration_seconds Prometheus metric.
Add etcd_debugging_disk_backend_commit_rebalance_duration_seconds Prometheus metric.
Add etcd_debugging_disk_backend_commit_spill_duration_seconds Prometheus metric.
Add etcd_debugging_disk_backend_commit_write_duration_seconds Prometheus metric.
Add etcd_debugging_lease_granted_total Prometheus metric.
Add etcd_debugging_lease_revoked_total Prometheus metric.
Add etcd_debugging_lease_renewed_total Prometheus metric.
Add etcd_debugging_lease_ttl_total Prometheus metric.
Add etcd_network_snapshot_send_inflights_total Prometheus metric.
Add etcd_network_snapshot_receive_inflights_total Prometheus metric.
Add etcd_server_snapshot_apply_in_progress_total Prometheus metric.
Add etcd_server_is_learner Prometheus metric.
Add etcd_server_learner_promote_failures Prometheus metric.
Add etcd_server_learner_promote_successes Prometheus metric.
Increase etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds Prometheus metric histogram upper-bound.
- Previously, highest bucket only collects requests taking 1.024 seconds or more.
- Now, highest buckets collect 1.024 seconds, 2.048 seconds, and 4.096 seconds or more.
Fix missing etcd_network_peer_sent_failures_total Prometheus metric count.
Fix etcd_debugging_server_lease_expired_total Prometheus metric.
Fix race conditions in v2 server stat collecting.
Change gRPC proxy to expose etcd server endpoint /metrics.
- The metrics that were exposed via the proxy were not etcd server members but instead the proxy itself.
Fix bug where db_compaction_total_duration_milliseconds metric incorrectly measured duration as 0.
Deprecating etcd_debugging_mvcc_db_total_size_in_bytes Prometheus metric (to be removed in v3.5). Use etcd_mvcc_db_total_size_in_bytes instead.
Deprecating etcd_debugging_mvcc_put_total Prometheus metric (to be removed in v3.5). Use etcd_mvcc_put_total instead.
Deprecating etcd_debugging_mvcc_delete_total Prometheus metric (to be removed in v3.5). Use etcd_mvcc_delete_total instead.
Deprecating etcd_debugging_mvcc_range_total Prometheus metric (to be removed in v3.5). Use etcd_mvcc_range_total instead.
Deprecating etcd_debugging_mvcc_txn_totalPrometheus metric (to be removed in v3.5). Use etcd_mvcc_txn_total instead.

Security, Authentication

See security doc for more details.

Support TLS cipher suite whitelisting.
- To block weak cipher suites.
- TLS handshake fails when client hello is requested with invalid cipher suites.
- Add etcd --cipher-suites flag.
- If empty, Go auto-populates the list.
Add etcd --host-whitelist flag, etcdserver.Config.HostWhitelist, and embed.Config.HostWhitelist, to prevent "DNS Rebinding" attack.
- Any website can simply create an authorized DNS name, and direct DNS to "localhost" (or any other address). Then, all HTTP endpoints of etcd server listening on "localhost" becomes accessible, thus vulnerable to DNS rebinding attacks (CVE-2018-5702).
- Client origin enforce policy works as follow:
  - If client connection is secure via HTTPS, allow any hostnames..
  - If client connection is not secure and "HostWhitelist" is not empty, only allow HTTP requests whose Host field is listed in whitelist.
- By default, "HostWhitelist" is "*", which means insecure server allows all client HTTP requests.
- Note that the client origin policy is enforced whether authentication is enabled or not, for tighter controls.
- When specifying hostnames, loopback addresses are not added automatically. To allow loopback interfaces, add them to whitelist manually (e.g. "localhost", "127.0.0.1", etc.).
- e.g. etcd --host-whitelist example.com, then the server will reject all HTTP requests whose Host field is not example.com (also rejects requests to "localhost").
Support etcd --cors in v3 HTTP requests (gRPC gateway).
Support ttl field for etcd Authentication JWT token.
- e.g. etcd --auth-token jwt,pub-key=<pub key path>,priv-key=<priv key path>,sign-method=<sign method>,ttl=5m.
Allow empty token provider in etcdserver.ServerConfig.AuthToken.
Fix TLS reload when certificate SAN field only includes IP addresses but no domain names.
- In Go, server calls (*tls.Config).GetCertificate for TLS reload if and only if server's (*tls.Config).Certificates field is not empty, or (*tls.ClientHelloInfo).ServerName is not empty with a valid SNI from the client. Previously, etcd always populates (*tls.Config).Certificates on the initial client TLS handshake, as non-empty. Thus, client was always expected to supply a matching SNI in order to pass the TLS verification and to trigger (*tls.Config).GetCertificate to reload TLS assets.
- However, a certificate whose SAN field does not include any domain names but only IP addresses would request *tls.ClientHelloInfo with an empty ServerName field, thus failing to trigger the TLS reload on initial TLS handshake; this becomes a problem when expired certificates need to be replaced online.
- Now, (*tls.Config).Certificates is created empty on initial TLS client handshake, first to trigger (*tls.Config).GetCertificate, and then to populate rest of the certificates on every new TLS connection, even when client SNI is empty (e.g. cert only includes IPs).

etcd server

Add rpctypes.ErrLeaderChanged.
- Now linearizable requests with read index would fail fast when there is a leadership change, instead of waiting until context timeout.
Add etcd --initial-election-tick-advance flag to configure initial election tick fast-forward.
- By default, etcd --initial-election-tick-advance=true, then local member fast-forwards election ticks to speed up "initial" leader election trigger.
- This benefits the case of larger election ticks. For instance, cross datacenter deployment may require longer election timeout of 10-second. If true, local node does not need wait up to 10-second. Instead, forwards its election ticks to 8-second, and have only 2-second left before leader election.
- Major assumptions are that: cluster has no active leader thus advancing ticks enables faster leader election. Or cluster already has an established leader, and rejoining follower is likely to receive heartbeats from the leader after tick advance and before election timeout.
- However, when network from leader to rejoining follower is congested, and the follower does not receive leader heartbeat within left election ticks, disruptive election has to happen thus affecting cluster availabilities.
- Now, this can be disabled by setting etcd --initial-election-tick-advance=false.
- Disabling this would slow down initial bootstrap process for cross datacenter deployments. Make tradeoffs by configuring etcd --initial-election-tick-advance at the cost of slow initial bootstrap.
- If single-node, it advances ticks regardless.
- Address disruptive rejoining follower node.
Add etcd --pre-vote flag to enable to run an additional Raft election phase.
- For instance, a flaky(or rejoining) member may drop in and out, and start campaign. This member will end up with a higher term, and ignore all incoming messages with lower term. In this case, a new leader eventually need to get elected, thus disruptive to cluster availability. Raft implements Pre-Vote phase to prevent this kind of disruptions. If enabled, Raft runs an additional phase of election to check if pre-candidate can get enough votes to win an election.
- etcd --pre-vote=false by default.
- v3.5 will enable etcd --pre-vote=true by default.
Add etcd --experimental-compaction-batch-limit to sets the maximum revisions deleted in each compaction batch.
Reduced default compaction batch size from 10k revisions to 1k revisions to improve p99 latency during compactions and reduced wait between compactions from 100ms to 10ms.
Add etcd --discovery-srv-name flag to support custom DNS SRV name with discovery.
- If not given, etcd queries _etcd-server-ssl._tcp.[YOUR_HOST] and _etcd-server._tcp.[YOUR_HOST].
- If etcd --discovery-srv-name="foo", then query _etcd-server-ssl-foo._tcp.[YOUR_HOST] and _etcd-server-foo._tcp.[YOUR_HOST].
- Useful for operating multiple etcd clusters under the same domain.
Support TLS cipher suite whitelisting.
- To block weak cipher suites.
- TLS handshake fails when client hello is requested with invalid cipher suites.
- Add etcd --cipher-suites flag.
- If empty, Go auto-populates the list.
Support etcd --cors in v3 HTTP requests (gRPC gateway).
Rename etcd --log-output to etcd --log-outputs to support multiple log outputs.
- etcd --log-output will be deprecated in v3.5.
Add etcd --logger flag to support structured logger and multiple log outputs in server-side.
- etcd --logger=capnslog will be deprecated in v3.5.
- Main motivation is to promote automated etcd monitoring, rather than looking back server logs when it starts breaking. Future development will make etcd log as few as possible, and make etcd easier to monitor with metrics and alerts.
- etcd --logger=capnslog --log-outputs=default is the default setting and same as previous etcd server logging format.
- etcd --logger=zap --log-outputs=default is not supported when etcd --logger=zap.
  - Use etcd --logger=zap --log-outputs=stderr instead.
  - Or, use etcd --logger=zap --log-outputs=systemd/journal to send logs to the local systemd journal.
  - Previously, if etcd parent process ID (PPID) is 1 (e.g. run with systemd), etcd --logger=capnslog --log-outputs=default redirects server logs to local systemd journal. And if write to journald fails, it writes to os.Stderr as a fallback.
  - However, even with PPID 1, it can fail to dial systemd journal (e.g. run embedded etcd with Docker container). Then, every single log write will fail and fall back to os.Stderr, which is inefficient.
  - To avoid this problem, systemd journal logging must be configured manually.
- etcd --logger=zap --log-outputs=stderr will log server operations in JSON-encoded format and writes logs to os.Stderr. Use this to override journald log redirects.
- etcd --logger=zap --log-outputs=stdout will log server operations in JSON-encoded format and writes logs to os.Stdout Use this to override journald log redirects.
- etcd --logger=zap --log-outputs=a.log will log server operations in JSON-encoded format and writes logs to the specified file a.log.
- etcd --logger=zap --log-outputs=a.log,b.log,c.log,stdout writes server logs to multiple files a.log, b.log and c.log at the same time and outputs to os.Stderr, in JSON-encoded format.
- etcd --logger=zap --log-outputs=/dev/null will discard all server logs.
Add etcd --log-level flag to support log level.
- v3.5 will deprecate etcd --debug flag in favor of etcd --log-level=debug.
Add etcd --backend-batch-limit flag.
Add etcd --backend-batch-interval flag.
Fix mvcc "unsynced" watcher restore operation.
- "unsynced" watcher is watcher that needs to be in sync with events that have happened.
- That is, "unsynced" watcher is the slow watcher that was requested on old revision.
- "unsynced" watcher restore operation was not correctly populating its underlying watcher group.
- Which possibly causes missing events from "unsynced" watchers.
- A node gets network partitioned with a watcher on a future revision, and falls behind receiving a leader snapshot after partition gets removed. When applying this snapshot, etcd watch storage moves current synced watchers to unsynced since sync watchers might have become stale during network partition. And reset synced watcher group to restart watcher routines. Previously, there was a bug when moving from synced watcher group to unsynced, thus client would miss events when the watcher was requested to the network-partitioned node.
Fix mvcc server panic from restore operation.
- Let's assume that a watcher had been requested with a future revision X and sent to node A that became network-partitioned thereafter. Meanwhile, cluster makes progress. Then when the partition gets removed, the leader sends a snapshot to node A. Previously if the snapshot's latest revision is still lower than the watch revision X, etcd server panicked during snapshot restore operation.
- Now, this server-side panic has been fixed.
Fix server panic on invalid Election Proclaim/Resign HTTP(S) requests.
- Previously, wrong-formatted HTTP requests to Election API could trigger panic in etcd server.
- e.g. curl -L http://localhost:2379/v3/election/proclaim -X POST -d '{"value":""}', curl -L http://localhost:2379/v3/election/resign -X POST -d '{"value":""}'.
Fix revision-based compaction retention parsing.
- Previously, etcd --auto-compaction-mode revision --auto-compaction-retention 1 was translated to revision retention 3600000000000.
- Now, etcd --auto-compaction-mode revision --auto-compaction-retention 1 is correctly parsed as revision retention 1.
Prevent overflow by large TTL values for Lease Grant.
- TTL parameter to Grant request is unit of second.
- Leases with too large TTL values exceeding math.MaxInt64 expire in unexpected ways.
- Server now returns rpctypes.ErrLeaseTTLTooLarge to client, when the requested TTL is larger than 9,000,000,000 seconds (which is >285 years).
- Again, etcd Lease is meant for short-periodic keepalives or sessions, in the range of seconds or minutes. Not for hours or days!
Fix expired lease revoke.
- Fix "the key is not deleted when the bound lease expires".
Enable etcd server raft.Config.CheckQuorum when starting with ForceNewCluster.
Allow non-WAL files in etcd --wal-dir directory.
- Previously, existing files such as lost+found in WAL directory prevent etcd server boot.
- Now, WAL directory that contains only lost+found or a file that's not suffixed with .wal is considered non-initialized.
Fix ETCD_CONFIG_FILE env variable parsing in etcd.
Fix race condition in rafthttp transport pause/resume.
Fix server crash from creating an empty role.
- Previously, creating a role with an empty name crashed etcd server with an error code Unavailable.
- Now, creating a role with an empty name is not allowed with an error code InvalidArgument.

API

Add isLearner field to etcdserverpb.Member, etcdserverpb.MemberAddRequest and etcdserverpb.StatusResponse as part of raft learner implementation.
Add MemberPromote rpc to etcdserverpb.Cluster interface and the corresponding MemberPromoteRequest and MemberPromoteResponse as part of raft learner implementation.
Add snapshot package for snapshot restore/save operations (see godoc.org/github.com/etcd/clientv3/snapshot for more).
Add watch_id field to etcdserverpb.WatchCreateRequest to allow user-provided watch ID to mvcc.
- Corresponding watch_id is returned via etcdserverpb.WatchResponse, if any.
Add fragment field to etcdserverpb.WatchCreateRequest to request etcd server to split watch events when the total size of events exceeds etcd --max-request-bytes flag value plus gRPC-overhead 512 bytes.
- The default server-side request bytes limit is embed.DefaultMaxRequestBytes which is 1.5 MiB plus gRPC-overhead 512 bytes.
- If watch response events exceed this server-side request limit and watch request is created with fragment field true, the server will split watch events into a set of chunks, each of which is a subset of watch events below server-side request limit.
- Useful when client-side has limited bandwidths.
- For example, watch response contains 10 events, where each event is 1 MiB. And server etcd --max-request-bytes flag value is 1 MiB. Then, server will send 10 separate fragmented events to the client.
- For example, watch response contains 5 events, where each event is 2 MiB. And server etcd --max-request-bytes flag value is 1 MiB and clientv3.Config.MaxCallRecvMsgSize is 1 MiB. Then, server will try to send 5 separate fragmented events to the client, and the client will error with "code = ResourceExhausted desc = grpc: received message larger than max (...)".
- Client must implement fragmented watch event merge (which clientv3 does in etcd v3.4).
Add raftAppliedIndex field to etcdserverpb.StatusResponse for current Raft applied index.
Add errors field to etcdserverpb.StatusResponse for server-side error.
- e.g. "etcdserver: no leader", "NOSPACE", "CORRUPT"
Add dbSizeInUse field to etcdserverpb.StatusResponse for actual DB size after compaction.
Add WatchRequest.WatchProgressRequest.
- To manually trigger broadcasting watch progress event (empty watch response with latest header) to all associated watch streams.
- Think of it as WithProgressNotify that can be triggered manually.

Note: v3.5 will deprecate etcd --log-package-levels flag for capnslog; etcd --logger=zap --log-outputs=stderr will the default. v3.5 will deprecate [CLIENT-URL]/config/local/log endpoint.

Package `embed`

Add embed.Config.CipherSuites to specify a list of supported cipher suites for TLS handshake between client/server and peers.
- If empty, Go auto-populates the list.
- Both embed.Config.ClientTLSInfo.CipherSuites and embed.Config.CipherSuites cannot be non-empty at the same time.
- If not empty, specify either embed.Config.ClientTLSInfo.CipherSuites or embed.Config.CipherSuites.
Add embed.Config.InitialElectionTickAdvance to enable/disable initial election tick fast-forward.
- embed.NewConfig() would return *embed.Config with InitialElectionTickAdvance as true by default.
Define embed.CompactorModePeriodic for compactor.ModePeriodic.
Define embed.CompactorModeRevision for compactor.ModeRevision.
Change embed.Config.CorsInfo in *cors.CORSInfo type to embed.Config.CORS in map[string]struct{} type.
Remove embed.Config.SetupLogging.
- Now logger is set up automatically based on embed.Config.Logger, embed.Config.LogOutputs, embed.Config.Debug fields.
Add embed.Config.Logger to support structured logger zap in server-side.
Add embed.Config.LogLevel.
Rename embed.Config.SnapCount field to embed.Config.SnapshotCount, to be consistent with the flag name etcd --snapshot-count.
Rename embed.Config.LogOutput to embed.Config.LogOutputs to support multiple log outputs.
Change embed.Config.LogOutputs type from string to []string to support multiple log outputs.
Add embed.Config.BackendBatchLimit field.
Add embed.Config.BackendBatchInterval field.
Make embed.DefaultEnableV2 false default.

Package `pkg/adt`

Change pkg/adt.IntervalTree from struct to interface.
- See pkg/adt README and pkg/adt godoc.
Improve pkg/adt.IntervalTree test coverage.
- See pkg/adt README and pkg/adt godoc.
Fix Red-Black tree to maintain black-height property.
- Previously, delete operation violates black-height property.

Package `integration`

Add CLUSTER_DEBUG to enable test cluster logging.
- Deprecated capnslog in integration tests.

client v3

Add MemberAddAsLearner to Clientv3.Cluster interface. This API is used to add a learner member to etcd cluster.
Add MemberPromote to Clientv3.Cluster interface. This API is used to promote a learner member in etcd cluster.
Client may receive rpctypes.ErrLeaderChanged from server.
- Now linearizable requests with read index would fail fast when there is a leadership change, instead of waiting until context timeout.
Add WithFragment OpOption to support watch events fragmentation when the total size of events exceeds etcd --max-request-bytes flag value plus gRPC-overhead 512 bytes.
- Watch fragmentation is disabled by default.
- The default server-side request bytes limit is embed.DefaultMaxRequestBytes which is 1.5 MiB plus gRPC-overhead 512 bytes.
- If watch response events exceed this server-side request limit and watch request is created with fragment field true, the server will split watch events into a set of chunks, each of which is a subset of watch events below server-side request limit.
- Useful when client-side has limited bandwidths.
- For example, watch response contains 10 events, where each event is 1 MiB. And server etcd --max-request-bytes flag value is 1 MiB. Then, server will send 10 separate fragmented events to the client.
- For example, watch response contains 5 events, where each event is 2 MiB. And server etcd --max-request-bytes flag value is 1 MiB and clientv3.Config.MaxCallRecvMsgSize is 1 MiB. Then, server will try to send 5 separate fragmented events to the client, and the client will error with "code = ResourceExhausted desc = grpc: received message larger than max (...)".
Add Watcher.RequestProgress method.
- To manually trigger broadcasting watch progress event (empty watch response with latest header) to all associated watch streams.
- Think of it as WithProgressNotify that can be triggered manually.
Fix lease keepalive interval updates when response queue is full.
- If <-chan *clientv3LeaseKeepAliveResponse from clientv3.Lease.KeepAlive was never consumed or channel is full, client was sending keepalive request every 500ms instead of expected rate of every "TTL / 3" duration.
Change snapshot file permissions: On Linux, the snapshot file changes from readable by all (mode 0644) to readable by the user only (mode 0600).
Client may choose to send keepalive pings to server using PermitWithoutStream.
- By setting PermitWithoutStream to true, client can send keepalive pings to server without any active streams(RPCs). In other words, it allows sending keepalive pings with unary or simple RPC calls.
- PermitWithoutStream is set to false by default.
Fix logic on release lock key if cancelled in clientv3/concurrency package.
Fix (*Client).Endpoints() method race condition.
Deprecated grpc.ErrClientConnClosing.
- clientv3 and proxy/grpcproxy now does not return grpc.ErrClientConnClosing.
- grpc.ErrClientConnClosing has been deprecated in gRPC >= 1.10.
- Use clientv3.IsConnCanceled(error) or google.golang.org/grpc/status.FromError(error) instead.

etcdctl v3

Make ETCDCTL_API=3 etcdctl default.
- Now, etcdctl set foo bar must be ETCDCTL_API=2 etcdctl set foo bar.
- Now, ETCDCTL_API=3 etcdctl put foo bar could be just etcdctl put foo bar.
Add etcdctl member add --learner and etcdctl member promote to add and promote raft learner member in etcd cluster.
Add etcdctl --password flag.
- To support : character in user name.
- e.g. etcdctl --user user --password password get foo
Add etcdctl user add --new-user-password flag.
Add etcdctl check datascale command.
Add etcdctl check datascale --auto-compact, --auto-defrag flags.
Add etcdctl check perf --auto-compact, --auto-defrag flags.
Add etcdctl defrag --cluster flag.
Add "raft applied index" field to endpoint status.
Add "errors" field to endpoint status.
Add etcdctl endpoint health --write-out support.
- Previously, etcdctl endpoint health --write-out json did not work.
Add missing newline in etcdctl endpoint health.
Fix etcdctl watch [key] [range_end] -- [exec-command…] parsing.
- Previously, ETCDCTL_API=3 etcdctl watch foo -- echo watch event received panicked.
Fix etcdctl move-leader command for TLS-enabled endpoints.
Add progress command to etcdctl watch --interactive.
- To manually trigger broadcasting watch progress event (empty watch response with latest header) to all associated watch streams.
- Think of it as WithProgressNotify that can be triggered manually.
Add timeout to etcdctl snapshot save.
- User can specify timeout of etcdctl snapshot save command using flag --command-timeout.
- Fix etcdctl to strip out insecure endpoints from DNS SRV records when using discovery

gRPC proxy

Fix etcd server panic from restore operation.
- Let's assume that a watcher had been requested with a future revision X and sent to node A that became network-partitioned thereafter. Meanwhile, cluster makes progress. Then when the partition gets removed, the leader sends a snapshot to node A. Previously if the snapshot's latest revision is still lower than the watch revision X, etcd server panicked during snapshot restore operation.
- Especially, gRPC proxy was affected, since it detects a leader loss with a key "proxy-namespace__lostleader" and a watch revision "int64(math.MaxInt64 - 2)".
- Now, this server-side panic has been fixed.
Fix memory leak in cache layer.
Change gRPC proxy to expose etcd server endpoint /metrics.
- The metrics that were exposed via the proxy were not etcd server members but instead the proxy itself.

gRPC gateway

Replace gRPC gateway endpoint /v3beta with /v3.
- Deprecated /v3alpha.
- To deprecate /v3beta in v3.5.
- In v3.4, curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}' still works as a fallback to curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}', but curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}' won't work in v3.5. Use curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}' instead.
Add API endpoints /{v3beta,v3}/lease/leases, /{v3beta,v3}/lease/revoke, /{v3beta,v3}/lease/timetolive.
- To deprecate /{v3beta,v3}/kv/lease/leases, /{v3beta,v3}/kv/lease/revoke, /{v3beta,v3}/kv/lease/timetolive in v3.5.
Support etcd --cors in v3 HTTP requests (gRPC gateway).

Package `raft`

Fix deadlock during PreVote migration process.
Add raft.ErrProposalDropped.
- Now (r *raft) Step returns raft.ErrProposalDropped if a proposal has been ignored.
- e.g. a node is removed from cluster, or raftpb.MsgProp arrives at current leader while there is an ongoing leadership transfer.
Improve Raft becomeLeader and stepLeader by keeping track of latest pb.EntryConfChange index.
- Previously record pendingConf boolean field scanning the entire tail of the log, which can delay hearbeat send.
Fix missing learner nodes on (n *node) ApplyConfChange.
Add raft.Config.MaxUncommittedEntriesSize to limit the total size of the uncommitted entries in bytes.
- Once exceeded, raft returns raft.ErrProposalDropped error.
- Prevent unbounded Raft log growth.
- There was a bug in PR#10167 but fixed via PR#10199.
Add raft.Ready.CommittedEntries pagination using raft.Config.MaxSizePerMsg.
- This prevents out-of-memory errors if the raft log has become very large and commits all at once.
- Fix correctness bug in CommittedEntries pagination.
Optimize message send flow control.
- Leader now sends more append entries if it has more non-empty entries to send after updating flow control information.
- Now, Raft allows multiple in-flight append messages.
Optimize memory allocation when boxing slice in maybeCommit.
- By boxing a heap-allocated slice header instead of the slice header on the stack, we can avoid an allocation when passing through the sort.Interface interface.
Avoid memory allocation in Raft entry String method.
Avoid multiple memory allocations when merging stable and unstable log.
Extract progress tracking into own component.
- Add package raft/tracker.
- Optimize string representation of Progress.
Make relationship between node and RawNode explicit.
Prevent learners from becoming leader.
Add package raft/quorum to reason about committed indexes as well as vote outcomes for both majority and joint quorums.
- Bundle Voters and Learner into raft/tracker.Config struct.
Use membership sets in progress tracking.
Implement joint quorum computation.
Refactor raft/node.go to centralize configuration change application.
Allow voter to become learner through snapshot.
Add package raft/confchange to internally support joint consensus.
Use RawNode for node's event loop.
Add RawNode.Bootstrap method.
Add raftpb.ConfChangeV2 to use joint quorums.
- raftpb.ConfChange continues to work as today: it allows carrying out a single configuration change. A pb.ConfChange proposal gets added to the Raft log as such and is thus also observed by the app during Ready handling, and fed back to ApplyConfChange.
- raftpb.ConfChangeV2 allows joint configuration changes but will continue to carry out configuration changes in "one phase" (i.e. without ever entering a joint config) when this is possible.
- raftpb.ConfChangeV2 messages initiate configuration changes. They support both the simple "one at a time" membership change protocol and full Joint Consensus allowing for arbitrary changes in membership.
Change raftpb.ConfState.Nodes to raftpb.ConfState.Voters.
Allow learners to vote, but still learners do not count in quorum.
- necessary in the situation in which a learner has been promoted (i.e. is now a voter) but has not learned about this yet.
Fix restoring joint consensus.
Visit Progress in stable order.
Proactively probe newly added followers.
- The general expectation in tracker.Progress.Next == c.LastIndex is that the follower has no log at all (and will thus likely need a snapshot), though the app may have applied a snapshot out of band before adding the replica (thus making the first index the better choice).
- Previously, when the leader applied a new configuration that added voters, it would not immediately probe these voters, delaying when they would be caught up.

Package `wal`

Add Verify function to perform corruption check on WAL contents.
Fix wal directory cleanup on creation failures.

Tooling

Add etcd-dump-logs --entry-type flag to support WAL log filtering by entry type.
Add etcd-dump-logs --stream-decoder flag to support custom decoder.
Add SHA256SUMS file to release assets.
- etcd maintainers are a distributed team, this change allows for releases to be cut and validation provided without requiring a signing key.

Go

Require Go 1.12+.
Compile with Go 1.12.9 including Go 1.12.8 security fixes.

Dockerfile

Rebase etcd image from Alpine to Debian to improve security and maintenance effort for etcd release.

Files

CHANGELOG-3.4.md

Latest commit

History

CHANGELOG-3.4.md

File metadata and controls

v3.4.4 (2020 TBD)

Metrics, Monitoring

Auth

v3.4.3 (2019-10-24)

Metrics, Monitoring

Go

v3.4.2 (2019-10-11)

Dependency

etcdctl v3

etcdserver

Go

client v3

v3.4.1 (2019-09-17)

Metrics, Monitoring

etcd server

Package embed

Dependency

Go

v3.4.0 (2019-08-30)

Documentation

Improved

Breaking Changes

Dependency

Metrics, Monitoring

Security, Authentication

etcd server

API

Package embed

Package pkg/adt

Package integration

client v3

etcdctl v3

gRPC proxy

gRPC gateway

Package raft

Package wal

Tooling

Go

Dockerfile

Package `embed`

Package `embed`

Package `pkg/adt`

Package `integration`

Package `raft`

Package `wal`