Skip to content

Large cluster has master endpoints flapping (one master is dropping in and out) #15212

@smarterclayton

Description

@smarterclayton

The Kubernetes service endpoints (master IPs, used by clients on the cluster) are maintained by etcd leases on fields in etcd with a 15s TTL, and they are refreshed every 10s. On a very large cluster it appears that the TTL refresh is failing and so the endpoints drop out, which causes endpoint churn and flapping.

The field is not configurable, it should either be so or be tied to another config flag we can tweak.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions