The Kubernetes service endpoints (master IPs, used by clients on the cluster) are maintained by etcd leases on fields in etcd with a 15s TTL, and they are refreshed every 10s. On a very large cluster it appears that the TTL refresh is failing and so the endpoints drop out, which causes endpoint churn and flapping.
The field is not configurable, it should either be so or be tied to another config flag we can tweak.