Basic high availability for auto egress IPs#19578
Basic high availability for auto egress IPs#19578openshift-merge-robot merged 3 commits intoopenshift:masterfrom
Conversation
2b044d1 to
473dcfd
Compare
|
LGTM |
473dcfd to
b223227
Compare
knobunc
left a comment
There was a problem hiding this comment.
/lgtm
@openshift/networking PTAL
| } | ||
|
|
||
| const ( | ||
| defaultPollInterval = 5 * time.Second |
There was a problem hiding this comment.
Can we doc these constants please?
There was a problem hiding this comment.
updated the egressVXLANMonitor docs to clarify how the constants get used
pkg/network/node/vxlan_monitor.go
Outdated
| // The fact that we (normally) use pod-to-egress traffic to do the monitoring rather than | ||
| // actively pinging the nodes means that if an egress node falls over while no one is | ||
| // using an egress IP on it, we won't notice the problem until someone does try to use it. | ||
| // So, eg, the first pod created in a namespace might spend 5 seconds trying to talk to a |
There was a problem hiding this comment.
It's (max) 7 seconds... right? Or am I missing something.
Every pollInterval (5s) we check the traffic to each node. If there's a mismatch we go into retry mode which re-checks every 1s and call it failed after 2 retries. So we will call it bad in best case 2s, worst case 7s, and average 4.5s.
There was a problem hiding this comment.
Yes. Changed it from "5" to "several"
| nodeIP: nodeIP, | ||
| sdnIP: sdnIP, | ||
| } | ||
| if len(evm.monitorNodes) == 1 && evm.pollInterval != 0 { |
There was a problem hiding this comment.
Is it worth turning off polling in RemoveNode()? Maybe not, just curious.
There was a problem hiding this comment.
Hm... I was doing it from poll() (it returned true if evm.monitorNodes was empty, causing the PollInfinite() to exit). But that creates a race condition if you delete the last node, then add a new node before poll() runs. So I redid it with a stop channel.
b223227 to
2e101d1
Compare
2e101d1 to
940d775
Compare
If a namespace has multiple egress IPs, monitor egress traffic and switch to an alternate egress IP if the currently-selected one appears dead.
Most dump-flows calls are part of health checks and don't normally need to be logged about unless they fail.
9c76ad6 to
0439a83
Compare
|
/hold cancel |
|
/retest |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danwinship, dcbw, knobunc The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
1 similar comment
|
/retest |
|
/retest |
2 similar comments
|
/retest |
|
/retest |
| for _, ip := range eg.namespaces[0].requestedIPs { | ||
| eg2 := eip.egressIPs[ip] | ||
| if eg2 != eg && len(eg2.nodes) == 1 && eg2.nodes[0] == eg.nodes[0] { | ||
| return false, fmt.Errorf("Multiple EgressIPs (%s, %s) for VNID %d on node %s", eg.ip, eg2.ip, eg.namespaces[0].vnid, eg.nodes[0].nodeIP) |
There was a problem hiding this comment.
@danwinship What is the reason to make the egressIP unavailable when all the egressIPs which are claimed by namespace are held by the same node? IMO, this condition will break the HA, but should be harmless if only let the first element work.
There was a problem hiding this comment.
@bmeng If you set multiple egress IPs on a NetNamespace, then you are expecting to get high availability. If the IPs are both on the same node though, then you aren't actually getting any HA. So, as in other cases, we break things to make the admin notice the misconfiguration.
There was a problem hiding this comment.
Ok, that makes sense. Thanks.
OK, this changes the semantics of automatic egress IPs so that:
EgressIPsfield of aNetNamespace, and if so, the first one in the list that is currently hosted on a node will be used.I tried to do this without changing the existing egressip code, but the existing code was based on the idea of processing each egress IP independently, and that doesn't work very well when changes to one egress IP can affect whether another egress IP is used. So I had to do some reorganization.
Some things to fix later (either before this gets merged, or in followup PRs, or in future patch releases):
@openshift/sig-networking PTAL