Make A/B deployment proportional to service weight#15309
Make A/B deployment proportional to service weight#15309openshift-merge-robot merged 1 commit intoopenshift:masterfrom
Conversation
|
@knobunc @rajatchopra PTAL |
knobunc
left a comment
There was a problem hiding this comment.
Generally on the right track. Let's use getServiceUnits to make it a hair simpler.
| }, | ||
| "description": "stages contains details about each stage that occurs during the build including start time, duration (in milliseconds), and the steps that occured within each stage." | ||
| }, | ||
| "logSnippet": { |
There was a problem hiding this comment.
This should not have gone away.
There was a problem hiding this comment.
Fixed. (accidentally deleted)
16ac7bd to
6f9c3d8
Compare
| r.stateChanged = true | ||
| } | ||
|
|
||
| // numberOfEndpoints returns the number of endpoints |
There was a problem hiding this comment.
numberOfEndpoints returns the number of endpoints for the given service id
pkg/router/template/router.go
Outdated
| r.lock.Lock() | ||
| defer r.lock.Unlock() | ||
|
|
||
| var eps int32 |
There was a problem hiding this comment.
eps := 0
(0 is a type int, and that it as least 32 bits)
Or if you must:
var eps int32 = 0
pkg/router/template/router.go
Outdated
| var eps int32 | ||
| eps = 0 | ||
| svc, ok := r.findMatchingServiceUnit(id) | ||
| if ok && len(svc.EndpointTable) > int(eps) { |
There was a problem hiding this comment.
no need to cast if you use := 0
pkg/router/template/router.go
Outdated
| eps = 0 | ||
| svc, ok := r.findMatchingServiceUnit(id) | ||
| if ok && len(svc.EndpointTable) > int(eps) { | ||
| eps = int32(len(svc.EndpointTable)) |
There was a problem hiding this comment.
and no need to cast here either...
|
@knobunc Is there more need? PTAL |
pkg/router/template/router.go
Outdated
| // serviceUnits[key] is the weigth for each endpoint in the service | ||
| // the sum of the weights of the endpoints is the scaled service weight | ||
|
|
||
| var numEp int32 |
There was a problem hiding this comment.
numEp := r.numberOfEndpoints(key)
if numEp < 1 {
numEp = 1
}
|
@knobunc @rajatchopra |
|
@openshift/networking PTAL |
|
[test][testextended][extended: networking] |
| } | ||
| // scale the weights to near the maximum (256) | ||
| // This improves precision when scaling for the endpoints | ||
| var scaleWeight int32 = 256 / maxWeight |
There was a problem hiding this comment.
This normalization is a good idea. But it will not cover the case where maxWeight is > 128 and one of the services have a weight which is too low compared to its number of endpoints.
e.g. I want A/B testing to happen for a percentage of 1.5% i.e. 1.5% of traffic should be served by serviceA and the rest by serviceB. So I put a weight of 3 on svc A and 200 on svc B.
All is good until I have 20 endpoints for svcA at which point we will give the endpoints a weight of '0'.
Ideally, we should give 3 endpoints a weight of '1' and the rest become '0'(unfortunately).
I have another algorithm in mind (not perfect either). Need discussion.
//Foreach svc:
equitableWeight = trunc(svcWeight/numEp)
for i = 0; i < len(endpoints); i++ {
endpoints[i].weight = equitableWeight
}
for i=0; i < (svcWeight - equitableWeight); i++ {
endpoints[i].weight += 1
}
There was a problem hiding this comment.
@rajatchopra Thanks for taking a look. The idea of normalizing was to bring up the largest weight to a 3 digit number. In this 128 is OK. It just gives a little mote headroom.
For your 1.5% example, The weight, if not 0, should have a min of 1. Thanks for spotting this. I'll fix it.
With a min of 1, the number of pods can be reduced if the overall percent is still too high. The 20 pods could be reduced to maybe 3.
There was a problem hiding this comment.
@rajatchopra I do not like setting the weight to 0 for a live backend. We need to doc clearly that you can send more traffic than the proportion would allow if there are too many backends.
|
[Extended Tests: networking] |
|
Failed with:
|
|
@openshift/networking PTAL |
| backendPath := specPath.Child("alternateBackends") | ||
| if len(route.Spec.AlternateBackends) > 3 { | ||
| result = append(result, field.Required(backendPath, "cannot specify more than 3 additional backends")) | ||
| result = append(result, field.Required(backendPath, "cannot specify more than 3 alternate backends")) |
There was a problem hiding this comment.
Tiniest nit: Extra space after "alternate"
|
[test] |
|
@pecameron testing failed with: |
|
Evaluated for origin test up to 073c4b8 |
|
continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/3457/) (Base Commit: 49df2d6) (PR Branch Commit: 073c4b8) |
A route can front up to 4 services that handle the requests. The load balancing strategy governs which endpoint gets each request. When roundrobin is chosen, the portion of the requests that each service handles is governed by the weight assigned to the service. Each endpoint in the service gets a fraction of the service's requests bug 1470350 https://bugzilla.redhat.com/show_bug.cgi?id=1470350 Code change is in origin PR 15309 openshift/origin#15309
|
/retest |
|
/retest |
|
/retest |
Distribute requests among the route's services based on service weight. The portion of the total weight that each service has is distributed evenly among the service's endpoints. bug 1470350 https://bugzilla.redhat.com/show_bug.cgi?id=1470350
|
@knobunc PTAL |
|
@pecameron: The following test failed, say
Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest |
|
@rajatchopra PTAL I need a /lgtm to move this forward. Thanks. |
rajatchopra
left a comment
There was a problem hiding this comment.
/lgtm
Let's do a follow on PR about the algorithm though. I would have liked that in the case where number of endpoints overrun the maximum manageable with weight '1', we should ignore the rest of the endpoints rather than just 'warn' that A/B weightage will not be honoured.
|
/approve addresses issue #11881 |
|
/approve no-issue |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: eparis, pecameron, rajatchopra Associated issue requirement bypassed by: eparis The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
|
/retest |
|
Automatic merge from submit-queue (batch tested with PRs 15533, 15414, 15584, 15529, 15309) |
Distribute requests among the route's services based on service weight.
The portion of the total weight that each service has is distributed
evenly among the service's endpoints.
bug 1470350
https://bugzilla.redhat.com/show_bug.cgi?id=1470350