Support multiple CIDR addresses for the pod SDN#14558
Support multiple CIDR addresses for the pod SDN#14558openshift-merge-robot merged 2 commits intoopenshift:masterfrom
Conversation
Hm... Is there any reason you did it that way? It seems like it would be nicer to just have aka
Treat it the same as if it was defined the new way, but with only a single CIDR range
Yes, that sounds right. Make sure the nodes do the right thing on restart as well. |
|
@openshift/networking |
|
@danwinship the reason he did it that way is because we will shortly need to accommodate hot-resizing the subnets (or even supporting a way to have different size host subnets per range, perhaps with a way to label the subnets so different nodes can deliberately have different allocations). |
pecameron
left a comment
There was a problem hiding this comment.
Just a quick pass through. I am learning about this as I go so not certain if this is at all helpful.
pkg/cmd/cli/describe/printer.go
Outdated
| } | ||
| name := formatResourceName(opts.Kind, n.Name, opts.WithKind) | ||
| _, err := fmt.Fprintf(w, "%s\t%s\t%d\t%s\t%s\n", name, n.Network, n.HostSubnetLength, n.ServiceNetwork, n.PluginName) | ||
| _, err := fmt.Fprintf(w, "%s\t%s\t%d\t%s\t%s\n", name, strings.Join(cidrList, ","), n.HostSubnetLength, n.ServiceNetwork, n.PluginName) |
There was a problem hiding this comment.
doing them all on one line will mess up the report. I assume that people that like this feature will have a pretty fragments IP space.
There was a problem hiding this comment.
Yes, it will be hard to read when there are several cidrs. May be one row for each cidr?
for _, cidr := range n.ClusterDef {
, err := fmt.Fprintf(w, "%s\t%s\t%d\t%s\t%s\n", name, cidr. ...)
}
pkg/cmd/server/api/v1/types.go
Outdated
| NetworkPluginName string `json:"networkPluginName"` | ||
| // ClusterNetworkCIDR is the CIDR string to specify the global overlay network's L3 space | ||
| ClusterNetworkCIDR string `json:"clusterNetworkCIDR"` | ||
| // ClusterNetworkConfig is used to specify the global overlay network's L3 space |
There was a problem hiding this comment.
might mention that this is a list of cidrs that define the global overlay network's L3 space
pkg/cmd/server/api/v1/types.go
Outdated
| ClusterNetworkCIDR string `json:"clusterNetworkCIDR"` | ||
| // ClusterNetworkConfig is used to specify the global overlay network's L3 space | ||
| ClusterNetworkConfig []ClusterNetworkEntry `json:"clusterNetworkConfig"` | ||
| // HostSubnetLength is the number of bits to allocate to each host's subnet e.g. 8 would mean a /24 network on the host |
There was a problem hiding this comment.
So this is the minimum value that can be in the cidr list above. Might comment about this above.
pkg/sdn/api/validation/validation.go
Outdated
| defaultClusterNetwork = &cn | ||
| } | ||
|
|
||
| //intersect tests two CIDR addresses to see if the first contains the second |
There was a problem hiding this comment.
In addition to contains, should check for first contained in second (unless second can't be larger than first). Should we check overlap with other defined spaces (service, exportIP, ...)?
| // One poorly formated address and one well formatted | ||
| // Overlapping addresses | ||
| // Both Cluster.Network & clusterDef is defined | ||
| // |
There was a problem hiding this comment.
test out of range (> 255) in IP field.
test mask (/xx) > 32, or < min size (at least as big as HostSubnetLength)
test overlap with other well defined spaces (service ip, external ip,...)
I think overlap can be either contained within or contained by.
| cn: &api.ClusterNetwork{ | ||
| ObjectMeta: metav1.ObjectMeta{Name: "any"}, | ||
| Network: "10.20.0.0.0/16", | ||
| ClusterDef: []api.ClusterNetworkEntry{{ClusterNetworkCIDR: "10.20.0.0.0/16"}}, |
There was a problem hiding this comment.
bad network also 10.a.0.0/16, 10.256.0.0 and 0.0.0.0/33.
pkg/sdn/plugin/common.go
Outdated
| return fmt.Sprintf("%s (network: %q, hostSubnetBits: %d, serviceNetwork: %q, pluginName: %q)", n.Name, n.Network, n.HostSubnetLength, n.ServiceNetwork, n.PluginName) | ||
| } | ||
|
|
||
| //determine if the first argument contains any of the second |
There was a problem hiding this comment.
Isn't this contains the second or is contained in the second?
| func parseNetworkInfo(clusterNetwork []osapi.ClusterNetworkEntry, serviceNetwork string) (*NetworkInfo, error) { | ||
| var cn []*net.IPNet | ||
|
|
||
| for _, entry := range clusterNetwork { |
There was a problem hiding this comment.
I think you need to check each cidr with all other cidrs (except for itself). How many cidrs do we support?
There was a problem hiding this comment.
Yes, this is needed. Do we want to restrict number of cidrs that we support? If users specify single IPs or sparse cidrs, this could be huge.
There was a problem hiding this comment.
What do you mean check each cidr with all other cidrs?
There was a problem hiding this comment.
This is to verify there is no overlap between the given cidrs, otherwise pods may end up with same IPs.
pkg/sdn/plugin/common.go
Outdated
| } else if !ni.ClusterNetwork.Contains(subnetIP) { | ||
| errList = append(errList, fmt.Errorf("existing node subnet: %s is not part of cluster network: %s", subnet.Subnet, ni.ClusterNetwork.String())) | ||
| } else if _, contains := cidrListContains(ni.ClusterNetwork, subnetIP); !contains { | ||
| errList = append(errList, fmt.Errorf("existing node subnet: %s in not part of any cluster network CIDR", subnet.Subnet)) |
pkg/sdn/plugin/common.go
Outdated
| } | ||
|
|
||
| return parseNetworkInfo(cn.Network, cn.ServiceNetwork) | ||
| return parseNetworkInfo(cn.ClusterDef, cn.ServiceNetwork) |
There was a problem hiding this comment.
Do we care about externalIP and load balancer IP here?
danwinship
left a comment
There was a problem hiding this comment.
looks good as a first draft
pkg/cmd/server/api/types.go
Outdated
| ServiceNetworkCIDR string | ||
| NetworkPluginName string | ||
| ClusterNetworkCIDR string | ||
| ClusterNetworkConfig []ClusterNetworkEntry |
There was a problem hiding this comment.
I'd probably go with just ClusterNetworks.
It would be nice if we could get rid of the old ClusterNetworkCIDR field, but I'm not sure how backward-compat works with startup config objects. Old master-config.yaml files need to keep working though, so whatever that requires...
pkg/cmd/server/api/types.go
Outdated
| } | ||
|
|
||
| type ClusterNetworkEntry struct { | ||
| ClusterNetworkCIDR string |
There was a problem hiding this comment.
Maybe just CIDR? You've already got ClusterNetwork in the struct name and field name...
Also, if the plan is to eventually have per-network HostSubnetLength values, I think we should add that here now; if we don't support different values yet, you can just add code to make sure every ClusterNetworkEntry has the same value.
(Also, if we're tweaking HostSubnetLength handling, we might want to consider renaming it and flipping the math around; the fact that it counts bits from the end of the IP address rather than the start is confusing. (Eg, HostSubnetLength of 9 means each node gets a /23 subnet (32 - 9 = 23). I think it would be clearer for most users if the config option was "23" rather than "9".
There was a problem hiding this comment.
+1 on CIDR.
If you plan to do global/local HostSubnetLength field that I was suggesting, then.
type ClusterNetworkEntry struct {
ClusterNetworkCIDR string
HostSubnetLength uint32
}
| // we need to set some customer parameters, so create by hand | ||
| restrictedNetworks, err := serviceadmit.ParseSimpleCIDRRules([]string{options.NetworkConfig.ClusterNetworkCIDR, options.NetworkConfig.ServiceNetworkCIDR}) | ||
| // PLACEHOLDER HAVE TO FIGURE OUT OUT TO DO | ||
| restrictedNetworks, err := serviceadmit.ParseSimpleCIDRRules([]string{options.NetworkConfig.ServiceNetworkCIDR}) |
There was a problem hiding this comment.
You need to pass an array containing all of the clusternetwork cidr values and the servicenetwork cidr value
pkg/sdn/api/v1/types.go
Outdated
| // Network is a CIDR string specifying the global overlay network's L3 space | ||
| Network string `json:"network" protobuf:"bytes,2,opt,name=network"` | ||
| // PLACEHOLDER | ||
| ClusterDef []ClusterNetworkEntry `json:"clusterDef" protobuf:"bytes,6,rep,name=clusterDef"` |
There was a problem hiding this comment.
https://github.com/kubernetes/community/blob/master/contributors/devel/api_changes.md has some information on changing kubernetes API objects, and in particular discusses the backward-compatibility snafus that come with trying to turn a single-element field into an array... (The "API Conventions" link near the top of that document is also filled with useful information about how kubernetes APIs work.)
The easiest way to make this work backward-compatibly would be to say that the "default"/first network is still specified by the Network and HostSubnetLength fields, but the 2nd to Nth networks are part of a new array field (which could be called AdditionalNetworks or something like that).
You need a doc comment above the field, and between the doc comment and the actual declaration should be
// +optional
to indicate that it's an optional field
pkg/sdn/api/v1/types.go
Outdated
|
|
||
| // PLACEHOLDER | ||
| type ClusterNetworkEntry struct { | ||
| ClusterNetworkCIDR string `json:"clusterNetworkCIDR" protobuf:"bytes,1,opt,name=clusterNetworkCIDR"` |
There was a problem hiding this comment.
same comments as in pkg/cmd/server/api/types.go above
pkg/sdn/api/types.go
Outdated
| metav1.ObjectMeta | ||
|
|
||
| Network string | ||
| ClusterDef []ClusterNetworkEntry |
There was a problem hiding this comment.
The "unversioned" ClusterNetwork type (the one in pkg/sdn/api/types.go, as opposed to pkg/sdn/api/v1/types.go) is an internal-only object, not part of the API, so it is not subject to backward-compatibility rules. So you could get rid of the old Network and HostSubnetLength fields here and only have the new array.
The one catch with this is that if the unversioned type and the v1 type aren't identical, then you'll need to add some hand-written conversion functions, but this isn't difficult. There are no examples in pkg/sdn/api but you can see examples in other API groups (eg, pkg/image/api/v1/conversion.go).
pkg/sdn/api/types.go
Outdated
| } | ||
|
|
||
| type ClusterNetworkEntry struct { | ||
| ClusterNetworkCIDR string |
There was a problem hiding this comment.
same comments as in pkg/cmd/server/api/types.go above
|
With multiple cluster network CIDR support, hostsubnet length don't need to be same for each CIDR. We may not add this support immediately but it will be nice if we can make the config compatible with the future change that is more likely to be added. How about this master config?
How should I deal with a config file that defined ClusterNetworkCIDR the old way? |
pkg/sdn/plugin/sdn_controller.go
Outdated
| //err = ioutil.WriteFile("/run/openshift-sdn/config.env", []byte(config), 0644) | ||
| //if err != nil { | ||
| // return false, err | ||
| //} |
pkg/cmd/server/start/master_args.go
Outdated
| } | ||
| clusterNetworkConfig = append(clusterNetworkConfig, clusterNetworkEntry) | ||
| } | ||
|
|
There was a problem hiding this comment.
My understanding is command line arguments is for simple/essential use cases and complex use cases need to be specified in configuration file. If we treat multiple cidr use case as complex then we don't need to support command line args.
Supporting comma separated list here makes it harder to specify different hostsubnetLength for each cidr later.
pkg/sdn/api/v1/types.go
Outdated
| // Network is a CIDR string specifying the global overlay network's L3 space | ||
| Network string `json:"network" protobuf:"bytes,2,opt,name=network"` | ||
| // PLACEHOLDER | ||
| ClusterDef []ClusterNetworkEntry `json:"clusterDef"` |
pkg/sdn/api/validation/validation.go
Outdated
|
|
||
| //intersect tests two CIDR addresses to see if the first contains the second | ||
| func intersect(cidr1, cidr2 string) bool { | ||
| _, net1, _ := net.ParseCIDR(cidr1) |
There was a problem hiding this comment.
Check the error (similar to cidr2 below)
pkg/sdn/plugin/master.go
Outdated
| } | ||
|
|
||
| func clusterDefEquals(obj, old []osapi.ClusterNetworkEntry) bool { | ||
| if obj == nil && old == nil { |
There was a problem hiding this comment.
Not needed, condition below covers this check.
5dfee72 to
5e73345
Compare
|
@danwinship @knobunc @pravisankar Should I change the math for hostSubnetLength field? Won't that How could we upgrade a config from the old math to the new math for hostsubnetlength? |
|
From the stand point of user using multiple cidr, with the current implementation user may not use a cidr because it conflicts with global HostSubnetLength (they don't have enough bits in the cidr to allocate for the pod). I feel cidr and hostSubnetLength go hand in hand. Why don't we work on supporting multiple cidr and hostsubnetLength per cidr? This will remove some of these limitations and will avoid unnecessary migration later on. @openshift/networking |
5d6c39a to
28e55c5
Compare
6010d34 to
6cfab3e
Compare
024dfea to
bb7d9b7
Compare
|
@danwinship @pravisankar is this moving in the right direction? I deprecated the global hostsubnetlength field and added one per cidr. I also changed it so only one could be defined on the cli and others need to be added to the config |
|
/lgtm |
|
/lgtm |
pkg/cmd/server/api/v1/types.go
Outdated
| HostSubnetLength uint32 `json:"hostSubnetLength"` | ||
| // DeprecatedClusterNetworkCIDR is the CIDR string to specify the global overlay network's L3 space. Deprecated, but maintained for backwards compatibility, use ClusterNetworks instead. | ||
| DeprecatedClusterNetworkCIDR string `json:"clusterNetworkCIDR,omitempty"` | ||
| // ClusterNetworks is a list of ClusterNetwork objects that defines the global overlay network's L3 space by specifying a set of CIDR and netmasks that the SDN can allocate addressed from. If this is specified, then DeprecatedClusterNetworkCIDR and DeprecatedHostSubnetLength may not be set. |
There was a problem hiding this comment.
Use the external field name - a user won't have any idea what "DeprecatedClusterNetworkCIDR" is without looking at the go code. In general, always use external field names in thees docs, no exceptions.
897a0e3 to
5f143b0
Compare
pkg/cmd/server/api/v1/types.go
Outdated
| HostSubnetLength uint32 `json:"hostSubnetLength"` | ||
| // ClusterNetworkCIDR is the CIDR string to specify the global overlay network's L3 space. Deprecated, but maintained for backwards compatibility, use ClusterNetworks instead. | ||
| DeprecatedClusterNetworkCIDR string `json:"clusterNetworkCIDR,omitempty"` | ||
| // ClusterNetworks is a list of ClusterNetwork objects that defines the global overlay network's L3 space by specifying a set of CIDR and netmasks that the SDN can allocate addressed from. If this is specified, then DeprecatedClusterNetworkCIDR and DeprecatedHostSubnetLength may not be set. |
There was a problem hiding this comment.
My comments got lost AGAIN. Basically you can't use external names here. Use the names as they would appear in JSON. Users only see this from the API and documentation, they never see the go code, so the names in Go are meaningless.
clusterNetworkCIDR and hostSubnetLength are the names they see
Chaning the Network Config section of the the master config to allow multiple CIDR addresses and hostsubnet Lengths for the allocation of nodes' address space
5f143b0 to
d1e46b9
Compare
|
@smarterclayton Ok, good to go? |
|
/test cmd |
|
API approved |
|
/lgtm @smarterclayton would you approve this please too? Thanks |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: eparis, knobunc, pravisankar The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
|
@JacobTanenbaum: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
flake #16414 /retest |
|
/retest |
|
Automatic merge from submit-queue (batch tested with PRs 14558, 16544). |
Work in progress PR for multiple CIDR address work. addresses can be defined in the master config file by
or by passing a comma seporated list of them on the command line using --network-cidr="10.128.0.0/28,12.128.0.0/28"
besides general review - I could use some feedback on how to do a few things
How should I deal with a config file that defined ClusterNetworkCIDR the old way?
In pkg/sdn/plugin/master.go there where checks to see if the cluster cidr was shrunk, I don't think that check is still required. It should be valid to break a large cluster cidr into it's smaller components. Is checking that all objects are allocated in defined places enough? As done by checkclusterobjects() in pkg/sdn/plugin/common.go
I need to change oc get clusternetwork to show a comma separated list
finish the unit testing
review of the work so far and input on the above questions are appreciated
@knobunc @dcbw @danwinship @rajatchopra