Skip to content

Feature: Adds reconfigure control plane#11871

Closed
Hector295 wants to merge 1 commit intokubernetes-sigs:masterfrom
Hector295:feature/reconfigure-control-plane
Closed

Feature: Adds reconfigure control plane#11871
Hector295 wants to merge 1 commit intokubernetes-sigs:masterfrom
Hector295:feature/reconfigure-control-plane

Conversation

@Hector295
Copy link
Copy Markdown

What type of PR is this?
/kind feature

What this PR does / why we need it:

This PR introduces tasks and configurations that allow for the reconfiguration of the control plane (kube-apiserver, controller-manager, and scheduler) in Kubespray without requiring a full cluster reprovision.

Additionally, a new playbook, reconfigure-control-plane.yml, has been added. To reconfigure an existing cluster without performing an upgrade, run:

ansible-playbook -i <INVENTORY> reconfigure-control-plane.yml --skip-tags upgrade

Which issue(s) this PR fixes:

Fixes #11552

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Feature: Adds new tasks, templates, and a playbook (reconfigure-control-plane.yml) for control plane reconfiguration in Kubespray, allowing incremental updates without requiring a full cluster reprovision.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 8, 2025
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Jan 8, 2025
@linux-foundation-easycla
Copy link
Copy Markdown

CLA Not Signed

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Jan 8, 2025
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Hector295
Once this PR has been reviewed and has the lgtm label, please assign yankay for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Welcome @Hector295!

It looks like this is your first PR to kubernetes-sigs/kubespray 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/kubespray has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @Hector295. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 8, 2025
@VannTen
Copy link
Copy Markdown
Contributor

VannTen commented Jan 9, 2025

Hum, could you explain the difference with ansible-playbook -l kube_control_plane,etcd upgrade-cluster.yml for instance ?

I'm not completely closed to the idea, but IMO this would be best implemented with limit and tags / a more intelligent playbook (== we probably restart some things a bit eagerly).

@Hector295
Copy link
Copy Markdown
Author

Hector295 commented Jan 9, 2025

Hi @VannTen , thank you for your comment and for taking the time to review this PR.

The command you mentioned (ansible-playbook -l kube_control_plane,etcd upgrade-cluster.yml) does not actually perform the necessary updates to the ConfigMap or the values that should be applied to the control plane pods (kube-apiserver, kube-controller-manager, kube-scheduler). The only change it makes is to the kubeadm-config.yaml file, but this does not propagate to the cluster configuration or the pods.

What I’ve implemented follows the official Kubernetes documentation, which outlines the correct process to ensure changes are properly reflected in the cluster.

Regarding your suggestion about limit, I agree that it could be a good option for scoping changes more precisely. However, I don’t believe that tags are strictly necessary in this context, as the main goal is to ensure the configuration changes are fully applied to the cluster.

Additionally, I tested this implementation by changing the value of kube_apiserver_node_port_range from 30000-32767 to 30001-32766. The results showed that with the command you mentioned, the changes were not fully reflected:

root@mistl-node-0:/etc/kubernetes# cat kubeadm-config.yaml | grep 3000
    value: "30001-32766"
root@mistl-node-0:/etc/kubernetes# kubectl get cm -n kube-system kubeadm-config -oyaml | grep 3000
        value: 30000-32767
root@mistl-node-0:/etc/kubernetes# kubectl describe pod -n kube-system kube-apiserver-mistl-node-0 | grep 3000
      --service-node-port-range=30000-32767

It’s also worth noting that the behavior of using the --config flag in kubeadm has changed. For additional context, you can refer to the following discussions:

@yankay
Copy link
Copy Markdown
Member

yankay commented Jan 10, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 10, 2025
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@Hector295: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubespray-yamllint c5055ea link true /test pull-kubespray-yamllint

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@VannTen
Copy link
Copy Markdown
Contributor

VannTen commented Jan 10, 2025

The command you mentioned (ansible-playbook -l kube_control_plane,etcd upgrade-cluster.yml) does not actually perform the necessary updates to the ConfigMap or the values that should be applied to the control plane pods (kube-apiserver, kube-controller-manager, kube-scheduler). The only change it makes is to the kubeadm-config.yaml file, but this does not propagate to the cluster configuration or the pods.

Is that recent ? Because it absolutely should reconfigure the control-plane, and it does on older releases (like, I did this yesterday on 1.25 to add OIDC parameters.
In that case, that is a regression, which we should fix.

Regarding your suggestion about limit, I agree that it could be a good option for scoping changes more precisely. However, I don’t believe that tags are strictly necessary in this context, as the main goal is to ensure the configuration changes are fully applied to the cluster.

I agree that tags are not strictly necessary. I was working from the assumption that you meant to minimize cluster disruption by only changing the configuration, but if upgrade-cluster is no longer updating configuration, I see where you're coming from.

It’s also worth noting that the behavior of using the --config flag in kubeadm has changed. For additional context, you can refer to the following discussions:

* [Comment on PR #11352](https://github.com/kubernetes-sigs/kubespray/pull/11352#issuecomment-2210283864)

* [Comment on kubeadm issue #3084](https://github.com/kubernetes/kubeadm/issues/3084#issuecomment-2209300846)

I was distantly aware of this issue, but I haven't had the time to focus on this yet.

I don't think a new playbook is the answer though, for several reasons:

  • the existing workflow use upgrade-cluster, this would be a new thing to be aware of.
  • this adds a non-trivial maintenance overhead

Is there a specific reason upgrade-cluster.yml can't be fixed instead ?

@Hector295
Copy link
Copy Markdown
Author

@VannTen the behavior you mentioned in 1.25 worked, but it was not the correct approach according to kubeadm's design. The upgrade-cluster functionality was being used to reconfigure the control plane, even though its actual purpose is to manage cluster version upgrades.
Reconfiguration of the control plne should not be done as part of a kubernetes upgrade because upgrades include additional tasks that are not relevant to reconfiguration.
The official Kubernetes documentation suggests manually updating the static manifests of the control plane located in /etc/kubernetes/manifests/. This PR introduces the reconfigure-control-plane.yml playbook, ensuring that changes are correctly applied to the control plane pods.

chadswen added a commit to chadswen/kargo that referenced this pull request Mar 3, 2025
Adds revised support for:
- The previously removed `--config` argument for `kubeadm upgrade apply`
- Changes to `ClusterConfiguration` as part of the `upgrade-cluster.yml` playbook lifecycle (Fixes kubernetes-sigs#11552)
- kubeadm-config `v1beta4` `UpgradeConfiguration` for the `kubeadm upgrade apply` command: [UpgradeConfiguration v1beta4](https://kubernetes.io/docs/reference/config-api/kubeadm-config.v1beta4/#kubeadm-k8s-io-v1beta4-UpgradeConfiguration).

Background:
PR kubernetes-sigs#11352 removed the --config flag from `kubeadm upgrade apply` to address the upgrade issues with kubeadm v1.30 identified in kubernetes-sigs#11350. Before this change, kubespray upgrades depended on `kubeadm upgrade apply --config=...` to make ClusterConfiguration changes with the upgrade. However, this reconfiguration was deprecated in `kubeadm upgrade apply` some time ago, and is no longer supported by the `kubeadm upgrade apply` config command.

To ensure `ClusterConfiguration` changes are still applied during upgrades in a supportable way, the new solution in this PR reconfigures ClusterConfiguration separately after upgrade with distinct upload-config and control plane static pod rewrite tasks that run immediately after a successful upgrade. See [this comment from @VannTen](kubernetes-sigs#11871 (comment)) for more discussion on why the expectation is to fix reconfiguration as part of the upgrade lifecycle, as well as issue kubernetes-sigs#11552.

Additionally, kubeadm v1.31 added back support for `--config`, along with UpgradeConfiguration when using v1beta4. This PR adds support for the `UpgradeConfiguration` in the kubeadm-config file, which is required to fully implement upgrades with `kubeadm.k8s.io/v1beta4`. This addition was omitted from the original v1beta4 implementation in kubernetes-sigs#11674, but it is required to use `--config` correctly during kubeadm upgrades with v1beta4.
chadswen added a commit to chadswen/kargo that referenced this pull request Mar 3, 2025
Adds revised support for:
- The previously removed `--config` argument for `kubeadm upgrade apply`
- Changes to `ClusterConfiguration` as part of the `upgrade-cluster.yml` playbook lifecycle (Fixes kubernetes-sigs#11552)
- kubeadm-config `v1beta4` `UpgradeConfiguration` for the `kubeadm upgrade apply` command: [UpgradeConfiguration v1beta4](https://kubernetes.io/docs/reference/config-api/kubeadm-config.v1beta4/#kubeadm-k8s-io-v1beta4-UpgradeConfiguration).

kubeadm upgrade apply --config support
PR kubernetes-sigs#11352 removed the --config flag from all usages of `kubeadm upgrade apply` to address the upgrade issues with kubeadm v1.30 identified in kubernetes-sigs#11350.

This PR enables support for the scenarios in which `--config` can and should still be used with `kubeadm upgrade apply`, with some version specific handling that still avoid kubeadm v1.30's upgrade failures.

Control plane reconfiguration during upgrade
Before PR kubernetes-sigs#11352, kubespray upgrades depended on `kubeadm upgrade apply --config=...` to make ClusterConfiguration changes during a cluster upgrade. However, this reconfiguration was deprecated from `kubeadm upgrade apply` some time ago, and is [no longer supported](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/#additional-information) by the `kubeadm upgrade apply` command.

To ensure `ClusterConfiguration` changes are still applied during upgrades in a supportable way, the new solution in this PR reconfigures `ClusterConfiguration` separately with distinct `kubeadm init phase upload-config kubeadm --config=...` and control plane static pod rewrite tasks that run immediately after a successful upgrade. See [this comment from @VannTen](kubernetes-sigs#11871 (comment)) for more discussion on why the expectation is to fix reconfiguration as part of the upgrade lifecycle, as well as issue kubernetes-sigs#11552. This approach is in line with kubeadm's [recommendations for cluster reconfiguration](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-reconfigure/).

kubeadm-config v1beta4 `UpgradeApplyConfiguration` support
Additionally, kubeadm v1.31 added back support for `--config` while introducing support for `UpgradeConfiguration` in the kubeadm-config file, which is required to fully implement upgrades with `kubeadm.k8s.io/v1beta4`. UpgradeConfiguration was not added in kubespray's initial v1beta4 implementation (PR kubernetes-sigs#11674), but it is required to use `--config` correctly during kubeadm upgrades with v1beta4.

This PR uses UpgradeConfiguration for v1beta4 kubeadm upgrades, while still retaining support for v1beta3.
chadswen added a commit to chadswen/kargo that referenced this pull request Mar 3, 2025
Adds revised support for:
- The previously removed `--config` argument for `kubeadm upgrade apply`
- Changes to `ClusterConfiguration` as part of the `upgrade-cluster.yml` playbook lifecycle
- kubeadm-config `v1beta4` `UpgradeConfiguration` for the `kubeadm upgrade apply` command: [UpgradeConfiguration v1beta4](https://kubernetes.io/docs/reference/config-api/kubeadm-config.v1beta4/#kubeadm-k8s-io-v1beta4-UpgradeConfiguration).

kubeadm upgrade apply --config support
PR kubernetes-sigs#11352 removed the --config flag from all usages of `kubeadm upgrade apply` to address the upgrade issues with kubeadm v1.30 identified in kubernetes-sigs#11350.

This PR enables support for the scenarios in which `--config` can and should still be used with `kubeadm upgrade apply`, with some version specific handling that still avoid kubeadm v1.30's upgrade failures.

Control plane reconfiguration during upgrade
Before PR kubernetes-sigs#11352, kubespray upgrades depended on `kubeadm upgrade apply --config=...` to make ClusterConfiguration changes during a cluster upgrade. However, this reconfiguration was deprecated from `kubeadm upgrade apply` some time ago, and is [no longer supported](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/#additional-information) by the `kubeadm upgrade apply` command.

To ensure `ClusterConfiguration` changes are still applied during upgrades in a supportable way, the new solution in this PR reconfigures `ClusterConfiguration` separately with distinct `kubeadm init phase upload-config kubeadm --config=...` and control plane static pod rewrite tasks that run immediately after a successful upgrade. See [this comment from @VannTen](kubernetes-sigs#11871 (comment)) for more discussion on why the expectation is to fix reconfiguration as part of the upgrade lifecycle, as well as issue kubernetes-sigs#11552. This approach is in line with kubeadm's [recommendations for cluster reconfiguration](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-reconfigure/).

kubeadm-config v1beta4 `UpgradeApplyConfiguration` support
Additionally, kubeadm v1.31 added back support for `--config` while introducing support for `UpgradeConfiguration` in the kubeadm-config file, which is required to fully implement upgrades with `kubeadm.k8s.io/v1beta4`. UpgradeConfiguration was not added in kubespray's initial v1beta4 implementation (PR kubernetes-sigs#11674), but it is required to use `--config` correctly during kubeadm upgrades with v1beta4.

This PR uses UpgradeConfiguration for v1beta4 kubeadm upgrades, while still retaining support for v1beta3.
@chadswen
Copy link
Copy Markdown
Member

chadswen commented Mar 3, 2025

Is there a specific reason upgrade-cluster.yml can't be fixed instead ?

@VannTen Over the past week I have been testing a solution that applies reconfiguration in our upgrade tasks, adds support for --config and introduces the new UpgradeConfiguration kind. I just submitted these kubeadm upgrade fixes in PR #12015.

Now that we have #12015 to fix reconfiguration during upgrades, I believe the scope of this PR does not need to satisfy the upgrade requirements.

If the community feels there is value in introducing standalone reconfiguration playbooks that can be used outside of upgrades, we could definitely still consider this PR within that scope. Although, historically reconfiguration has been managed by rerunning cluster.yml or gracefully during upgrade with upgrade-cluster.yml.

@VannTen
Copy link
Copy Markdown
Contributor

VannTen commented Mar 31, 2025

Superseded by #12015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

upgrade: variables modified in kubeadm-config.yaml are not reflected in static manifests anymore

5 participants