feat: support of PodDisruptionBudgets#6864
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @SuperQ. Thanks for your PR. I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #6864 +/- ##
=======================================
Coverage 37.97% 37.97%
=======================================
Files 393 393
Lines 22605 22605
=======================================
Hits 8584 8584
Misses 14021 14021
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
It seems not reasonable to add |
|
Sure, the PDB is pretty simple / standard. apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: test-tikv
namespace: test
spec:
maxUnavailable: 1
selector:
matchLabels:
app.kubernetes.io/component: tikv
app.kubernetes.io/instance: testMy understanding is that the PodDisruptionBudget acts on the Pod resource and looks for the Pod's owner. So, this example: apiVersion: v1
kind: Pod
metadata:
name: test-tikv-326vs
namespace: test
labels:
app.kubernetes.io/component: tikv
app.kubernetes.io/instance: test
...
ownerReferences:
- apiVersion: core.pingcap.com/v1alpha1
blockOwnerDeletion: true
controller: true
kind: TiKV
name: test-tikv
uid: ...So when Kubernetes tries to gather all the Pods matching the label spec it looks at the owner references and finds that TiKV is the owner, no mention of TiKVGroup. So the way things are now, the PDB can't work. |
|
https://kubernetes.io/docs/tasks/run-application/configure-pdb/#arbitrary-controllers-and-selectors It seems that Anyway, this PR cannot fix the issue. |
|
Oof, that's an unfortunate restriction. So the only real option is that the TiDB Operator will need to create pods via an indirect method, for example each Pod needs to have an individual StatefulSet. |
|
If you would like, I can file a tracking issue detailing the need here. |
|
Now operator manages |
|
What we can do now is only adding the leader eviction scheduler to one pod even if there are more than one pod is deleting. |
|
What needs to be done is have an intermediate Pod controller like this: Happy to help implement this. |
It's useless. PDB needs to get replicas from pod's owner. |
|
It does work, I have filed #6901 with some details on how multiple StatefulSets can be combined into a single PDB. In this new design, the Pod's owner is the StatefulSet, which supports maxUnavailable. This is how and why we implemented this in the valkey-operator. |
|
Now that we have the details we need in the issue, I am going to close this PR. |
|
I have verified the PDB in my own env. There are still some issues of this PR:
|
38939e1 to
70720b8
Compare
|
Updated. I think we can probably remove the Scale subresource from the |
|
@SuperQ I left some comments.
The scale of group(e.g. TiKVGroup) is for HPA and VPA. It has been used. The scale of instance(e.g. TiKV) is only for compatibility of PDB |
In order for PodDisruptionBudgets to work on various groups of Pods
we need to implement the Scale subresource at the direct owner of
the Pod.
Otherwise the PDB will return errors like:
```yaml
status:
conditions:
- lastTransitionTime: "2026-04..."
message: tikvs.core.pingcap.com does not implement the scale subresource
reason: SyncFailed
status: "False"
type: DisruptionAllowed
```
Followup to pingcap#6213
Signed-off-by: SuperQ <superq@gmail.com>
70720b8 to
29edafd
Compare
| // +kubebuilder:validation:Maximum=1 | ||
| // +default:value=1 | ||
| // +optional | ||
| Replicas *int32 `json:"replicas"` |
There was a problem hiding this comment.
Add a comment about this field. "It's only designed for compatibility with PDB and cannot be changed"
| // +kubebuilder:validation:Maximum=1 | ||
| // +default:value=1 | ||
| // +optional | ||
| Replicas *int32 `json:"replicas"` |
|
Still two comments are left. Thanks @SuperQ |
|
Should we also add a kubebuilder validation like: |
|
Bah! I didn't mean to close this PR. |
|
Opened #6912 |
In order for PodDisruptionBudgets to work on various groups of Pods we need to implement the Scale subresource at the direct owner of the Pod.
Otherwise the PDB will return errors like:
Followup to #6213
Closes: #6901
Closes: #4221