-
Notifications
You must be signed in to change notification settings - Fork 24
Test reapply of pods for the same resource claim #140
Conversation
|
On a side note, I'm not a huge fan of declaring new devices for each test like this. Should we clean up the dummy devices in between tests? |
|
Thanks for looking into this @michaelasp The current test is failing at line 266 which seems to be the first pod creation: Line 266 in d372c0e
Aren't we trying to repro failure after the delete and recreate? Is this perhaps a different failure of our e2e repro? |
|
Hmm strange, it didn't fail at that point locally for me. Let me trigger a rerun but it may be due to the fact I commented out other tests for a faster run. |
Yes I completely agree. Some of this work is actually called out in #137 |
|
Still failing at 266. I think because we are using the same pod labels as the previous test, that has some lingering impact? |
I though delete waited for the resources to go away, but to be safe I made it so we used different labels and deployment name just in case that was the issue. I think that we may need to add a bit of timeouts between tests to let things settle or figure out what exactly is causing these intermittent failures. |
|
Now it's repro'd |
|
Thanks for the repro @michaelasp Checking the logs, the issue here stems because we don't (and most likely cannot properly) recreate the exact dummy interface after it is assigned to the first pod and then moved back to the root namespace.
While there definitely is scope for improvement in our code, in this sceario I'd say that perhaps this is something which is a byproduct of using a dummy interface and likely our tests may need to take care of re-configuring the address after the dummy interface is recycled from Pod 1. |
|
The reason why the addresses are not brough back is because it can cause conflicts with the host namesapce, I can imagine that inside the pod people may add different addresses, but this is a theory, we need more feedback on this. Usually the host interfaces must be managed by systemd when they come back to the root namesapce https://gist.github.com/aojea/e5787586a08313df51234e4d0c147df1 In the meantime, I think dhcp should be opt-in #143 and we can revisit later. regarding the e2e test, absolutely, bats has also setup() and teardown() hooks that will allow to set up the interface per test to not have to create one, those are things I did by laziness that need to be fixed |
|
Ah thanks for the RCA @gauravkghildiyal, I think #143 helps with the long delay but we still need to reconfigure the IP once it goes back into the host namespace. I'll add that for this test then. Let's discuss more on what this behavior should be. |
0eb61d9 to
1d9e4a9
Compare
|
Thanks! The test looks good to me. We can merge if this is no longer a WIP. |
Replicates an issue seen with reapplying the same deployment for a resource