-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn in jobby list
if there are failed pods to a Kueue workload
#89
Conversation
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
Additional details and impacted files@@ Coverage Diff @@
## main #89 +/- ##
==========================================
+ Coverage 56.06% 56.42% +0.35%
==========================================
Files 61 61
Lines 3018 2997 -21
==========================================
- Hits 1692 1691 -1
+ Misses 1326 1306 -20
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Summary: I list all pods for a workload, any failed pod associated with it will have Technically, this is portable across job types in the sense that any failed pod associated with a cluster resource should trigger an alarm (if, for example, the kuberay operator were to fail, we would get a notifier here as well.) This requires a working connection to the k8s API server, which makes it fail in CI if not mocked away. (I'm not sure why that is, but maybe you can chime in here.) |
I'm not surprised it needs a mock, since the endpoint accesses the In the long run, we would benefit from introducing a few test fixtures to produce mock workloads, and not repeat that logic across the individual tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few small assertions to the E2E test case, but everything looks good!
As per title. We only warn if the job hasn't already failed, in which case you can inspect the job directly to debug.
Addresses the final point for #86, at least for Kueue jobs.