I've been tackling a common issue at work with countless abandoned Cron Jobs in our Kubernetes clusters. No one wants to risk deleting them for fear of breaking something in production. To address this, I created a CLI tool called Zombie Hunter. It scans Kubernetes clusters to find CronJobs that haven't executed successfully in a configurable number of days. It provides confidence scores to help determine whether these jobs are truly abandoned or just running less frequently.
**Key Features:**
- Scans across all namespaces for CronJobs
- Analyzes job execution history
- Computes confidence scores (ranging from 50% to 99%)
- Can export results in table, CSV, or JSON formats
This is my first open-source project (currently in v0.1), so I would love some feedback:
- How useful do you find this tool?
- What features do you think I should implement to make it production-ready?
- Are there any bugs or specific edge cases I might have overlooked?
You can check it out on GitHub [here](https://github.com/rrdesai64/zombie-hunter). It's MIT licensed and contributions are welcome! Thanks for your time!
1 Answer
What criteria did you use for the confidence scores? It sounds intriguing!

Great question! The scoring relies on several factors:
**Primary factor: Time since last successful execution**
- Inactive for 365+ days → 99% confidence
- 180-365 days → 95%
- 90-180 days → 85%
- 60-90 days → 75%
- 30-60 days → 60%
**Secondary factors that adjust the score:**
- Suspended jobs drop to 20% confidence (intentional pause)
- If all past jobs failed, it jumps to 95% (clearly broken)
- Jobs that have never run get 50% confidence (it's unclear if they are intentional or abandoned)
Here's the basic logic I used:
```go
func calculateConfidence(daysSince, total, failed int, suspended bool) int {
if suspended {
return 20 // Intentionally paused
}
if total == 0 {
return 50 // Never ran - could be new or abandoned
}
if total > 0 && failed == total {
return 95 // All jobs failed - clearly broken
}
// Time-based scoring
if daysSince >= 365 { return 99 }
if daysSince >= 180 { return 95 }
if daysSince >= 90 { return 85 }
if daysSince >= 60 { return 75 }
if daysSince >= 30 { return 60 }
return 40
}
```
I'm planning to include more features in future versions, like pattern detection and resource usage analysis. Does this scoring approach make sense for you? Let me know if there are any edge cases I might've missed!