Detecting and Diagnosing Problems When z/OS Thinks It Is OK (includes PFA User Experience)
Project and Program:
MVS,
MVS Core Technologies
Tags:
Proceedings ,
2012 ,
SHARE in Atlanta 2012
The presenter will discuss the multiple capabilities which are available on z/OS to detect and diagnose soft failures
- Describe soft failure detection
- Built into z/OS component like XCF stalled member detection
- Provided by health checks
- Provided by z/OS PFA
- Provided by other vendor products
- Highlight the kind of problems each different type of soft failure detection is good at and not good at
- Machine time scale vs human time scale
- Location in the stack
- Detectable by performance metrics vs non performance metrics
- Insight from building PFA to help reduce impact of soft failures
- Automation of alerts is key
- z/OS can survive / recover from most soft failures
- Most metrics are very time sensitive
Robert Abrams, IBM Corporation; Sam Knutson, GEICO
Back to Proceedings File Library