Storage overlays, also known as storage violations, are among the toughest problems to debug. A bad mainframe storage...
overlay can corrupt data, wreck a CICS region or force an unscheduled initial program load.
CICS is particularly vulnerable to overlays because its structure colocates application and system memory in one address space, effectively short-circuiting z/OS’ storage protection keys. To address this problem, IBM introduced three facilities, including storage protection. In this tip, I’ll explain what storage protection is and how to use it to avoid storage violations.
Mainframe storage protection
With storage protection, CICS uses z/OS storage key protection by putting system and user data in different keys. This arrangement allows CICS system code to read and write into user and system storage. Conversely, application programs can read but not update CICS storage while having free rein over user storage. Thus, properly implemented, an attempted application overlay of system storage results in a protection exception (0C4) instead of bringing CICS down.
You can enable storage protection for a region by setting the systems initialization table parameter STGPROT to YES. The parameter is not dynamic and CICS must be bounced to switch states.
When storage protection is active, a few transaction and program attributes become important. For a transaction, the TASKDATAKEY attribute specifies which key CICS uses to get storage on behalf of the transaction, including the EXEC Interface Block transaction work area.
The program attribute EXECKEY tells CICS which storage key to use when it enters the program.
Storage protection and the above resource attributes make things a little more complicated. For instance, a transaction with TASKDATAKEY of CICS can’t call a program with an EXECKEY of USER because the program will not be able to write to task storage. Fortunately, CICS can see this coming and will kill the transaction with ABEND code AEZD. In addition, programs that call other routines without going through CICS or Language Environment, as in a statically linked module, may pass control to the target program in the wrong key.
Note that neither TASKDATAKEY or EXECKEY have any effect if STGPROT is set to NO.
Setting the CICS parameters is the easy part. The bigger challenge is finding and fixing application programs that don’t work with mainframe storage protection. There are a few things to look for that are automatically suspect:
- Programs that modify CICS data areas will not get very far under storage protection. A good source-scanning tool and a list of CICS data areas will help you find them quickly.
Once identified, the programs can be changed to use CICS’ extensive Systems Programming Interface (SPI). If there isn’t an SPI command, CICS exit or alternate process available, someone at the shop needs to decide if the functionality is still needed.
- Programs that call the operating system services for memory management, resource serialization or event synchronization pose a risk as some of these services may get or use storage in an incorrect key. While there is some system-level code that will always need operating system services, the majority of the calls should be changed to use CICS’ Application Programming Interface.
Note that these culprits are most likely to be Assembler programs. Runtime libraries for higher-level languages are usually smart enough to call CICS services when they know they’re in the online environment.
However, for most other programs, the brute force method applies. This entails turning on storage protection, running an application, getting a dump, fixing the code and trying again. While this shakes out some of the more salient problems, true storage violations may not appear except during a stress test or a production workload.
Implementations and limitations
Shops may adopt several strategies for implementing storage protection. One option would be to leave storage protection off in the early development and testing phases. Then, as programmers promote modules and code quality improves, testing moves into regions where storage protection is enabled to catch the remaining errors. In an ideal world, storage protection should be off in production under the assumption that all overlays were squelched in development. However, production will bring out errors programmers never imagined, and most systems programmers would want to afford production regions some measure of protection.
Also note that mainframe storage protection does not catch all overlays. One transaction running in user key can overlay any other transaction’s user key storage. In addition, a program can overlay itself or any other application program’s code.
To fix these shortcomings, CICS has two additional mainframe storage protection facilities:
- Re-entrant program protection loads re-entrant programs into read-only storage similar to z/OS link pack area. This is not as simple as it sounds, since some programs may cause problems because they modify themselves or a static storage area. Depending on program behavior, re-entrant program protection may be harder to implement than storage protection.
- Transaction isolation divides storage for tasks into “subspaces” that mimic z/OS address spaces. The setup ensures each task has its own subspace and cannot write into another task’s subspace. However, transaction isolation comes with a lot of CPU overhead, as CICS has to manage subspace switches. There is additional memory overhead as subspaces have storage boundary requirements that lead to fragmentation.
About the expert: Robert Crawford has been a systems programmer for 29 years. While specializing in CICS technical support he has also worked with VSAM, DB2, IMS and assorted other mainframe products. He has programmed in Assembler, Rexx, C, C++, PL/1 and COBOL. The latest phase in his career finds him an operations architect responsible for establishing mainframe strategy and direction for a large Insurance company. He lives and works with his family in south Texas.