In its recent announcement, IBM touted the zEC12 as the first commercially available machine to support transactional...
memory through the implementation of Transaction Execution Facility.
Transactional memory glass half full
Normal mainframe serialization relies on enqueues, latches and locks. At the machine-code level, Z systems offer instructions such as compare and swap. Besides being complex, these mechanisms share a couple of undesirable side effects. A program that fails to get ownership of a resource usually waits, which can lead to elongated response times and cascading hangs. If programmers aren't careful, the traditional methods are also prone to deadly embraces that permanently block at least two processes until one is canceled.
In contrast, TEF employs an "optimistic" attitude that assumes it can proceed without conflicts. If the optimism is misplaced, the hardware rolls the transaction back and lets the program decide what to do next.
The zEC12 added two special machine instructions to mark the beginning and end of transactions. In between these instructions, a program can load and store from memory and alter registers. However, all the changes are provisional and uncommitted until the process ends the transaction without encountering a conflict. If a conflict arises -- for instance if another CPU changes a memory location referenced by the transactional process -- the hardware aborts the transaction, including changes to memory, and the whole thing has to start over. Note that programs using TEF must conform to a specific entry logic because if a conflict causes the CPU to interrupt the process, it will branch back to the instruction immediately following the beginning of the transaction, then set a non-zero condition code (CC).
In addition, the zEC12 supports constrained transactions. Constrained transactions have a few restrictions as to what they can do but they're more likely to succeed.
Non-constrained transactions can be nested up to a model dependent depth. The processor commits the changes as each transaction ends. However, if a conflict arises, the abort goes all the way back to the outermost transaction.
Programmers may use the ETND instruction to get the transaction nesting depth. If the CPU isn't in transactional mode, the nesting depth is zero.
The TEF instructions
These are the new instructions making up the TEF:
|TBEGIN||Marks the beginning of a transaction|
|TBEGINC||Notes the beginning of a constrained transaction|
|TABORT||Aborts the transaction in the same way as the machine might if a conflict arises|
|TEND||Ends the transaction and commits its changes|
|NTSTG||Saves values to memory outside of the transaction's commit scope. In other words, changes made with the NTSTG instruction stick even if the transaction aborts.|
|ETND||Extract Transaction Depth -- Gets the relative transaction nesting level|
The Transaction Diagnostic Block
A programmer may optionally specify a 256-byte area for a transaction diagnostic block (TDB) to receive CPU-generated conflict data. The TDB contains a lot of information, including the general registers at the time of the conflict along with the transaction abort code.
The zEC12 Principle of Operations details the many reasons why a transaction might be aborted. The POPS also warns that conflicts can arise from "speculative" instruction examination, which is part of out-of-order instruction execution. This creates interesting, Kafkaesque situations where a transaction will be aborted for something that may or may not have happened.
A closer look at TEF
Below is a code TEF code fragment:
XR R2,R2 Clear the loop counter
TBEGIN X'FF00',TDB Transaction begin
JNZ TRANABRT JMP if previously aborted
AP TRANCNT,=P'1' Increment decimal counter
TEND Commit changes
J ITWORKED Move on to better things
TDB DS XL256 Diagnostic area
TRANABRT DS 0H
LA R2,1(,R2) Increment loop counter
CHI R2,=H'30' 31st time through?
JL TRANSTRT No, retry transaction
PLANB DS 0H Yes, try something else
These instructions attempt to increment a packed decimal counter in common storage. The logic will try to update the counter 30 times before giving up.
The first instruction clears register two (R2), which keeps track of the number of update attempts. The next instruction, TBEGIN, puts the CPU in transactional state. The TBEGIN's first operand is a mask that tells the processor to restore the contents of general purpose registers 0 through 15 if something aborts the transaction. This is important as the R2 contains the loop counter. The second operand points to the TDB.
This is where the transaction entry logic becomes important. The instruction after the TBEGIN tests the CC. If the CC is zero, the processor was successfully put into transactional state and control falls through to update the counter. A non-zero CC means something aborted the transaction and caused a branch to the recovery logic at label TRANABRT.
After -- hopefully -- updating the counter, the TEND instruction ends the transaction and commits the incremented counter.
The instructions following TRANABRT attempt to recover from an aborted transaction. First, it increments the value in R2. If the value is less than 30, it jumps back to re-initialize the transaction. Otherwise it falls through to plan B.
TEF appears to be a leap forward in processor technology , considering the complex circuitry and microcode needed to monitor memory and generate interrupts. The question is if it is easier or more efficient than some of the other methods mentioned above.
Although the above example is contrived, it can serve as a way to think about how TEF works on a busy system. Given that the add packed (AP) instruction would execute in nanoseconds, chances are better than good the transaction will work the first time even while dozen of threads may want to update that counter. In this case, TEF appears to be a pretty easy solution.
Chances are TEF will find its way into lots of system-level code. However, IBM also intends customers to use it through the new Java and C++ APIs.
About the expert:
Robert Crawford has been a systems programmer for 29 years. While specializing in CICS technical support, he has also worked with VSAM, DB2, IMS and other mainframe products. He has programmed in Assembler, Rexx, C, C++, PL/1 and COBOL. In his latest career phase he is an operations architect responsible for establishing mainframe strategy and direction for a large insurance company. He works in south Texas, where he lives with his family.