tmsrv — Transaction Manager Server.
This is a special ATMI server that is used for distributed transaction coordination. For transactions started with tpbegin(3), tmsrv generates new XID and passes XID back to transaction initiator. At the same time, transaction is remembered by tmsrv as active transaction, transaction time-out expiry is checked by the background thread.
In Enduro/X XA Resource Managers are numeric identifiers, which are allowed to be in range of 1..255. Enduro/X’s Resource Manager ID (RMID) is same identifier as Group Number or grpno known in other ATMI systems. In one global transaction, maximum 32 number of different resources may participate.
If during distributed transaction processing new resource manager is associated with transaction, then process notifies initial transaction manager that new the association must be made.
Every transaction is logged to a separate file on the disk. The file name contains the resource manager ID and transaction XID. All involved resource manager statuses are logged to this particular machine-readable log file. Once the transaction is completed, file is removed. If tmsrv has crashed having in-progress transactions, transaction log files are read from the disk at the next tmsrv startup, and appropriate actions according to two phase commit state machine are performed (aborted or rolled back).
If running transaction did time-out, then background thread will abort it automatically, and for caller process commit() will fail with TPEABORT error code.
If several resource managers are used in the single transaction, other transaction managers for corresponding resource managers are used as workers for executing prepare/abort/commit actions on the enlisted resource managers. These other transaction managers may be located on other cluster nodes, depending on the system setup. Cluster setup must be done correctly because an initiator transaction manager must have direct access (i.e. direct tpbridge(8) link) to all enlisted resource manager-associated transaction managers (workers).
Transaction managers can be load-balanced with ndrxconfig.xml(5) with min/max attributes. In load-balanced mode at tpbegin() corresponding free transaction manager will be selected. Later at transaction processing selected manager is responsible for the full life cycle of the transaction. Other enlisted transaction/resource managers for this transaction will help prepare/commit/abort transaction branches . These other TMs will be selected in load-balanced mode.
Every instance of tmsrv will advertise the following list of services
For example for Resource Manager ID 1, Cluster Node ID 6, Enduro/X Server ID 10 services will look like:
Currently service format 1. is used for starting new transaction, and accepting prepare/commit/abort calls from the transaction initiator TM. Service Nr 3. is used by transaction initiator for subsequent calls of the tpcommit(3)/tpabort(3). Also 3. is used by services involved in transaction to register new Resource Manager ID as part of the transaction.
For XA processing, resource manager drivers are loaded via dynamic loadable shared libs. Resource manager should expose xa_switch in shared lib. For every different resource manager, there is different Enduro/X tmsrv running. Enduro/X process gets associated with the corresponding RMID via NDRX_XA_RES_ID environment variable.
To configure different RMID’s for different processes or tmsrvs, use the Enduro/X built-in facility of environment variable override or associate processes with <cctag> setting which corresponds to [@global/<cctag>] sub-section where the XA settings can be placed. See the manpage of ex_env(5) for more details.
Enduro/X supports static and dynamic XA registration.
tmsrv register all activities of the transactions and resource managers in the machine readable log file. Log file is used for crash recovery, where last state of the transaction is read and transaction is completed according to the target state set in log file.
Each log file line contains a CRC32 checksum, which is verified during the recovery, any bad line is ignored, which might happen in case if data have not fully flushed to the disk. If during the recovery process some of the lines are invalid, they are ignored, and tmsrv acts with knowledge of last known state.
When the transaction is started or when a new resource manager joins the transaction or when commit/abort request is made, logging is mandatory, i.e. if the disk is full or permissions error, the transaction is either not started or state not changed.
When the transaction is finalized (committed or aborted) the transaction and resource states are logged optionally, thus write errors are ignored (but logged to ULOG). Thus if recovery is necessary at this stage, the transaction would be finalized according to any last valid data logged.
If after crash recovery some transactions still exist in Resource Manager as not completed, following xadmin(8) commands may be used to finish them at particular Resource Manager level:
WARNING: These commands does not consult with the originating transaction managers for the transaction statuses, thus these command shall be used only when system is idling (not processing any useful workload and it is known that there some records at resources locked / stuck at prepare stage).
To collect and rollback any orphaned prepared transactions, it is recommended to configure singled tmrecoversv(8) copy at the end of ndrxconfig.xml(5) server startup sequence, so that this server automatically would collect any broken transaction branches. Or after the system startup, manually invoke tmrecovercl(5) command line tool to perform transaction collection and rollback.
By default tmsrv writes log files to disk and uses fflush() unix call to persist the data. This call submits the message only to Operating System kernel, but does not guarantee that data is written to disk. Thus if power outage happens some transaction information may be lost. Thus for critical systems it is recommended to use special flags which instruct the tmsrv to perform disk synchronization when commit decisions have been taken. These flags shall be set in NDRX_XA_FLAGS: FSYNC or FDATASYNC and DSYNC. The FSYNC or FDATASYNC corresponds to fsync() and fdatasync() unix calls to flush the transaction log data to disk. DSYNC ensures that log file directory structure is flushed to disk. FDATASYNC is bit faster than FSYNC, as it does not update the file last change and other insignificant attributes). Usage of these flags may significantly reduce the transaction TPS performance. DSYNC usage depends on the operating system. It is necessary for Linux and Solaris operating systems. For other operating systems, please consult with vendors manuals, when directory fsync is needed for new files to be persisted.
The -R mode might not be enabled in a database for the user. I.e. user is not allowed to see open transactions. Thus must be enabled by following commands on the DB user set in XA open string:
grant select on pending_trans$ to <database_user>; grant select on dba_2pc_pending to <database_user>; grant select on dba_pending_transactions to <database_user>; grant execute on dbms_system to <database_user>; (If using Oracle 10.2) grant execute on dbms_xa to <database_user>; (If using Oracle 10.2)
If planning to use Oracle RAC, to successfully process the distributed transaction across binaries which are connected to different RAC nodes, Oracle RAC Singleton Service must be configured so that only one node actively serves the transactions, and this ensures XA affinity.
Typically on gird infrastructure, that can be configured as:
$ srvctl add service -db RACDB -service XARAC -preferred RAC1 -available RAC2
For policy-based RAC cluster management, use:
$ srvctl add service -db RACDB -service XARAC -serverpool xa_pool -cardinality SINGLETON
NOTE: -dtp option shall be leaved to default, which is FALSE.
If this above is not configured and say two binaries are working with the same XA transaction, one binary is connected to the first RAC node and another binary with the second RAC node, the transaction will not work, as XA API will not see the transaction on other node than where it was started and the following error would be generated:
ORA-24798: cannot resume the distributed transaction branch on another instance
For more details consult with Oracle instructions, as basically Enduro/X uses plain X/Open XA API for managing the transactions, and it is expected that Oracle DB provides support for XA API.
When using dynamic registration xa switches with the RECON XA flag functionality, to keep the process working in case if communications are lost while executing non XA AP code e.g. SQL statements, the process by it self must perform tpclose(3)/tpopen(3) until it succeeds, or the process shall perform exit so that Enduro/X would restart it. This extra logic is needed due to fact, that if outside of XA API communications are lost, the Enduro/X by it self would not see that comms status have changed because ax_start() is executed only when resource is modified by the application. If comms are not working in the application, the resource is not modified and thus ax_start() is not invoked.
When the process joins the transaction (either initiator or participating XATMI server), firstly it register with tmsrv and only then performs xa_start() API call. If transaction at tmsrv expires concurrently while the joining process has not yet called the xa_start(), there is the possibility that an orphan transaction may be created (i.e. created active transaction in the resource, but the transaction is not managed by Enduro/X as already rolled back). To overcome this limitation, careful transaction timeout planning shall be performed which applies to tpbegin() setting and timeout setting at the resource for inactive transactions.
If transaction expires at tmsrv, this fact does not terminate any tpcall(3) operations, except that if called service’s associate resource manager is not registered with given global transaction.
Report bugs to support@mavimax.com
If logs directory (-l) is located on Linux ext4 file system and FSYNC/FDATASYNC/DSYNC flags are used, the transaction manager might perform much slower than physical hard disk is capable of. Instead, it is recommended to use xfs file system for Linux, which performs better.