Bug #123
Transaction Manager counts re-tries incorrectly.
Status: | Closed | Start date: | 04/11/2017 | |
---|---|---|---|---|
Priority: | High (Code 3) | Due date: | ||
Assignee: | - | % Done: | 100% | |
Category: | - | |||
Target version: | - |
Description
Transaction has failed (due to some unknown reason - maybe DB):
N:NDRX:5:10465:001:20170328:161119000:autils.c:0401:Serialized xid: [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] N:NDRX:5:10465:001:20170328:161119000:autils.c:0405:xid deserialization total len: 27 U:UBF :5:10465:001:20170328:161119000:autils.c:0406:XID data for deserialization 0000 61 94 f7 a1 10 05 0d 79 7a f4 89 77 42 39 9d b4 a......yz..wB9.. 0010 44 50 b8 41 1e b8 01 01 00 0e 00 DP.A....... U:UBF :5:10465:001:20170328:161119000:autils.c:0435:Original XID restored 0000 a1 f7 94 61 00 00 00 00 10 00 00 00 00 00 00 00 ...a............ 0010 05 00 00 00 00 00 00 00 0d 79 7a f4 89 77 42 39 .........yz..wB9 0020 9d b4 44 50 b8 41 1e b8 01 01 00 0e 00 00 00 00 ..DP.A.......... 0030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0090 00 00 00 00 00 00 00 00 ........ N:NDRX:5:10465:001:20170328:161119000:tmi/xa.c:0379:atmi_xa_prepare_entry N:NDRX:2:10465:001:20170328:161119001:tmi/xa.c:0385:xa_prepare_entry - fail: -6 [routine invoked in an improper context] N:NDRX:3:10465:001:20170328:161119001:perror.c:0309:_TPset_error_fmt_rsn: 16 (TPERMERR) reason: -6 [xa_prepare_entry - fail: -6 [routine invoked in an improper context]]
then sort of fixed:
N:NDRX:5:10465:000:20170328:161111011:ed_buf.c:0377:Free-up auto buffer : 0xb9f520 typeid: 0 TMNODEID 1 TMRMID 1 TMSRVID 14 TMCALLERRM 1 TMCMD c TMXID YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A TMKNOWNRMS \01 TMPROCESSID srv,crdstatsv,8141,10555,1 N:NDRX:4:10453:007:20170328:161111011:atedrv.c:0148:Operation tperrno: 0, xa return code: 0 N:NDRX:4:10465:005:20170328:161111011:/tmsrv.c:0151:Got command code: [c] N:NDRX:5:10465:005:20170328:161111011:/tmapi.c:0151:tm_tpcommit() called N:NDRX:4:10453:007:20170328:161111011:atedrv.c:0211:RM 1 votes for stage: 70 N:NDRX:5:10453:007:20170328:161111011:rv/log.c:0637:tms_log_stage: new stage - 70 N:NDRX:4:10465:005:20170328:161111011:rv/log.c:0138:Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] locked for thread_id: 140048651441920 N:NDRX:4:10453:007:20170328:161111011:atedrv.c:0097:Entered in stage: COMMITTED
But then I get information that transaction is locked:
N:NDRX:4:10465:001:20170329:083859000:ground.c:0228:XID: [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] stage: [40]. Try: 1, max: 100 N:NDRX:2:10465:001:20170329:083859000:rv/log.c:0126:Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 N:NDRX:5:10465:001:20170329:083859000:ground.c:0253:Transaction locked or already processed by foreground... N:NDRX:5:10465:001:20170329:083859000:ground.c:0262:background - sleep 10
this also could be OK if counter would be counted properly. But it's not. I'm getting same counter (try 1, max 100) all the time:
N:NDRX:5:10429:001:20170329:083728000:ground.c:0262:background - sleep 10 N:NDRX:4:10465:001:20170329:083729000:ground.c:0228:XID: [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] stage: [40]. Try: 1, max: 100 N:NDRX:2:10465:001:20170329:083729000:rv/log.c:0126:Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 N:NDRX:5:10465:001:20170329:083729000:ground.c:0253:Transaction locked or already processed by foreground... N:NDRX:5:10465:001:20170329:083729000:ground.c:0262:background - sleep 10 ... N:NDRX:4:10465:001:20170329:083859000:ground.c:0228:XID: [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] stage: [40]. Try: 1, max: 100 N:NDRX:2:10465:001:20170329:083859000:rv/log.c:0126:Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 N:NDRX:5:10465:001:20170329:083859000:ground.c:0253:Transaction locked or already processed by foreground... N:NDRX:5:10465:001:20170329:083859000:ground.c:0262:background - sleep 10 ... etc.
and I see following in user log:
... 0:20170328:17521900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17522900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17523900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17524900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17525900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17530900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17531900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17532900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17533900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17534900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17535900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17540900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17541900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17542900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17543900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17544900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17545900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17550900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 0:20170328:17551900:tmsrv :tmsrv: Transaction [YZT3oRAFDXl69Il3QjmdtERQuEEeuAEBAA4A] already locked for thread_id: 140048643049216 ...
Please fix
History
#1 Updated by Madars about 7 years ago
Hmm, seems that problem is here http://www.silodev.com/lxr/source/tmsrv/background.c :
0201 public int background_loop(void) 0202 { 0203 int ret = SUCCEED; 0204 atmi_xa_log_list_t *tx_list; 0205 atmi_xa_log_list_t *el, *tmp; 0206 atmi_xa_tx_info_t xai; 0207 atmi_xa_log_t *p_tl; 0208 0209 memset(&xai, 0, sizeof(xai)); 0210 0211 while(!G_bacground_req_shutdown) 0212 { 0213 /* Check the list of transactions (iterate over...) 0214 * Seems anyway, we need a list of background ops here... 0215 */ 0216 0217 /* Lock for processing... (cose xadmin might want to do some stuff in middle 0218 * Might want to think something better (so that it does not lock all process) 0219 */ 0220 background_lock(); 0221 tx_list = tms_copy_hash2list(COPY_MODE_BACKGROUND | COPY_MODE_ACQLOCK); <<<< it does the copy of the transaction list (for concurrency access) 0222 0223 LL_FOREACH_SAFE(tx_list,el,tmp) 0224 { 0225 el->p_tl.trycount++; <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< This increments the copy and NOT the original transaction object. 0226 NDRX_LOG(log_info, "XID: [%s] stage: [%hd]. Try: %d, max: %d", 0227 el->p_tl.tmxid, el->p_tl.txstage, el->p_tl.trycount, 0228 G_tmsrv_cfg.max_tries); 0229
#2 Updated by Madars almost 7 years ago
- Priority changed from Normal (Code 4) to High (Code 3)
#3 Updated by Madars almost 7 years ago
But it looks like the actual transaction was in progress for some long time. We might want to reset tries counter back to 0 when actual action (prepare for example) on DB is finished.
#4 Updated by Madars almost 7 years ago
Also I guess we need to perform county only when we got the lock, not counting all locked attempts.
Thus we could better state what does the retry attempt setting does.
#5 Updated by Madars almost 7 years ago
- Status changed from New to Resolved
- % Done changed from 0 to 100
Fixed.
#6 Updated by Madars almost 7 years ago
- Status changed from Resolved to Closed