Bug #465

test038_tpnotify failure of tpbridge with core

Added by Madars over 4 years ago. Updated over 4 years ago.

Status:ClosedStart date:11/02/2019
Priority:Normal (Code 4)Due date:
Assignee:-% Done:

100%

Category:-
Target version:-

Description

test038_tpnotify failure of tpbridge with core:

[user1@localhost test038_tpnotify]$ file core.25331 
core.25331: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'tpbridge -k nre38Kff1kz -i 101 -e /home/user1/endurox/atmitest/test038_tpnotify', real uid: 1001, effective uid: 1001, real gid: 1001, effective gid: 1001, execfn: '/home/user1/endurox/dist/bin/tpbridge', platform: 'x86_64'
[user1@localhost test038_tpnotify]$ gdb  /home/user1/endurox/dist/bin/tpbridge core.25331
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying" 
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/user1/endurox/dist/bin/tpbridge...done.
[New LWP 25335]
[New LWP 25336]
[New LWP 25332]
[New LWP 25333]
[New LWP 25334]
[New LWP 25331]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `tpbridge -k nre38Kff1kz -i 101 -e /home/user1/endurox/atmitest/test038_tpnotify'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f10f78d9694 in vfprintf () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-157.el7_3.1.x86_64 gpgme-1.3.2-5.el7.x86_64 libassuan-2.1.0-3.el7.x86_64 libgpg-error-1.12-3.el7.x86_64
(gdb) 
(gdb) 
(gdb) 
(gdb) 
(gdb) 
(gdb) where
#0  0x00007f10f78d9694 in vfprintf () from /lib64/libc.so.6
#1  0x00007f10f84b77e1 in __ndrx_debug__ (dbg_ptr=0x7f10d8088c10, lev=lev@entry=3, file=file@entry=0x40c1b0 "/home/user1/endurox/bridge/queue.c", line=line@entry=157, func=func@entry=0x40c579 <__func__.9386> "br_add_to_q", 
    fmt=fmt@entry=0x40c250 "Message %p/%d [%s] added to in-mem queue for late delivery...") at /home/user1/endurox/libnstd/ndebug.c:1097
#2  0x0000000000405659 in br_add_to_q (destq=0x7f10f1a60f60 "/dom1,clt,reply,atmiclt38,25368,1", pack_type=3, len=608, buf=0x7f10d8091718 "\002") at /home/user1/endurox/bridge/queue.c:156
#3  br_process_error (buf=buf@entry=0x7f10d8091718 "\002", len=len@entry=608, err=err@entry=11, from_q=from_q@entry=0x0, pack_type=pack_type@entry=3, destqstr=destqstr@entry=0x7f10f1a60f60 "/dom1,clt,reply,atmiclt38,25368,1")
    at /home/user1/endurox/bridge/queue.c:233
#4  0x0000000000406115 in br_submit_reply_to_q (call=call@entry=0x7f10d8091718, len=608) at /home/user1/endurox/bridge/queue.c:389
#5  0x0000000000407468 in br_process_msg_th (ptr=0x1f91a10, p_finish_off=<optimized out>) at /home/user1/endurox/bridge/network.c:467
#6  0x000000000040b23f in poolthread_do (thread_p=0x1f7fab0) at /home/user1/endurox/libexthpool/thpool.c:370
#7  0x00007f10f7c59dc5 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f10f798873d in clone () from /lib64/libc.so.6
(gdb) 

History

#1 Updated by Madars over 4 years ago

exprivate int br_add_to_q(char *buf, int len, int pack_type, char *destq)
{
    int ret=EXSUCCEED;
    in_msg_t *msg;

    if (NULL==(msg=NDRX_CALLOC(1, sizeof(in_msg_t))))
    {
        NDRX_ERR_MALLOC(sizeof(in_msg_t));
        EXFAIL_OUT(ret);
    }

    NDRX_SYSBUF_MALLOC_WERR_OUT(msg->buffer, NULL, ret);

    /*fill in the details*/
    msg->pack_type = pack_type;
    msg->len = len;
    NDRX_STRCPY_SAFE(msg->destqstr, destq);
    memcpy(msg->buffer, buf, len);

    ndrx_stopwatch_reset(&msg->trytime);

    MUTEX_LOCK_V(M_in_q_lock);
    DL_APPEND(M_in_q, msg);
    MUTEX_UNLOCK_V(M_in_q_lock);

    NDRX_LOG(log_warn, "Message %p/%d [%s] added to in-mem queue " 
            "for late delivery...", msg->buffer, msg->len, msg->destqstr);

Seems like NDRX_LOG after the adding to M_in_q can cause picking up already destructed object in case if background runner picked up and already processed the message... basically race condition.

#2 Updated by Madars over 4 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

#3 Updated by Madars over 4 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF