Feature #368

xadmin start dead process detailed reason infos

Added by Madars Vitolins over 1 year ago. Updated 24 days ago.

Status:ResolvedStart date:12/28/2018
Priority:HighDue date:
Assignee:-% Done:

100%

Category:-
Target version:-

Description

Reporting more precise diagnostics from dead process at startup:

exec tmsrv -k 0myWI5nu -i 310 -e /tmp/TM1 -r -- -t10 -l/tmp --  :
    process id=20433 ... Died.

For example, we shall capture the output from "exec" command and somehow provide the error back to master copy of ndrxd (either via return code or via queue). In case of queue, then we need some kind of temporary storage for the dead process reasons, so that wen signalled reason arrives, we could read the exact reason.

Better would be if we provide return code, that would avoid the need for queue and storage.

We could store the reasons in hash by pid. The hash should be housekeeped, so that we zap the reasons after some time (to keep the memory in order)

History

#1 Updated by Madars Vitolins over 1 year ago

or we copy last error status to process model. but.. for system this will require to start the queueing admin threads to deliver status to ndrxd... just to note.

#2 Updated by Madars Vitolins over 1 year ago

  • Priority changed from Normal to High

test028 does not boot for some reason. This would help to explain it:

* ndrxd idle instance started.
exec tmsrv -k nre38Kff1kz -i 1 -e /home/user1/endurox/atmitest/test028_tmq/tmsrv-dom1.log -r -- -t1 -l/home/user1/endurox/atmitest/test028_tmq/RM1 --  :
        process id=23941 ... Started.
exec atmisv28 -k nre38Kff1kz -i 20 -e /home/user1/endurox/atmitest/test028_tmq/atmisv28-dom1.log -r --  :
        process id=23955 ... Started.
exec tmqueue -k nre38Kff1kz -i 100 -e /home/user1/endurox/atmitest/test028_tmq/tmqueue-dom1.log -r -- -m MYSPACE -q ./q.conf -s1 --  :
        process id=23957 ... Died.
Startup finished. 2 processes started.
* Shared resources opened...

#3 Updated by Madars Vitolins over 1 year ago

ULOG.20190111:23957:20190111:02395117:tmqueue     :ERROR! Filed to read tx file: req_read=696, read=0: Is a directory
user1@ubuntu16:/tmp$ 

#4 Updated by Madars Vitolins 24 days ago

Provided status of startup as:

Provide following new status codes for binaries during xadmin start:
static char *nosuchfile = "No such file or directory";

static char *eaccess = "Access denied";

static char *ebadfile = "Bad executable";

static char *elimits = "Limits exceeded";

static char *stillstarting = "Still starting";

static char *eargslim = "CLI args on env params too long";

static char *eenv= "Environment setup failure";

static char *esys= "System failure";

The status after the fork and bad exec is provided via shared memory.

available from 7.1+

#5 Updated by Madars Vitolins 24 days ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

Also available in: Atom PDF