User Tools

Site Tools


Sidebar

Table Of Contents

endurox:v7.5.x:manuals:tpbridge.8

tpbridge

Name

tpbridge — Enduro/X Bridge Server.

Synopsis

tpbridge [OPTIONS]

DESCRIPTION

This is special ATMI server which is used to connect local ATMI instances over the network. The result is network joined instances which makes Enduro/X cluster.

Bridge process is used to exchange service lists between two nodes, calculate monotonic clock diff (so that later for messages time can be adjusted) between nodes.

To establish network connection, on one machine it must be in passive mode and on other machine it must be in active mode. Active tpbridge periodically tries to connect to the other machine. To one passive bridge only one connection can be made.

If connection is dropped, active node will re-try to connect.

All data messages are prefixed with 4 byte message length indicator. Meaning that the logical message can be split over the multiple packets or within one packed can be carried multiple logical messages - tpbridge will solve that.

When connection is established, clock diff and service lists are exchanged, then bridge is used to serve ATMI actions over the machines. I.e. tpcall(), tpforward(), conversations, etc. At connection established clocks are send in asynchronous fashion and time spent during the message in network is ignored, as connection is fresh and no data flows. But it might happen that over the time the monotonic clocks of connected hosts drift away, the difference measured at startup is not actual, and the messages receive or sent from one node might look like expired on other node. To solve this issue tpbridge periodically sends dynamic clock exchange between nodes, in synchronous fashion. The round trip time is measured (just like a ping time) and if it is within acceptable boundaries, the time from other host is accepted and time correction value is updated. The max rountrip time is set by -k flag (default 200ms), and interval is set by -K flag (600 sec). To monitor the clock status, TM_MIB class T_BRCON can be used for this purpose, e.g. "$ xadmin -c T_BRCON" call.

When connection is stopped. This is reported to ndrxd daemon which removes services from shared memory accordingly.

tpbridge supports two network message formats. First format is native format which sends over the network directly internal (C lang) structures. This format will work faster, but cannot be used between different type of computers. I.e. in this case it is not possible to mix for example x86_64 with x86. Or x86 with RISC/ARM 32bit. If mixing is necessary, then use Enduro/X Network Protocol option, activated by flag -f on both nodes. In this case standard common TLV data format is used for data exchange between nodes. This might be slower than native format.

When using host name (-h) for resolving binding host or connection address, tpbridge will resolve IP addresses. Multiple IP addresses for host name are supported. The logic for using them is following:

In case of binding (server):

  1. Query host name
  2. Select first IP found in list
  3. Try to bind to selected IP
  4. If bind failed with EADDRINUSE or EADDRNOTAVAIL, select next IP, continue with 3.
  5. If list of IP addresses are exhausted, start with 1.

In case of connecting to address (client):

  1. Query host name
  2. Select first IP found in list
  3. Perform connect() to selected IP
  4. If connect() asynchronously failed (after EINPROGRESS) with any error, select next IP, continue with 3.
  5. If list of IP addresses are exhausted, start with 1.

tpbridge to-network and from-network streams are separated by different threads. Thus reading from XATMI queues is doing server main thread. And reading from network is done by other thread.

BLOCKING CONDITION SOLVING

In case if IPC queues are full:

In case when messages are received from network, but local queues are blocked (full), e.g services are slow to process such amount of incoming requests, tpbridge will try to solve situation in following way:

  1. In case if sending to XATMI IPC queue bridge gets EAGAIN error, message is added to internal temporary queue. Queue size is set by -Q cli flag (or read from NDRX_MSGMAX or defaulted to 100, see TEMPORARY_QUEUE_SIZE bellow. When message is queued, special queue sender thread is being activated.
  2. In case if queued message count is greater than TEMPORARY_QUEUE_SIZE, then incoming net-in thread is blocked until queue sender thread will send some messages and queued message count will be less than TEMPORARY_QUEUE_SIZE.
  3. In case if queue action (-A) is set to 2 and new message arrives for single XATMI IPC queue (e.g. service queue, reply queue, etc.) which is full, the message is discarded.
  4. In case if queue action (-A) is set to 3 and new message arrives for single XATMI IPC queue which is full or message arrives for other queue but overall limit is exceeded of temporary enqueued messages, then message is discarded.
  5. In case if bridge was solving conditions with blocking and is blocked, once every second net-in thread is unblocked, to process any incoming messages from socket.

Queue sender Logic:

  1. Enqueued messages are grouped by single XATMI IPC queue names.
  2. Each enqueued message get scheduled next attempt run. Where minimum wait time is set by -m parameter.
  3. On every attempt unsuccessful send attempt, next attempt time is multiplied by 2.
  4. If single XATMI IPC queue attempt was successful, the next message of the same queue is tried to send immediately, if it fails, the next attempt for the same queue is scheduled and queue runner switches to next different single XATMI IPC queue (if any).
  5. If attempts are reached or TTL expired (if configured), message is discarded.
  6. In case if message by it self is timed (i.e. tpcall() without TPNOTIME flag) and it expire, the message is discarded.
  7. In case if queue destination queue is broken (attempt did no end with EAGAIN or EINTER), the message is discarded.

In case if network socket is full:

  1. In case if outgoing socket is being blocked (full), tpbridge stop process outgoing traffic is processed - e.g. "@TPBRIDGEXXX" queue will not be read.
  2. To respond to ndrxd(5) pings, bridge lets to process one outgoing message per second (i.e. read Enduro/X XATMI server queues).

Message discard strategy

If message is service call and client is waiting for answer, server error TPESVCERR is returned to caller.

All discarded messages are logged with error level 3 (warning) to the bridge logs.

ULOG contains entry "Discarding message" for every discarded message.

OPTIONS

-n NODE_ID
Other Enduro/X instance’s Node ID. Numerical 1..32.
[-r]
Send Refresh messages to other node. If not set, other node will not see our’s node’s services. OPTIONAL flag.
-t MODE
MODE can be P for passive/TCPIP server mode, any other (e.g. A) will be client mode.
-i IP_ADDRESS
In Active mode it is IP address to connect to. In passive mode it is binding/listen address.
-h HOST_NAME
Binding/connection IP Address may be resolved from host name set in -h parameter. Host name is resolved by OS, DNS queries, etc. tpbridge shall be started with -i or with -h, if both flags will be set, error will be generated.
-6
If set, then IPv6 addresses will be used. By default tpbridge operates with IPv4 addresses.
-p PORT_NUMBER
In active mode PORT_NUMBER is port to connect to. In passive mode it is port on which to listen for connection.
-T TIME_OUT_SEC
Parameter indicates time-out value for packet receive in seconds. This is socket option. Receive is initiate when it either there is poll even on socket or incomplete logical message is received and then next recv() is called. If the message part is not received in time, then socket is closed and connection is restarted. This parameter also is used in case if target socket to which msg is being sent is full for this given time period. If msg is not fully sent and time out is reached, the connection is restarted, outgoing msg is being dropped.
[-b BACKLOG_NR]
Number of backlog entries. This is server’s (passive mode) connection queue, before server accepts connection. OPTIONAL parameter. Default value is 100. But could be set to something like 5.
[-c CONNECTION_CHECK_SEC]
Connection check interval in seconds. OPTIONAL parameter. Default value 5.
[-z PERIODIC_ZERO_SEND_SEC]
Interval in seconds between which zero length message is wrote to socket. This is useful to keep the connection option over the firewalls, etc. OPTIONAL parameter. Default value 0 (Do not send).
[-a INCOMING_RECV_ACTIVITY_SEC]
If set, then this is maximum time into which some packet from network must be received. If no receive activity on socket is done, the connection is reset. The 0 value disables this functionality. The default value is -z multiplied by 2. Note that checks are performed with -c interval. intervals. Usually this is used with -z, so that it is guaranteed that during that there will be any traffic.
[-f]
Use Enduro/X Standard Network TLV Protocol instead of native data structures for sending data over the network. This also ensure some backwards compatibility between Enduro/X versions. But cases for backwards compatibility must be checked individually.
[-P THREAD_POOL_SIZE]
This is number of worker threads for sending and receiving messages for/to network. 50% of the threads are used for upload and other 50% are used for network download. Thus number is divided by 2 and two thread pools are created. If divided value is less than 1, then default is used. The default size is 4.
[-R QUEUE_RETRIES]
Number of attempts to send message to local queue, if on pervious attempt queue was full. The first attempt is done in real time, any further (if this flag allows) are performed with calculated frequency of: nr_messages_failed_to_send - nr_messages_sent in milliseconds. Default value is 999999. To disable temporary queue, set value to 0.
[-A TEMPORARY_QUEUE_ACTION]
This value indicates the action how tpbridge shall process the cases when temporary queue space for unsent / blocking messages are full. Values are following: value 1 - if global temp queue is full (-Q param) - block the bridge / stop incoming traffic, if single XATMI IPC queue is full (-q) - ignore the condition (i.e. let to fill till the -Q limit). Value 2 - if global temp queue full - block, if single XATMI IPC queue is full - discarded the message. Value 3 - if global temp queue is full - discarded the message, if single XATMI IPC queue is full - discarded the message. Default is 1.
[-Q TEMPORARY_QUEUE_SIZE]
This is number of messages that tpbridge can accumulate in case if message is received from network and destination queue is full (e.g. service call queue, reply queue, etc). If this parameter is not set, then value uses NDRX_MSGMAX environment variable setting. If env variable is not available, then value is defaulted to 100. The value of temporary queue size is preferred (and not string) as due to parallel processing conditions, the number of messages in queue might go over this number until the bridge is locked.
[-q TEMPORARY_QUEUE_SIZE_DEST]
This is max number of messages tpbridge can accumulate for single XATMI IPC queue which is full/blocking. This parameter is used in case if queue action (-A) is configure to drop messages, if single XATMI IPC queue temporary space is full.
[-L TEMPORARY_QUEUE_TTL]
This is number of milliseconds for messages to live in temporary queue. Default value is NDRX_TOUT env setting converted to millisecond.
[-m TEMPORARY_QUEUE_MINSLEEP]
This is minimum number milliseconds to wait after which schedule next attempt of message sending to the XATMI IPC queue. The default is 40.
[-M TEMPORARY_QUEUE_MAXSLEEP]
This is maximum number milliseconds to wait after which schedule next attempt of message sending to the XATMI IPC queue. The default is 150.
[-B THREADPOOL_BUFFER_SIZE]
This is number of messages that either net-out or net-in threads can accumulate to corresponding thread job queue. Higher the number, will mean tpbridge will start to collect some unprocessed messages, but better would be the pipeline for incoming/outgoing main threads and the thread pool workers. The default value is half of the THREAD_POOL_SIZE.
[-k CLOCKSYNC_ROUNDTRIP]
Maximum periodic clock sync message rountrip from local host to remote host nad back in milliseconds for accepting the remote hosts monontonic clock value for time adjustments. If roundtrip time for clock request is greater than this value, the response with remote hosts monotonic clock value is ignored. Default is 200.
[-K CLOCKSYNC_PERIOD]
Number of seconds to send the request from clock synchronization. Value 0 disables this functionality. Default value is 600. Checking is performed with the granularity of the CONNECTION_CHECK_SEC.

EXIT STATUS

0
Success
1
Failure

BUGS

Report bugs to support@mavimax.com

COPYING

© Mavimax, Ltd