Name
tpbridge — Enduro/X Bridge Server.
Synopsis
tpbridge [OPTIONS]
DESCRIPTION
This is special ATMI server which is used to connect local ATMI instances
over the network. The result is network joined instances which makes
Enduro/X cluster.
Bridge process is used to exchange service lists between two nodes,
calculate monotonic clock diff (so that later for messages time can
be adjusted) between nodes and send XATMI IPC traffic between the machines.
To establish network connection, on one machine bridge must be in
passive mode (socket server) and on other machine it must be configured
in active mode (socket client). Active tpbridge periodically tries to connect
to the passive Enduro/X instance. If connection is dropped, active node
will re-try to connect. Single tpbridge process accepts only single
TCP connection. Between two Enduro/X instances only one link can be defined
where on each Enduro/X node there is tpbridge process configured accordingly.
Enduro/X node may have several tpbridge process definitions defined,
but each of these processes must define links for different Enduro/X
cluster nodes.
All data messages are prefixed with 4 byte message length indicator.
Meaning that the logical message can be split over the multiple packets or
within one packed multiple logical messages can be carried.
When connection is established, sequence of actions happens: 1)
clock difference between nodes and advertised service lists are exchanged. 2)
After initial data exchange (clock & tables) tpbridge is used to send
XATMI IPC over the network. I.e. tpcall(), tpforward(),
conversations, etc.
When connection is stopped. This is reported to ndrxd daemon which
removes services from shared memory accordingly.
tpbridge supports two network message formats. First format is native format
which sends over the network directly internal (C lang) structures. This format
will work faster, but cannot be used between different type of computers.
I.e. in this case it is not possible to mix for example x86_64 with x86. Or
x86 with RISC/ARM 32bit. If mixing is necessary, then use Enduro/X Network Protocol option,
activated by flag -f on both nodes. In this case standard common TLV data format is used
for data exchange between nodes. This might be slower than native format.
When using host name (-h) for resolving binding host or connection address,
tpbridge will resolve IP addresses. Multiple IP addresses for host name are
supported. The logic for using them is following:
In case of binding (server):
-
Query host name
-
Select first IP found in list
-
Try to bind to selected IP
-
If bind failed with EADDRINUSE or EADDRNOTAVAIL, select next IP, continue with 3.
-
If list of IP addresses are exhausted, start with 1.
In case of connecting to address (client):
-
Query host name
-
Select first IP found in list
-
Perform connect() to selected IP
-
If connect() asynchronously failed (after EINPROGRESS) with any error,
select next IP, continue with 3.
-
If list of IP addresses are exhausted, start with 1.
tpbridge to-network and from-network streams are separated by different threads.
Thus reading from XATMI queues is doing server main thread. And reading from
network is done by other thread.
BLOCKING CONDITION SOLVING
In case if IPC queues are full:
In case when messages are received from network, but local queues are blocked (full), e.g
services are slow to process such amount of incoming requests, tpbridge will try
to solve situation in following way:
-
In case if sending to XATMI IPC queue bridge gets EAGAIN error, message is
added to internal temporary queue. Queue size is set by -Q cli flag (or read
from NDRX_MSGMAX or defaulted to 100, see TEMPORARY_QUEUE_SIZE bellow.
When message is queued, special queue sender thread is being activated.
-
In case if queued message count is greater than
TEMPORARY_QUEUE_SIZE, then incoming net-in thread is blocked until
queue sender thread will send some messages and queued message count will be less than
TEMPORARY_QUEUE_SIZE.
-
In case if queue action (-A) is set to 2 and new message arrives for
single XATMI IPC queue (e.g. service queue, reply queue, etc.) which is full,
the message is discarded.
-
In case if queue action (-A) is set to 3 and new message arrives for
single XATMI IPC queue which is full or message arrives for other queue
but overall limit is exceeded of temporary enqueued messages,
then message is discarded.
-
In case if bridge was solving conditions with blocking and is blocked,
once every second net-in thread is unblocked, to process
any incoming messages from socket.
Queue sender Logic:
-
Enqueued messages are grouped by single XATMI IPC queue names.
-
Each enqueued message get scheduled next attempt run. Where minimum wait
time is set by -m parameter.
-
On every attempt unsuccessful send attempt, next attempt time is multiplied by 2.
-
If single XATMI IPC queue attempt was successful, the next message of the
same queue is tried to send immediately, if it fails, the next attempt for the
same queue is scheduled and queue runner switches to next different single XATMI
IPC queue (if any).
-
If attempts are reached or TTL expired (if configured), message is discarded.
-
In case if message by it self is timed (i.e. tpcall() without TPNOTIME flag)
and it expire, the message is discarded.
-
In case if queue destination queue is broken (attempt did no end with EAGAIN or
EINTER), the message is discarded.
In case if network socket is full:
-
In case if outgoing socket is being blocked (full), tpbridge stop process
outgoing traffic is processed - e.g. "@TPBRIDGEXXX" queue will not be read.
-
To respond to ndrxd(5) pings, bridge lets to process
one outgoing message per second (i.e. read Enduro/X XATMI server queues).
Message discard strategy
If message is service call and client is waiting for answer, server
error TPESVCERR is returned to caller.
All discarded messages are logged with error level 3 (warning) to the bridge
logs.
ULOG contains entry "Discarding message" for every discarded message.
CLOCK SYNC
As Enduro/X mostly all time elements (timeouts, etc) accounts in local Monotonic
time, the time correction (adjustment from remote Monotonic clock to local) is
required when XATMI IPC message is received from remote node. Bridge process
uses special messages to exchange the clock information between the nodes.
When connection is established, each node sends to other node it’s
local Monotonic time. This information is used for time correction between the nodes.
However over the time the Monotonic clocks of connected hosts may drift away from
the difference measured at connection startup. The messages
received or sent from one node might look like expired on other node.
To solve this issue, tpbridge periodically sends dynamic clock exchange messages
between nodes, in synchronous fashion. The round trip time is measured (just like a ping time)
and if it is within acceptable boundaries, the time from other host is accepted
and time correction value is updated. The max rountrip time is set by -k
flag (default 200ms), and interval is set by -K flag (600 sec).
To monitor the clock status, TM_MIB class T_BRCON can be used
for this purpose, e.g. "$ xadmin -c T_BRCON" call.
OPTIONS
-
-n NODE_ID
-
Other Enduro/X instance’s Node ID. Numerical 1..32.
-
[-r]
-
Send Refresh messages to other node. If not set, other node will
not see our’s node’s services. OPTIONAL flag.
-
-t MODE
-
MODE can be P for passive/TCPIP server mode, any other (e.g. A)
will be client mode.
-
-i IP_ADDRESS
-
In Active mode it is IP address to connect to. In passive mode it is
binding/listen address.
-
-h HOST_NAME
-
Binding/connection IP Address may be resolved from host name set in -h parameter.
Host name is resolved by OS, DNS queries, etc. tpbridge shall be started with
-i or with -h, if both flags will be set, error will be generated.
-
-6
-
If set, then IPv6 addresses will be used. By default tpbridge operates with
IPv4 addresses.
-
-p PORT_NUMBER
-
In active mode PORT_NUMBER is port to connect to. In passive mode it is
port on which to listen for connection.
-
-T TIME_OUT_SEC
-
Parameter indicates time-out value for packet receive in seconds. This is
socket option. Receive is initiate when it either there is poll even on socket
or incomplete logical message is received and then next recv() is called.
If the message part is not received in time, then socket is closed and connection
is restarted. This parameter also is used in case if target socket to which msg
is being sent is full for this given time period. If msg is not fully sent
and time out is reached, the connection is restarted, outgoing msg is being dropped.
-
[-b BACKLOG_NR]
-
Number of backlog entries. This is server’s (passive mode) connection queue, before
server accepts connection. OPTIONAL parameter. Default value is 100. But
could be set to something like 5.
-
[-c CONNECTION_CHECK_SEC]
-
Connection check interval in seconds. OPTIONAL parameter. Default value 5.
-
[-z PERIODIC_ZERO_SEND_SEC]
-
Interval in seconds between which zero length message is wrote to socket.
This is useful to keep the connection option over the firewalls, etc.
OPTIONAL parameter. Default value 0 (Do not send).
-
[-a INCOMING_RECV_ACTIVITY_SEC]
-
If set, then this is maximum time into which some packet from network must be
received. If no receive activity on socket is done, the connection is reset.
The 0 value disables this functionality. The default value is -z
multiplied by 2. Note that checks are performed with -c interval.
intervals. Usually this is used with -z, so that it is guaranteed that during
that there will be any traffic.
-
[-f]
-
Use Enduro/X Standard Network TLV Protocol instead of native data structures
for sending data over the network. This also ensure some backwards compatibility
between Enduro/X versions. But cases for backwards compatibility must be checked
individually.
-
[-P THREAD_POOL_SIZE]
-
This is number of worker threads for sending and receiving messages
for/to network. 50% of the threads are used for upload and other 50% are
used for network download. Thus number is divided by 2 and two thread pools
are created. If divided value is less than 1, then default is used.
The default size is 4.
-
[-R QUEUE_RETRIES]
-
Number of attempts to send message to local queue, if on pervious attempt queue
was full. The first attempt is done in real time, any further (if this flag allows)
are performed with calculated frequency of: nr_messages_failed_to_send - nr_messages_sent
in milliseconds. Default value is 999999. To disable temporary queue, set value
to 0.
-
[-A TEMPORARY_QUEUE_ACTION]
-
This value indicates the action how tpbridge shall process the cases when temporary
queue space for unsent / blocking messages are full. Values are following:
value 1 - if global temp queue is full (-Q param) - block the
bridge / stop incoming traffic, if single XATMI IPC queue is full (-q) -
ignore the condition (i.e. let to fill till the -Q limit). Value 2 - if
global temp queue full - block, if single XATMI IPC queue is full -
discarded the message. Value 3 - if global temp queue is full - discarded the message,
if single XATMI IPC queue is full - discarded the message. Default is 1.
-
[-Q TEMPORARY_QUEUE_SIZE]
-
This is number of messages that tpbridge can accumulate in case if message is
received from network and destination queue is full (e.g. service call queue, reply queue, etc).
If this parameter is not set, then value uses NDRX_MSGMAX environment variable setting.
If env variable is not available, then value is defaulted to 100. The value
of temporary queue size is preferred (and not string) as due to parallel processing
conditions, the number of messages in queue might go over this number until
the bridge is locked.
-
[-q TEMPORARY_QUEUE_SIZE_DEST]
-
This is max number of messages tpbridge can accumulate
for single XATMI IPC queue which is full/blocking.
This parameter is used in case if queue action (-A) is configure to drop messages,
if single XATMI IPC queue temporary space is full.
-
[-L TEMPORARY_QUEUE_TTL]
-
This is number of milliseconds for messages to live in temporary queue. Default
value is NDRX_TOUT env setting converted to millisecond.
-
[-m TEMPORARY_QUEUE_MINSLEEP]
-
This is minimum number milliseconds to wait after which schedule next attempt
of message sending to the XATMI IPC queue. The default is 40.
-
[-M TEMPORARY_QUEUE_MAXSLEEP]
-
This is maximum number milliseconds to wait after which schedule next attempt
of message sending to the XATMI IPC queue. The default is 150.
-
[-B THREADPOOL_BUFFER_SIZE]
-
This is number of messages that either net-out or net-in threads can accumulate
to corresponding thread job queue. Higher the number, will mean tpbridge will
start to collect some unprocessed messages, but better would be the pipeline
for incoming/outgoing main threads and the thread pool workers. The default
value is half of the THREAD_POOL_SIZE.
-
[-k CLOCKSYNC_ROUNDTRIP]
-
Maximum periodic clock sync message rountrip from local host to remote host nad
back in milliseconds for accepting the remote hosts monontonic clock value
for time adjustments. If roundtrip time for clock request is greater than this
value, the response with remote hosts monotonic clock value is ignored.
Default is 200.
-
[-K CLOCKSYNC_PERIOD]
-
Number of seconds to send the request from clock synchronization. Value 0 disables
this functionality. Default value is 600. Checking is performed with
the granularity of the CONNECTION_CHECK_SEC.