Friday, August 9, 2013

ADEN

update: This article was featured on java.net home page (you can look for it by date [08/15/2013] on https://home.java.net/archive/spotlight). 

ADEN project provides a high-performance, UDP-based alternative to conventional TCP sockets for :-

1- Low data-volume applications.

2- Bulk transfer over a LAN.

An example of low data-volume applications is client-server, request-reply-style applications where messages are small. For low data-volume applications, ADEN offers an internet-friendly transmission mechanism but with much less complexity/overhead than TCP. 
For bulk transfer over a LAN, ADEN provides a less congestion-sensitive transmission mechanism that allows to transfer bulk data in high speed over a LAN (see performance test results).

The following article provides everything you need to know about ADEN. It also demonstrates an algorithm for secure transmission that can be implemented as a 'plugin' to ADEN. Note that some details are omitted for clarity. Complete description of ADEN protocol can be found here. Note also that ADEN's source code is shipped with complete examples. This article does not provide any code snippets.

ADEN is currently implemented over Java's DatagramChannels.

1.Overview

Exchanging small messages between pairs of computers over TCP streams is not highly efficient as -for example- transferring large files. TCP is really useful when you want your application to send as much as possible data in as short a time as possible. to do that TCP uses sophisticated congestion control mechanism to prevent your application from congesting the network or causing a congested network to become worst. This mechanism ensures optimal send rate at all time.
It is clear that TCP congestion control has little benefit on the network when peers exchange only a few segments (packets) per round-trip time. Programmers has another choice, to use UDP,  a connection-less unreliable protocol that does not provide any congestion avoidance mechanism. Programmer has the freedom to implement a custom transmission mechanism. Fortunately, guidelines are available for designing efficient congestion control in low data-volume applications (see RFC5405).
However, working with UDP directly via UDP sockets means programmer has to deal with all UDP limitations: unreliability and the lack of 'negotiated' connection are among the main issues. Because of that programmers may prefer to use TCP sockets to save time and effort, at the expense of the overall efficiency of the system.

ADEN's project basic goal is:
- To implement 'negotiated' connection/disconnection and reliable in-order message delivery
on top of Java's DatagramChannels.
Note that I said 'message' not 'packet'. A message can be much larger than a single UDP packet.

2.ADEN connection

ADEN connection is established using a two-way handshake and closed the same way. So a total of 4 IP packets are exchanged instead of 7 packets as with TCP.
To establish an ADEN connection, client creates an ADEN socket, bounds it to a local address/port, initiates it with the server address/port and sends a special message called BOF. After sending BOF the ADEN socket is now connected and can exchange application messages with the server. To be able to accept connection requests (i.e. receive BOFs), server creates a special ADEN socket, bounds it to a local address/port. And by blocking on 'accept' method, the server gets a an ADEN socket that is used to receive the BOF message and exchange further messages with the client. So a pair of ADEN sockets now represents an ADEN connection. To close ADEN connection the client sends a special message called EOF.

The general criteria for BOF & EOF messages is:

 1-BOF and EOF are the first and the last message the client send on an ADEN socket respectively.
2- BOF should be small enough to be put in a single UDP datagram -the same for EOF (this way, ADEN protocol will transfer each message by passing only a single datagram on each direction).
3- BOF can be used to pass useful data. however data in a BOF message should never be used to invoke a non-idempotent operation on the receiving end(3). Normally BOF would be used to pass application metadata (version...etc ).
4-EOF should not hold any data (besides the flag that indicates this is an EOF !). If attempt to send EOF failed the client can continue normal execution(4).

The server should not send to the client unless it receives at least one more message after BOF  and that message is not EOF (from logical perspective- that make sense because the server normally waits for client's request). This way, we will avoid invalid connections created by receiving old BOFs.

 All in bold above is MANDATORY .
 After sending EOF the client should close the ADEN socket (Yes, ADEN socket can not be reused). Note that the client should be made to do 'its best' to send the EOF at the end of the session e.g. calling the EOF sending command in a 'finally' clause.

How connections are closed by the server?

-the server closes  ADEN connection by closing the ADEN socket only i.e server does not have to send EOF to the client.
The normal scenario is that the server will close the connection in response to one of these events:
-server receives an EOF.
-server does not receive any message for long time on that connection.For how long server should wait before disposing an idle connection is application-specific.

2.1 Connection lifetime

connection lifetime can only be controlled by application-level means:
-server uses read timeouts to determine when to abandon connections.
-client uses read timeouts to determine when to 'give up' and rise a 'server is not responding!'-like error message.
-client may need to 'ping' the server  periodically so the server application does not close the connection (keep-alive signals).
-client may need to 'ping' the server periodically to prevent connections through NATs from getting 'expired' .

2.2 Connection termination again

as we mentioned earlier, normal termination of an ADEN connection involves one-way EOF message passing before local resources associated with that connection are released on both sides i.e. closing the sockets. However, it's not always possible to perform normal termination because server or client may just crash!. ADEN does not provide special handling for such possible scenarios, and it is let to fix it self. If the server crashed the client will detect that the connection is lost when it tries to write a message, ADEN on the client side will fail due not receiving acknowledgments (or due to receiving a RESET signal from the server if the server restarted). If the client crashed server will dispose the socket normally due to read timeout (unless you use 0 timeout causing your server to block indefinitely!) or when the server tries to write a message causing ADEN to fail due not receiving acknowledgments. If the client recovers and try to re-connect then if the server has already closed the old socket a new connection/socket will be created normally, otherwise the connection request will cause the old socket to be closed automatically, if the server thread was reading or writing on that socket it will be interrupted with an appropriate error.

In short, the following should be kept in mind when using ADEN:
-Always use positive non-zero timeout for read operation on both client and server.
-Server should detect and close idle connections. (the same rules applies when using TCP sockets!)

2.3 Connection uniqueness

Without going into details of ADEN protocol (that is left to a separate article), ADEN  sockets can connect different hosts from the same local address/port. Incoming datagrams are always delivered to the correct socket (i.e thread) within a JVM. Datagrams that arrive after closing the associated socket are discarded, unless that datagram is a BOF and the local host is listening for connection requests on the local address/port on which the datagram was received. In that case a new socket will be created and passed to the application. It is possible to receive an old BOF retransmitted by the network, however, we pretend that never happens, to avoid further complexity in the internal structure of ADEN. Recall the rule that says server must not send over a new socket unless it receives further message (see 2. above), server will eventually close a socket when not receiving anything from it (except a BOF!).


3. Sending messages over ADEN

local application can send a message to the remote application by invoking write() method of ADEN socket. ADEN supports message segmentation, An application level message passed to ADEN is either transmitted in a single datagram  or fragmented into a number of datagrams sent independently , However that depends on the size of the message (message size is not restricted by ADEN ) and the maximum payload for UDP datagram that ADEN is configured to use.
ADEN performs reliable transmission;  each transmitted datagram has to be acknowledged by another datagram, that is the normal case but ADEN can be configured to send multiple datagrams  and require receiver to acknowledge them by a single datagram, and consequently reduce the total number of synchronized acknowledgments exchanged. This feature is intended to provide high throughput on a LAN, it should not be used at all if you are using ADEN on the internet (more information about ADEN's protocol).
ADEN provides synchronous output operations, i.e. each send operation will cause the sending thread to block until the message is received on the other end by -at least- the ADEN protocol . However, asynchronous output can be built over ADEN if needed.

Send operation in ADEN have two modes:

1- asynchronous to remote application (default)

In this mode write() call returns when the message is received by the ADEN protocol on the other end but read() is not necessarily invoked by the remote application.

2-synchronous mode

This mode allows write() call at one end to synchronize with read() call on the other end. When write() is called read() should be called on the other end too, otherwise send will fail. This mode should be used to write the control messages (BOF and EOF message)  in order to synchronize connection initiation/termination (see ADEN connections in 2). Normally, application messages are sent in the asynchronous mode. Sending messages faster than they could be received at application-level will cause overflow at ADEN protocol level. Overflow is described in the following paragraph:

ADEN's receiver protocol allows incoming messages to be queued and received later by the application, if the maximum queue size is reached new messages will be ignored. The protocol will continue to receive new messages when the room become available (i.e. when the application continue to receive messages). Ignoring messages to prevent overflow will cause ADEN's sender protocol on the other side to perform re-transmissions. Note that overflow on one side can cause write() failure and unplanned connection termination on the other side.

If your application involves transmitting a file as a stream of messages then you need to ensure sender and receiver codes are well-orchestrated i.e. the overflow situation described above is unlikely to happen under normal execution. A  clever test would be running both sending and receiving modules on the same machine(5). When running the tests  overflow can be detected by observing ADEN's retransmissions log(6).

Alternatively you can use synchronous sending only and in this case overflow would not be a concern.

4. Receiving messages over ADEN

local application can receive a message from the remote application by invoking read() method of ADEN socket. Messages are received from the network at the ADEN protocol level, messages are then buffered until they are received by the application. The maximum number of  messages that can be queued per connection is configurable, ADEN deals with overflow by ignoring any new message, this will cause ADEN protocol -at the sending end- to initiate a retransmission after some time (it is forced to  assume datagrams are lost on the network).  At worst case the sender application will lose the connection with an error (the other end is not receiving!). Note that overflow is not possible when sending messages in the synchronous mode only.
Read is a blocking method that allows timeouts: if no messages are yet available in the underlying queue the caller thread will block  until a message is available or timeout occur whichever happen first.  Note that forgetting to use non-zero positive timeouts will put your application at the risk of blocking indefinitely.


5. Probing ADEN connection

ADEN provides ping() method that can be called any time after establishing the connection. It helps the local application to verify that the remote application did not dispose the connection; that is done by sending a special message received and acknowledged by ADEN protocol like an ordinary message and then dropped.

Usage:
Although it can be used by both client and server, ping is intended to be used by clients only. Client may call ping periodically while waiting for server response (mainly when the responses requires lengthy computation by the server). Note that pings can not cause overflow because they are not messages receivable by the application.
Another use of ping() is to prevent UDP sessions across NATs from expiring when the communication is idle.  However, ping cannot prevent the server application from closing the connection due to read timeout, that can only be done via application-level 'pings' .

6. Exactly-once message delivery

Implementing exactly-once message delivery model over ADEN messages can be described in the light of two applications:

1-messages are passed in one direction
In this application sender needs confirmation that the message is received, but does not expect useful data to be returned. What we need here is to send messages in synchronous mode(see 3).  No need for application-level acknowledgements .

2-messages are exchanged in request-reply manner
here the client sends the message (request) and receives another message in return (reply)  which normally contains useful data needed by the client. No need to assign an id for the request if there is only one conversation a time . otherwise each request-reply should be uniquely identified. That can be done by assigning a sequence (0,1,2,3,....)  to the request, the server then uses the sequence provided with the request as an id for the corresponding reply. Note that this sequence is not used to tell the server about requests order. It is only used to map requests to their replies.

Constraints for implementing exactly-once model:

-Client should not be programmed to resend requests:
Client must not resend requests. After sending a request if no reply is received for long time the client should consider the request lost. Client may also rise an error indicating that the server is not responding. Because the problem could be that the connection is dead client can be programmed to ping the connection as a final step before throwing an error. Note that if the connection is not dead i.e. ping was successful then you probably have a bug in your server code causing valid requests to be thrown away. Client behavior when a request is lost is application-specific and -normally- reflects the model essence. For example a client sends a request to transfer money from one bank account to another and the power went-off before a reply is received confirming the transaction was successful. Later when power is back the client may first query the source account to ensure that the transaction was not  carried out at the first time, if that is true then request the transaction again.

-Server should not be programmed to drop requests to prevent  overflow:
for example a service thread that reads messages from ADEN socket and then put them in a message queue to be handled by another thread, in order to avoid dropping messages when the queue is full the service thread should not read from ADEN socket unless the queue has room for more messages. However, server can still be made to ignore invalid requests silently i.e. without replying an error message to the client.

7. Implementing secure communication over ADEN

Secure transmission can be implemented on top of ADEN. No need to modify ADEN protocol or even understand how it works. In the following I describe a technique by using asymmetric cryptography.
First it is required that each end of an ADEN connection to have his own pair of cryptographic keys known as public key and private key. Each end should also have the public key of the other end. How public keys are exchanged is out of our scope.
Before we go into details there is something we should know first:
Each ADEN's connection produces a 64-bit unique identifier on each side of the connection. They are used to identify the  ADEN connection (details about ADEN protocol will be discussed in a separate article ). These IDs are exchanged during connection setup . They are accessible by the application and can be used to implement the security mechanism described here.

Each message should be composed of two parts:
header and payload.  Header contains two fields the first is session_id which is a 64-bit value (Long)and the second is message_sequence which is a 32-bit value (Integer) . The payload is the application-level message passed to the security layer to be protected and then passed to the next layer for transmission.

 session_id: for message sender it must be the remote id of the ADEN socket over which the message is to be sent, and for receiver it must be the local id of the ADEN socket over which the message was received.

message_sequence: a counter that both ends keep during the session. initially 0 when ADEN connection is established and incremented by 1 for each message sent/received. Receiver should accept a message only if this field equals the current value of the counter.

The Algorithm:

Sender:

1- Receive a message from upper layer, generate header and append it to the message.

2-Using own private key sign the message, append the signature to the message and encrypt the result using public key of the receiver then send the encrypted result over the ADEN socket.

Receiver:

1-Receive the message from ADEN socket, decrypt the message using own private key, strip the signature part from the decrypted message and verify signature using sender public key. If the signature is correct continue to the next step otherwise ignore the message.

2- Check the message header, If it has correct session_id (equals the local id of the ADEN socket the message was received on) AND correct message_sequence strip the header and forward the message to the upper layer, otherwise ignore the message.

Note that the key we are using to encrypt the message is different form the key used to decrypt the message. Alternatively, we can use a single key or a shared secret key to encrypt and decrypt the message as follows:
Receiver should first generate a secret key (using a symmetric-key algorithm such as AES) then send it in a message to the sender (or vise-versa) using the algorithm described above. For further messages; instead of using public key to encrypt the message and private key to decrypt it; we use the shared secret key to encrypt/decrypt the message, the rest remains the same.

Note that using this security layer we will ensure that:
-messages are not readable except by the receiver (confidentiality provided by encryption) .
-messages cannot be faked (authentication provided by signature).
-An attacker can not fool receiver with duplicates (thanks to the message header).

8. Building concurrent servers

similar to Java's classical I/O  ADEN's  I/O are blocking operations.
 the following technique is usually used with  classical Java sockets :
A single main thread is used to accept connections and each connection is then assigned to a separate thread. This technique can be used to build concurrent servers using ADEN sockets. Probably on of the best ways to implement this mechanism is to use thread pools so threads can be reused instead of creating a new thread every time we have a new connection. Another good strategy is to ensure the server application uses a 'maximum limit' for the number of threads it may use.  so instead of allowing the number of threads to grow blindly the server application should use a threshold that is when reached  the server simply closes new connections if all available threads are busy.

A full example of a concurrent sever is shipped with ADEN source code.

log:
(1) i.e an application-level message is either fits in a single UDP datagram or needs to be fragmented into a few datagrams only.
 
(2)ADEN follows specifications and recommendations provided by IETF regarding congestion control and retransmissions scheduling.
(3)It is possible to re-receive a BOF on a new socket  if the socket created by the server is closed before the connection is finished with the client (i.e. before BOF is acknowledged). Also if an old BOF is re-transmitted by the network.

(4)There are two reasons - a logical reason : failing to send EOF has no effect on the application.  And  technical reason:  EOF could be received and acknowledged but the acknowledgement could be lost on the network and the server could have already disposed the connection when EOF is retransmitted  by ADEN protocol. 

(5) messages are unlikely to be lost over network in this case.

(6)log should be enabled first.