Wednesday, April 9, 2014

ADEN (2)




PREFACE

Passing small messages between distributed software modules can be made far more efficient using UDP than using TCP (see first part of this article). But the fact that the programmer has to develop her/his own network protocol makes UDP networking less desirable. However, as part of my self-study I designed a network protocol over Java's UDP sockets and made it useable via a free open-source class library providing an API similar in many ways to TCP sockets and 'hiding' UDP limitations allowing one to program request/reply conversations in a simple straight-forward manner. In this article I continue the explanation of that work which I named ADEN.
 
What is ADEN?
Depending on what you need ADEN for, ADEN can be seen as a small alternative of TCP optimized for small messages or a reliable alternative of UDP.

A formal definition of ADEN is:
ADEN is a Java API that -using its own network protocol- allows Java applications running on separate hosts to exchange messages reliably over UDP. ADEN provides these TCP-like features: 'negotiated' connection/disconnection and reliable in- order delivery with no limitation on the message size.
Communication over ADEN is carried out via request/reply mechanism in client/server paradigm. User has the option to invoke one of two different transmission methods : one that  is used when request sender expects useful data from request receiver and another method that provide the same behavior of ordinary request/reply transaction where the reply is a dummy. The later method is used when messages are only sent in one direction to prevent application from trying to send messages faster than received.
You can use ADEN's protocol in any Java application, you only need to add ADEN's class library to your Java application. The API is much similar to TCP sockets .

ADEN vs. TCP sockets:

-        ADEN's protocol is Message-based not stream-based.
-        ADEN's protocol is Far more efficient on the network for applications where messages are small .
-        ADEN's protocol provides exactly-once delivery model.
-        ADEN's protocol allows application to test ('ping') the connection.
-        ADEN's connection is created with two-way handshake and closed with two-way handshake, a total of 4 packets are exchanged. TCP connection is established with 3-way handshake and closed with 4-way handshake, a total of 7 packets are exchanged ).
-        There is no TIME-WAIT -or a similar technique- in ADEN. 
 ADEN vs. UDP sockets:

ADEN's protocol provides:

-'Real' connections.
-A built-in reliable delivery system.
-message fragmentation support.
-A built-in port multiplexing system that allows messages from different remote peers arriving to the same local address/port to be safely received by separate threads.

ADEN's protocol

ADEN protocol  can be described with minimum details as following:  

Sender protocol

step1 - Send N packets.

step2 - Wait until the Nth packet is acknowledged or RTO sec. has passed then go to step 3.

step3 - If the Nth packet is acknowledged then halt otherwise go to step 4.

step4 - If there is an acknowledged packet then put the next packet (in the sequence) for retransmission otherwise put the Nth packet for retransmission then go to step5.

step5 - If the packet set for retransmission is not retransmitted at least M times then retransmit it and go to step2, otherwise fail.  

the above algorithm describes how a single 'burst' of packets (N packets) are transmitted reliably. An application-level message is transmitted by executing this protocol one or more times. note that details are omitted for demonstration.
The default value of N (burst size) is 1, However, if you modify N remember that it should remain small ( 3 or 4 ).

Congestion Control

-Roughly speaking, Congestion is controlled by varying the time between subsequent retransmissions. Note that Retransmission scheduling is TCP-inspired (Look in ADEN's source-code for class RoundTripTimer, which was borrowed from Esmond Pitt's book Fundimintal Networking in Java. Although the technique explained in the book is totally different, it helped a lot in designing the technique used in ADEN).
  

Receiver Protocol

step1 - wait until a packet is received.

step2 - If received packet is not the expected packet go to step5 otherwise copy its contents to input_buffer .

step3 - If  reorder_buffer contains a packet that was received out-of-order and this packet is  now the next expected packet copy that packet contents to input_buffer and repeat step3.  Go to step4.

step4- if message is complete put contents of the input_buffer into  message_queue then notify user thread to receive the message. If  packet is a retransmission put a NACK in feedback_buffer otherwise if ACK is required by the received packet put ACK in feedback_buffer, go to step8.

step5 - If sequence is less than expected go to step6 otherwise go to step7.

step6 - If packet is a retransmission packet or ACK is required then put ACK in feedback_buffer , go to step8.

step7 - if sequence is invalid ignore the packet and go to 1  otherwise put the packet in reorder_buffer in its correct order then if  the received packet is a retransmission or ACK is required put a NACK in feedback_buffer.

step8 - if feedback_buffer is not empty send it and go to step 1.

ACK is a feedback packet sent by Receiver to tell Sender about Reciever's current position, the same is true about NACK which is used only when Receiver suspects that a packet might be lost.

The packet structure

The protocol data unit is called 'packet' which is enclosed in a single UDP datagram.
A packet can be a data packet (we refer to it as packet) or a feedback packet(we refer to it as feedback).
A packet may hold one complete or one fragment of an application-level message.
N (see sender protocol) is user-defined, the default is 1(recommended). The protocol can be made to send 'bursts' of packets instead of one packet a time by setting N>1 (not recommended).
packet payload size is predefined by user (default is 467 - which is 512 minus IP, UDP & ADEN headers  ) .
The packet (and feedback) header 17 bytes, however, when ADEN transfers a message the first transmitted packet contains the message size therefore this packet payload is less by 4 bytes than the configured payload length.
 Note that a single transmittable/receivable data unit by the application is called a 'message' not a packet. i.e. although message transmission consists of exchanging a number of datagrams over an unreliable service, a message can not be partially sent or partially received.
The protocol header is composed of three fields which are - by order:
-an 8-bit control flag used for various purposes (for example referring to the content type of the datagram payload: packet or a feedback).
-64-bit connection id which is a unique identifier for the connection over which the message is passed and is obtained during connection setup (explained later).
-64-bit sequence_number .
A packet sequence_number refers to the sequence of the chunk of bytes in the packet payload.
a feedback always holds the sequence of the next expected chunk.

The packet that contains the first fragment of a message  holds-besides the three fields mentioned above- an extra four bytes field in its header used to hold the total length of that message.

Note about sequence field of the packet/feedback:

Sequence  is incremented by the payload length of the transmitted/received packet instead of incrementing by one, this way the sender can always retransmit bytes that were discarded by receiver's platform from a packet payload (that is possible in Java -the reason is probably to prevent running out of memory when available memory for JVM is low).
 
Connection

The connection id in the header of an outgoing packet is obtained from the other end during connection setup while in feedback header is the connection id found in the header of the packet that triggered this feedback. 
connection:
connection setup can be explained in Server-Client terms:
Client generates a unique 64-bit id and send it in a special message (connection message) to the server. server likewise generates unique 64-bit id and send it to the client in a special feedback. After this exchange each end has two connection ids for this connection, one generated locally and one generated remotely.
Note that the uniqueness of the generated connection id has to be maintaned whithin the local host only.

Note that The connection message can contain application data and should not be larger than one packet.
the server should not send unless it received at least one message (after the connection message).   This allows to avoid possible incorrect connections resulted by unsuccessful handshake.

Passing data in both directions

The sending end uses the remote id as the connection id in the packets header.
and the receiving end receives a packet only if the connection id in its header equals the local id.  
Note that the sequence number of the connection message is zero, likewise the first message from the server to client (no -if you ask - first message from server cannot be confused as a connection message).

Connection termination

connection termination is handled entirely at application domain, i.e.  the protocol does not provide a mechanism to negotiate connection termination. The application should synchronize connection termination as follows:
'clean' connection termination can be done by passing a special message in one direction at the end of communication (call it EOF message if you want).
who sends that message is application-specific but normally it is the client (the connection initiator). After passing this message both ends can dispose the connection object.

Note that EOF message should not contain application data, sender should normally tolerate unsuccessful send of this message and continue normally.

Note that it cannot be granted that the connection is always terminated this way, using 'read' timeout -at both ends of an ADEN connection- is the only way to prevent blocking indefinitely on a dead connection.

Remarks:
You may noticed that ADEN provides TCP functionality but with less overhead. That reflects greatly on the overall efficiency of the system when only small messages are exchanged. The same can also be said when running on a LAN, because round-trip times are relatively small and packets are less likely to be lost here. Therefore ADEN can be used to transmit large files within a LAN.