update: This article was featured on
java.net home
page (you can look for it by date [08/15/2013] on https://home.java.net/archive/spotlight).
ADEN project provides a high-performance, UDP-based alternative to conventional TCP sockets for :-
1- Low data-volume applications.
2- Bulk transfer over a LAN.
An example of low data-volume applications is client-server, request-reply-style applications where messages are small. For low data-volume applications, ADEN offers an internet-friendly transmission mechanism but with much less complexity/overhead than TCP.
For bulk transfer over a LAN, ADEN provides a less congestion-sensitive transmission mechanism that allows to transfer bulk data in high speed over a LAN (see performance test results).
1- Low data-volume applications.
2- Bulk transfer over a LAN.
An example of low data-volume applications is client-server, request-reply-style applications where messages are small. For low data-volume applications, ADEN offers an internet-friendly transmission mechanism but with much less complexity/overhead than TCP.
For bulk transfer over a LAN, ADEN provides a less congestion-sensitive transmission mechanism that allows to transfer bulk data in high speed over a LAN (see performance test results).
The following article provides everything you need to know about ADEN. It also demonstrates an algorithm for secure transmission that can be implemented as a 'plugin' to ADEN. Note that some details are omitted for clarity. Complete description of ADEN protocol can be found here. Note also that ADEN's source code is shipped with complete examples. This article does not provide any code snippets.
ADEN is currently implemented over Java's DatagramChannels.
1.Overview
Exchanging small messages between
pairs of computers over TCP streams is not highly efficient as -for example-
transferring large files. TCP is really useful when you want your application
to send as much as possible data in as short a time as possible. to do that TCP
uses sophisticated congestion control mechanism to prevent your application
from congesting the network or causing a congested network to become worst.
This mechanism ensures optimal send rate at all time.
It is clear that TCP congestion
control has little benefit on the network when peers exchange only a few
segments (packets) per round-trip time. Programmers has another choice, to use
UDP, a connection-less unreliable protocol that does not provide any
congestion avoidance mechanism. Programmer has the freedom to implement a
custom transmission mechanism. Fortunately, guidelines are available for
designing efficient congestion control in low data-volume applications (see RFC5405).
However, working with UDP directly
via UDP sockets means programmer has to deal with all UDP limitations:
unreliability and the lack of 'negotiated' connection are among the main
issues. Because of that programmers may prefer to use TCP sockets to save time
and effort, at the expense of the overall efficiency of the system.
ADEN's project basic goal is:
- To implement 'negotiated'
connection/disconnection and reliable in-order message delivery
on top of Java's DatagramChannels.
Note that I said 'message' not
'packet'. A message can be much larger than a single UDP packet.
2.ADEN connection
ADEN connection is established using a two-way handshake and closed the same way. So a total of 4 IP packets are exchanged instead of 7 packets as with TCP.
To establish an ADEN connection,
client creates an ADEN socket, bounds it to a local address/port, initiates it
with the server address/port and sends a special message called BOF. After
sending BOF the ADEN socket is now connected and can exchange application
messages with the server. To be able to accept connection requests (i.e. receive BOFs), server creates a special
ADEN socket, bounds it to a local address/port. And by blocking on 'accept'
method, the server gets a an ADEN socket that is used to receive the BOF
message and exchange further messages with the client. So a pair of ADEN sockets now represents an ADEN connection. To close ADEN connection the client sends a special
message called EOF.
The general criteria for BOF & EOF messages is:
1-BOF and EOF are the first and the last message the
client send on an ADEN socket respectively.
2- BOF should be small enough to be
put in a single UDP datagram -the same for EOF (this way, ADEN protocol will
transfer each message by passing only a single datagram on each direction).
3- BOF can be used to pass useful
data. however data in a BOF message should never be used to invoke a
non-idempotent operation on the receiving end(3). Normally BOF would be used to
pass application metadata (version...etc ).
4-EOF should not hold any data
(besides the flag that indicates this is an EOF !). If attempt to send EOF
failed the client can continue normal execution(4).
The server should not send to the client unless it receives at least one more message after BOF and that message is not EOF (from logical perspective- that make sense because the server normally waits for client's request). This way, we will avoid invalid connections created by receiving old BOFs.
All in bold above is MANDATORY
.
After sending EOF the client
should close the ADEN socket (Yes, ADEN socket can not be reused). Note that
the client should be made to do 'its best' to send the EOF at the end of the
session e.g. calling the EOF sending command in a 'finally' clause.
How connections are closed by the
server?
-the server closes ADEN
connection by closing the ADEN socket only i.e server does not have to send EOF
to the client.
The normal scenario is that the
server will close the connection in response to one of these events:
-server receives an EOF.
-server does not receive any message
for long time on that connection.For how long server should wait before
disposing an idle connection is application-specific.
2.1 Connection lifetime
connection lifetime can only be
controlled by application-level means:
-server uses read timeouts to
determine when to abandon connections.
-client uses read timeouts to
determine when to 'give up' and rise a 'server is not responding!'-like error
message.
-client may need to 'ping' the
server periodically so the server application does not close the
connection (keep-alive signals).
-client may need to 'ping' the
server periodically to prevent connections through NATs
from getting 'expired' .
2.2 Connection termination again
as we mentioned earlier, normal
termination of an ADEN connection involves one-way EOF message passing before
local resources associated with that connection are released on both sides i.e.
closing the sockets. However, it's not always possible to perform normal
termination because server or client may just crash!. ADEN does not provide
special handling for such possible scenarios, and it is let to fix it self. If
the server crashed the client will detect that the connection is lost when it
tries to write a message, ADEN on the client side will fail due not receiving
acknowledgments (or due to receiving a RESET signal from the server if the
server restarted). If the client crashed server will dispose the socket
normally due to read timeout (unless you use 0 timeout causing your server to
block indefinitely!) or when the server tries to write a message causing ADEN
to fail due not receiving acknowledgments. If the client recovers and try to
re-connect then if the server has already closed the old socket a new connection/socket
will be created normally, otherwise the connection request will cause the old
socket to be closed automatically, if the server thread was reading or writing
on that socket it will be interrupted with an appropriate error.
In short, the following should be
kept in mind when using ADEN:
-Always use positive non-zero timeout
for read operation on both client and server.
-Server should detect and close idle
connections. (the same rules applies when using TCP sockets!)
2.3 Connection uniqueness
Without going into details of ADEN
protocol (that is left to a separate article),
ADEN sockets can connect different hosts from the same local
address/port. Incoming datagrams are always delivered to the correct socket
(i.e thread) within a JVM. Datagrams that arrive after closing the associated
socket are discarded, unless that datagram is a BOF and the local host is
listening for connection requests on the local address/port on which the
datagram was received. In that case a new socket will be created and passed to
the application. It is possible to receive an old BOF retransmitted by the
network, however, we pretend that never happens, to avoid further complexity in
the internal structure of ADEN. Recall the rule that says server must not send
over a new socket unless it receives further message (see 2.
above), server will eventually close a socket when not receiving anything from
it (except a BOF!).
3. Sending messages over ADEN
local application can send a message
to the remote application by invoking write() method of ADEN socket. ADEN
supports message segmentation, An application level message passed to ADEN is
either transmitted in a single datagram or fragmented into a number of
datagrams sent independently , However that depends on the size of the message
(message size is not restricted by ADEN ) and the maximum payload for UDP
datagram that ADEN is configured to use.
ADEN performs reliable
transmission; each transmitted datagram has to be acknowledged by another
datagram, that is the normal case but ADEN can be configured to send multiple
datagrams and require receiver to acknowledge them by a single datagram,
and consequently reduce the total number of synchronized acknowledgments
exchanged. This feature is intended to provide high throughput on a LAN, it
should not be used at all if you are using ADEN on the internet (more
information about ADEN's protocol).
ADEN provides synchronous output
operations, i.e. each send operation will cause the sending thread to block
until the message is received on the other end by -at least- the ADEN protocol
. However, asynchronous output can be built over ADEN if needed.
Send operation in ADEN have two
modes:
1- asynchronous to remote
application (default)
In this mode write() call returns
when the message is received by the ADEN protocol on the other end but read()
is not necessarily invoked by the remote application.
2-synchronous mode
This mode allows write() call at one
end to synchronize with read() call on the other end. When write() is called
read() should be called on the other end too, otherwise send will fail. This
mode should be used to write the control messages (BOF and EOF message)
in order to synchronize connection initiation/termination (see ADEN connections
in 2). Normally, application messages are sent in the asynchronous mode.
Sending messages faster than they could be received at application-level will
cause overflow at ADEN protocol level. Overflow is described in the following
paragraph:
ADEN's receiver protocol allows
incoming messages to be queued and received later by the application, if the
maximum queue size is reached new messages will be ignored. The protocol will
continue to receive new messages when the room become available (i.e. when the
application continue to receive messages). Ignoring messages to prevent overflow
will cause ADEN's sender protocol on the other side to perform
re-transmissions. Note that overflow on one side can cause write() failure and
unplanned connection termination on the other side.
If your application involves
transmitting a file as a stream of messages then you need to ensure sender and
receiver codes are well-orchestrated i.e. the overflow situation described
above is unlikely to happen under normal execution. A clever test would
be running both sending and receiving modules on the same machine(5). When
running the tests overflow can be detected by observing ADEN's
retransmissions log(6).
Alternatively you can use
synchronous sending only and in this case overflow would not be a concern.
4. Receiving messages over ADEN
local application can receive a
message from the remote application by invoking read() method of ADEN socket.
Messages are received from the network at the ADEN protocol level, messages are
then buffered until they are received by the application. The maximum number
of messages that can be queued per connection is configurable, ADEN deals
with overflow by ignoring any new message, this will cause ADEN protocol -at
the sending end- to initiate a retransmission after some time (it is forced
to assume datagrams are lost on the network). At worst case the
sender application will lose the connection with an error (the other end is not
receiving!). Note that overflow is not possible when sending messages in the
synchronous mode only.
Read is a blocking method that allows
timeouts: if no messages are yet available in the underlying queue the caller
thread will block until a message is available or timeout occur whichever
happen first. Note that forgetting to use non-zero positive timeouts will
put your application at the risk of blocking indefinitely.
5. Probing ADEN connection
ADEN provides ping() method
that can be called any time after establishing the connection. It helps the
local application to verify that the remote application did not dispose the
connection; that is done by sending a special message received and acknowledged
by ADEN protocol like an ordinary message and then dropped.
Usage:
Although it can be used by both
client and server, ping is intended to be used by clients only. Client may call
ping periodically while waiting for server response (mainly when the responses
requires lengthy computation by the server). Note that pings can not cause
overflow because they are not messages receivable by the application.
Another use of ping() is to prevent
UDP sessions across NATs from expiring when the communication is
idle. However, ping cannot prevent the server application from closing
the connection due to read timeout, that can only be done via application-level
'pings' .
6. Exactly-once message delivery
Implementing exactly-once message
delivery model over ADEN messages can be described in the light of two
applications:
1-messages are passed in one
direction
In this application sender needs
confirmation that the message is received, but does not expect useful data to
be returned. What we need here is to send messages in synchronous mode(see
3). No need for application-level acknowledgements .
2-messages are exchanged in
request-reply manner
here the client sends the message
(request) and receives another message in return (reply) which normally
contains useful data needed by the client. No need to assign an id for the
request if there is only one conversation a time . otherwise each request-reply
should be uniquely identified. That can be done by assigning a sequence
(0,1,2,3,....) to the request, the server then uses the sequence provided
with the request as an id for the corresponding reply. Note that this sequence
is not used to tell the server about requests order. It is only used to
map requests to their replies.
Constraints for implementing
exactly-once model:
-Client should not be programmed to
resend requests:
Client must not resend requests.
After sending a request if no reply is received for long time the client should
consider the request lost. Client may also rise an error indicating that the
server is not responding. Because the problem could be that the connection is
dead client can be programmed to ping the connection as a final step before
throwing an error. Note that if the connection is not dead i.e. ping was
successful then you probably have a bug in your server code causing valid
requests to be thrown away. Client behavior when a request is lost is
application-specific and -normally- reflects the model essence. For example a
client sends a request to transfer money from one bank account to another and
the power went-off before a reply is received confirming the transaction was successful. Later when power is back the client may first query the source
account to ensure that the transaction was not carried out at the first
time, if that is true then request the transaction again.
-Server should not be programmed to
drop requests to prevent overflow:
for example a service thread that
reads messages from ADEN socket and then put them in a message queue to be
handled by another thread, in order to avoid dropping messages when the queue
is full the service thread should not read from ADEN socket unless the queue
has room for more messages. However, server can still be made to ignore invalid
requests silently i.e. without replying an error message to the client.
7. Implementing secure communication over ADEN
Secure transmission can be implemented on top of ADEN. No need to modify ADEN protocol or even understand how it works. In the following I describe a technique by
using asymmetric cryptography.
First it is required that each end
of an ADEN connection to have his own pair of cryptographic keys known as
public key and private key. Each end should also have the public key of the
other end. How public keys are exchanged is out of our scope.
Before we go into details there is
something we should know first:
Each ADEN's connection produces a
64-bit unique identifier on each side of the connection. They are used to
identify the ADEN connection (details about ADEN protocol will be
discussed in a separate article ). These IDs are exchanged during connection
setup . They are accessible by the application and can be used to implement the
security mechanism described here.
Each message should be composed of
two parts:
header and payload. Header
contains two fields the first is session_id which is a 64-bit value
(Long)and the second is message_sequence which is a 32-bit value
(Integer) . The payload is the application-level message passed to the security
layer to be protected and then passed to the next layer for transmission.
session_id: for message sender it must be the
remote id of the ADEN socket over which the message is to be sent, and for
receiver it must be the local id of the ADEN socket over which the message was
received.
message_sequence: a counter that both ends keep
during the session. initially 0 when ADEN connection is established and
incremented by 1 for each message sent/received. Receiver should accept a
message only if this field equals the current value of the counter.
The Algorithm:
Sender:
1- Receive a message
from upper layer, generate header and append it to the message.
2-Using own private
key sign the message, append the signature to the message and encrypt the
result using public key of the receiver then send the encrypted result over the
ADEN socket.
Receiver:
1-Receive the message
from ADEN socket, decrypt the message using own private key, strip the
signature part from the decrypted message and verify signature using sender
public key. If the signature is correct continue to the next step otherwise
ignore the message.
2- Check the message
header, If it has correct session_id (equals the local id of the ADEN socket
the message was received on) AND correct message_sequence strip the header and
forward the message to the upper layer, otherwise ignore the message.
Note that the key we are using to encrypt the message is different
form the key used to decrypt the message. Alternatively, we can use a single key or a shared secret key to encrypt and decrypt the message as follows:
Receiver should first generate a secret key (using a symmetric-key algorithm such as AES) then send it in a message to the sender (or vise-versa) using the algorithm described above. For further messages; instead of using public key to encrypt the message and private key to decrypt it; we use the shared secret key to encrypt/decrypt the message, the rest remains the same.
Receiver should first generate a secret key (using a symmetric-key algorithm such as AES) then send it in a message to the sender (or vise-versa) using the algorithm described above. For further messages; instead of using public key to encrypt the message and private key to decrypt it; we use the shared secret key to encrypt/decrypt the message, the rest remains the same.
Note that using this security layer
we will ensure that:
-messages are not readable except by
the receiver (confidentiality provided by encryption) .
-messages cannot be faked
(authentication provided by signature).
-An attacker can not fool receiver
with duplicates (thanks to the message header).
8. Building concurrent servers
similar to Java's classical
I/O ADEN's I/O are blocking operations.
the following technique is
usually used with classical Java sockets :
A single main thread is used to
accept connections and each connection is then assigned to a separate thread.
This technique can be used to build concurrent servers using ADEN sockets.
Probably on of the best ways to implement this mechanism is to use thread pools
so threads can be reused instead of creating a new thread every time we have a
new connection. Another good strategy is to ensure the server application uses
a 'maximum limit' for the number of threads it may use. so instead of
allowing the number of threads to grow blindly the server application should
use a threshold that is when reached the server simply closes new
connections if all available threads are busy.
A full example of a concurrent sever
is shipped with ADEN source code.
log:
(1) i.e an application-level message is
either fits in a single UDP datagram or needs to be fragmented into a few
datagrams only.
(2)ADEN follows specifications and
recommendations provided by IETF regarding congestion control and
retransmissions scheduling.
(3)It is possible to re-receive a BOF
on a new socket if the socket created by the server is closed before the
connection is finished with the client (i.e. before BOF is acknowledged). Also
if an old BOF is re-transmitted by the network.
(4)There are two reasons - a logical
reason : failing to send EOF has no effect on the application. And
technical reason: EOF could be received and acknowledged but the
acknowledgement could be lost on the network and the server could have already
disposed the connection when EOF is retransmitted by ADEN protocol.
(5) messages are unlikely to be lost
over network in this case.
(6)log should be enabled first.
Great ya man :)
ReplyDeleteI'm glad you like it Nabil:)
DeleteLooks promising
ReplyDelete