Next: 9 Conclusion Up: Unix Communication Facilities Previous: 7 Performance

8 Making Unreliable Communication Reliable

This chapter
describes how to implement reliable send()/receive() functions based on unreliable UDP datagrams
shows how these functions have to be designed in terms of
lost messages
corrupt messages
duplicate messages
messages out of order

The UDP performance measurements in the previous chapter have demonstrated that UDP is not reliable even on a local area network. How can communication then made reliable? This chapter gives an overview about what is necessary in principle to archive reliable delivered messages if an unreliable message protocol like UDP is used. This knowledge is then used to implement ``safe'' functions. For reliable messages ideally the following conditions have to be met:

messages are delivered in the order they are sent
every message sent is delivered exactly once
an acknowledgment is returned for each delivered message

The assumed failure model is:

communication failures: functioning processes are only temporarily unable to communicate (due to networking problems like lost messages or corruption of messages)
node crashes: A node either works or crashes (stops working); a crashed node eventually recovers

The second and third goals above are hard to achieve in the presence of crashes and unreliable networks. In particular, it is not possible to achieve them without making changes to stable storage that survives crashes. Details about this can be found in [Mull93, page 251f] or other introduction books on distributed systems.

For the following it is assumed that node crashes do not happen, as otherwise the system gets really complicated. An example for a system that covers all goals is the Arjuna system, developed by the University of Newcastle upon Tyne.

The following problems can arise while two processes are communicating through an unreliable network: Messages from one computer to another are:

lost
corrupted
delivered more than once
delivered out of order

It is assumed that normally the receiver of a data message sends a reply message that it got a message. So a normal ``save send'' looks like this:

sender: sends a data message and waits for the acknowledgment message
receiver: receives a data message and sends acknowledgment to sender

8.1 Lost Messages

If the underlying message transport system is unreliable, messages can get lost. Without some caution, this could lead to the sender of a message waiting forever for two reasons:

the data message from the sender to the destination (receiver) is lost
the acknowledgment message from the receiver to the sender is lost

The response to lost messages is that the sender of a message sets a timer (after sending the message); so the sender never blocks for ever. If a timeout occurs, the sender does not know why it did not get a reply. The sender can either

give up
try again (send the message again)

If the sender gives up, it has no guarantee that the receiver hasn't got the message (and processed it). If the sender tries again, it just re-sends the message to the destination.

8.2 Corrupt Messages

Every message should contain a checksum, e.g. the popular CRC (Cyclic Redundancy Checksum), which detects with a very high probability whether a message was altered during transmission. Most protocol suites and applications just discard all corrupt messages. It is assumed that the other side timeouts and retransmits the message. For the sender and receiver a corrupt message is handled like a lost message.

8.3 Duplicate Messages

Either through the underlying network, or timeouts from the sender (through lost acknowledgment messages) the receiver of a message can get a message more than once. The normal approach to detect this is to introduce sequence numbers. If the communication is synchronous, one sequence number suffices, otherwise two different sequence numbers are useful. Every new message gets a new message number. The sender and receiver have to agree on the initial sequence numbers at the start of communication. Both client and server increment their sequence number if they send data, and wait for an acknowledgment of their sequence number. If a message is delivered more than once, this can easily be recognized, as the sequence number was already seen by the receiver, so it can discard the message. The problem with introducing sequence numbers is that this state information has to be maintained. This is not always desirable. In particular if a connectionless protocol is used, and the exchange between client and server just consists of one request and one reply, it is expensive to maintain the result of all operations. It is normally only possible to maintain a fixed number of results.

A detailed study how to select sequence numbers can be found in [Tom75].

8.4 Messages out of Order

The sequence number can also be used to detect messages that arrive out of order. If every message is acknowledged before a new one is sent, this can not happen.

8.5 Congestion Control

In connection-oriented communication or multiple messages to one destination together with message switching congestion control is a serious issue. Some aspects are investigated and solutions suggested in [Jac88].

8.6 Fundamental Problem of uncertain Communication

Even if all of the problems above can be solved, a certain amount of uncertainty is unavoidable. One side can never be sure that the last acknowledgment was delivered correctly. This is caused by the fact that the receiver of an acknowledgment would have to acknowledge the acknowledgment, which in turn the original sender would have to acknowledge, and so on. This problem is known as the ``two army problem'' and described in detail in [Tan88, page 397f].

8.7 Implementation

The function safesendto() and saferecvfrom() in the file distSOCKET/safesend.c take all of the mentioned points into account. They can be used as plug-in replacement for the system calls sendto() and recvfrom(), which are used for connectionless protocols. The function safesendto() sends a message and waits for a reply. If no reply is returned, the message is sent again after a certain timeout period. The function saferecvfrom() acknowledges all incoming messages. Both keep track of a fixed number of sends/receives.

8.8 Summary

The problem of reliable communication using an unreliable communication protocol is discussed. A reliable send()/receive() function using sockets is developed. It was found out that the overhead of getting reliable communication using an unreliable service is considerable, especially in terms of the increased number of exchanged messages (at least two messages instead of one), increased delay till an acknowledgment message is received, and efficient implementation. Therefore the use of connection-oriented protocols like TCP should be considered seriously if the data exchange between two processes is substantial.

Next: 9 Conclusion Up: Unix Communication Facilities Previous: 7 Performance

Gerhard Müller