Sunday, December 7, 2014

UDP binding and port reuse in Linux

A recent technical challenge required me to dig deeply into how UDP ports are "bound" - that is, reserved or allocated - in the Linux TCP/IP implementation. It ended up being one of those cases where I had an intuition how things worked, then I found some evidence suggesting that my intuition was wrong, and in the end I discovered it was correct after all but in a different way than I'd expected. Along the way, I wrote and nearly published a different post that would have perpetuated some misperceptions about UDP port binding in Linux. Therefore, I am writing this post instead in an attempt to promulgate correct information.

If you're in a hurry to get to the punchline, it's this: you CAN bind more than one UDP socket to the same address and port in Linux; skip to the section titled Testing the solution to see how.

Background

In my last post, I mentioned working on a secondary controller, or "payload," for small autonomous aircraft. Our research team uses ROS to connect a variety of interrelated software components that run on this payload. The component I discussed before is the "bridge" between ROS and the aircraft autopilot; another component is the bridge between ROS and the wireless network that connects that aircraft to all other flying aircraft as well as to ground control stations.

We currently use 802.11n wireless devices in ad hoc mode, and optionally a mesh routing protocol (such as B.A.T.M.A.N. Advanced) that enables aircraft to act as relays, repeating messages on behalf of other aircraft that are out of range of direct transmission. Our command and status-reporting protocol is built on top of UDP, and we use either IP unicast or IP broadcast depending on the type of message being sent. Command messages from ground control stations to aircraft may be either unicast or broadcast; reports from each aircraft are always broadcast because other aircraft need to know its position for formation flight and collision avoidance.

To test software before we fly it, we run one or multiple Simulation-In-The-Loop (SITL) instances on one or more computers; each SITL instance includes the autopilot software, a simulated flight dynamics model, and the payload software. Because each instance of the payload software needs to communicate via UDP unicast and broadcast, both with other SITLs on the same computer and SITLs on other computers, we need a way to open multiple UDP sockets that can send and receive broadcasts to each other on the same port at the same time. Whether or not this is supported turns out to be a matter of great confusion.

The 60-second-or-so guide to UDP broadcasts (in Python)

As I noted in my previous post, most of the team develops in Python. Sending and receiving UDP broadcasts in Python is quite easy; to set up a socket and send a datagram to a broadcast IP address and arbitrary UDP port is all of four lines of code (not counting error-handling):