Friday, September 4, 2015

Fifty autonomous planes simultaneously in the air


I've been pretty quiet on this blog for a while; some of my more recent posts have alluded to an autonomous aircraft project that's been keeping me busy. I'm pleased to announce that we've reached a major milestone for this project: on Thursday, August 27th, our team successfully flew fifty autonomous planes simultaneously, executing cooperative behaviors in the air. This is the culmination of two years of work for me, and more than three years for some of my teammates.

I wrote a summary of the flight over at DIY Drones, check it out!

Sunday, December 7, 2014

UDP binding and port reuse in Linux

A recent technical challenge required me to dig deeply into how UDP ports are "bound" - that is, reserved or allocated - in the Linux TCP/IP implementation. It ended up being one of those cases where I had an intuition how things worked, then I found some evidence suggesting that my intuition was wrong, and in the end I discovered it was correct after all but in a different way than I'd expected. Along the way, I wrote and nearly published a different post that would have perpetuated some misperceptions about UDP port binding in Linux. Therefore, I am writing this post instead in an attempt to promulgate correct information.

If you're in a hurry to get to the punchline, it's this: you CAN bind more than one UDP socket to the same address and port in Linux; skip to the section titled Testing the solution to see how.

Background

In my last post, I mentioned working on a secondary controller, or "payload," for small autonomous aircraft. Our research team uses ROS to connect a variety of interrelated software components that run on this payload. The component I discussed before is the "bridge" between ROS and the aircraft autopilot; another component is the bridge between ROS and the wireless network that connects that aircraft to all other flying aircraft as well as to ground control stations.

We currently use 802.11n wireless devices in ad hoc mode, and optionally a mesh routing protocol (such as B.A.T.M.A.N. Advanced) that enables aircraft to act as relays, repeating messages on behalf of other aircraft that are out of range of direct transmission. Our command and status-reporting protocol is built on top of UDP, and we use either IP unicast or IP broadcast depending on the type of message being sent. Command messages from ground control stations to aircraft may be either unicast or broadcast; reports from each aircraft are always broadcast because other aircraft need to know its position for formation flight and collision avoidance.

To test software before we fly it, we run one or multiple Simulation-In-The-Loop (SITL) instances on one or more computers; each SITL instance includes the autopilot software, a simulated flight dynamics model, and the payload software. Because each instance of the payload software needs to communicate via UDP unicast and broadcast, both with other SITLs on the same computer and SITLs on other computers, we need a way to open multiple UDP sockets that can send and receive broadcasts to each other on the same port at the same time. Whether or not this is supported turns out to be a matter of great confusion.

The 60-second-or-so guide to UDP broadcasts (in Python)

As I noted in my previous post, most of the team develops in Python. Sending and receiving UDP broadcasts in Python is quite easy; to set up a socket and send a datagram to a broadcast IP address and arbitrary UDP port is all of four lines of code (not counting error-handling):

Wednesday, July 9, 2014

A Python ROS bridge to MAVLink-based autopilots

My main project over the past few months has been developing a secondary controller (some call this a "companion computer;" we usually call it the "payload") for a small, fixed-wing autonomous aircraft flying a commercial off-the-shelf hobbyist autopilot that speaks the MAVLink protocol:


The purpose of the payload is to provide higher-level mission and path planning, inter-aircraft coordination, and an interface to ground control stations; this allows the autopilot to remain dedicated to immediate guidance, navigation, and control tasks and to maintaining flight safety in general. As a research group, our aim is to develop the capability for up to fifty of these aircraft to operate cooperatively and to execute a complex mission with only high-level commands from the ground.

We chose to build the payload software on top of the Robot Operating System (ROS). ROS offers two important features: first, an inter-process communication middleware that includes a publish-subscribe model and a service request-response model. Second, a vast open-source library of software components, or "nodes," ranging from extended Kalman filters to laser rangefinder drivers to coordinate frame transforms. Using ROS, we can create (and leverage) a collection of nodes onboard the payload to communicate across a wireless network with other aircraft and the ground, perform path planning, process sensor data, and so on.

One of these nodes, necessarily, must be a driver (or "bridge") between the autopilot and the other ROS nodes. In our case, it must map MAVLink messages from the autopilot into ROS messages, and ROS subscribers and services commands and queries into MAVLink messages to the autopilot. Of course, this mapping isn't one-to-one in most cases. MAVLink specifies transaction protocols for retrieving and updating things like the list of waypoints; a single ROS service can request the list of waypoints, wait while the transaction takes place, and return a single response with the entire list. Other operations can (and should) be one-to-one and even idempotent: changing the guidance mode of the autopilot, or (dis)arming the throttle.

As it turns out, there are a handful of bridges between MAVLink and ROS out there already, including:
Of these, mavros is the most full-featured. It offers a healthy set of publishers conveying various aspects of the autopilot state, and subscribers to handle commands to the autopilot. It also offers a modular plugin architecture so that other developers can add on new ROS publishers, subscribers, and services. It is written in C++ (ROS officially supports both C++ and Python, and Matlab support has recently been introduced), which can be a boon for efficiency and a bane for complexity. We ideally would like the flexibility of its modular architecture, but our team has greater access to Python programmers than to C++ programmers. Further, we anticipated some unique interfacing needs that might merit a custom approach.

Thus, I took the unreasonable course of action and decided to roll my own autopilot bridge, creatively named autopilot_bridge. It is loosely inspired by roscopter but adopts its own modular architecture in the spirit of the popular MAVLink-based ground control station software MAVProxy.

You can find autopilot_bridge, and more documentation on installing and running it, on GitHub:

https://github.com/mikeclement/autopilot_bridge

Someday, when some stray free time turns up, I would love to share some of the lessons I learned about dynamically loading Python code and interfacing with protocol state machines. In the meantime, it's all in the code :)

Saturday, February 22, 2014

Multiple AR.Drones from a single computer using the ardrone_autonomy ROS package

I'm currently working with a small graduate student team on centrally controlling a group of Parrot AR.Drone quadcopters or "quads" using the Robot Operating System (ROS), a software middleware for connecting robotic system components. There is a ROS driver called ardrone_autonomy for the AR.Drone, which wraps the AR.Drone SDK and exposes its controls and feedback to ROS' publish-subscribe model. This works well for controlling a single quad from a single computer. However, we ran into issues when trying to control multiple quads from a single computer, seemingly because the underlying SDK was designed for single-computer to single-quad use.

A lot of other groups have posted interesting solutions for controlling multiple quads. The AR.Drones use 802.11 wireless networks for control, feedback, and video; some solutions we examined focus on reconfiguring all quads to use a common network and then use multiple computers to control them. Others go a step further and allow control of multiple quads from a single computer. However, no solution that we've encountered allows both control and feedback/video of multiple quads from a single computer (at least, not using stock AR.Drone firmware).

One of the students, Brenton Campbell, and I set out to fly, crash, and break multiple quads at once hack some code and develop a solution for full bi-directional communications. This post documents and provides all code necessary for multi-quad control, feedback (navdata), and video from a single computer. It uses a modified version of the ardrone_autonomy ROS package and its included version of the AR.Drone SDK (v2.0.1, internally termed ARDroneLib), combined with some network hackery. Further, our solution requires no manual or permanent changes to the quad; it is entirely coordinated from the computer and can "automatically" reconfigure and utilize out-of-the-box quads. It has only been tested on v1.0 quads, but as far as I've read, it should apply equally to v2.0 quads.

I should note that this solution draws on insights gleaned from the above-referenced posts, and also:

The solution described herein might void warranties, violate license agreements, and (though exceedingly unlikely) render your AR.Drone unusable. My description assumes a fair amount of familiarity with Linux, comfort using the command line, and a bit of C and Python programming skill. Use this information and code at your own risk!

The solution can be broken into three steps, each of which I discuss below:
  1. Reconfiguring network settings on individual AR.Drones
  2. Remapping UDP ports to unique ground-side port numbers
  3. Modifying the AR.Drone SDK and ardrone_autonomy package to customize UDP ports
Update 3/26/2014 - Following a discussion thread on the ardrone_autonomy GitHub project, another contributor, Kenneth Bogert, discovered an alternative to steps 2 and 3 above. It turns out that when the computer sends UDP probe datagrams to a quad's video and navdata ports, the quad captures the source ports and uses them as its destination ports. By modifying the SDK to use ephemeral client-side ports, both video and navdata work and no port remapping is necessary. Kenneth posted a pull request with his modifications here.

Update 11/20/2014 - An updated version of Kenneth's pull request was merged into ardrone_autonomy on the 9th.

Monday, December 16, 2013

Weekend Diversion - an HTTP interface to mplayer using Python and Flask

Author's note: I'm currently collaborating with a friend over at AutisTech.org on finding or creating a video playback solution with very simple remote control for his daughter. This post is based on some early exploration I did toward this. Ultimately, we'd like to have a simplified mobile device interface to XBMC or something similar. If you are interested in contributing your expertise, please check out the projects page over at his website!

Update 1/25/2014: More details on the video player concept have been posted here.

Once again I was looking to do some coding for fun over the weekend. A friend and I had recently discussed ways to control media playback on a Linux-based system from a remote device. Looking for an excuse to practice writing Python code (I come from a Perl background) and to learn Flask (a lightweight in-application webserver for Python), I decided to roll my own simple HTTP interface to control mplayer, a popular, command-line media player for Linux.

Read on for the details of my approach and some mildly-questionable Python code. (If you're looking for more refined implementations of this idea with varying feature sets, check out here and here.)

Wednesday, November 20, 2013

Forcing a higher MSS to improve TCP performance on links with asymmetric MTUs

An interesting networking problem was recently posed to me. As context, suppose two computers, call them “A” and “B”, are networked together such that the physical link from A to B is separate from the physical link from B to A. The A-B link has a link-layer Maximum Transmission Unit (MTU) of, say, 1500 bytes (normal for Ethernet). However, the B-A link has a much smaller MTU, say 500 bytes:

-----              -----
|   |-- 1500 MTU ->|   |
| A |              | B |
|   |<-- 500 MTU --|   |
-----              -----

Here’s the problem: when transferring large volumes of data from A to B over a TCP connection, the transfer rate is much slower than what is expected given a 100Mbps link speed from A to B. The question is why, and what can be done about it? Read on for the discussion, and a proposed and tested solution.

Sunday, November 17, 2013

Weekend Diversion - Conway's Game of Life using ncurses

Seeking a bit of for-fun programming for the weekend and an excuse to refresh myself on the basics of ncurses, I decided to write my own small implementation of Conway's Game of Life. If you've never had a chance to become acquainted with this "game," I highly recommend taking the time to read about it or to play with one of the many implementations (including mine below if you'd like). It is a great study in how a small number of simple rules can lead to some amazing emergent phenomena when applied at a larger scale.

In a nutshell, the "game" (I use quotes because it is not a game in the traditional, competitive sense, although some people have created variants that allow two players to "compete") is staged in a 2-D world that one can think of as a grid of "cells." The life of each cell depends upon its neighbors; cells like some company, but not too much! For each "step" in the game, each cell's life is evaluated based on the number of living neighbors it has (above and below, left and right, and the four diagonals). The life of the cell following the step is determined by three rules:
  1. If the number of living neighbors is exactly 2, the cell remains alive or dead (whichever it was previously).
  2. If the number of living neighbors is exactly 3, the cell is "born" (or remains alive).
  3. Otherwise, the cell "dies" (or remains dead).
Pretty simple, right? So simple that an entire implementation with an ncurses interface takes less than 200 lines of code (as sloccount reports; minus whitespace and comments). So, let's get to it!

Saturday, September 14, 2013

Modifying software for fun and music, part 2

Author's Note: Whew! I've been on a long hiatus from posting. This post in particular is a long-overdue follow-up to an experiment where I documented my foray into deciphering and modifying a particular piece of open source software as I went along. Unlike that post, the following is compiled from notes made during the process. Enjoy!

Last time we left off with having added the ability to specify multiple songs from the command line. However, the last song in the list was getting truncated several seconds early. Also, m4a (AAC) files were not playing at all. Finally, after further testing it came to light that mono MP3 files were not playing correctly. Read on for my notes on treating each of these issues, and finally the output of diff on the original code and my changes, if you're so inclined to use them!

Wednesday, May 16, 2012

Using ld to manually build an executable

Author's note: further work on raop_play has been sporadic due to other happenings, though I have some new results addressing song truncation, playing other file types, and playback of mono files. I will (hopefully) get this posted shortly!

About three days ago, I finally bit the bullet and decided to give Kubuntu a try. While I have been using Debian happily for the past few years, I am discovering an increasing desire for simplicity and to have things work out of the box, rather than spend large amounts of time configuring everything from scratch. One of my side-goals was to try out the Pulse audio system to see if I could successfully integrate my Linux sound with my Airport Express. As it turns out, this does not work as smoothly as I'd anticipated. While getting it up and running was mostly straightforward, there are some playback issues that appear to be unresolved within the community. Perhaps a project for another day, but with a freshly-installed OS I wanted to play some tunes NOW!

So instead I go back to my previous work on raop_play and simply recompile with the new libraries on this installation... but I run into a problem. For some reason, the OpenSSL library wasn't being found by the linker, giving me the following error:

...
gcc -o raop_play raop_play.o raop_client.o rtsp_client.o aexcl_lib.o base64.o aes.o m4a_stream.o audio_stream.o wav_stream.o mp3_stream.o flac_stream.o ogg_stream.o aac_stream.o pls_stream.o pcm_stream.o -lssl -lsamplerate -lid3tag
raop_client.o: In function `rsa_encrypt':
raop_client.c:(.text+0x131): undefined reference to `RSA_new'
raop_client.c:(.text+0x175): undefined reference to `BN_bin2bn'
raop_client.c:(.text+0x1b3): undefined reference to `BN_bin2bn'
...

... and so on for another dozen lines. Actually, it was an even-worse problem at first, until I discovered that gcc is sensitive to the ordering of files and library references, and fixed the Makefile to move all "-l" options to the end of the command, as shown above.

I racked my brain, wondering if I had an incompatible version of the OpenSSL library, or if the system's library path was broken, but every workaround I tried yielded the same result.

Wednesday, March 21, 2012

"It's all just bit-flipping and timing"

This was stated to me by a coworker at my first job. It was at a formative moment in my life: getting ready to start college, working at a then-startup doing data entry and computer repair, and learning to program in C++ in my spare time. My coworker said this while I was shoulder-surfing his efforts to program an embedded computer to display text on a small serial-driven LCD screen. He was effectively instructing the computer to set particular bits on the display (bit-flipping) at specific times (timing). It's all just bit-flipping and timing...

Four years later I was taking CSSE 380, Organization of Programming Languages, learning Scheme and writing continuation-passing interpreters with garbage collection and static type checking. Which is to say, bending my mind on a nightly basis. (I recall discussing with my roommates - also in the class - starting a band called Dr. Scheme and writing heavy metal songs that always ended in "cond, lambda, define!" But I digress...)

Looking back on it now, that course wasn't really about learning Scheme, nor was it about coding garbage collection or type checking. It was about arriving at the fundamental realization that under the hood of the language compilers and interpreters we use on a daily basis lives an amazing paradox. Each one is a transformation on some language, turning it into yet another language (e.g., assembler) or into a sequence of operations performed immediately within that very transformation, producing a final result. And what is amazing is that these transformations are often built using the very language they transform! One can craft a Scheme interpreter in Scheme, using Scheme lists to represent Scheme code. Likewise (with a bit of parsing), one can write a C compiler in C. It was recognizing the duality of a program represented as code (text) and textual strings that can be manipulated in a programmatic way that opened my eyes to a part of what really goes on inside a computer.

Suddenly, the magic of a programming language, hand-crafted by wizards in underground dwellings, was transmuted into a very real, if not easy, undertaking that even us wee burgeoning Computer Scientists could approach. All of the abstractions that we took for granted when hitting the Compile button boiled down into the "bit-flipping and timing" of compilation and interpretation. I would call this a transformative moment in my Computer Science education, where my understanding of the discipline underwent a fundamental shift. And over time, I have come to recognize other moments in a CS education that bear this same mark.

Wednesday, February 22, 2012

A quick word on Bufferbloat

I've recently been working on a conference paper and some coding in support of my dissertation research, hence the lack of updates. In the meantime, I found this very interesting and important to share...

Jim Gettys (one of the primary people behind the X Windows system most users of Unix and Linux rely upon for their graphical environment) has been busy investigating "Bufferbloat." Apparently this issue is cropping up more and more on the Internet, and causing problems for services such as streaming video (think Youtube or Netflix) and telephony (think Skype or Vonage). Here is his very good introduction to the problem; read on for my two-paragraph synopsis and links, followed by a few thoughts of my own.

In short, network routers (and in many cases switches) use buffers to queue packets when more are arriving than can be sent out a particular network interface. The purpose of buffering is to avoid dropping packets when they can't be sent out fast enough. This almost always happens when one interface of a router operates at a much faster rate than another interface. One place this is common at the border where your ISP's fast network connection meets your home's (relatively) slower network connection.

Buffers are generally good then, right? They keep data from being dropped. Well, when used in excess, buffers end up defeating one of the principal mechanisms built into TCP to avoid congestion on networks. TCP actually needs a few packets to be dropped (don't worry, it's good about resending them as soon as it realizes they didn't make it through) in order to determine how much traffic it can safely place on a network. Without this, TCP keeps pushing more and more data, assuming there is room to spare. This can lead to very high latency and jitter, two primary enemies of all streaming media (video and audio, for example).

You can read a great discussion between Gettys, Vint Cerf, Van Jacobson, and Nick Weaver here. Gettys has also created a website and project around understanding and addressing Bufferbloat.

An addendum - personal perspective

To put the importance of latency and jitter in perspective, I once was engineering a network that had two paths out to the Internet, a cellular connection (lower bandwidth, low to moderate latency) and a satellite link (higher bandwidth, very high latency). One person was using the cellular path to download imagery data. The apparent performance was less than stellar, and so I was asked to re-route their traffic over the satellite link. After making this change, the apparent performance became much worse, yet the person did not understand why.

When I asked about the nature of the downloaded data, I was told each transfer was a small amount of imagery, but transfers were made often and needed to be completed quickly. It turned out to be map tiles like those used by Google Earth. While the satellite link offered greater bandwidth (number of bits it could transfer per unit time), it took a long time to get the first bits all the way across the link. For small transfers, this time dominated over the time to move the remainder of the data after the first bits arrived, so the apparent performance was worse than a lower-bandwidth, lower-latency connection.

Most interactive applications such as voice over IP, video teleconferencing, and network gaming are more sensitive to latency and jitter than they are to occasional packet loss. The problem of bloated buffers is not only a result of past trends in device manufacture and configuration but also of changes in how people use the Internet. What was an acceptable, perhaps appropriate solution when the main kinds of transactions were time-insensitive file and webpage transfers, is no longer appropriate in the age of time-sensitive multimedia streaming.

Parts of the solution are out there, and parts of it are yet to be developed. At this stage, raising everyone's awareness (not just device manufacturers and ISP's but also end-users and application developers) is the best action we can take toward understanding and ultimately addressing the problem.

Wednesday, January 18, 2012

Modifying software for fun and music, part 1

Author's Note: This is the first post in an experiment wherein I document my foray into deciphering and modifying a particular piece of open source software as I do it. My interest lies in whether the resulting posts a) are digestible, and b) provide additional insight into the "how" of the process. As such, these will undergo only cursory editing before being posted. Expect typos!

Update 9/14/2013: The second part of this post is finally available here!

A few months ago, I purchased one of these newfangled Internet-enabled televisions so I could stream movies from Netflix without having to plug my laptop into the TV every time. Since I didn't spring for a model with built-in wireless, I subsequently bought a nifty device from some big-name manufacturer, which lets me plug in an Ethernet device and acts as a wireless client on its behalf. This device happens to also let me stream music from said manufacturer's music application to my stereo via an 1/8" audio plug on the device. Pretty nifty stuff.

My main home computer is a desktop running Linux, and I don't want to boot my laptop every time I want to play some music (the whole point of the TV upgrade, right?). So I want an easy way to stream music from Linux to said device. Well, if you're familiar with audio under Linux, there's something like six different subsystems you can run: OSS, ESD, ALSA, Pulse, et cetera. Someone made a nice module for the Pulse audio subsystem that lets these devices act like virtual sound cards, which is great if you're running Pulse. But after an entire afternoon spent breaking and fixing my sound in an effort to shift from ALSA to Pulse, I decided this wasn't the solution for me.

Fortunately, someone else had the same idea and created a utility called raop_play a while back. This is a command line client that takes the IP address of the device we want to stream to and the filename of the audio file (e.g., MP3) to play. After a quick download and compile (okay, a moderately quick compile after installing a few dependencies and subverting build errors), it worked right out of the box. But it lacked a couple of things I wanted:
  1. The command line only takes a single filename, even though there is an interactive mode with support for playing additional files. I'd like to specify an entire album up front.
  2. Although the documentation claimed support for M4A files (which I happen to have a lot of by virtue of using said manufacturer's music store), I only got errors trying to play them. Playback of MP3 files also seems a bit buggy (playback sometimes stops prematurely). I'm thinking of incorporating a different decoding engine.
For today's post, I will focus just on the first item: playing multiple files. Armed with nothing but a compiler and an innate desire to make this software do what I want, this post is my log of trying to get this to work.

Wednesday, December 14, 2011

Signals and sockets for querying a process

Hello again, dear reader!

Another part of the research I'm doing entails capturing, processing, and storing network packet attributes. This is done in a nifty application that invol... oh, but that is a post on its own! What I'd like to share today is an interesting little way of sharing data between the packet capture process and another running process.

So here's the skinny: my application uses libpcap to do packet capture. Pcap has a couple ways to process the packets it grabs off the wire, both of which are blocking. My code also has to answer queries from a single other process on the same machine. But, even if my while loop (if using pcap_next()) or callback (if using pcap_loop() or pcap_dispatch()) checks somehow for pending queries, the querying process has to wait until the pcap process gets another packet for that check to occur. The question arises: how can this application respond immediately to a query, regardless if packets are currently being captured?

Shared memory and multithreading is an option, as is pushing data to a separate database. But we want simple (my entire application is under 300 lines of code, counting the solution I describe here), and the machines I want to run this code on may not be able to support a database server. Besides, what's the fun in doing this if there isn't an opportunity for a bit of hackery?

It turns out that a combination of sockets and signals does just the trick. We're going to give the pcap process a listening Unix socket and and a function to handle signals, and let the OS do the rest of the work for us.

Before we jump into the code, let's making life simpler and take all this packet capture business out of the picture - that's complicated enough on its own, and may be the subject of another post in the future. Instead, let's say we have a table (2-D array) of students and the classes they must take. Each spot in the table is a struct with the quarter in which they took the class and the grade they received. That way we get a struct for the query (student and class) and another for the response (quarter and grade). And to keep things easy on ourselves, we'll make everything a number except for the grade, which will be a single character ('A', 'B', 'C', and so on).

Let's look at the code that processes a query (all error-checking has been removed for simplicity):

  void handle_query(int sig) {
    char buffer[BUF_SIZE];
   
    int sd = accept(sock, NULL, NULL);
    int len = recv(sd, buffer, BUF_SIZE, 0);
    struct query *q = (struct query *)buffer;
   
    struct record *r = &records[q->student][q->class];
   
    send(sd, (char *)r, sizeof(struct record), 0);
    close(sd);
  }

Wow, that was easy! Looks a lot like the standard TCP server from a network programming 101 class, doesn't it? Accept a connection from a listening socket, receive a query, typecast it into a struct, do a lookup, send the result typecast as a byte array, and close the connection. If you haven't seen something similar before, check this out or do a quick Google search for "Linux TCP server in C". I'll provide the definitions of struct query and struct record at the end; for now, just know that sock and records are global variables.

So what's with this funky-looking function declaration? It's a void; that's okay, but what's this int sig that never gets used in the function body? Well, this function isn't actually called by any code in the program per se; it's a signal handler. "A signal what?" you ask...

Sunday, December 4, 2011

Sending raw Ethernet frames in 6 easy steps

Part of my work entails building a protocol stack in C that lives alongside TCP/IP and which is constructed entirely in user space, save for a few standard system calls. Hence, I need the ability to craft raw Ethernet frames and send them using only the facilities the operating system provides.

Fortunately, I have two things going for me. First, the implementation is all being done under Linux (in particular, a 2.6.32 kernel, though I believe any 2.6.x kernel will do). Second, a few other authors have gone before me in deciphering the man pages and system calls and have put together some great example codes. Most notably, Andreas Schaufler, whose writeup inspired my implementation as well as the code in this post.

So why write more on the subject? Because it takes time to understand all the moving parts and pieces, and most of us are operating under deadlines. What I hope to contribute is a one-stop, soup-to-nuts explanation with example code to save time for the next person. I'm going to assume you are somewhat comfortable with the C programming language and have seen or done some network programming before. However, my goal is to make this as painless as possible.

Starting point

First, let's try to make things easy for whomever is using this code. If they want to send a raw Ethernet frame, what things can we reasonably ask them to provide?
  • The destination MAC address would be nice, for sure. If they only know the IP, you can either have them look up the MAC with the arp command or code that lookup into your software. Let's assume you've done this if needed.
  • Since the frame needs to go out some particular Ethernet interface, we'll assume they know this already and can specify it by its "friendly" name, e.g., "eth0". The source MAC is needed as well, but we can look that up from the interface, assuming they're not trying to fake it. If they are, I will point out where to do this.
  • An Ethernet protocol number would also be good to know. This is the value that tells the receiving system which protocol is contained inside the frame. IP uses 0x0800, but perhaps you, like me, want to craft your own protocol that isn't processed by the TCP/IP stack.
  • Finally, the bytes you want to put into the message. This part is entirely up to you. Maybe you want to inject the contents of a captured packet, or some payload of your own creation. All we care about is that the bytes are stored somewhere (we'll say a C byte or "char" array) and you know the number of bytes to be sent. (We'll also assume you're staying within the number of bytes that fit in a frame, a.k.a. the Maximum Transmission Unit or MTU.)
Given these four pieces of information, let's walk through the code to send a frame. It will take six steps, listed below. Depending on your starting point, you may omit some of these steps.
  1. Defining our input and output data
  2. Building a raw "packet" socket
  3. Looking up the interface index and MAC address
  4. Filling in the packet contents
  5. Filling in the link layer socket address structure
  6. Sending the packet

Greetings!

Hello, dear readers! To kick off this blog, I thought I would share some of the technical bits from the past several months of working on my dissertation research.

In my field (computer science), underpinning most academic papers is a mountain of prototyping work that often does not see the light of day. Pity that, after giving up on man pages as a complete source of documentation and digging through forum posts and kernel code to uncover key pieces of information needed to make software work, that knowledge should never find a home.

But, lucky you! You have managed to stumble across this blog, and willingly or not, you have lent me your eyes and mind for the next few minutes, so I intend to make the most of it! (After all, how often does a grad student get a captive audience when talking about the nitty gritty of his or her work?)

Okay, okay, enough intro, let's get on with the good stuff!