Broken Pipe

I’m writing a simple utility called wires. It’s a clone of the GNU utility strings, but it’s written in Rust so I wanted the name to be more metallic. I had a working build now (though it doesn’t quite have all the flags and features of the original), and I thought everything was fine, but then I piped the output to head -10. And I immediately got this:

thread 'main' panicked at 'Failed to write to supplied stream: 
    Error { repr: Os { code: 32,message: "Broken pipe" } }', src/libcore/result.rs:860:4
note: Run with `RUST_BACKTRACE=1` for a backtrace.

Hmm, that’s not very friendly. Unix-y command line tools should be able to redirect output to each other without crashing. Continue reading “Broken Pipe”

Wires

I like retro games, so I’m trying to write a Zork machine in Rust. That said, I’m brand new to Rust, so I’ve decided to take smaller bites of the language until I’m a little bit more comfortable with how it works. This post is the story of one of these smaller projects.

Enter wires, a (much simplified) tool in Rust similar to the GNU utility Strings. Wires consumes a file, one byte at a time. If it finds a series of bytes at least 3 bytes long, where every byte is a printable ASCII character, it writes the series to stdout. If it can’t find or open the file, it exits 1, and if writing to stdout fails, it panics. Not super complicated, but it’s finished and does just that. Continue reading “Wires”

Troubleshooting a basic docker issue: Getting a script into the container

Sometimes when I’m working with docker, I’ll have a frustrating experience. I want to do something simple, like copy a shell script into the container and execute it as the container’s entry point, but I miss a silly detail and get stuck for a while. The other day I was trying to run a dockerized Redis instance, and I kept running into a silly permissions issue with the file system that I just couldn’t figure out. In retrospect, as is so often the case in software, I was testing elaborate hypotheses and not checking something obvious, but in case this post saves someone else the trouble, I wanted to go into a little detail. Here’s how to make a redis container that starts with a shell script instead of just starting up the redis server right away. Continue reading “Troubleshooting a basic docker issue: Getting a script into the container”

Querying and Enumeration

Querying a collection is interesting in that, basically, all the operations are built on enumeration. Filtering an array? Enumerate it, and reject elements that don’t meet the predicate. Transforming an array? Enumerate it, and apply the transformation function to each element. Reducing an array (e.g., summing up its elements)? Enumerate it, and keep a running total as you add up each element. That means that, if the language and library desingers were clever, just having enumeration on a class is enough to make it queryable with the full filter/map/reduce set of array operations. This means that, in many languages, it’ possible to write custom collections that add a lot of functionality just by implementing enumeration on our class.

Today we’re going to write two classes, Menu and MenuItem, in both C# and Ruby, and look at how we can tell the code to treat Menu as a collection of MenuItems and then query it. (Also, this post was inspired by another)

In C#, we’re going to use LINQ (the strangely named Language-Integrated Query library that gives C# map/reduce like functionality), and we’ll use the Ruby module Enumerable. Both programs will do the same thing: Look at a list of menu items and filter them based on allergen information. Let’s start with C#.

To get a class in C# to work with LINQ, we have to implement the IEnumerable<T> interface. The C# interface IEnumerable<T> just exposes one public method: GetEnumerator(). There’s one historical accident that, in my estimation, mars otherwise very clean implementation: To implement IEnumerable<T>, we have to implement IEnumerable. In other words, because the non-type-safe IEnumerable was around before the type-safe IEnumerable, we need to use both. Here’s the implementation:

On line 8, we declare that we want this class to be enumerable. Then on line 17, we implement a very simple enumerator. In this case, we just return the enumerator of the underlying list. Line 22 is the blemish I alluded to earlier: because C# 1.0 didn’t have generics, parts of the language rely on the IEnumerable interface, rather than on the IEnumerable<T> interface, so we have to implement both. In this case, we can just return the same enumerator in response to both methods. (The difference is that clients calling the second method will see a collection of Object, not a collection of MenuItem, and will have to cast at runtime.)

That let’s us make this happy little snippet:

var okToEat = menu
              .Where(item => !item.Allergens.Contains("shellfish"))
              .ToList();

Now let’s look at the same thing in Ruby:

This essentially the same class. The important line is line 2: include Enumerable. This adds Ruby’s Enumerable module into the class, which keeps us from getting NoMethoErrors if we call each or filter. Consuming this ruby class looks like this:

ok_to_eat = menu.reject { |item| item.allergens.include? 'shellfish' }

It’s also worth noting that both Ruby and C# will now let you use a Menu in a regular for each loop. I think this can be a handy, if little-used feature in both languages. Here are some reasons why you would do this:

  1. You need to represent an object in your business domain that logically is a collection of some kind.
  2. You want to implement a data structure that didn’t come with the base class libraries, and you want to be able to enumerate that data structure.
  3. You want to make a data structure that clients can query but cannot modify.

I also want to point out some interesting similarities: First, even though Ruby and C# are very different languages, they’ve both constructed enumeration and querying collections in such a way that it can be achieved in user-defined classes just by implementing one method: GetEnumerator() in C# and #each in Ruby. They have also both implemented their native for each loops in such a way that user-defined collections can be used in them. So thanks, language designers for this cool thing.

Till next time, happy learning!

-Will

Acknowledgments: A few things gave me the idea for this post: First, in an episode of .NET Rocks, though I can’t find the specific episode, Carl said simply, “If you can enumerate a collection you can query it.” Second, I went to a Meetup talk in DC where the speaker ran through a number of different programs that can be written in JavaScript using only map, filer, and reduce. Also, I should thank Jon Skeet for writing the excellent C# in Depth, without which I would not understand C# well enought to write posts like this.

Bones and Documents

I’ve been thinking about tests and legacy code. The topic came up when I was listening to an episode of the Legacy Code Rocks podcast with the famous Michael Feathers, author of Working Effectively with Legacy Code. Andrea Goulet, one of the podcast’s hosts, and co-founder Corgibytes, a company that specializes in legacy code, made the observation that when working in legacy code, tests are like bones: They are very important to archeologists, and they stick around.

The analogy of legacy code work to archeology is a good one, but I’m not sure about the analogy between tests and bones. On one hand, bones are very important to archeology. They last a long time (or, as academics would say, they “persist in the archeological record”), and they can be tied to a particular species. But I think tests have a more important aspect than durability. The important thing about tests is that they were written down by their authors. Reading tests is more like finding documents.

When I was studying ancient literature and history in college, there was one glorious semester when I was taking a class in Roman Comedy, and a class in archeology of Roman domestic life. Once, we were having a debate in the archeology class about how private different parts of the house were, asking questions like: “guests are allowed in the atrium, but would it be awkward if some guest wandered into the kitchen?”

This type of question is difficult to answer from an archeological perspective. People have done studies like count the number of doors a person has to pass through from the street to get to a particular room, and used that as a proxy for how “private” the room is considered, but that’s a bit of a guess. Then, when I was reading Roman Comedy, I found a passage where a man starts complaining about how unwanted visitors are trampling all over his privacy because they won’t stay out of the kitchen. In my mind, that answers the question definitively: The way we know Romans felt like the kitchen was more private than the atrium is that one of them makes a joke about unwelcome visitors who aren’t polite enough to stay out of the kitchen. Mystery solved.

Going back to software for a moment: Tests, even bad or very old tests, tell us about what the authors of a system believed at the time that the system actually did. I think legacy code can be approached from two perspectives, much like the Roman house: We can do archeology or we can look at documents. Archeology is reading the source code itself, doing little refactorings, running static analyzers, figuring out when and whether a given method is called. Document-finding is about tests and documentation. If there’s a test that says test "the autoload module works like normal autoload" do (from Rails), we may not love the descriptor, but we know that the previous developers thought autoloading was a normal thing that people do, and that the code in the test represented normal autoloading. That’s a valuable piece of information. I think in terms of archeology, it’s definitely document-type evidence, not artifact-type evidence.

Let me try to tie the analogy all back together: Archeology and legacy code investigation are similar in that they both involve trying to discover more about some artifact that previous people left. In both cases, we generally can’t ask the original creators of the author what they wrote, either because they don’t work at that company any more, or because they’ve been dead for thousands of years. Legacy code investigation and (recent) archeology both have two primary modes of investigation: By inspecting the artifacts themselves, and by inspecting documents that were produced by the same people as produced the artifacts. Both modes have strengths and weaknesses, but they work well together in concert.

So next time you’re trying to understand some old piece of code that ancient peoples (or you 2 months ago) wrote, remember to examine the artifacts (code itself) and the documents (tests and commit messages) together. And check out the next episode of Legacy Code Rocks!

Till next time, happy learning!

-Will