TDD and Algorithms Class

As I mentioned in a previous post, I recently started taking a course on Coursera called “Discrete Optimization.” It’s hard, and I don’t have quite enough time in my schedule to finish everything, but I am learning a great deal.

My main concern with taking this course is to learn Ruby, and learn a little bit of test driven development. If I end up learning some cool algorithms along the way, I’ll take it.

This post will be about test driven development and algorithms. Also, I have to apologize for not posting exact source code or very many exact test cases; the Coursera class has an honor code, and I want my blog post to help you learn, but not help you cheat.

The first problem we work on in the class is the Knapsack Problem. Basically, given a set of items which are discrete (i.e., we can’t take half an item), each of which item has a value and weight, what’s the most valuable knapsack that can be packed under a given weight. Another way to put the question is: “Given this list of items, each with a value and a weight, is there a subset of items with a total value greater than $100 but a total weight less than 40 lbs” for example. You can read other people’s better explanation of this problem elsewhere; this post is about how I’m approaching it.

So, I’m going to write tests first. Sandi Metz mentions in her excellent book Practical Object-Oriented Design in Ruby that beginning test writers often couple their tests too closely with their code. I’ve often, when trying to design a test, been tempted to basically write a method twice, once in the the code and once in the test, and then assert that they return the same thing. This doesn’t really help.

An algorithm for a well-defined problem gives me a great chance to separate my tests from my code: I can find behaviors that any correct algorithm will certainly follow. For example, if I am only allowed to take 0 pounds of items, the value of the items will be 0. If the allowed capacity is so high that I am able to take all of the items, then I will take all of the items. (The problem assumes that there are no items of negative value.)

These behaviors provide a perfect opportunity to write tests, because they are very definite and easy to code, and also because any correct algorithm for solving the knapsack problem will follow these behaviors. This later property is particularly helpful in an algorithms class, which will require me to write different algorithms that have different tradeoffs; one algorithm for the knapsack problem is fairly fast, and guaranteed to be optimal, but uses a prohibitively large amount of memory for knapsacks with large capacity. The nice thing about this approach to testing is that I can write tests that exercise any valid knapsack solver, and all the implementations I come up with should be able to rely on the same set of tests.

The idea for this sort of test came from an old episode of .NET Rocks where the guest mentioned behavior driven testing. (Unfortunately, I can’t find a link to the specific episode right now. If anyone can help me out, please leave a comment.) His example was testing an algorithm that reverses lists by asserting that that reversing a list twice yields the original list. This example shares the same properties: it need know nothing about how the list reversal algorithm works, and it will be true for any correct algorithm. At the time I listened to the episode, I thought “that’s a really cool idea, but I can’t remember the last time was working on a problem that had such clear behaviors.”

But I realize now I wasn’t thinking of the problems correctly. I think line of business apps can use this sort of test in many different places. For example, writing a test that asserts that an order with no products has a total of zero, or a test that adding sales tax to an order never decreases the total amount, or something.

In summary, I’ve started to get excited when I can write a test that exhibits these two properties:

  1. The behavior tested is a necessary property of the problem domain; there is no correct implementatio of the solution that doesn’t exhibit the behavior.
  2. The behavior can be checked from only the input and output of the implementation; the test is completely blind to the internal details.

It might be optimistic to think that most tests can be written that way, but I think it’s really worth taking the time to find a few.

Till next time, happy learning!

-Will

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s