Build one to throw away

Most software projects inadvertently plan to build a throwaway product for delivery to customers, instead of heeding Fred Brooks:

In most projects, the first system built is barely usable…

Plan one to throw away; you will, anyhow.

Luckily, when building Text Collector, only my own management expectations burdened me, so I had an opportunity to build a real pilot and chuck it.

I wrote the pilot in Java and it included all the major features I knew I would need. Thanks to Steve Yegg popping up for the first time in a few years, I heard about Kotlin, so I switched to Kotlin for the real program.

Now that it’s in alpha, I can reasonably compare the size of the two programs:

Java Kotlin
Lines  1.9k  3.5k
Words 6k  13k
Bytes 71k  146k

By many standards, this is small, and smaller yet when you consider that I’ve counted using simply the “wc” command, so the numbers include whitespace and comments.

The pilot included all major functionality, but ignored most edge cases. It featured mms and sms collection with:

  • Date filtering
  • Message preview (small collections only)
  • Pdf creation
  • Pdf sharing
  • Inline image rendering (no scaling)
  • Zooming (without panning or clamping)

For the real program, I kept all the pilot features except contact filtering, added handling for many edge cases, and added features:

  • Organization by conversation
  • Zip creation
  • Reporting usage statistics and errors
  • Failure diagnostics and debug features
  • Option to cancel collections in progress
  • Pre-calculated layout (allows preview for large collection)

So, the Kotlin app has roughly twice the features, with roughly twice the amount of code. At first glance, it doesn’t seem like that significant a difference, but breaking it down this way lets me count up the lines associated with new features only: 1.3 thousand. Thus, the part roughly equivalent to the pilot weighs in at 2.2k lines of Kotlin to Java’s 1.9.

In other words, Kotlin, handling edge cases, is only 15% larger than the equivalent Java that ignores edge cases with wild abandon.

My commit log shows that the Java version took about one month, while the Kotlin version took three. That seems pretty dismal: two thousand lines per month in Java and closer to one thousand in Kotlin, but…

The Java pilot had no tests, the real, Kotlin version adds 6.6 thousand lines worth of test code.

Exception Rules IV: The Voyage Stack

Something I wrote seven years ago and I’m publishing now to see if learned anything since then.

This is part of a series where I review common wisdom about Java error handling. The series is in no particular order, but the first installment explains my categorization method.

Clean up with finally
Truth: high
Importance: high

Joshua Bloch explains the reasoning behind this in Effective Java as “strive for failure atomicity.” Whatever happens, clean up after yourself. Checked exceptions give us one of their rare benefits by just maybe reminding us to write finally blocks.

One way not to write a finally block, however, is like this:

try {
	connection = DriverManager.getConnection("stooges");
	// Snip other database code
} catch (SQLException e) {
	throw new RuntimeException(e);
} finally {
	try {
		// Bad. Do not do this.
		connection.close();
	} catch (SQLException e) {
		throw new RuntimeException(e);
	}
}

If opening the connection throws SQLException, closing the connection throws NullPointer and obscures the original cause. Usually, people suggest wrapping a conditional around the connection.close() call to check for null, but that gets ugly fast and is easy to forget. Instead, follow the next rule.

Make try blocks as small as possible
Truth: high
Importance: medium

Consider this code that obscures which file could not open:

try {
	// Smelly. Don't do this.
	curly = new FileReader("Curly");
	shemp = new FileReader("Shemp");
} catch (FileNotFoundException e) {
	throw new RuntimeException("Whoopwoopwoopwoop", e);
} finally {
	// The close method translates IOException
	// to RuntimeException
	if (curly != null) {
		close(curly);
	}
	if (shemp != null) {
		close(shemp);
	}
}

We programmers wrap multiple lines in the same handler because we are lazy, but like the hare napping during the race, that laziness hurts us in the end; it forces us to reason much more carefully about the application state during cleanup.

Instead, handle the exception as close to its cause as possible:

try {
	curly = new FileReader("Curly");
} catch (FileNotFoundException e) {
	throw new RuntimeException("Missing: Curly", e);
}
try {
	shemp = new FileReader("Shemp");
} catch (FileNotFoundException e) {
	throw new RuntimeException("Missing: Shemp", e);
} finally {
	close(curly);
}
close(shemp);

The benefits of the shrunken try block may not be obvious from this tiny example, but consider how you can now factor out the file opening blocks to a method that simply throws an unchecked exception for missing files. Once done, this code simplifies down to:

curly = open("Curly");
try {
	shemp = open("Shemp");
} finally {
	close(curly);
}
close(shemp);

Note that all these examples lose the original exception when the close method throws an exception. Most applications can afford that minimal risk, but if you believe it likely that your cleanup code will throw further exceptions, a log and re-throw might be appropriate.

Do not rely on getCause()
Truth: high
Importance: high

Peeking at an exception’s cause is equivalent to using something’s privates or parsing the string representation to find a field value. It makes code brittle; moreover, you can only test it by forcing errors in your collaborators.

If you find that some library forces you to do this, consider avoiding that function completely; if you still have no way around it, adorn your code liberally with “XXX” comments, and test as best you can.

Do not catch top-level exceptions
Truth: low
Importance: low

Top-level exception classes like Exception live close to the root of the exception hierarchy. The argument against catching these says that you should avoid it because the specific lower type, such as FileNotFound, traditionally conveys information necessary to handling the exception, so by catching the top-level exception, your handler is dealing with an unknown error, which it probably knows little about.

Actually, the advice should say “do not try to recover from top-level exceptions.” Catching top-level exceptions is not fundamentally wrong or even bad, but because you do not really know how severe the exception was, you should usually do no more than report and organize a crash.

Exception Rules III: The Search for Cause

Something I wrote seven years ago. Did I learn anything in that time?

This is part of a series where I review common wisdom about Java error handling. The series is in no particular order, but the first installment explains my categorization method.

Do not use empty catch blocks
Truth: high
Importance: high

This is the most obvious of the category of exception handling rules that address how to avoid losing error information. The usual example for when you might justifiably ignore an exception goes like so:

static void closeQuietly(Closeable closeable) {
  // Smelly. Do not do this.
  try {
    closeable.close();
  } catch (IOException e) {
    // I've already done what I needed, so I don't care
  }
}

Yes, the comment makes this better than the completely empty catch block, but that is like saying that heroin is fine because you wear long sleeves.

Very rarely, suppressing an exception actually is the right thing to do, but never unless you absolutely know why it happened. Do you know when close() throws IOException? I thought not.

Do not catch and return null
Truth: medium
Importance: medium

Catching and returning null is a minor variation on the exception-suppression theme. Consider the Stooges class, which contains this method:

public String poke(String stooge)
              throws StoogeNotFoundException {
  if (stooges.contains(stooge)) {
    return "Woopwoopwoopwoop";
  } else {
    throw new StoogeNotFoundException("Wise guy, eh");
  }
}

Suppose you want to write another method that checks a Stooge’s reaction to a poke, but Stooges gives you no isStooge method. Instead, it forces you to write this:

static String getStoogeReaction(Stooges stooges, String name) {
  try {
    return stooges.poke(name);
  } catch (StoogeNotFoundException e) {
    return null;
  }
}

If you have to use an API that uses exceptions for flow control, something like this might be your best option, but never write an API that makes your clients do it.

Log only once
Truth: medium
Importance: low

You can also state this rule as “Log or re-throw, not both.” Redundant logging is certainly impolite to those maintaining your application, but hardly the worst you could do. You might log and re-throw for legitimate reasons:

  • Your framework swallows exceptions you throw at it
  • Your application logs in a specific location, different from your container
  • This is part of a larger application, and you worry that your clients might ignore the exception

Do not log unless you need to, but if in doubt, log it.

Always keep the cause when chaining exceptions
Truth: high
Importance: high

Only the very naive intentionally do this, but it is easy to do accidentally, and a very easy way to lose the information about what went wrong.

In 2016, I still think exception chaining is a very important feature and I’ve been surprised by how many mainstream languages lack exception chaining.

Exception Rules II: The Wrath of Checked

Something I wrote seven years ago; I’m publishing now to see if I’ve learned anything.

This is part of a series where I review common wisdom about Java error handling. The series is in no particular order, but the first installment explains my categorization method.

Use checked exceptions when the client might recover
Truth:
low
Importance: medium

The checked exception experiment tested a compelling ideal. Stated in Sun’s tutorial:

Any Exception that can be thrown by a method is part of the method’s public programming interface. Those who call a method must know about the exceptions that a method can throw so that they can decide what to do about them. These exceptions are as much a part of that method’s programming interface as its parameters and return value.

The more Java I write, the more convincing I find Bruce Eckel’s argument that the experiment proved its hypothesis false.

The tutorial writers tell us to use checked exceptions whenever our clients can do something useful to recover. They fail to mention the abstraction-destroying effects of checked exceptions.

To be fair, checked exceptions only destroy abstractions the way alcohol destroys families; if daddy stopped using so much we would be fine. But programmers are human, and humans are lazy. Especially programmers.

Laziness makes programmers suppress errors, but they hide exceptions for good reason. In the typical example, when trying to abstract away the database connection, avoid subjecting your client to SQLException. If SQLException were unchecked, abstractions that neglected its handling it would leak on error, but their programmers would not add the leakiness to their signatures.

Yes, checking the error code and determining whether to retry after waiting or email an administrator or call Ghostbusters is ideal, but only a small fraction of programs actually need that depth of error tolerance. For the majority, wrapping in RuntimeException is often best, but Sun’s tutorial will make you feel guilty about that:

Do not throw a RuntimeException or create a subclass of RuntimeException simply because you don’t want to be bothered with specifying the exceptions your methods can throw.

Ignore it. This is the sort of thinking that encourages silly specifications like FileNotFound, which indicates that the file does not exist. Or is read-only. Or locked. Or a directory. Or for some other reason inaccessible.

All languages I know other than Java work perfectly well without checked exceptions, implying that you can legitimately throw unchecked exceptions only and stop wasting brain cycles on whether you should make the exception checked or not. If, however, you still want to use checked exceptions, follow this simple guideline:

Use checked exceptions only when client code could not have anticipated the error.

FileInputStream, for example, makes itself more irritating by ignoring this advice. Its constructors should not throw FileNotFound because client code should have checked for the file’s existence before trying to open it. [Retraction: I don’t recommend check-then-act style so much any more. Better to ask forgiveness than get permission, thanks again, Python.]

I consider this guideline true even for multi-threaded use because errors of improper synchronization still land in the bucket of exceptions the client should have anticipated.

Exception Rules

I drafted this long ago, then quit my job where I was writing Java and never looked back… till now, since I’m writing for Android. I thought it would be fun to see if I’ve learned anything in the seven years since I wrote this.

The not-very-secret secret to simplifying code is really very simple: just remove error handling. One hobbled dialect of Java burdened with bulky XML syntax built its success on removing the constraints of compile-time type checking and exceptions [I meant Spring].

Fortunately, those who handle errors formed an elite group of code writers. They stand between us mere mortals and chasm of infinite code failure. I know this because Bjarne Stroustrup appeared to me in a dream and directed me to where I found the Silicon tablets that contained this group’s Java wisdom.

That is, at least, what I wish happened. Actually, no one really knows the best way to handle exceptions. I suspect this somehow relates to them being exceptional; guidelines scattered around the internet are usually incomplete and often contradictory. To make things worse, my own ideas on the matters of error handling best practices vary with the situation.

Nevertheless, I add my noise about how you should use Java’s exceptions to the rest. In this series, I summarize and categorize many of the Java error handling best practices I have heard.

I categorize each rule based on arbitrary axes of “Truth” and “Importance,” which roughly match how religiously you should follow the guideline and how severe the consequences if you do not.

Truth indicates how often you should follow the rule.

  • Low truth: Ignore the rule
  • Medium truth: Follow the rule sometimes
  • High truth: Always follow the rule

Importance indicates what happens when you do not follow the truth. That is, the damage code suffers by either following a low-truth rule or breaking a high-truth rule.

  • Low importance: No serious risks
  • Medium importance: Sometimes very dangerous
  • High importance: Always risks horrible consequences

I begin with a simple one:

Do not specify  “throws Exception”
Truth
: high
Importance: low

Throwing “Exception” pesters client coders without providing any useful information. It ranks high on truth because only sloppy laziness and stupidity cause people to violate it. Still, I rank it as a low importance because annoyance is worst consequence of violation unless coupled with breaking another rule, like using an empty catch block to suppress the Exception.

Do not use exceptions for flow control
Truth:
high
Importance: medium

Although this is the rare rule where everyone agrees, some people still break it.

Author’s note, seven years later: the linked api throws an exception to indicate login failure. I still consider that a poor design, but at the time I didn’t know about Python’s StopIteration, which actually makes sense.

Despite the universal agreement on this principle, most writers fail to give any better reason than blustering about the expense of generating stack traces. While not really premature optimization, that argument smells like it because the two are barely related. Improving performance by avoiding exception-based flow control is like improving your sex life by brushing your teeth.

The real danger of exception-based flow control lies in exception suppression. For example, the poke method lets you jab a stooge in the eye.

public String poke(String stooge)
              throws StoogeNotFoundException {
  if (stooges.contains(stooge)) {
    return "Woopwoopwoopwoop";
  } else {
    throw new StoogeNotFoundException("Wise guy, eh");
  }
}

You can write a hideous, dangerous, unforgivable isStooge method like so.

public boolean isStooge(String name) {
  // Evil. Never do this.
  try {
    poke(name);
    return true;
  } catch (Exception e) {
    return false;
  }
}

This isStooge hides and forgets any exception poke throws, not just StoogeNotFound. On average, however, exception-based flow control is not this dangerous, though still unbearably ugly.

If the snippet specified “catch (StoogeNotFoundException),” the try-catch would just be a structured replacement for goto. When done correctly, using exceptions for flow control is merely poor style used by those who long for the good old days of goto. Stay away from it for the same reason you stay away from top hats and morning coats.

Programming Android: first impressions

I suspended work on Comefrom0x10 for a little while to start my first attempt at a serious Android app. It is tentatively called “Text Collector” and essentially just makes a pdf of your text messages.

So, how is Android as a platform?

Well, first, it’s Java. This means that half my code is type declarations, the other half is keywords; we all saw that coming, move along…

The Android core api is unpleasant to use, but it could have been worse. Its main problem is severe under-documentation, apparently thanks to a bad case of “source code is the documentation” syndrome.

Though technically Java, for better or worse, it feels like an api designed by people who would rather write C. Integer constants and bitmasks are everywhere; there is even the occasional “out” parameter. On the bright side, there is a refreshing lack of abstract factory singletons. There is no xml standing in for “dependency injection” code.

There is plenty of xml for defining layouts, though. Layout xml is attribute-heavy, which means less verbose than it could have been, but also that you can’t put comments in many places where they ought to go:


<Frobnicator
  android:foo="bar" <!-- could use a comment here, but that's illegal -->
...

Thankfully, layouts and resource definitions appear to be the only places you have to use xml. In principle, you could define layouts entirely in Java, but frying pan, meet fire.

As far as I can tell, the entire Java standard library is available, but I’ve used only a few small parts of it. There are bizarro-world Android replacements of some parts. Methods that expect uris take android.net.Uri instead of java.net.URI. Bundle of Parcelable looks like it probably could just have been Map<String,Serializable>. I haven’t spent enough time with Android code to judge whether there are good reasons for this seeming duplication.

Like many apis, the core library is a mix of surprisingly easy juxtaposed with surprisingly difficult. There are some nice included layouts and widgets, like a date picker, but try hooking up a date picker to a TextView with inputType=date, and you are in for nasty surprises. Writing and displaying pdf is almost trivial, but if you want zoom and two-dimensional scrolling while you display it, expect pain.

When All You Have Is a Class

I care little for static typing, but when a language gives you static types, use the type system.

Say we have a domain class like

public class User {
  public User(String firstName,
              String lastName,
              String email,
              long balanceInDollars) {
  ...
}

I’m in agreement with Stephan that using String and primitive types here is wimping out. That is weak typing. Weak! Chuck Norris does not approve!

The post goes on to describe a more typesafe and relatively elegant implementation. While obnoxiously verbose, its type system is not Java’s real problem. The actual deficit that makes Java less expressive than many others is that it only supports one modeling paradigm, the ubiquitous class.

Take an interview question as an example:

Given an input string, write a program that lists the top n words. So for the sentence, “Do or do not,” the top 2 tokens would be:

do = 2
not = 1

Sort first by frequency, then lexically by token.

A reasonable solution takes four conceptual steps:

  1. Tokenize
  2. Count and map tokens to frequency
  3. Sort
  4. Display the first n

In Java, sorting causes the most code. The most obvious way to maintain type safety demands an entirely new class; it contains two properties, token and frequency; it implements Comparable, equals and hashcode. The implementation then sticks instances of this class into a sorted data structure.

An alternative implementation could eschew the new class, instead jamming data into two-element Object arrays then sort using a Comparator. This would likely be shorter but lose type checking along the way.

The first feels unnatural because the data and the sorting behavior have no intuitive connection. Word frequency means nothing outside the sentence while sort order means nothing outside the problem. The two should live apart but the language foists that artificial marriage on the program. To make the error in this design more clear, consider if the problem specified that sort order should be interchangeable. For example, sort same-frequency tokens by length instead.

The second approach, though more conceptually pure, loses type safety. Using a value class containing token and frequency improves it but once again that class lacks conceptual justification. The token itself is unrelated to its frequency when apart from the sentence.

Like many problems, this one only involves scraps of data and a way to sort them but both Java programs require class implementations in a problem with no natural need for modeling anything as a class.

Not only does Java make you think in classes even when inappropriate, it simply refuses to acknowledge that any other problem model exists. You want collection literals to group data? Forget type safety. First class functions? Better define an interface. Mixins? Sorry, not even multiple inheritance. Declarative domain specific languages? Good luck. When all you have is a class, everything looks like an object.

Android List

Some time ago, I wrote a shopping list app as an exercise to learn Android programming. I do not plan to maintain this or enhance it, but the code is now available.

It is just a couple simple list views. One view adds items to the list. All items ever added remain in that view, the inventory. The other view is the actual shopping list. Tapping an item in the inventory moves it to the shopping list; tapping it in the shopping list moves it back to inventory.

A right or left swipe switches between the views, which brings up an annoying factoid: the Android api does not have built-in swipe detection. Creating your own gesture detector is easy; 30 seconds on Google finds as many slightly different implementations as you could care to see. Nevertheless, swiping is a simple, common need. A side-to-side swipe detector and a zoom detector would probably cover almost every application.

This illustrates another reason to like Python:

Fans of Python use the phrase “batteries included” to describe the standard library, which covers everything from asynchronous processing to zip files.

Stop Twiddling My Bits

Googling for how to compute checksums with Java might return insanity. Quick. What does this function do?

// Java
static String twiddleDee(byte[] data) {
  StringBuffer buf = new StringBuffer();
  for (int i = 0; i < data.length; i++) {
    int halfbyte = (data[i] >>> 4) & 0x0F;
    int two_halfs = 0;
    do {
      if ((0 <= halfbyte) && (halfbyte <= 9))
        buf.append((char) ('0' + halfbyte));
      else
        buf.append((char) ('a' + (halfbyte - 10)));
      halfbyte = data[i] & 0x0F;
    } while (two_halfs++ < 1);
  }
  return buf.toString();
}

Compute a SHA-1 and output a hex string. This is my first public service code donation:

// Java
public static String sha1(byte[] itsAllBitsAfterAll) {
  MessageDigest digester = newSha1Digester();
  digester.update(itsAllBitsAfterAll);
  return bytesAsHex(digester.digest());
}

// This might make a good future post about senseless
// factories
private static MessageDigest newSha1Digester() {
  try {
    return MessageDigest.getInstance("SHA-1");
  } catch (NoSuchAlgorithmException e) {
    throw new RuntimeException(
        "How many times must exceptions be thrown?", e);
  }
}

static String bytesAsHex(byte[] bytes) {
  Formatter result = new Formatter();
  for (byte nextByte : bytes) {
    result.format("%02x", nextByte);
  }
  return result.toString();
}

Like the countless optical illusions where the lines turn out to really be straight or the colors are actually the same, the first snippet matches the “bytesAsHex” method. The first is twenty times faster, but the second is twenty times clearer.

Always write the second. If you really need to squeeze those few extra milliseconds out of your code, use a library. If you think you can improve the library, use something open source, write it better, benchmark, and contribute.

Update, seven years later: In the intervening time, I’ve become somewhat more comfortable with bitwise operations and much more wary of dependencies. Today, I would not include a library only to optimize a little bit of string formatting.