Painless Android releases

Android apps require not one, but two version numbers:

  • Version code: an integer that Android uses to check whether one version is more recent than another
  • Version name: a friendly version to display to the user, conventionally something like 1.2.3

This means that when you want to build a new release of your app, you have two things to manually update, and that is two things too many. You will make mistakes.

Luckily, it’s not too hard to automate this away in your Gradle build script.

Gradle inherited much of its design from Apache Maven. Maven defined a standard release feature that automatically handles typical pitfalls and mindless details of making a release: tagging in source control and incrementing your version number. For Gradle, there is a nice third-party implementation, the gradle-release plugin. So long as you don’t fight Maven-style version conventions, it can make cutting releases almost entirely automatic, modulo prompting you to confirm that it guessed correct version numbers.

If your project only has one version number, you just apply the release plugin and you’re done, but Android’s two-version-number system takes some customization.

I only discuss version numbers here, but the release plugin also does several other useful sanity checks.

First, move the versions out of your app/build.gradle into app/gradle.properties. They should look like so:

app/gradle.properties

version=1.0-SNAPSHOT
versionCode=1

app/build.gradle

android {
    // ...
    defaultConfig {
        versionCode project.versionCode.toInteger()
        versionName project.version
        // ...

“SNAPSHOT” is Maven’s convention for “between releases”. Version 1.0-SNAPSHOT means the code leading up to version 1.0. This convention is how the release plugin guesses what version number you are releasing: it just lops off the suffix.

When you run ./gradlew release, the release plugin updates the version thus:

  1. Edits gradle.properties, removing the “snapshot” part
    1.0-SNAPSHOT becomes 1.0
  2. Commits the change and tags this as version 1.0 in source control
  3. Builds the release
  4. Edits gradle.properties again, to next dev version
    1.0 becomes 1.1-SNAPSHOT
  5. Commits so you can immediately start working on version 1.1

Thus, out of this box, this handles the user-friendly version number, but not the “version code.”

Updating the version code

When Android installs an update to an app, it knows by version code whether the update is newer than what it currently has installed. 3 is newer than 2 and so on.

Thus, the obvious strategy for updating your version code is to add one on every release. If using the release plugin, you might do this as a manual step after it finishes a release. If you forget, you’ll accidentally build your next release with the same version code as you just used. If you have other branches, you need to remember to update them as well. Ouch.

There is a better way. Version codes need not be sequential, so instead of incrementing 1,2,3…, we can derive it from the date. A format like [2-digit year][month][day][0-9] works nicely. A release today gets version code 1704080, tomorrow, 1704090.

This format will cover you for 82 years at up to ten releases a day. If that’s not enough for you, use a four-digit year and a two-digit suffix, but watch out for integer overflow in 130 years or so.

The date-based strategy, however, means that you have to set your “version code” immediately before you release, instead of after. To do this, add a Gradle task right before updating version name.

app/build.gradle

task setVersionCode { doLast {
    // Add a task that updates version code
    def current = project.versionCode.toInteger()
    def releaseAs = new Date().format('YYMMdd0', TimeZone.getTimeZone('UTC'))
    if (releaseAs.toInteger() <= current) {
        // More than one release today
        releaseAs = current + 1
    }
    def releaser = project.plugins[net.researchgate.release.ReleasePlugin]
    def propsFile = releaser.findPropertiesFile()
    def props = new Properties()
    propsFile.withInputStream { props.load(it) }
    props.versionCode = releaseAs.toString()
    propsFile.withOutputStream { props.store(it, null) }
}}
// Execute our task before unSnapshotVersion, provided by the release plugin:
unSnapshotVersion.dependsOn setVersionCode

With this simple build script change (plus applying the release plugin), a single command updates both version numbers:

./gradlew release

The release plugin also runs the “build” task at the point of release, so this single command leaves you with both a release .apk and your working directory updated to the tip (snapshot) code ready to start work on the next release. There’s still a problem though: if you haven’t configured your build script to sign the build, you won’t be able to publish the release .apk.

Signing the build

To make Gradle sign a build, you need to add a “signingConfig”:

android {
    // ...
    signingConfigs {
        release {
            storeFile file('/home/myname/.javakeys/mykeys.jks')
            keyAlias 'myappsigningkey'
            // These two lines make gradle believe that the signingConfigs
            // section is complete. Without them, tasks like installRelease
            // will not be available! (see http://stackoverflow.com/a/19350401)
            storePassword "notYourRealPassword"
            keyPassword "notYourRealPassword"
        }
    }
    buildTypes {
        release {
            signingConfig signingConfigs.release
            // ...

This fails, so you put your real password in the “password” config place and get pwned. Your wife leaves you, and your dog dies. You didn’t that, right?

So where should you put your password? The top-voted answer on Stack Overflow says ~/.gradle/gradle.properties, presumably protected by 600 permissions. I don’t see the point. If you’re relying on file system permissions to keep the password secure, why have the password at all? You could just protect the keystore with file system permissions.

What you need is a prompt for the password.

Thanks to bug 1251, Gradle running in daemon mode (the default) doesn’t let you use System.console().readPassword("Password:"). You can disable daemon mode, but then you run afoul of (orphaned?) bug 2357 because Android Studio generates a default gradle.properties that includes jvmargs. Once you remove that configuration, you find that prompts don’t display when you build not in daemon mode (bug 869). That’s a pain because you can’t see the version number confirmation prompts.

As a result of this epic adventure, you’ll eventually find that the only reliable way to prompt for password is via Swing. No, I’m not joking. It’s not as gruesome as it sounds, thanks to Groovy’s Swing builder, so pop over to where Tim Roes documented how to do it.

Inheritance: is-a has-a

Lots of things we learn in school turn out to be naive simplifications of how the real world works, and sometimes we later learn, to our chagrin, that the way we thought about the world really isn’t true at all. Take that familiar organization of life into a giant tree: kingdom, phylum, genus, species. It seems neat enough, but in the grown-up world, people can spend lifetimes arguing about where things fit in this classification.

A related simplification that I learned in school was the rule of when to use inheritance versus composition. It went like so: in this assignment, you simulate a world full of monsters. Zombie is a type of monster, so zombie should inherit from monster. On the other hand, vampires have a coffin, so vampire should have a field that refers to coffin. Now make a UML diagram.

This makes sense as far as it goes, but there’s a major problem: it’s not usually a useful way to think about inheritance when building real programs.

Is-a v has-a perspective makes most sense when thinking about type systems. If a function takes an argument of type monster, it can also take any type of monster, either vampire or zombie. The trouble starts when you use the same reasoning to design a program and it comes back to our taxonomy problem.

You start designing a system by figuring out what your different things are: zombies, vampires, ghosts, coffins and so on. It’s easy enough: three types of monsters, each a class that inherits from monster, and coffin, its own thing. Naturally, you also need people; people need places to live and ghosts need places to haunt, so you have houses. But wait, people aren’t monsters, but they have a lot in common, so they need a base class, say living things. But that’s not quite right; the monsters aren’t technically alive, so maybe they are dead things. Also, houses and coffins seem to be of a non-living type, so that’s another base class. Should it be dead things? If the coffin is made of wood, it used be alive, so maybe that makes sense.

Most real-world characteristics of things are completely irrelevant to most programs. In our simulation, perhaps the only thing ghosts do is haunt, whereas vampires and zombies bite people but don’t haunt. It’s confusing and wasteful to worry about how they are all types of monsters, who are types of dead things and so forth.

Now, occasionally, it does make sense to think of inheritance as an is-a relationship. The cf0x10 parse tree, for example, is a pile of subclasses. When this type of design makes sense, however, it will be obvious; no need to shoehorn everything into it.

What about other metaphors? It’s common, for example, to say that instances of classes are receivers while method calls are messages to that receiver. That’s a useful perspective for language design and it’s useful to have a name for that bit before the dot – receiver.message() – but, again it’s not so helpful a metaphor when designing a program.

In real programs, metaphors like these just tend to cause trouble. Software isn’t made of physical things. A class, in reality, is just a way to group related bits of a program. I prefer not to start by creating any design for a class hierarchy; instead I write code that does the things I need it to do. A class hierarchy, if any, usually emerges from unifying the bits that make sense to put together.

Exception Rules IV: The Voyage Stack

Something I wrote seven years ago and I’m publishing now to see if learned anything since then.

This is part of a series where I review common wisdom about Java error handling. The series is in no particular order, but the first installment explains my categorization method.

Clean up with finally
Truth: high
Importance: high

Joshua Bloch explains the reasoning behind this in Effective Java as “strive for failure atomicity.” Whatever happens, clean up after yourself. Checked exceptions give us one of their rare benefits by just maybe reminding us to write finally blocks.

One way not to write a finally block, however, is like this:

try {
	connection = DriverManager.getConnection("stooges");
	// Snip other database code
} catch (SQLException e) {
	throw new RuntimeException(e);
} finally {
	try {
		// Bad. Do not do this.
		connection.close();
	} catch (SQLException e) {
		throw new RuntimeException(e);
	}
}

If opening the connection throws SQLException, closing the connection throws NullPointer and obscures the original cause. Usually, people suggest wrapping a conditional around the connection.close() call to check for null, but that gets ugly fast and is easy to forget. Instead, follow the next rule.

Make try blocks as small as possible
Truth: high
Importance: medium

Consider this code that obscures which file could not open:

try {
	// Smelly. Don't do this.
	curly = new FileReader("Curly");
	shemp = new FileReader("Shemp");
} catch (FileNotFoundException e) {
	throw new RuntimeException("Whoopwoopwoopwoop", e);
} finally {
	// The close method translates IOException
	// to RuntimeException
	if (curly != null) {
		close(curly);
	}
	if (shemp != null) {
		close(shemp);
	}
}

We programmers wrap multiple lines in the same handler because we are lazy, but like the hare napping during the race, that laziness hurts us in the end; it forces us to reason much more carefully about the application state during cleanup.

Instead, handle the exception as close to its cause as possible:

try {
	curly = new FileReader("Curly");
} catch (FileNotFoundException e) {
	throw new RuntimeException("Missing: Curly", e);
}
try {
	shemp = new FileReader("Shemp");
} catch (FileNotFoundException e) {
	throw new RuntimeException("Missing: Shemp", e);
} finally {
	close(curly);
}
close(shemp);

The benefits of the shrunken try block may not be obvious from this tiny example, but consider how you can now factor out the file opening blocks to a method that simply throws an unchecked exception for missing files. Once done, this code simplifies down to:

curly = open("Curly");
try {
	shemp = open("Shemp");
} finally {
	close(curly);
}
close(shemp);

Note that all these examples lose the original exception when the close method throws an exception. Most applications can afford that minimal risk, but if you believe it likely that your cleanup code will throw further exceptions, a log and re-throw might be appropriate.

Do not rely on getCause()
Truth: high
Importance: high

Peeking at an exception’s cause is equivalent to using something’s privates or parsing the string representation to find a field value. It makes code brittle; moreover, you can only test it by forcing errors in your collaborators.

If you find that some library forces you to do this, consider avoiding that function completely; if you still have no way around it, adorn your code liberally with “XXX” comments, and test as best you can.

Do not catch top-level exceptions
Truth: low
Importance: low

Top-level exception classes like Exception live close to the root of the exception hierarchy. The argument against catching these says that you should avoid it because the specific lower type, such as FileNotFound, traditionally conveys information necessary to handling the exception, so by catching the top-level exception, your handler is dealing with an unknown error, which it probably knows little about.

Actually, the advice should say “do not try to recover from top-level exceptions.” Catching top-level exceptions is not fundamentally wrong or even bad, but because you do not really know how severe the exception was, you should usually do no more than report and organize a crash.

Exception Rules III: The Search for Cause

Something I wrote seven years ago. Did I learn anything in that time?

This is part of a series where I review common wisdom about Java error handling. The series is in no particular order, but the first installment explains my categorization method.

Do not use empty catch blocks
Truth: high
Importance: high

This is the most obvious of the category of exception handling rules that address how to avoid losing error information. The usual example for when you might justifiably ignore an exception goes like so:

static void closeQuietly(Closeable closeable) {
  // Smelly. Do not do this.
  try {
    closeable.close();
  } catch (IOException e) {
    // I've already done what I needed, so I don't care
  }
}

Yes, the comment makes this better than the completely empty catch block, but that is like saying that heroin is fine because you wear long sleeves.

Very rarely, suppressing an exception actually is the right thing to do, but never unless you absolutely know why it happened. Do you know when close() throws IOException? I thought not.

Do not catch and return null
Truth: medium
Importance: medium

Catching and returning null is a minor variation on the exception-suppression theme. Consider the Stooges class, which contains this method:

public String poke(String stooge)
              throws StoogeNotFoundException {
  if (stooges.contains(stooge)) {
    return "Woopwoopwoopwoop";
  } else {
    throw new StoogeNotFoundException("Wise guy, eh");
  }
}

Suppose you want to write another method that checks a Stooge’s reaction to a poke, but Stooges gives you no isStooge method. Instead, it forces you to write this:

static String getStoogeReaction(Stooges stooges, String name) {
  try {
    return stooges.poke(name);
  } catch (StoogeNotFoundException e) {
    return null;
  }
}

If you have to use an API that uses exceptions for flow control, something like this might be your best option, but never write an API that makes your clients do it.

Log only once
Truth: medium
Importance: low

You can also state this rule as “Log or re-throw, not both.” Redundant logging is certainly impolite to those maintaining your application, but hardly the worst you could do. You might log and re-throw for legitimate reasons:

  • Your framework swallows exceptions you throw at it
  • Your application logs in a specific location, different from your container
  • This is part of a larger application, and you worry that your clients might ignore the exception

Do not log unless you need to, but if in doubt, log it.

Always keep the cause when chaining exceptions
Truth: high
Importance: high

Only the very naive intentionally do this, but it is easy to do accidentally, and a very easy way to lose the information about what went wrong.

In 2016, I still think exception chaining is a very important feature and I’ve been surprised by how many mainstream languages lack exception chaining.

Exception Rules II: The Wrath of Checked

Something I wrote seven years ago; I’m publishing now to see if I’ve learned anything.

This is part of a series where I review common wisdom about Java error handling. The series is in no particular order, but the first installment explains my categorization method.

Use checked exceptions when the client might recover
Truth:
low
Importance: medium

The checked exception experiment tested a compelling ideal. Stated in Sun’s tutorial:

Any Exception that can be thrown by a method is part of the method’s public programming interface. Those who call a method must know about the exceptions that a method can throw so that they can decide what to do about them. These exceptions are as much a part of that method’s programming interface as its parameters and return value.

The more Java I write, the more convincing I find Bruce Eckel’s argument that the experiment proved its hypothesis false.

The tutorial writers tell us to use checked exceptions whenever our clients can do something useful to recover. They fail to mention the abstraction-destroying effects of checked exceptions.

To be fair, checked exceptions only destroy abstractions the way alcohol destroys families; if daddy stopped using so much we would be fine. But programmers are human, and humans are lazy. Especially programmers.

Laziness makes programmers suppress errors, but they hide exceptions for good reason. In the typical example, when trying to abstract away the database connection, avoid subjecting your client to SQLException. If SQLException were unchecked, abstractions that neglected its handling it would leak on error, but their programmers would not add the leakiness to their signatures.

Yes, checking the error code and determining whether to retry after waiting or email an administrator or call Ghostbusters is ideal, but only a small fraction of programs actually need that depth of error tolerance. For the majority, wrapping in RuntimeException is often best, but Sun’s tutorial will make you feel guilty about that:

Do not throw a RuntimeException or create a subclass of RuntimeException simply because you don’t want to be bothered with specifying the exceptions your methods can throw.

Ignore it. This is the sort of thinking that encourages silly specifications like FileNotFound, which indicates that the file does not exist. Or is read-only. Or locked. Or a directory. Or for some other reason inaccessible.

All languages I know other than Java work perfectly well without checked exceptions, implying that you can legitimately throw unchecked exceptions only and stop wasting brain cycles on whether you should make the exception checked or not. If, however, you still want to use checked exceptions, follow this simple guideline:

Use checked exceptions only when client code could not have anticipated the error.

FileInputStream, for example, makes itself more irritating by ignoring this advice. Its constructors should not throw FileNotFound because client code should have checked for the file’s existence before trying to open it. [Retraction: I don’t recommend check-then-act style so much any more. Better to ask forgiveness than get permission, thanks again, Python.]

I consider this guideline true even for multi-threaded use because errors of improper synchronization still land in the bucket of exceptions the client should have anticipated.

Exception Rules

I drafted this long ago, then quit my job where I was writing Java and never looked back… till now, since I’m writing for Android. I thought it would be fun to see if I’ve learned anything in the seven years since I wrote this.

The not-very-secret secret to simplifying code is really very simple: just remove error handling. One hobbled dialect of Java burdened with bulky XML syntax built its success on removing the constraints of compile-time type checking and exceptions [I meant Spring].

Fortunately, those who handle errors formed an elite group of code writers. They stand between us mere mortals and chasm of infinite code failure. I know this because Bjarne Stroustrup appeared to me in a dream and directed me to where I found the Silicon tablets that contained this group’s Java wisdom.

That is, at least, what I wish happened. Actually, no one really knows the best way to handle exceptions. I suspect this somehow relates to them being exceptional; guidelines scattered around the internet are usually incomplete and often contradictory. To make things worse, my own ideas on the matters of error handling best practices vary with the situation.

Nevertheless, I add my noise about how you should use Java’s exceptions to the rest. In this series, I summarize and categorize many of the Java error handling best practices I have heard.

I categorize each rule based on arbitrary axes of “Truth” and “Importance,” which roughly match how religiously you should follow the guideline and how severe the consequences if you do not.

Truth indicates how often you should follow the rule.

  • Low truth: Ignore the rule
  • Medium truth: Follow the rule sometimes
  • High truth: Always follow the rule

Importance indicates what happens when you do not follow the truth. That is, the damage code suffers by either following a low-truth rule or breaking a high-truth rule.

  • Low importance: No serious risks
  • Medium importance: Sometimes very dangerous
  • High importance: Always risks horrible consequences

I begin with a simple one:

Do not specify  “throws Exception”
Truth
: high
Importance: low

Throwing “Exception” pesters client coders without providing any useful information. It ranks high on truth because only sloppy laziness and stupidity cause people to violate it. Still, I rank it as a low importance because annoyance is worst consequence of violation unless coupled with breaking another rule, like using an empty catch block to suppress the Exception.

Do not use exceptions for flow control
Truth:
high
Importance: medium

Although this is the rare rule where everyone agrees, some people still break it.

Author’s note, seven years later: the linked api throws an exception to indicate login failure. I still consider that a poor design, but at the time I didn’t know about Python’s StopIteration, which actually makes sense.

Despite the universal agreement on this principle, most writers fail to give any better reason than blustering about the expense of generating stack traces. While not really premature optimization, that argument smells like it because the two are barely related. Improving performance by avoiding exception-based flow control is like improving your sex life by brushing your teeth.

The real danger of exception-based flow control lies in exception suppression. For example, the poke method lets you jab a stooge in the eye.

public String poke(String stooge)
              throws StoogeNotFoundException {
  if (stooges.contains(stooge)) {
    return "Woopwoopwoopwoop";
  } else {
    throw new StoogeNotFoundException("Wise guy, eh");
  }
}

You can write a hideous, dangerous, unforgivable isStooge method like so.

public boolean isStooge(String name) {
  // Evil. Never do this.
  try {
    poke(name);
    return true;
  } catch (Exception e) {
    return false;
  }
}

This isStooge hides and forgets any exception poke throws, not just StoogeNotFound. On average, however, exception-based flow control is not this dangerous, though still unbearably ugly.

If the snippet specified “catch (StoogeNotFoundException),” the try-catch would just be a structured replacement for goto. When done correctly, using exceptions for flow control is merely poor style used by those who long for the good old days of goto. Stay away from it for the same reason you stay away from top hats and morning coats.

Programming Android: first impressions

I suspended work on Comefrom0x10 for a little while to start my first attempt at a serious Android app. It is tentatively called “Text Collector” and essentially just makes a pdf of your text messages.

So, how is Android as a platform?

Well, first, it’s Java. This means that half my code is type declarations, the other half is keywords; we all saw that coming, move along…

The Android core api is unpleasant to use, but it could have been worse. Its main problem is severe under-documentation, apparently thanks to a bad case of “source code is the documentation” syndrome.

Though technically Java, for better or worse, it feels like an api designed by people who would rather write C. Integer constants and bitmasks are everywhere; there is even the occasional “out” parameter. On the bright side, there is a refreshing lack of abstract factory singletons. There is no xml standing in for “dependency injection” code.

There is plenty of xml for defining layouts, though. Layout xml is attribute-heavy, which means less verbose than it could have been, but also that you can’t put comments in many places where they ought to go:


<Frobnicator
  android:foo="bar" <!-- could use a comment here, but that's illegal -->
...

Thankfully, layouts and resource definitions appear to be the only places you have to use xml. In principle, you could define layouts entirely in Java, but frying pan, meet fire.

As far as I can tell, the entire Java standard library is available, but I’ve used only a few small parts of it. There are bizarro-world Android replacements of some parts. Methods that expect uris take android.net.Uri instead of java.net.URI. Bundle of Parcelable looks like it probably could just have been Map<String,Serializable>. I haven’t spent enough time with Android code to judge whether there are good reasons for this seeming duplication.

Like many apis, the core library is a mix of surprisingly easy juxtaposed with surprisingly difficult. There are some nice included layouts and widgets, like a date picker, but try hooking up a date picker to a TextView with inputType=date, and you are in for nasty surprises. Writing and displaying pdf is almost trivial, but if you want zoom and two-dimensional scrolling while you display it, expect pain.

Use the string literals

I think that some of my college classes took points off your grade for using literal strings instead of #define. Likewise, linters and coding standards typically want to prevent programmers from using literals.

Many indoctrinated programmers, therefore, insist that the correct way of writing a select statement is something like this:

"select " + COLUMN_NAME + ", " + COLUMN_EMAIL
+ " from " + TABLE_PEOPLE
+ " where " + COLUMN_ID + " = ? "

This jihad gains its supposed righteousness from the idea that literals make your code less maintainable. But, aside from thorough tests, what does make code maintainable? In order of importance:

  1. Readability
  2. Fewer dependencies
  3. All else equal, shorter is better

So, assume for a moment that the most maintainable code is the most readable code, and compare:

"select name, email from people where id = ?"

I suggest that the second statement is far more readable, and therefore more maintainable. It is also shorter, even without including the constant declarations, and it removes a dependency on constants defined somewhere else.

Ha! You’re ignoring the dependency on the database structure. If I want to change a column or table name, the first code is dryer.

Perhaps, but structural database changes are complex and should not be undertaken with the flippant attitude that you can do them simply by changing the value of a constant somewhere. Consider, just for starters, that if you change column name “email” to, say, “primary_email”, you will probably want to change the name of the constant to match and you will still have to search for any places the string literal might have been used, just in case.

Ok, but at least I will spend less time tracking down bugs related to spelling errors.

Sure, but seriously, how much time do you spend on spelling errors? They are typically among the easiest bugs to fix. The cumulative effort of fixing typos is less than the effort involved in managing your constants.

Life without literals is burdensome. Where should the constant declaration go? Does a definition for the same value already exist in scope? If it does, and has the same name as what you want to use, does it refer to the same thing? You add a “department” table that also has a “name” column; should you (a) use the existing constant, (b) make a new constant with a non-conflicting name, (c) rename the existing constant or, (d) both b and c?

But we have a coding standard that says what to do, so I can just follow that.

That’s just one more thing you need to remember. Do you remember precisely what your coding standard says to do? Does it even cover every case?

Does your coding standard say that uppercase identifiers must never be built from external data? Even if it does, unless you can see the constant definition in the same screenful of text as where it’s used, your code is not obviously safe from injection.

Instead of taking on this travail, just funnel your energy into picking good names for your tables, urls, or whatever in the first place. You’ll need fewer structural changes later.

Good points, but constants can add meaning, instead of being “magic numbers” sprinkled about.

True, adding meaning (see “readability”) is the one good reason for a constant instead of a literal, but usually this applies only to numbers, not strings.

What about localization?

Well… there are some good arguments for externalizing strings that appear to users, but that’s another article.

Beware string concatenation

The Owasp top 10 webapp vulnerability list is due to update soon; I wager its top three won’t change from the 2013 list:

  1. Injection [mainly sql injection]
  2. Broken Authentication and Session Management
  3. Cross-Site Scripting (XSS)

What do two of the top three vulnerabilities have in common? Cross-site scripting is a form of injection attack that is so popular it deserves its own category; both cross-site scripting and “injection” often result from unsafe string concatenation:

"select name from students where " + ...
html = "<div>" + ...

I bet one simple rule would stop most webapp compromises:

Avoid string concatenation.

Production code should very rarely concatenate strings; when it must be done, it should be written to make it obvious that the concatenation is safe.

Assuming most programmers do not intend to write vulnerable code, it is safe to say that they consider the vulnerable code they are writing to be obviously safe. That is just a special case of “programmers don’t think about the bugs they are writing.” By extension, they don’t think about the vulnerabilities they are writing.

So, I dislike rigid coding standards, but if I were to implement one, it would be to say that any place string concatenation is used must be accompanied by comments to answer:

  • Where is the data coming from?
  • What could happen if a malicious actor provided the data?
  • What makes this safe?

If you have so much string concatenation in your webapp that this seems onerous, I guarantee your site is full of security holes.

Stop Twiddling My Bits

Googling for how to compute checksums with Java might return insanity. Quick. What does this function do?

// Java
static String twiddleDee(byte[] data) {
  StringBuffer buf = new StringBuffer();
  for (int i = 0; i < data.length; i++) {
    int halfbyte = (data[i] >>> 4) & 0x0F;
    int two_halfs = 0;
    do {
      if ((0 <= halfbyte) && (halfbyte <= 9))
        buf.append((char) ('0' + halfbyte));
      else
        buf.append((char) ('a' + (halfbyte - 10)));
      halfbyte = data[i] & 0x0F;
    } while (two_halfs++ < 1);
  }
  return buf.toString();
}

Compute a SHA-1 and output a hex string. This is my first public service code donation:

// Java
public static String sha1(byte[] itsAllBitsAfterAll) {
  MessageDigest digester = newSha1Digester();
  digester.update(itsAllBitsAfterAll);
  return bytesAsHex(digester.digest());
}

// This might make a good future post about senseless
// factories
private static MessageDigest newSha1Digester() {
  try {
    return MessageDigest.getInstance("SHA-1");
  } catch (NoSuchAlgorithmException e) {
    throw new RuntimeException(
        "How many times must exceptions be thrown?", e);
  }
}

static String bytesAsHex(byte[] bytes) {
  Formatter result = new Formatter();
  for (byte nextByte : bytes) {
    result.format("%02x", nextByte);
  }
  return result.toString();
}

Like the countless optical illusions where the lines turn out to really be straight or the colors are actually the same, the first snippet matches the “bytesAsHex” method. The first is twenty times faster, but the second is twenty times clearer.

Always write the second. If you really need to squeeze those few extra milliseconds out of your code, use a library. If you think you can improve the library, use something open source, write it better, benchmark, and contribute.

Update, seven years later: In the intervening time, I’ve become somewhat more comfortable with bitwise operations and much more wary of dependencies. Today, I would not include a library only to optimize a little bit of string formatting.