Analytics

Wednesday, April 17, 2013

String.format

Most of the time when you are dealing with strings, you want to construct them dynamically from some variables, whether they are integers, objects, or strings themselves. You might end up with code that kind of looks like this:

To most people, this is not the nicest looking code, and it can be hard to parse while reading and easy to mess up while writing. Java offers an alternative that behaves like good old printf which is the String.format method. Instead of inlining the variables as in the above example, you can put format specifiers in their place and specify the variables at the end. This additionally has the benefit of allowing you to render strings in different ways (e.g. leading zeros). For example:

In many cases, this is preferable because it is easier to see what the string should look like. There are some disadvantages such as the increased difficulty in identifying which variable is associated with which format specifier, but as we can see that might be overcome through good naming conventions. Now that I've introduced (or reintroduced) to you this handy feature of Java, let me tell you why you should be wary of using it. The reason is, of course, performance. Let's take a look at a quick benchmark program:

This benchmark compares String.format to appending when there are between 0 and 10 format specifiers (all "%d" for simplicity). The output of the second run (to avoid the impact of JVM warm-up) is as follows, with the columns representing number of format specifiers, time taken by String.format, and time taken by appending, respectively:

Needless to say, the append method blows String.format out of the water, especially with a small number of format specifiers. As it turns out, String.format calls down into Java's regular expression library to parse the format string, and that has a lot of overhead, especially in simple cases. That's not to say String.format doesn't have its uses. The whole point of this discussion came from the fact that it can make your code more readable, which is a great thing and sometimes more important than performance. But if you ever need to manipulate strings in the innermost loop of your application, please just stick with appending.

No comments:

Post a Comment