Analytics

Sunday, May 26, 2013

Refactoring

Refactoring is a common technique in the day-to-day work of a software engineer. It is crucial for producing readable, extensible, and maintainable code in large projects. If you are a user of Eclipse, then you are probably familiar with its various refactoring capabilities that allow you to safely make sweeping changes to a codebase. Eclipse is one of the most commonly used integrated development environments (IDEs) due to the prevalence of Java and, as such, produces a lot of data about how people refactor. This study done in 2009 analyzes data collected by Eclipse and other sources to reveal some interesting facts about refactoring and tools for doing so. Here are some of the highlights:
  • The developers building refactoring tools don't refactor in the same way as "normal" users. As expected, those building the tools use the complex refactorings more frequently. For the average person, "rename" constitutes the majority of their usage (myself probably included).
  • People typically refactor in "batches." That is, there will often be multiple refactorings of the same kind performed in short succession. Interestingly, this appears to not be true for the "move" tool, as the average batch of that has just one refactoring, perhaps because you typically move what is already a cohesive group all at once.
  • Programmers do not configure refactoring tools, i.e. they tend to leave all of the default options. Unfortunately the study did not have enough information to determine why, and in my mind there are three potential reasons: the defaults could be correct for most cases, programmers may prefer to make minor modifications manually after the refactoring, and status quo bias.
  • Refactoring is often left out of commit messages and done in conjunction with other work. While this is not ideal, I will admit that I am often guilty of it. It is very common to, while working on implementing some new functionality, come across some existing code which you can refactor to make the new code easier to write. Especially if the refactoring is minor, breaking your flow in order to separate the refactoring out into a separate commit can be too much overhead.
Trying to improve the efficiency of a software engineer is a very interesting challenge. In some sense, it is a psychology problem to figure out how programmers understand code and what kinds of tools complement that understanding. As we continue reducing the cost of translating from conceptual models to code, development will become more efficient and accessible. Studies like this are important for bringing to light the important truths of how we program.

No comments:

Post a Comment