In the middle of a major, hairy refactoring recently, I codified a tactic for refactoring that I’d been using, unconsciously, for years. The tactic is to break down refactoring into two modes, and deliberately switch between them:
- Clean-up mode: Clean up and reorganize the code, without changing the code’s external functionality.
- Change mode: Make the smallest, simplest self-contained change, making sure the code is functional again when it’s completed.
You start in clean-up mode, and switch back and forth as needed. This tactic is probably unsurprising to experienced programmers, but for those of you not yet as hard-core as Donald Knuth, let me explain why it’s a good idea.
Separating clean-up and changes into discrete modes of working gives you less to think about at any one time. A refactor can be very invasive—code that was closely tied together is now completely decoupled, or vice-versa; functions, objects and modules become obsolete or nonsensical, or split apart or merge in unexpected ways; identifiers become ambiguous or collide. If you’re trying to reorganize everything at the same time you’re juggling old and new code in your head, it’s easy (for anyone) to get lost in the maze.
When you’re consciously in clean-up mode, you can focus just on tasks like making sure variables have unambiguous, correct names, object / module / function / &c. organization and boundaries are sensible, and so on. You’re not changing any functionality, so your tests (you do have tests, right?) still pass, and the application still behaves as it always has (you did test it manually, right?). And you can be liberal in your clean-up; if you end up improving code that isn’t ultimately affected by the refactor, there’s no harm in that.
Once you feel like the code is clean and ready for the changes in functionality, commit your clean-up (you are using version control, right?). I usually use a commit message like “clean-up in preparation for….”
Now switch to change mode, and start making the changes. Quite often, you’ll discover something else that needs to be cleaned up. But you’re in change mode, not clean-up mode. Since you’re only making the smallest, simplest self-contained change, this new clean-up can probably wait until later. But if it can’t wait until later, then roll back your changes (or save a patch, or store your work in the attic, or the stash, or shelve it, or whatever the kids are calling it these days). Complete the new clean-up, commit it, and then go back to change mode and your half-completed changes.
This tactic has some additional benefits that might not be immediately obvious, too:
- Clean-up mode forces you to read over all of the code that’s going to be changed and ask, “How is this code going to be affected by the changes I’m planning?” Sometimes you discover unanticipated edge-cases or bugs in your planned changes. Sometimes you realize the whole plan is flawed, and have to go back to the drawing board. And sometimes, after a good, hearty clean-up session, you realize that you can make the required changes in a less invasive, simpler way. I find that most refactors are more clean-up than anything else.
- If the codebase has a “master” or “main” branch, and you use a “feature” or “working” branch for your refactor, you can put the clean-up commits into the master branch (they don’t change any functionality, right?) and only commit the functional changes to the feature branch. What’s the point of that? Well, everyone else gets to benefit from your code clean-up right away, and when you do end up merging your functionality changes into the master branch, because it’s a smaller delta, there’s a smaller chance of conflicts.
The next time you’re staring down the barrel of a nasty refactor, and cursing the person who didn’t fully think think through the business requirements, try this. It won’t make every refactor a piece of cake, nor will it become the hot new programming methodology acronym down in the valley (2SR, anyone? 2SRT?), but I guarantee you’ll be glad you tried it.