Recently I was working on some code. The code was trying to tell me many things, but I was not sure if I was understanding what it was trying to communicate. It just felt irrelevant or noise at that moment. Somehow the right level of abstraction was missing.
As you can see, my first reaction after looking at this code was that there is too much going on, most of which is duplicate. So cleaned it up a bit and made it more expressive by
By introducing a new class called Context and moving all the mocking code into that, my test looked lot more clear. I was also able to create an abstraction that could communicate intent much more easily.
Next I reduced the clutter further by creating another level of abstraction as follows
But at this point, even though the code ended up being very dense, it was very difficult to understand what was going on and why so. In a desperate search for simplicity and better communication, I ended up with
What is interesting about this is that I made some simple assumption saying:
every name is not a celebrity name unless specified
every user name is a valid (non-restricted) user name unless specified
every domain name is a valid (non-restricted) domain name unless specified
every identity is available unless specified
All these assumptions are now capture in my Context object and rest of my tests can happily focus on what really matters. I really liked the way this reduced the clutter in my tests without compromising on communication.
Last week I posted a small code snippet for refactoring under the heading Refactoring Teaser.
In this post I’ll try to show step by step how I would try to refactor this mud ball.
First and foremost cleaned up the tests to communicate the intent. Also notice I’ve changed the test class name to ContentTest instead of StringUtilTest, which means anything and everything.
Next, I created a class called Content, instead of StringUtil. Content is a first-class domain object. Also notice, no more side-effect intense statics.
publicclass Content {privatestaticfinalString BLANK_OUTPUT ="";privatestaticfinalString SPACE =" ";privatestaticfinalString DELIMITER ="', '";privatestaticfinalString SINGLE_QUOTE ="'";privatestaticfinalint MIN_NO_WORDS =2;privatestaticfinal Pattern ON_WHITESPACES = Pattern.compile("\\p{Z}|\\p{P}");privateList phrases =newArrayList();public Content(finalString content){String[] tokens = ON_WHITESPACES.split(content);if(tokens.length> MIN_NO_WORDS){
buildAllPhrasesUptoThreeWordsFrom(tokens);}}
@Override
publicString toString(){return toPhrases(Integer.MAX_VALUE);}publicString toPhrases(finalint userRequestedSize){if(phrases.isEmpty()){return BLANK_OUTPUT;}List requiredPhrases = phrases.subList(0, numberOfPhrasesRequired(userRequestedSize));return withInQuotes(join(requiredPhrases, DELIMITER));}privateString withInQuotes(finalString phrases){return SINGLE_QUOTE + phrases + SINGLE_QUOTE;}privateint numberOfPhrasesRequired(finalint userRequestedSize){return userRequestedSize > phrases.size()? phrases.size(): userRequestedSize;}privatevoid buildAllPhrasesUptoThreeWordsFrom(finalString[] words){
buildSingleWordPhrases(words);
buildDoubleWordPhrases(words);
buildTripleWordPhrases(words);}privatevoid buildSingleWordPhrases(finalString[] words){for(int i =0; i < words.length;++i){
phrases.add(words[i]);}}privatevoid buildDoubleWordPhrases(finalString[] words){for(int i =0; i < words.length-1;++i){
phrases.add(words[i]+ SPACE + words[i +1]);}}privatevoid buildTripleWordPhrases(finalString[] words){for(int i =0; i < words.length-2;++i){
phrases.add(words[i]+ SPACE + words[i +1]+ SPACE + words[i +2]);}}}
This was a big step forward, but not good enough. Next I focused on the following code:
privatevoid buildAllPhrasesUptoThreeWordsFrom(finalString[] words){
buildSingleWordPhrases(words);
buildDoubleWordPhrases(words);
buildTripleWordPhrases(words);}privatevoid buildSingleWordPhrases(finalString[] words){for(int i =0; i < words.length;++i){
phrases.add(words[i]);}}privatevoid buildDoubleWordPhrases(finalString[] words){for(int i =0; i < words.length-1;++i){
phrases.add(words[i]+ SPACE + words[i +1]);}}privatevoid buildTripleWordPhrases(finalString[] words){for(int i =0; i < words.length-2;++i){
phrases.add(words[i]+ SPACE + words[i +1]+ SPACE + words[i +2]);}}
The above code violates the Open-Closed Principle (pdf). It also smells of duplication. Created a somewhat generic method to kill the duplication.
Now I had a feeling that my Content class was doing too much and also suffered from the primitive obsession code smell. Looked like a concept/abstraction (class) was dying to be called out. So created a Words class as an inner class.
privateclass Words {privateString[] tokens;privatestaticfinalString SPACE =" ";
Words(finalString content){
tokens = ON_WHITESPACES.split(content);}boolean has(finalint minNoWords){return tokens.length> minNoWords;}List phrasesOf(finalint length){List phrases =newArrayList();for(int i =0; i <= tokens.length- length;++i){String phrase = phraseAt(i, length);
phrases.add(phrase);}return phrases;}privateString phraseAt(finalint index, finalint length){
StringBuilder phrase =new StringBuilder(tokens[index]);for(int i =1; i < length; i++){
phrase.append(SPACE + tokens[index + i]);}return phrase.toString();}}
In the constructor of the Content class we instantiate a Words class as follows:
public Content(finalString content){
Words words =new Words(content);if(words.has(MIN_NO_WORDS)){
phrases.addAll(words.phrasesOf(ONE_WORD));
phrases.addAll(words.phrasesOf(TWO_WORDS));
phrases.addAll(words.phrasesOf(THREE_WORDS));}}
Even though this code communicates well, there is duplication and noise that can be removed without compromising on the communication.
When I read through this code, checking StringBuffer’s length and then appending a separator does not communicate the intent very well.
4
5
6
if(sb.length()!=0){
sb.append(separator);}
Here we are relying on the side-effect (StringBuffer’s length) instead of asking the iterator. So the code does not flow smoothly. Also, now if you want to add a prefix to the joined String this logic might break.
I know you must be wondering why in the first place someone is writing tests for equals and hashCode. I agree. Its not required. But lets say someone really needs to.
This is how I would refactor the code to make it more communicative.
With TDD, I’m a lot more confident about my solutions. If I spot a design improvement, I can quickly jump in and fix it with confidence. When given feedback, I’m able to respond to it quickly. I feel I’m in control.
It really helps me keep my design and code minimalistic and clean. No bells and whistles. No buy one get one free combo offers. <Perfection in design is achieved not when there is nothing more to add, but rather when there is nothing more to take away>
Turns out that my code is lot more easier to test. Because its designed to be testable. Lots of people argue that they will write tests after writing the code and it would be more efficient. The biggest problem I find with this approach is that the code ends up being something not readily testable. Then either I’ve to spend time making it testable or I skip testing saying its not possible or not required.
It helps me build a safety net of executable, living, up-to-date specification for my classes and features. Its a great starting point for new team members to understand my software. Tests are a great reference point for someone who wants to use my code.
TDD is a great teacher. It helps me listen to my code and learn from it. It forces me to pay close attention to what is happening. Its easy to spot bad things faster. Its all about feedback and visibility.
TDD forces me to slow down and think. It encourages me to take baby-steps. Sometimes I find people are in such a hurry that they spill mess all over the place. It takes soo much more time and effort to clean up the mess they leave behind.
My tests tend to communicate my design choices much longer after I’m gone.
I massively reduce the amount of time I spend in the debugger or trying to manually test (monkey test) from the UI. When something breaks, I no longer need to crawl through the logs to figure out where things are going wrong. I get pin-pointed feedback.
TDD helps me maintain focus on measurable outcome (producing software that accomplishes a concrete objective). I’m not longer drifting down ratholes.
TDD also helps me reduce the hand-overs between developers and tests and hence the wastage introduced because of all that overhead and context switching.
And so on…
Having said that, TDD alone is not sufficient to achieve the above. At times you need to spike/prototype things. One needs to have (or at least start focusing on building) a good knowledge of Design Principles and Patterns. Its easy to get lost, having a pair can really help.
Again I don’t want to sound dogmatic. I don’t think TDD is the only way to build great software. There are lots of great developers out there, building amazing software without TDD. However I think TDD is a much easier approach to achieve the same.
As you can see there is a lot of duplication (Assert.IsTrue is basically noise). It’s also not very clear what the intent of those assert is.
Since Response is a Test Helper class. We thought moving the asserts on the response makes sense. But we also want to make sure the person reading this test understands that we are verifying a bunch of things on the response object.
Since we are using C#, we could do the following using a Delegate.