When you smell complexity and lack of clarity in the air, look around, you’ll find your code swimming in a (smelly) soup of primitives (low level data-types, functions and language components). Unable to bare the stink, your code is screaming and screeching, asking you to rescue it.
This is my friend, primitive obsession, the stinkiest code smell. You can rescue your code (yes we can) by creating higher level abstractions (functions, data types, objects) and giving some sense to this anarchy.
Primitive Obsession is about lack of abstractions. In the OO world, Methods, Objects, Packages/Namespaces are ways of creating abstraction. Similarly functions, procedures, modules, etc are also valid ways of creating abstractions.
Adding more objects does not always lead to better abstraction. Sometimes removing objects is more useful.
There are many different refactorings that can be used as a remedies:
Even before the authenticate() method gets called, the calling class does basic validations on the password parameter. Stuff like
the password cannot be null,
it needs to be more than 6 char long and less than 30 char long
should contain at least one special char or upper case letter
should contain at least one letter
and so on …
Some of these checks happen to reside as separate methods on a PasswordUtil class and some on the StringUtil class.
Once the authenticate method is called, we retrieve the respective User from the database, fetch the password stored in the database and match the new password against it. Wait a sec, we don’t store plain password in the DB any more, we hash them before we store ‘em. And as you might already know, we use one-way hash; which means given a hash, we cannot get back the original string. So we hash the newly entered password. For which we use a HashUtil class. Then we compare the 2 hashes.
The second method is:
public User create(final UserDTO userDTO){
...
}
Before the create() method is called, we validate all the fields inside the UserDTO. During this validation, we do the exact same validations on password as we do before the authenticate method. If all the fields are valid, then inside the create method, we make sure no one else has the same userid. Then we take the raw text password and hash it, so that we can store it in our DB. Once we save the user data in DB, we send out an activation email and off we are.
Sorry that was long. What is the point? Exactly my point. What is the point. Why do I need to know all this stuff? I can’t really explain you the pain I go through when I see:
All these hops & jumps around these large meaningless classes (UserDTO, PasswordUtil, StringUtil, HashUtil)
Conceptual and Data duplication in multiple places
Difficulty in knowing where I can find some logic (password logic seems to be sprayed all over the place)
And so on …
This is an example of Primitive Obsession.
A huge amount of complexity can be reduced,
clarity can be increased and
duplication can be avoided in this code
If we can create a Password class. To think about it, Password is really an entity like User in this domain.
Password class’ constructor can do the validations for you.
You can give it another password and ask if they match. This will hide all the hashing and rehashing logic from you
You can kill all those 3 Utils classes (PasswordUtil, StringUtil, HashUtil) & move the logic in the Password class where it belong
So once we are done, we have the following method signatures:
public User userWithMatching(UserId id, Password userEnteredPwd){
...
}
At the SDTConf 2009,Corey Haines & I hosted a session called Biggest Stinkers. During this session we were trying to answer the following two (different) questions:
As an experienced developer, looking back, what do you think is the stinkiest code smell that has hurt you the most? In other words, which is the single code smell if you go after eradicating, *most* of the design problems in your code would be solved?
There are so many different principles and guidelines to help you achieve a good design. For new developers where do they start? Which is the one code smell or principle that we can teach new developers that will help them the most as far as good design goes (other than years of experience)?
Even though the 2 questions look similar, I think the second question is more broader than the first and quite different.
Anyway, this was probably the most crowded session. We had some great contenders for Smelliest Code Smell (big stinker):
I suggested, Primitive Obsession (Dealing with low level data structures/data types when higher order abstractions can reduce complexity n-fold. This is not specific to OO. Its about lack of abstractions at the right level)
We all agreed that Don’t write code (write new code only when everything else fails) is the single most important lesson every developer needs to learn. The amount of duplicate, crappy code (across projects) that exists today is overwhelming. In a lot of cases developers don’t even bother to look around. They just want to write code. This is what measuring productivity & performance based on Lines of Code (LoC) has done to us. IMHO good developers are 20x faster than average developers coz they think of reuse at a whole different level. Some people confuse this guideline with “Not Invented Here Syndrome“. Personally I think NIHS is very important for advancement in our field. Its important to bring innovation. NIHS is at the design & approach level. Joel has an interesting blog post called In Defense of Not-Invented-Here Syndrome.
Anyway, if we agree that we really need to write code, then what is the one thing you will watch out for? SRP and Connascence are pretty much helping you achieve high Cohesion. If one does not have high cohesion, it might be easy to spot duplication (at least conceptual duplication) or you’ll find that pulling out a right abstraction can solve the problem. So it really leaves Duplicate Code and Primitive Obsession in the race.
Based on my experience, I would argue that I’ve seen code which does not have much duplication but its very difficult to understand what’s going on. Hence I claim, “only if the code had better abstractions it would be a lot easier to understand and evolve the code”. Also when you try to eliminate duplicate code, at one level, there is no literal code duplication, but there is conceptual duplication and creating a high order abstraction is an effective way to solve the problem. Hence I conclude that looking back, Primitive Obsession is at the crux of poor design. a.k.a Biggest Stinker.
Then we look at who is constructing this class, and turns out that we have this deadly SuggestionsUtil class (love the name). This class suffers with various code smells:
So far, most of the refactoring teasers we’ve looked at, have suffered because of lack of modularity and with primitive obsession. This refactoring teaser is quite the opposite. Overall the code base is decent sized. So instead of trying to solve the whole problem in one go, let’s take it one step at a time.
Following is the crux of the Sender Edge Server IP Extraction Algo:
public IPAddressExtractor(finalList receivedHeaders, final Domain recepientDomain){
Domain recepientMXRecord = recepientDomain.retrieveFirstMXRecord();for(MatchingCriterion critierion : asList(MATCHING_DOMAIN, MATCHING_IP, MATCHING_SECOND_LEVEL_DOMAIN)){
Match result =foreach(receivedHeaders).and(recepientMXRecord).match(critierion);if(result.success()){
storeSenderIPWithDistance(result);break;}}}
To do the whole Fluent interfaces on line number 21, I had to create a private method:
Other than the switch statement smell and conditional complexity, the original code was obsessed with Primitive Obsession code smell. To fix this issue, the first thing I had to do was great first class citizens (Objects). So I ended up creating
Also notice that for testing purpose we don’t want to hit the network, so I created a FakeNetwork class which stubs out all Network calls. Network is injected into all Domain classes through the DomainFactory. (I’m not very happy with this design, it feels like a bit of a hack to inject Network this way.)
Last week I posted a small code snippet for refactoring under the heading Refactoring Teaser.
In this post I’ll try to show step by step how I would try to refactor this mud ball.
First and foremost cleaned up the tests to communicate the intent. Also notice I’ve changed the test class name to ContentTest instead of StringUtilTest, which means anything and everything.
Next, I created a class called Content, instead of StringUtil. Content is a first-class domain object. Also notice, no more side-effect intense statics.
publicclass Content {privatestaticfinalString BLANK_OUTPUT ="";privatestaticfinalString SPACE =" ";privatestaticfinalString DELIMITER ="', '";privatestaticfinalString SINGLE_QUOTE ="'";privatestaticfinalint MIN_NO_WORDS =2;privatestaticfinal Pattern ON_WHITESPACES = Pattern.compile("\\p{Z}|\\p{P}");privateList phrases =newArrayList();public Content(finalString content){String[] tokens = ON_WHITESPACES.split(content);if(tokens.length> MIN_NO_WORDS){
buildAllPhrasesUptoThreeWordsFrom(tokens);}}
@Override
publicString toString(){return toPhrases(Integer.MAX_VALUE);}publicString toPhrases(finalint userRequestedSize){if(phrases.isEmpty()){return BLANK_OUTPUT;}List requiredPhrases = phrases.subList(0, numberOfPhrasesRequired(userRequestedSize));return withInQuotes(join(requiredPhrases, DELIMITER));}privateString withInQuotes(finalString phrases){return SINGLE_QUOTE + phrases + SINGLE_QUOTE;}privateint numberOfPhrasesRequired(finalint userRequestedSize){return userRequestedSize > phrases.size()? phrases.size(): userRequestedSize;}privatevoid buildAllPhrasesUptoThreeWordsFrom(finalString[] words){
buildSingleWordPhrases(words);
buildDoubleWordPhrases(words);
buildTripleWordPhrases(words);}privatevoid buildSingleWordPhrases(finalString[] words){for(int i =0; i < words.length;++i){
phrases.add(words[i]);}}privatevoid buildDoubleWordPhrases(finalString[] words){for(int i =0; i < words.length-1;++i){
phrases.add(words[i]+ SPACE + words[i +1]);}}privatevoid buildTripleWordPhrases(finalString[] words){for(int i =0; i < words.length-2;++i){
phrases.add(words[i]+ SPACE + words[i +1]+ SPACE + words[i +2]);}}}
This was a big step forward, but not good enough. Next I focused on the following code:
privatevoid buildAllPhrasesUptoThreeWordsFrom(finalString[] words){
buildSingleWordPhrases(words);
buildDoubleWordPhrases(words);
buildTripleWordPhrases(words);}privatevoid buildSingleWordPhrases(finalString[] words){for(int i =0; i < words.length;++i){
phrases.add(words[i]);}}privatevoid buildDoubleWordPhrases(finalString[] words){for(int i =0; i < words.length-1;++i){
phrases.add(words[i]+ SPACE + words[i +1]);}}privatevoid buildTripleWordPhrases(finalString[] words){for(int i =0; i < words.length-2;++i){
phrases.add(words[i]+ SPACE + words[i +1]+ SPACE + words[i +2]);}}
The above code violates the Open-Closed Principle (pdf). It also smells of duplication. Created a somewhat generic method to kill the duplication.
Now I had a feeling that my Content class was doing too much and also suffered from the primitive obsession code smell. Looked like a concept/abstraction (class) was dying to be called out. So created a Words class as an inner class.
privateclass Words {privateString[] tokens;privatestaticfinalString SPACE =" ";
Words(finalString content){
tokens = ON_WHITESPACES.split(content);}boolean has(finalint minNoWords){return tokens.length> minNoWords;}List phrasesOf(finalint length){List phrases =newArrayList();for(int i =0; i <= tokens.length- length;++i){String phrase = phraseAt(i, length);
phrases.add(phrase);}return phrases;}privateString phraseAt(finalint index, finalint length){
StringBuilder phrase =new StringBuilder(tokens[index]);for(int i =1; i < length; i++){
phrase.append(SPACE + tokens[index + i]);}return phrase.toString();}}
In the constructor of the Content class we instantiate a Words class as follows:
public Content(finalString content){
Words words =new Words(content);if(words.has(MIN_NO_WORDS)){
phrases.addAll(words.phrasesOf(ONE_WORD));
phrases.addAll(words.phrasesOf(TWO_WORDS));
phrases.addAll(words.phrasesOf(THREE_WORDS));}}
Even though this code communicates well, there is duplication and noise that can be removed without compromising on the communication.
Recently Brett Schuchert from Object Mentor has started posting code snippets on his blog and inviting people to refactor it. Similar to the Daily Refactoring Teaser that I’m conducting at Directi. (I’m planning to make it public soon).