A lot of people think, if they can write some code, they qualify as a software developer.
IMHO don’t call yourself a developer if you don’t take ownership and responsibility for solving the overall, real business/user problem.
A good developer
understands the overall problem and its context.
has good problem solving skills (we are in the business of creative problem solving)
has empathy for the users and is a user and business advocate.
takes ownership by being a part-of the team and having a sense of belonging.
makes investment into continuous learning & improvement
[Agile and Software Craftsmanship movements has made great strives in this direction. However some Agile folks don't get it. We can't draw a line and say this side is business and that side is development. Its ONE team working towards a common goal. Unfortunately, I've seen a lot of teams who end up creating artificial boundaries between people wanting the software and people building it.]
Anyway, having the ability to just writing some code does not qualify you to be a developer.
In retrospect, I think Object Orientation has tremendously helped me become a better programmer. But at the same time, its also made me vulnerable to including extra complexity (or at least thinking in terms of more complex solutions) in my code.
One of the important lessons I learned a few years ago was, not to try and model my software on real world (my perception of reality). This leads to my software solution ending up as complex and easy to misunderstood as the real world. Soon I started embracing “There is no Spoon” philosophy and really focusing on abstractions.
Last year, I was again caught red handed, trying to sneak in too many objects (and hence complexity) into my code. This time I was pairing with another developer new to TDD and we were building a Snakes and Ladders game using TDD. The focus was really demonstrate TDD in a very different context.
30 mins into the pairing, we had the following classes with wonderful tests for almost each class:
Game
Board
Player
Dice
Snake
Ladder
Just then Sandeep Shetty was passing by and he looked at the beautiful piece of mess we had created. He was surprised how royally we were wasting our time. The following 15 min discussion helped all of us realize how we were so caught up in TDD and coming up with all those (useless) abstractions when simply we could just have
one class called Game (place holder, the class is not really required) with
one method called move(int number_on_the_dice)
a list to hold the current position of each player on the board (there can be more than 2 players)
a hashmap with starting and ending points on the board (represents both the snakes and ladders, how does it matter whether its a snake or a ladder, they are just starting and ending points on the board)
a counter to calculate player’s turn
and … and what? …that’s it
Can you beat the simplicity of these 15 odd lines of code? Its not really about the number of lines of code, its about the conciseness and simplicity of it.
When you smell complexity and lack of clarity in the air, look around, you’ll find your code swimming in a (smelly) soup of primitives (low level data-types, functions and language components). Unable to bare the stink, your code is screaming and screeching, asking you to rescue it.
This is my friend, primitive obsession, the stinkiest code smell. You can rescue your code (yes we can) by creating higher level abstractions (functions, data types, objects) and giving some sense to this anarchy.
Primitive Obsession is about lack of abstractions. In the OO world, Methods, Objects, Packages/Namespaces are ways of creating abstraction. Similarly functions, procedures, modules, etc are also valid ways of creating abstractions.
Adding more objects does not always lead to better abstraction. Sometimes removing objects is more useful.
There are many different refactorings that can be used as a remedies:
Even before the authenticate() method gets called, the calling class does basic validations on the password parameter. Stuff like
the password cannot be null,
it needs to be more than 6 char long and less than 30 char long
should contain at least one special char or upper case letter
should contain at least one letter
and so on …
Some of these checks happen to reside as separate methods on a PasswordUtil class and some on the StringUtil class.
Once the authenticate method is called, we retrieve the respective User from the database, fetch the password stored in the database and match the new password against it. Wait a sec, we don’t store plain password in the DB any more, we hash them before we store ‘em. And as you might already know, we use one-way hash; which means given a hash, we cannot get back the original string. So we hash the newly entered password. For which we use a HashUtil class. Then we compare the 2 hashes.
The second method is:
public User create(final UserDTO userDTO){
...
}
Before the create() method is called, we validate all the fields inside the UserDTO. During this validation, we do the exact same validations on password as we do before the authenticate method. If all the fields are valid, then inside the create method, we make sure no one else has the same userid. Then we take the raw text password and hash it, so that we can store it in our DB. Once we save the user data in DB, we send out an activation email and off we are.
Sorry that was long. What is the point? Exactly my point. What is the point. Why do I need to know all this stuff? I can’t really explain you the pain I go through when I see:
All these hops & jumps around these large meaningless classes (UserDTO, PasswordUtil, StringUtil, HashUtil)
Conceptual and Data duplication in multiple places
Difficulty in knowing where I can find some logic (password logic seems to be sprayed all over the place)
And so on …
This is an example of Primitive Obsession.
A huge amount of complexity can be reduced,
clarity can be increased and
duplication can be avoided in this code
If we can create a Password class. To think about it, Password is really an entity like User in this domain.
Password class’ constructor can do the validations for you.
You can give it another password and ask if they match. This will hide all the hashing and rehashing logic from you
You can kill all those 3 Utils classes (PasswordUtil, StringUtil, HashUtil) & move the logic in the Password class where it belong
So once we are done, we have the following method signatures:
public User userWithMatching(UserId id, Password userEnteredPwd){
...
}
At the SDTConf 2009,Corey Haines & I hosted a session called Biggest Stinkers. During this session we were trying to answer the following two (different) questions:
As an experienced developer, looking back, what do you think is the stinkiest code smell that has hurt you the most? In other words, which is the single code smell if you go after eradicating, *most* of the design problems in your code would be solved?
There are so many different principles and guidelines to help you achieve a good design. For new developers where do they start? Which is the one code smell or principle that we can teach new developers that will help them the most as far as good design goes (other than years of experience)?
Even though the 2 questions look similar, I think the second question is more broader than the first and quite different.
Anyway, this was probably the most crowded session. We had some great contenders for Smelliest Code Smell (big stinker):
I suggested, Primitive Obsession (Dealing with low level data structures/data types when higher order abstractions can reduce complexity n-fold. This is not specific to OO. Its about lack of abstractions at the right level)
We all agreed that Don’t write code (write new code only when everything else fails) is the single most important lesson every developer needs to learn. The amount of duplicate, crappy code (across projects) that exists today is overwhelming. In a lot of cases developers don’t even bother to look around. They just want to write code. This is what measuring productivity & performance based on Lines of Code (LoC) has done to us. IMHO good developers are 20x faster than average developers coz they think of reuse at a whole different level. Some people confuse this guideline with “Not Invented Here Syndrome“. Personally I think NIHS is very important for advancement in our field. Its important to bring innovation. NIHS is at the design & approach level. Joel has an interesting blog post called In Defense of Not-Invented-Here Syndrome.
Anyway, if we agree that we really need to write code, then what is the one thing you will watch out for? SRP and Connascence are pretty much helping you achieve high Cohesion. If one does not have high cohesion, it might be easy to spot duplication (at least conceptual duplication) or you’ll find that pulling out a right abstraction can solve the problem. So it really leaves Duplicate Code and Primitive Obsession in the race.
Based on my experience, I would argue that I’ve seen code which does not have much duplication but its very difficult to understand what’s going on. Hence I claim, “only if the code had better abstractions it would be a lot easier to understand and evolve the code”. Also when you try to eliminate duplicate code, at one level, there is no literal code duplication, but there is conceptual duplication and creating a high order abstraction is an effective way to solve the problem. Hence I conclude that looking back, Primitive Obsession is at the crux of poor design. a.k.a Biggest Stinker.
In your house, do you put garbage, food and jewelery in the same box and keep it in the same place? Why not?
Coz they don’t belong together.
But why don’t they belong together?
Coz they have very different purpose & you don’t intent to use them at the same time in the same context. We even go further and separate each item. For example, we put gold jewelery in one box and silver in another. We even separate the jewelery based on the occasion we plan to wear them and so on.
When you apply the same thinking in your code, you achieve high cohesion.
Its also worth highlighting that as time goes by, as you accumulate more things, every time you want to add something new, we go back, reorganize and rearrange those items in our boxes. Sometimes we even rethinking our whole organizing strategy. Sometimes we go buy new boxes to help us organize things better so its easy to find them. We can’t think of it all upfront and get it right the first time. Same applies in software design.
Lately, I’ve sensed some confusion in developers around the Single Responsibility Principle (SRP). Its easy to say every object or abstraction should have a single responsibility. But it’s important to understand what do we mean by responsibility. (Its like telling someone who really lives to eat, that they should control their diet. Of course they control their diet, just that they have a very different definition of diet.)
So when we talk about SRP, what does single responsibility really mean?
Some developers suggest each public method on a object is exposing a responsibility. If we go by this definition of responsibility we’ll end up with every object having exactly one method. Which does not feel right.
Some developers suggest that each object should do exactly one thing. What do they mean by one thing? When I look at a class or a method, how can I tell if its doing one thing or more than one thing? This definition of SRP is even more vague than the original definition.
For example, if we have a class Foo, do you think, adding the toString() method violates SRP? Do you think toString() is the responsibility of the Foo class? Most developers would agree that toString() really belongs on the Foo class and it does not violate SRP. toString() method usually needs access to the internal instance variables and hence this method really belongs on the respective class.
Lets say, we agree that toString() method should belong on the Foo class. Now going by the same argument, can we add toXML() method on the same class? If no, why not? If yes, when will you stop?
Basically this is a slipper slope. Trying to understand SRP by defining Responsibility does not work really well for me. Instead I look at it from the “Single Reason for Change” lens. If I look at a class Foo with a method bar() on it; if the class Foo and the method bar() change for 2 different reasons then I think it violates SRP. Irrespective of how many methods it has and how many things it does. (It could be violating other design principles).
Going back to our example, I think adding a toString() method does not violate the SRP coz I don’t see any reason why only the toString() method needs to change and the rest of the class remains unchanged. If we add or remove instance variables, its most likely that both the class and the toString() method will change.
Now what about toXML() method? I think the toXML() method does not belong on the object, coz I see 2 reasons for which the toXML() method can change.
When we add/remove any instance variables
When the XML representation/format changes.
However there is a trade-off here. If you don’t put any method on the class, then how do other objects get the data out (other than breaking encapsulation. Even doing so will increase the coupling.)? On the other hand, if you add such methods, then any changes in the format/representation will effect this object as well. Which does not seem right. So how do you design something which does not violate SRP, does not break encapsulation and also adheres to Open Closed Principle?
One way to solve this puzzle is to add a to() method which takes a Formatter object, so we end up with a to(Formatter collector) method. Depending on what format (XML, YAML, etc) one needs to pass the appropriate Formatter type. For example if you want the XML representation of this Class, you pass a XMLFormatter object to the to() method. The class which has the to() method is responsible to adding what ever data it needs to expose and the Formatter object is responsible for the representation and formatting. This way if the XML schema changes, there is only one places to go make the change. If the instance variables change there is only one place to make the change. (Again this is not 100% true, but this solution is a step forward from the first solution as far as SRP goes).
So look for Reasons to Change, as a way to understand SRP.
There should never be more than one reason for a class to change.
Frankly I find this a little too restrictive. I think there can be more than 1 reasons for the class to change. As far as all those reasons affect the whole class (all or most of the methods) the same way, I think we’ve achieved high cohesion and we can be happy for now.
Also its worth highlighting that SRP applies at multiple levels, right on top at a project level all the way down to your individual lines of code.