Its been a while since the Fourth Refactoring Teaser was posted. So far, I think this is one of the trickiest refactorings I’ve tried. Refactored half of the solution and rewrote the rest of it.
Particularly thrilled about shrinkage in the code base. Getting rid of all those convoluted Strategies and Child Strategies with 2 main classes was real fun (and difficult as well). Even though the solution is not up to the mark, its come a long long way from where it was.
Ended up renaming IdentityGenerator to EmailSuggester. Renamed the PartialAcceptanceTest to EmailSuggesterTest. Also really like how that test looks now:
I’m not happy with this method. This is the roughest part of this code. All the
if(seed != lastName){
seems dodgy. But at least all of it is in one place instead of being scattered around 10 different classes with tons of duplicate code.
For each potential email data, we try to create an email address, if its available, we add it, else we move to the next potential email data, till we exhaust the list.
Given two tokens (user name and domain name), the Email class tries to creates an email address without Restricted Words and Celebrity Names in it.
30
31
32
33
34
35
privateString buildIdWithoutRestrictedWordsAndCelebrityNames(){
Email current =this;if(isCelebrityName())
current = trimLastCharacter();return buildIdWithoutRestrictedWordsAndCelebrityNames(current, 1);}
37
38
39
40
41
42
43
44
45
46
privateString buildIdWithoutRestrictedWordsAndCelebrityNames(final Email last, finalint count){if(count == MAX_ATTEMPTS)thrownewIllegalStateException("Exceeded the Max number of tries");String userName = findClosestNonRestrictiveWord(last.userName, RestrictedUserNames, 0);String domainName = findClosestNonRestrictiveWord(last.domainName, RestrictedDomainNames, 0);
Email id =new Email(userName, domainName, dns);if(!id.isCelebrityName())return id.asString();return buildIdWithoutRestrictedWordsAndCelebrityNames(id.trimLastCharacter(), count +1);}
Influenced by Functional Programming, I’ve tried to use Tail recursion and Immutable objects here.
Also to get rid of massive duplication in code, I had to introduce a new Interface and 2 anonymous inner classes.
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
publicinterface RestrictedWords {
RestrictedWords RestrictedUserNames =new RestrictedWords(){
@Override
publicboolean contains(finalString word, final DomainNameService dns){return dns.isRestrictedUserName(word);}};
RestrictedWords RestrictedDomainNames =new RestrictedWords(){
@Override
publicboolean contains(finalString word, final DomainNameService dns){return dns.isRestrictedDomainName(word);}};boolean contains(finalString word, DomainNameService dns);}
* Heap size remains the same assuming Garbage collection takes places regularly.
Here, when I say Immutable, I mean the class in which the method (using iterator or recursive) is contained is Immutable. .i.e. the method is trying to update the instance variables, but they are declared final, hence they cannot be modified. Only way to modify them is to create a new object with the updated instance variables.
//instance variableprivatefinal List<String> restrictedWords = retrieveRestrictedWordsFromDB();publicString findClosestNonRestrictiveWord(String word){for(int i =0; i < MAX_ATTEMPTS && word.length()>1; i++){if(!restrictedWords.contains(word))return word;
word = word.substring(0, word.length()-1);}thrownewIllegalArgumentException(word +" is a restricted word");}
One would say this code is performant and uses memory optimally (time and space complexity is low) but is not the most expressive (intent revealing) code.
publicString findClosestNonRestrictiveWord(finalString word, finalint count){if(count == MAX_ATTEMPTS || word.length()<=1)thrownewIllegalArgumentException(word +" is a restricted word");if(!restrictedWords.contains(word))return word;return findClosestNonRestrictiveWord(trimLastChar(word), count +1);}privateString trimLastChar(finalString word){return word.substring(0, word.length()-1);}
I would argue that this code is much easier to understand and has very similar space and time complexity as the iterator based approach in most languages. (I know some JVMs don’t do Tail Call Optimization.)
In this specific case, there is no big win either ways because the MAX_ATTEMPTS is 5. Lets take a slightly more complete example where we have two mutable (non-final) instance variables (note this is a dumbed down version of a complex class.):
publicclass EmailAddress {privatestaticfinalint MAX_ATTEMPTS =5;privateString userName;privateString domainName;public EmailAddress(finalString userName, finalString domainName){if(isEmpty(userName)|| isEmpty(domainName))thrownewRuntimeException("User Name or Domain Name is empty or null");this.userName= userName.toLowerCase();this.domainName= domainName.toLowerCase();}publicString asString(){for(int loopCounter =0; loopCounter < MAX_ATTEMPTS;++loopCounter)if(isEmpty(userName)|| isEmpty(domainName))thrownewRuntimeException();elseif(!EmailService.isAvailable(userName, domainName)){
userName = userName.substring(0, userName.length()-1);
domainName = domainName.substring(0, domainName.length()-1);}elsereturn userName +"@"+ domainName;thrownewRuntimeException();}privateboolean isEmpty(finalString token){return token ==null|| token.length()==0;}}
Now let’s imagine you had 2 threads calling the asString method. [Blast], this code would blow apart.
Well, not a problem, we can simply make the method synchronized. But wait a second, what does this mean? I can only execute this whole method by one thread. Which means for a single CPU, single threaded application this code performs well. But the moment you have a multi-core concurrent (multi-threaded) application, this code will show no performance improvement. That sucks!
So what might look performant (esp. after compromising on expressiveness) is not really scalable. So what option do we have?
We could create a new instance of EmailAddress for every thread. But what if that’s not under your control?
Another thing we can try is reduce the scope of the two instance variables to method parameter or local copies, but what if you cannot do that trivially?
Notice the two instance variables are final. Which means the EmailAddress object is immutable. Also notice that instead of trimming the original instance variables, we create a new EmailAddress with the trimmed values, which means we don’t manipulate any object level state and two threads are not sharing any data. Hence we avoid the whole synchronization issue.
Please note that in languages that do not support Tail Call Optimization, using tail recursion has similar space and time complexity as recursion. (For long recursive calls you run the risk of stack_over_flow). Also your Stack and Heap memory will grow linearly with every recursion. More about Iteration v/s Recursion and Mutable v/s Immutable.