I have a treat for crappy code scavengers. Here is some code which has a Cyclomatic Complexity of 68 and NPath Complexity of 34,632 (this method is ONLY 189 lines long (154 NCSS)).
/*
* Main reading method
*/publicvoid read(final ByteBuffer byteBuffer)throwsException{
invalidateBuffer();// Check that the buffer is not bigger than 1 Megabyte. For security reasons// we will abort parsing when 1 Mega of queued chars was found.if(buffer.length()> maxBufferSize)thrownewException("Stopped parsing never ending stanza");
CharBuffer charBuffer = encoder.decode(byteBuffer);char[] buf = charBuffer.array();int readByte = charBuffer.remaining();// Just return if nothing was readif(readByte ==0)return;// Verify if the last received byte is an incomplete double byte characterchar lastChar = buf[readByte -1];if(lastChar >= 0xfff0){// Rewind the position one place so the last byte stays in the buffer// The missing byte should arrive in the next iteration. Once we have both// of bytes we will have the correct character
byteBuffer.position(byteBuffer.position()-1);// Decrease the number of bytes read by one
readByte--;// Just return if nothing was readif(readByte ==0)return;}
buffer.append(buf, 0, readByte);// Do nothing if the buffer only contains white spacesif(buffer.charAt(0)<=' '&& buffer.charAt(buffer.length()-1)<=' ')if("".equals(buffer.toString().trim())){// Empty the buffer so there is no memory leak
buffer.delete(0, buffer.length());return;}// Robot.char ch;boolean isHighSurrogate =false;for(int i =0; i < readByte; i++){
ch = buf[i];if(ch < 0x20 && ch != 0x9 && ch != 0xA && ch != 0xD && ch != 0x0)// Unicode characters in the range 0x0000-0x001F other than 9, A, and D are not allowed in XML// We need to allow the NULL character, however, for Flash XMLSocket clients to work.thrownewException("Disallowed character");if(isHighSurrogate){if(Character.isLowSurrogate(ch))// Everything is fine. Clean up traces for surrogates
isHighSurrogate =false;else// Trigger error. Found high surrogate not followed by low surrogatethrownewException("Found high surrogate not followed by low surrogate");}elseif(Character.isHighSurrogate(ch))
isHighSurrogate =true;elseif(Character.isLowSurrogate(ch))// Trigger error. Found low surrogate char without a preceding high surrogatethrownewException("Found low surrogate char without a preceding high surrogate");if(status == XMLLightweightParser.TAIL){// Looking for the close tagif(depth <1&& ch == head.charAt(tailCount)){
tailCount++;if(tailCount == head.length()){// Close stanza found!// Calculate the correct start,end position of the message into the bufferint end = buffer.length()- readByte + i +1;String msg = buffer.substring(startLastMsg, end);// Add message to the list
foundMsg(msg);
startLastMsg = end;}}else{
tailCount =0;
status = XMLLightweightParser.INSIDE;}}elseif(status == XMLLightweightParser.PRETAIL){if(ch == XMLLightweightParser.CDATA_START[cdataOffset]){
cdataOffset++;if(cdataOffset == XMLLightweightParser.CDATA_START.length){
status = XMLLightweightParser.INSIDE_CDATA;
cdataOffset =0;continue;}}else{
cdataOffset =0;
status = XMLLightweightParser.INSIDE;}if(ch =='/'){
status = XMLLightweightParser.TAIL;
depth--;}elseif(ch =='!')// This is a <! (comment) so ignore it
status = XMLLightweightParser.INSIDE;else
depth++;}elseif(status == XMLLightweightParser.VERIFY_CLOSE_TAG){if(ch =='>'){
depth--;
status = XMLLightweightParser.OUTSIDE;if(depth <1){// Found a tag in the form <tag />int end = buffer.length()- readByte + i +1;String msg = buffer.substring(startLastMsg, end);// Add message to the list
foundMsg(msg);
startLastMsg = end;}}elseif(ch =='<'){
status = XMLLightweightParser.PRETAIL;
insideChildrenTag =true;}else
status = XMLLightweightParser.INSIDE;}elseif(status == XMLLightweightParser.INSIDE_PARAM_VALUE){if(ch =='"')
status = XMLLightweightParser.INSIDE;}elseif(status == XMLLightweightParser.INSIDE_CDATA){if(ch == XMLLightweightParser.CDATA_END[cdataOffset]){
cdataOffset++;if(cdataOffset == XMLLightweightParser.CDATA_END.length){
status = XMLLightweightParser.OUTSIDE;
cdataOffset =0;}}else
cdataOffset =0;}elseif(status == XMLLightweightParser.INSIDE){if(ch == XMLLightweightParser.CDATA_START[cdataOffset]){
cdataOffset++;if(cdataOffset == XMLLightweightParser.CDATA_START.length){
status = XMLLightweightParser.INSIDE_CDATA;
cdataOffset =0;continue;}}else{
cdataOffset =0;
status = XMLLightweightParser.INSIDE;}if(ch =='"')
status = XMLLightweightParser.INSIDE_PARAM_VALUE;elseif(ch =='>'){
status = XMLLightweightParser.OUTSIDE;if(insideRootTag
&&("stream:stream>".equals(head.toString())||"?xml>".equals(head.toString())||"flash:stream>".equals(head
.toString()))){// Found closing stream:streamint end = buffer.length()- readByte + i +1;// Skip LF, CR and other "weird" characters that could appearwhile(startLastMsg < end &&'<'!= buffer.charAt(startLastMsg))
startLastMsg++;String msg = buffer.substring(startLastMsg, end);
foundMsg(msg);
startLastMsg = end;}
insideRootTag =false;}elseif(ch =='/')
status = XMLLightweightParser.VERIFY_CLOSE_TAG;}elseif(status == XMLLightweightParser.HEAD){if(ch ==' '|| ch =='>'){// Append > to head to allow searching </tag>
head.append(">");if(ch =='>')
status = XMLLightweightParser.OUTSIDE;else
status = XMLLightweightParser.INSIDE;
insideRootTag =true;
insideChildrenTag =false;continue;}elseif(ch =='/'&& head.length()>0){
status = XMLLightweightParser.VERIFY_CLOSE_TAG;
depth--;}
head.append(ch);}elseif(status == XMLLightweightParser.INIT){if(ch =='<'){
status = XMLLightweightParser.HEAD;
depth =1;}else
startLastMsg++;}elseif(status == XMLLightweightParser.OUTSIDE)if(ch =='<'){
status = XMLLightweightParser.PRETAIL;
cdataOffset =1;
insideChildrenTag =true;}}if(head.length()>0&&("/stream:stream>".equals(head.toString())||"/flash:stream>".equals(head.toString())))// Found closing stream:stream
foundMsg("</stream:stream>");}
What does this code actually do?
This method is inside a LightWeightXMLParser. It reads data from a socket channel (java nio) and collects data until data is available on the channel. When a message is complete (fully formed XML), you can retrieve messages by invoking the getMsgs() method and you can invoke areThereMsgs() method to know if at least a message is presents.
86
87
88
89
90
91
92
93
94
95
96
/*
* @return an array with all messages found
*/publicString[] getMsgs(){String[] res =newString[msgs.size()];for(int i =0; i < res.length; i++)
res[i]= msgs.get(i);
msgs.clear();
invalidateBuffer();return res;}
Following Tests might help you understand the code slightly better:
16
17
18
19
20
21
22
23
@Override
protectedvoid setUp()throwsException{super.setUp();// Create parser
parser =new LightWeightXMLParser(CHARSET);// Crete byte buffer and append text
in = ByteBuffer.allocate(4096);}
publicvoid testHeader()throwsException{String msg1 ="<stream:stream to=\"localhost\" xmlns=\"jabber:client\" xmlns:stream=\"http://etherx.jabber.org/streams\" version=\"1.0\">";
in.put(msg1.getBytes());
in.flip();// Fill parser with byte buffer content and parse it
parser.read(in);// Make verifications
assertTrue("Stream header is not being correctly parsed", parser.areThereMsgs());
assertEquals("Wrong stanza was parsed", msg1, parser.getMsgs()[0]);}
43
44
45
46
47
48
49
50
51
52
53
54
55
56
publicvoid testHeaderWithXMLVersion()throwsException{String msg1 ="<?xml version=\"1.0\"?>";String msg2 ="<stream:stream to=\"localhost\" xmlns=\"jabber:client\" xmlns:stream=\"http://etherx.jabber.org/streams\" version=\"1.0\">";
in.put((msg1 + msg2).getBytes());
in.flip();// Fill parser with byte buffer content and parse it
parser.read(in);// Make verifications
assertTrue("Stream header is not being correctly parsed", parser.areThereMsgs());String[] values = parser.getMsgs();
assertEquals("Wrong number of parsed stanzas", 2, values.length);
assertEquals("Wrong stanza was parsed", msg1, values[0]);
assertEquals("Wrong stanza was parsed", msg2, values[1]);}
publicvoid testCompleteStanzas()throwsException{String msg1 ="<stream:stream to=\"localhost\" xmlns=\"jabber:client\" xmlns:stream=\"http://etherx.jabber.org/streams\" version=\"1.0\">";String msg2 ="<starttls xmlns=\"urn:ietf:params:xml:ns:xmpp-tls\"/>";String msg3 ="<stream:stream to=\"localhost\" xmlns=\"jabber:client\" xmlns:stream=\"http://etherx.jabber.org/streams\" version=\"1.0\">";String msg4 ="<iq id=\"428qP-0\" to=\"localhost\" type=\"get\"><query xmlns=\"jabber:iq:register\"></query></iq>";String msg5 ="<stream:stream to=\"localhost\" xmlns=\"jabber:client\" xmlns:stream=\"http://etherx.jabber.org/streams\" version=\"1.0\">";String msg6 ="<presence id=\"428qP-5\"></presence>";String msg7 ="</stream:stream>";
in.put(msg1.getBytes());
in.put(msg2.getBytes());
in.put(msg3.getBytes());
in.put(msg4.getBytes());
in.put(msg5.getBytes());
in.put(msg6.getBytes());
in.put(msg7.getBytes());
in.flip();// Fill parser with byte buffer content and parse it
parser.read(in);// Make verifications
assertTrue("Stream header is not being correctly parsed", parser.areThereMsgs());String[] values = parser.getMsgs();
assertEquals("Wrong number of parsed stanzas", 7, values.length);
assertEquals("Wrong stanza was parsed", msg1, values[0]);
assertEquals("Wrong stanza was parsed", msg2, values[1]);
assertEquals("Wrong stanza was parsed", msg3, values[2]);
assertEquals("Wrong stanza was parsed", msg4, values[3]);
assertEquals("Wrong stanza was parsed", msg5, values[4]);
assertEquals("Wrong stanza was parsed", msg6, values[5]);
assertEquals("Wrong stanza was parsed", msg7, values[6]);}
117
118
119
120
121
122
123
124
125
126
127
publicvoid testIQ()throwsException{String iq ="<iq type=\"set\" to=\"lachesis\" from=\"0sups/Connection Worker - 1\" id=\"360-22348\"><session xmlns=\"http://jabber.org/protocol/connectionmanager\" id=\"0sups87b1694\"><close/></session></iq>";
in.put(iq.getBytes());
in.flip();// Fill parser with byte buffer content and parse it
parser.read(in);// Make verifications
assertTrue("Stream header is not being correctly parsed", parser.areThereMsgs());String parsedIQ = parser.getMsgs()[0];
assertEquals("Wrong stanza was parsed", iq, parsedIQ);}
129
130
131
132
133
134
135
136
137
138
139
140
publicvoid testNestedElements()throwsException{String msg1 ="<message><message xmlns=\"e\">1</message></message>";
in.put(msg1.getBytes());
in.flip();// Fill parser with byte buffer content and parse it
parser.read(in);// Make verifications
assertTrue("Stream header is not being correctly parsed", parser.areThereMsgs());String[] values = parser.getMsgs();
assertEquals("Wrong number of parsed stanzas", 1, values.length);
assertEquals("Wrong stanza was parsed", msg1, values[0]);}
142
143
144
145
146
147
148
149
150
publicvoid testIncompleteStanza()throwsException{String msg1 ="<message><something xmlns=\"http://idetalk.com/namespace\">12";
in.put(msg1.getBytes());
in.flip();// Fill parser with byte buffer content and parse it
parser.read(in);// Make verifications
assertFalse("Found messages in incomplete stanza", parser.areThereMsgs());}
publicvoid testCompletedStanza()throwsException{String msg1 ="<message><something xmlns=\"http://idetalk.com/namespace\">12";
in.put(msg1.getBytes());
in.flip();// Fill parser with byte buffer content and parse it
parser.read(in);// Make verifications
assertFalse("Found messages in incomplete stanza", parser.areThereMsgs());String msg2 ="</something></message>";
ByteBuffer in2 = ByteBuffer.allocate(4096);
in2.put(msg2.getBytes());
in2.flip();// Fill parser with byte buffer content and parse it
parser.read(in2);
in2.clear();
assertTrue("Stream header is not being correctly parsed", parser.areThereMsgs());String[] values = parser.getMsgs();
assertEquals("Wrong number of parsed stanzas", 1, values.length);
assertEquals("Wrong stanza was parsed", msg1 + msg2, values[0]);}
196
197
198
199
200
201
202
203
204
205
206
207
publicvoid testStanzaWithComments()throwsException{String msg1 ="<iq from=\"lg@jabber.org/spark\"><query xmlns=\"jabber:iq:privacy\"><!-- silly comment --></query></iq>";
in.put(msg1.getBytes());
in.flip();// Fill parser with byte buffer content and parse it
parser.read(in);// Make verifications
assertTrue("No messages were found in stanza", parser.areThereMsgs());String[] values = parser.getMsgs();
assertEquals("Wrong number of parsed stanzas", 1, values.length);
assertEquals("Wrong stanza was parsed", msg1, values[0]);}
Its been a while since the Fourth Refactoring Teaser was posted. So far, I think this is one of the trickiest refactorings I’ve tried. Refactored half of the solution and rewrote the rest of it.
Particularly thrilled about shrinkage in the code base. Getting rid of all those convoluted Strategies and Child Strategies with 2 main classes was real fun (and difficult as well). Even though the solution is not up to the mark, its come a long long way from where it was.
Ended up renaming IdentityGenerator to EmailSuggester. Renamed the PartialAcceptanceTest to EmailSuggesterTest. Also really like how that test looks now:
I’m not happy with this method. This is the roughest part of this code. All the
if(seed != lastName){
seems dodgy. But at least all of it is in one place instead of being scattered around 10 different classes with tons of duplicate code.
For each potential email data, we try to create an email address, if its available, we add it, else we move to the next potential email data, till we exhaust the list.
Given two tokens (user name and domain name), the Email class tries to creates an email address without Restricted Words and Celebrity Names in it.
30
31
32
33
34
35
privateString buildIdWithoutRestrictedWordsAndCelebrityNames(){
Email current =this;if(isCelebrityName())
current = trimLastCharacter();return buildIdWithoutRestrictedWordsAndCelebrityNames(current, 1);}
37
38
39
40
41
42
43
44
45
46
privateString buildIdWithoutRestrictedWordsAndCelebrityNames(final Email last, finalint count){if(count == MAX_ATTEMPTS)thrownewIllegalStateException("Exceeded the Max number of tries");String userName = findClosestNonRestrictiveWord(last.userName, RestrictedUserNames, 0);String domainName = findClosestNonRestrictiveWord(last.domainName, RestrictedDomainNames, 0);
Email id =new Email(userName, domainName, dns);if(!id.isCelebrityName())return id.asString();return buildIdWithoutRestrictedWordsAndCelebrityNames(id.trimLastCharacter(), count +1);}
Influenced by Functional Programming, I’ve tried to use Tail recursion and Immutable objects here.
Also to get rid of massive duplication in code, I had to introduce a new Interface and 2 anonymous inner classes.
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
publicinterface RestrictedWords {
RestrictedWords RestrictedUserNames =new RestrictedWords(){
@Override
publicboolean contains(finalString word, final DomainNameService dns){return dns.isRestrictedUserName(word);}};
RestrictedWords RestrictedDomainNames =new RestrictedWords(){
@Override
publicboolean contains(finalString word, final DomainNameService dns){return dns.isRestrictedDomainName(word);}};boolean contains(finalString word, DomainNameService dns);}
Then we look at who is constructing this class, and turns out that we have this deadly SuggestionsUtil class (love the name). This class suffers with various code smells:
So far, most of the refactoring teasers we’ve looked at, have suffered because of lack of modularity and with primitive obsession. This refactoring teaser is quite the opposite. Overall the code base is decent sized. So instead of trying to solve the whole problem in one go, let’s take it one step at a time.
Following is the crux of the Sender Edge Server IP Extraction Algo:
public IPAddressExtractor(finalList receivedHeaders, final Domain recepientDomain){
Domain recepientMXRecord = recepientDomain.retrieveFirstMXRecord();for(MatchingCriterion critierion : asList(MATCHING_DOMAIN, MATCHING_IP, MATCHING_SECOND_LEVEL_DOMAIN)){
Match result =foreach(receivedHeaders).and(recepientMXRecord).match(critierion);if(result.success()){
storeSenderIPWithDistance(result);break;}}}
To do the whole Fluent interfaces on line number 21, I had to create a private method:
Other than the switch statement smell and conditional complexity, the original code was obsessed with Primitive Obsession code smell. To fix this issue, the first thing I had to do was great first class citizens (Objects). So I ended up creating
Also notice that for testing purpose we don’t want to hit the network, so I created a FakeNetwork class which stubs out all Network calls. Network is injected into all Domain classes through the DomainFactory. (I’m not very happy with this design, it feels like a bit of a hack to inject Network this way.)
Last week I posted a small code snippet for refactoring under the heading Refactoring Teaser.
In this post I’ll try to show step by step how I would try to refactor this mud ball.
First and foremost cleaned up the tests to communicate the intent. Also notice I’ve changed the test class name to ContentTest instead of StringUtilTest, which means anything and everything.
Next, I created a class called Content, instead of StringUtil. Content is a first-class domain object. Also notice, no more side-effect intense statics.
publicclass Content {privatestaticfinalString BLANK_OUTPUT ="";privatestaticfinalString SPACE =" ";privatestaticfinalString DELIMITER ="', '";privatestaticfinalString SINGLE_QUOTE ="'";privatestaticfinalint MIN_NO_WORDS =2;privatestaticfinal Pattern ON_WHITESPACES = Pattern.compile("\\p{Z}|\\p{P}");privateList phrases =newArrayList();public Content(finalString content){String[] tokens = ON_WHITESPACES.split(content);if(tokens.length> MIN_NO_WORDS){
buildAllPhrasesUptoThreeWordsFrom(tokens);}}
@Override
publicString toString(){return toPhrases(Integer.MAX_VALUE);}publicString toPhrases(finalint userRequestedSize){if(phrases.isEmpty()){return BLANK_OUTPUT;}List requiredPhrases = phrases.subList(0, numberOfPhrasesRequired(userRequestedSize));return withInQuotes(join(requiredPhrases, DELIMITER));}privateString withInQuotes(finalString phrases){return SINGLE_QUOTE + phrases + SINGLE_QUOTE;}privateint numberOfPhrasesRequired(finalint userRequestedSize){return userRequestedSize > phrases.size()? phrases.size(): userRequestedSize;}privatevoid buildAllPhrasesUptoThreeWordsFrom(finalString[] words){
buildSingleWordPhrases(words);
buildDoubleWordPhrases(words);
buildTripleWordPhrases(words);}privatevoid buildSingleWordPhrases(finalString[] words){for(int i =0; i < words.length;++i){
phrases.add(words[i]);}}privatevoid buildDoubleWordPhrases(finalString[] words){for(int i =0; i < words.length-1;++i){
phrases.add(words[i]+ SPACE + words[i +1]);}}privatevoid buildTripleWordPhrases(finalString[] words){for(int i =0; i < words.length-2;++i){
phrases.add(words[i]+ SPACE + words[i +1]+ SPACE + words[i +2]);}}}
This was a big step forward, but not good enough. Next I focused on the following code:
privatevoid buildAllPhrasesUptoThreeWordsFrom(finalString[] words){
buildSingleWordPhrases(words);
buildDoubleWordPhrases(words);
buildTripleWordPhrases(words);}privatevoid buildSingleWordPhrases(finalString[] words){for(int i =0; i < words.length;++i){
phrases.add(words[i]);}}privatevoid buildDoubleWordPhrases(finalString[] words){for(int i =0; i < words.length-1;++i){
phrases.add(words[i]+ SPACE + words[i +1]);}}privatevoid buildTripleWordPhrases(finalString[] words){for(int i =0; i < words.length-2;++i){
phrases.add(words[i]+ SPACE + words[i +1]+ SPACE + words[i +2]);}}
The above code violates the Open-Closed Principle (pdf). It also smells of duplication. Created a somewhat generic method to kill the duplication.
Now I had a feeling that my Content class was doing too much and also suffered from the primitive obsession code smell. Looked like a concept/abstraction (class) was dying to be called out. So created a Words class as an inner class.
privateclass Words {privateString[] tokens;privatestaticfinalString SPACE =" ";
Words(finalString content){
tokens = ON_WHITESPACES.split(content);}boolean has(finalint minNoWords){return tokens.length> minNoWords;}List phrasesOf(finalint length){List phrases =newArrayList();for(int i =0; i <= tokens.length- length;++i){String phrase = phraseAt(i, length);
phrases.add(phrase);}return phrases;}privateString phraseAt(finalint index, finalint length){
StringBuilder phrase =new StringBuilder(tokens[index]);for(int i =1; i < length; i++){
phrase.append(SPACE + tokens[index + i]);}return phrase.toString();}}
In the constructor of the Content class we instantiate a Words class as follows:
public Content(finalString content){
Words words =new Words(content);if(words.has(MIN_NO_WORDS)){
phrases.addAll(words.phrasesOf(ONE_WORD));
phrases.addAll(words.phrasesOf(TWO_WORDS));
phrases.addAll(words.phrasesOf(THREE_WORDS));}}
Even though this code communicates well, there is duplication and noise that can be removed without compromising on the communication.