Many people condemn "copy-and-paste programming", know the DRY principal (don't repeat yourself) and advocate removal of duplication, but I still see lots of code that contains plenty of duplication - sometimes by the very same people.
Based on the sorts of duplication I see most often, I think this might be that if the amount of code involved isn't very large, then people don't consider it worth refactoring to remove the duplication. However, removing duplication of even small amounts of code can make the code shorter, easier to read and clarify its meaning. The benefits can be even greater if the amount of duplication is greater - so fix them - but don't ignore the easy to fix, smaller bits of duplication either.
Consider the following code:
public void foo(Map configuration){
String name = (String) (configuration.get("name") == null ?
"defaultName" :
configuration.get("name"));
int value = (Integer)(configuration.get("value") == null ?
0 :
configuration.get("value"));
//...
There is obvious duplication - but I see code with this level of duplication quite often - even by people whose code is otherwise good. There are several ways to eliminate this duplication and make the code clearer.
The obvious candidate for removing the duplication is to extract (at least one) method. I'd probably go for something like:
public void foo(Map configuration){
String name = (String) getValueOrDefault(configuration, "name", "defaultName");
int value = (Integer) getValueOrDefault(configuration, "value", 0);
//...
}
private Object getValueOrDefault(Map configuration, Object key, Object defaultValue) {
Object value = configuration.get(key);
return value == null ? defaultValue : value;
}
that is, both extracting a method for the duplicated code, and extracting a variable for the duplicated expression. The purpose of the code is now clearer - it was to get the configuration value or a default value if it wasn't defined. The name of the extracted method tells you what the purpose of the code is.
What if I've got similar code to that which has been refactored into the getValueOrDefault method, in some other class, that currently isn't related by inheritance? You could introduce some common superclass and put it there - but that's often a bad idea (that I don't really want to go into here, but have written about, to some extent, in another article).
One way to share this code is to put it in a "Utils" class as a static method, for example:
public class Utils {
public static Object getValueOrDefault(Map configuration, Object key, Object defaultValue) {
Object value = configuration.get(key);
return value == null ? defaultValue : value;
}
}
which, if you use Java 5.0 static imports, doesn't look any different to the caller than having the method defined in the same class as the caller.
The problem with the method as defined so far is that I have to pass the Map into it, but what I really want is to for the method to be on Map; however in Java I can't (yet) do that. A way to make the code read more like I want would be to define a decorator that adds the method that I want. This can be defined as:
public class MapWithDefaultableValues implements Map {
private Map undefaultedValues;
public MapWithDefaultableValues(Map undefaultedValues) {
this.undefaultedValues = undefaultedValues;
}
public Object getValueOrDefault(String key, Object defaultValue) {
Object value = undefaultedValues.get(key);
return value == null ? defaultValue : value;
}
public void clear() {
undefaultedValues.clear();
}
//... etc for all other methods of Map
The eclipse IDE has a handy feature that generates the code for delegating all the methods of an interface to a field - see "Source > Generate Delegate Methods...".
Then, the calling code can become something like:
public void foo(Map configurationAsMap){
MapWithDefaultableValues configuration = new MapWithDefaultableValues(configurationAsMap);
String name = (String) configuration.getValueOrDefault("name", "defaultName");
int value = (Integer) configuration.getValueOrDefault("value", 0);
//...
You might even find that there are other things that you'd like to be able to do with your configuration objects that also removes duplicated code, and have a class called something like Configuration that doesn't necessarily implement Map but instead provides the interface that is simpler for client code.
Consider the following code:
public void something(Thing thing){
//...
thing.foo(thing.bar());
thing.baz();
//...
The duplication here is of thing - you could call it repetition rather than duplication if you like - it's not much like the earlier example - what is the repetition of thing telling you? One possibility is that the code might be better with this code on the class Thing instead of in the method where it is currently. If there are other instances of this code, or maybe just of part of it (like thing.foo(thing.bar())) then that might tell you even more strongly that the code is in the wrong place.
You might want to add a method on Thing:
public void something() {
foo(bar());
baz();
}
and change the call to:
public void something(Thing thing){
//...
thing.something();
//...
There are lots of other examples. Keep a look out for any type of duplication at all in both the code base and in what you are doing (e.g. if you are frequently typing something that seems very similar to something you've already done). In some cases, it's a sign that you need to do some refactoring. In other cases it's a sign that you need a better tool (for example, I wrote MockMaker [now obsolete, unless you're on pre 1.3 Java] as the result of noticing that I was constantly writing similar code for hand written mock objects - back in the days before anything like JMock).
I'd like developers to take more notice of duplicated code and it's removal. Not only can it make your code smaller and easier to read, sometimes better design and abstractions materialise as a result of simply removing duplication (essentially the hypothesis behind my PhD thesis).
Following on from the previous article in the series, another "programming in the small" thing that I look out for is formatting. Good formatting helps make the code easier to read - bad formatting does the opposite.
Consider the following code:
public void foo(){
if(someCondition())
doSomething();
doSomethingElse();
}
At first glance, it looks like doSomethingElse() is only executed if someCondition() is true. However,
it is executed no matter what the value of the condition is. Possibly the code was originally:
public void foo(){
if(someCondition())
doSomething();
}
and someone had to add doSomethingElse() to be executed after doSomething(). This is why I prefer to always use the {}s even when syntactically redundant.
In Python, the indentation of code is significant - personally, I like this - some people hate it. The reason I like it is because I think code should be formatted with indentation that matches the logical structure of the code, and by making indentation significant, Python enforces this, while at the same time removing the necessity for syntax to delimit code blocks (e.g. {}s or end or equivalent keywords.
I prefer formatting that doesn't use too much space - in particular, vertical space. Even large screens don't allow for loads of lines vertically. Generally, methods/functions shouldn't be that long anyway, but by using vertical space wisely, you can see more code (e.g. several methods) in one screen without compromising readability.
In particular, multiple blank lines (instead of just one), or blank lines that do nothing for the readability of the code, drive me a bit nuts. I also prefer not to have symmetrical {}s - i.e. not having { on a new line, because I don't think it adds anything to readability (if indentation is correct) but just takes more vertical space. (I just found an article which names different bracing styles - apparently, the style I favour is called "K&R").
When talking about formatting with Romilly Cocking he mentioned about the fact that our vision requires things that we read to be within quite a small region of our field of view - (see macula) - and for that reason it's best to try to keep code reasonably compact. (I hope I've represented what Romilly said - hopefully he'll blog about it.).
Obviously sloppily formatted code shows a lack of care of code quality - which can be distracting. On the other hand, minor formatting inconsistencies - that is, ones that you have to look for (e.g. using an automated tool) - are probably not worth looking for. In general, if in doubt about formatting, I recommend that the team finds an automated formatter (typically whatever is built in to your IDE) that you can run on your code if you want. However, I wouldn't have code always auto-formatted because that doesn't always do what you want, and I think sometimes you should "get over" minor inconsistencies, but take more important formatting issues seriously.
I'd like developers to take care over the appearance of code, as well as the contents. It's not the most important thing, but it is important.