March 31, 2006

Programming in the small - structured programming

Following on from the previous article in the series, another programming habit I see frequently is following the structured programming rule of having only one exit point from a function/method.

Cargo Cults

I've been prompted to write this article (which I was previously considering but thought might be seen too much a matter of taste rather than anything more definitive than that) by Dave Thomas's talk at SPA 2006. He explained the phenomina of Cargo Cults - which I won't fully describe here - but in a nutshell, it's doing something because it used to work but not considering whether it's still appropriate.

Dave mentioned structured programming - in particular the GOTO statement as a Cargo Cult. (I couldn't find much of a write-up of his talk - I might write more myself some time ...).

I think that the "single exit point" rule of structured programming is a Cargo Cult (but I don't think that the absence of GOTO statements in languages is).

Structured programming considered harmful

Consider the following function (and pretend the ternary operator isn't available):

	void foo(boolean condition) {
		int returnValue;
		if(condition){
			returnValue = 5;
		}else{
			returnValue = 6;
		}
		return returnValue;
	}

I've seen lots of code like this over the years. I'm sure it's because people have been taught at some point (and I think it's at University) that a function should only have one exit point, and so that's what they do.

The improved version - multiple exit points

Compare the code above with this version:

	void foo(boolean condition) {
		if(condition){
			return 5;
		}else{
			return 6;
		}
	}

It's slightly shorter, and I think, easier to understand.

In the "single exit point" version, the declaration of "returnValue" is unnecessary fluff that I have to read, that is less direct and clear than just doing the returns when you need them. Unless I read the whole function (in this case mercifully short) I don't know whether "returnValue" gets used for other things during the function.

Longer functions that use the "single exit point" rule can get really convoluted and difficult to understand. Having a function return as soon as it can improves the clarity because it improves the "locality" of the code. It's much the same philosophy as behind a previous article in the series - declaring variables as locally as possible.

At a return statement you know what the code is going to do next, you don't need to read through the rest of the function to work out that none of it's relevant for that path (well, there is an exception to that - a finally clause, but that's another article).

I've seen code that has become really convoluted as a result of trying to follow the "single exit point" rule and I just can't see any benefit.

I'll tell you what I want, what I really, really want.

What I'd like is for people to think about whether having a "single exit point" is a good thing rather than just believing the accepted wisdom and taking it as true. It's become such a strong "Cargo Cult" that it'll take many years until it disappears. To me it seems as unchallengable as the idea that water goes down a plug hole clockwise in one hemisphere and anti-clockwise in the other.

Posted by ivan at 7:51 AM Copyright (c) 2004-2007 Ivan Moore | Comments (10)

March 20, 2006

Putting the tea into team

Here's the proof - many thanks to Kingsley Hendrickse for the photo.

Posted by ivan at 5:44 PM Copyright (c) 2004-2007 Ivan Moore | Comments (3)

March 12, 2006

BDD follows from TDD

BDD follows naturally from TDD. That is, Bladder Driven Development inevitably follows from Tea Driven Development.

Posted by ivan at 2:07 PM Copyright (c) 2004-2007 Ivan Moore | Comments (3)

March 4, 2006

Programming in the small - access levels.

Following on from the previous article, another programming habit I see frequently is using access levels (e.g. in Java, private, package, protected and public) that are less restrictive than possible. For example, declaring a method public that could be private.

In this article, I'll call the access levels that are less restrictive as being "more public" or "less private" etc.

Declaring class members "too public".

Probably the most frequent example of this habit is that of declaring class members as protected when they could be private. I believe that people do this because of past experiences where they wanted to be able to override some method and found they couldn't - I'll address that later.

Something that I've noticed in my travels around many client sites is a suprisingly large number of Java developers who think protected means something different than it actually does (in fact, even that private means something different than it actually does - it's class based not instance based). I won't describe what these access levels actually mean - it's well documented.

The consequences

Class members with "less public" access levels can be referenced in fewer classes than members with "more public" access levels - that's the point of access levels. That means that when reading some code, if you see that it is declared "more public" you know instantly that there are more classes that might reference (hence more dependencies) than if it's "more private".

You might guess that, for example, if you change a public method, it'll probably have an effect on some other code (in a different class) because the chances are that there is some other code (in a different class) calling the method, because it's declared public. Therefore, before changing a public method, you might want to look at references to the method and check that callers of the method are OK with the changes you are making.

If a member is declared private, then you know that only methods of that class can be using the member, so you are free to refactor private methods by only looking at code in that class.

Therefore, if a class member is declared to be "more public" than it could be, you might be put off refactoring it because of possible effects on code in other classes, or you might waste time looking for references, when if it was private you might see immediately the only references.

You could, of course, look for all references to the member, but because it's not immediate you won't do that all the time.

What about my experience where I couldn't override something because it was declared private?

This experience is often a symptom of something else:

Lack of collective code ownership

If the code that you are dealing with is all in-house, then you should be able to make a member more public when you need to. That is, make everything as private as it can be for your current needs - making it more public only if you find you need to. This fails if there is rigid partitioning of a code base. Collective code ownership is needed for this to work well in practice.

Closed source libraries

If you are writing or using a closed source library, there may be times when members have to be declared more public in order to allow for future unforseen uses of the code.

Use of inheritance instead of composition for customisation

One of the reasons libraries may need members to be declared more public than possible is if the way the library is designed is for users to extend classes using inheritance. There are many cases where using composition would be a better design.

I'll tell you what I want, what I really, really want.

It would be great if Eclipse (or other IDEs) spotted that members could be more private than they have been declared and offer to change them for you. I'm sure something like this must exist - please post here telling me. Note that I want a tool to tell me immediately, not one that has to be run separately, because any lack of immediacy will mean it won't be as useful.

A few years ago I thought of writing a tool to do this that I was going to call "Thatcher" - it would privatize as much as possible. I haven't got around to it...

Posted by ivan at 2:49 PM Copyright (c) 2004-2007 Ivan Moore | Comments (2)