Learning about estimation when you don’t care about estimates

This post follows on from the previous one, linked here – Help! How do we start estimating?

The problem with running a training course on estimation, is that there’s a danger that poor assumptions are made about the things being estimated – i.e. you could end up “assuming the problem away”. If that happens, the exercise becomes too sanitised and relatively meaningless other than in a theoretical and abstract sense. And that’s often hard to relate to.

Using analogies – e.g. the “throwing the cat game” [http://tastycupcakes.org/2016/05/throw-the-cat-and-other-objects/] can help. They show you the dangers of assumptions etc., but in all the cases I’ve seen, the variability is low-ish. You’re doing the same thing to all items.

But what if you’re doing potentially fundamentally different things on different tickets? For example, if writing a validation method with some clever logic is a “5”, what is “patching a Docker template”? With the industry using the term “DevOps” like there’s no tomorrow, the variability of work a single team will perform will inevitably rise.

We, as humans, are generally far better at comparison based estimation than comparing to an absolute measure (especially an abstract one like distance or time). We’re also very good at spotting patterns. Of if real patterns don’t exist, then inventing them (one of the reasons why the saying “correlation does not imply causation” exists – we need to be reminded!)

That innate ability to see patterns, regardless of whether or not they exist, can lead people to try and find a relationship or common attribute across entirely disparate work items. Some teams make a virtue of this, using terms like “complexity” or “relative effort”, which make reasonable sense – irrespective of work, it can be categorised as complex/simple or quick/time consuming. That allows a single team to use a single “scale” to estimate everything that they could do.

That single scale, is one of the reasons that things can get a bit confusing. With radically different types of work, one thing that can happen is that natural clusters may form. Infrastructure management tasks may hover around the 1-5 mark, development work might be 2-8, something architecturally significant might be 5-13 and spikes might be as high as a 20(*).

If that’s happening, one of a few things could be the cause

  • It could be true
  • There could be the hidden view that “development stuff” is harder than “infrastructure management” (or something along those lines) and that bias is gaming the numbers
  • Anything else I’ve not thought of right now – depends on the room, dynamics etc.

Relative estimation works well within a logical domain – there are sufficient overlaps and related attributes for a meaningful comparison. A comparison across fundamentally different domains makes far less sense. There’s even an old simile on the subject – “as different as chalk and cheese”.

To combat this, I usually recommend having a lot more than just a reference story. I suggest having a catalogue of items, from as many of the affected domains as possible, making sure we’ve got examples of a few of the magic numbers in each domain. When attempting to size an item, the first question to answer is which catalogue item is the closest match, and then go up or down however many sizes as is appropriate from there. It can take an awful lot of stress and confusion out of estimation as you’re no longer trying to shoehorn a square peg into a round hole.

There is a price to pay for this additional freedom – your velocity figures become less relevant, as you can’t really compare sprint against sprint as simply as before, in case the “mix of work” is changing. Your burnup charts may still look like they work, but scope changes are harder to visualise – some of your “scope changes” are likely to be technical debt that you’ve discovered as you could be making platform changes with no change in business vision or scope. It also takes a lot longer to create a useful catalogue, when compared to “just picking an average looking story and calling it a 5”

Teams that go through a learning process like this, usually end up realising that there isn’t a simple textbook answer, and their only viable option to cope is to be alert and have an open mind.

 

(*) All using the scale 0,0.5,1,2,3,5,8,13,20,40,100 for illustrative purposes only

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s