November 2015
How random variation fucks about with your presence of mind

This is a table I knocked up in Excel (other spreadsheet programs are available) to show how even a little bit of random variation can make spotting underlying trends really difficult, particularly if we look at data only selectively in little bits.

The data could represent a whole range of narratives; tumour size in a cancer patient (e.g. in testimonials of the kind produced by the chap who instigated this story), body weight during a dieting programme, global temperature (on a rather different timescale!), or just about anything to do with sports or the stock market (as XKCD would attest to).

Column 1 is time. I’ve called it “Days”. Frankly, depending on context, it could be anything: hours, months, years, decades.

Column 2 is the obvious trend with no random variation added. Our outcome (whatever it is) starts at 5 and goes up by 0.1 each day.

Column 3 is exactly the same but with some random noise added. Each data point is adjusted at random by anything up to 0.5 in either direction.

Column 4 is an edited version of Column 3 with some, er, rather optimistic comments added. They’re not unreasonable conclusions, based on those little data snippets.

Moral: if we really want to find a particular pattern in noisy data, we’ll find it. It doesn’t mean we’ve found the signal rather than the noise though.

Update: Tables are a bit visually crap, so here’s the same data in a graph:

Positive mental attitudes in graph form


