January 2018
« May    

Millions and billions

Spotted in this morning’s Metro:

"Nasa has contributed £250million to SpaceX's £760billion program."

Nasa has contributed £250million to SpaceX’s £760billion program.

This is misleading, because it looks like Nasa contributed this much (in blue):

What it looks like

where in fact they contributed this much (again, in blue):

What actually happened

I’m not the first one to get frustrated by this.


I can hazard cheezburger?

(alternate title: What’s a hazard ratio?)

Today I was faced with the not-entirely-straightforward task of explaining what a hazard ratio is. It’s a measure that pops up quite a lot in epidemiological statistics, when you’re conducting a study into survival times. It’s not immediately obvious, however, what it is, and when faced with the inevitable “so, what’s this hazard ratio all about then?” question, I struggled to find a good answer.

Wikipedia’s explanation is, frankly, rubbish (paraphrasing: “a hazard ratio is the ratio of the hazards”). Other attempts to explain it that I found online were similarly confusing. I found a very nice explanation online in the Pharmaceutical Statistics journal, which sadly falls down on two counts:

  • it’s still too long for “30-second elevator speech” purposes, and
  • it’s paywalled, so it’s useless to non-subscribers.

So, armed with this paper, I came up with the following even shorter explanation, which I now share with t’interwebz. I wrote it in the context of a “time-to-death” analysis, though survival analysis can be used to analyse times to almost any event you choose, not just mortality.

First of all, it would probably help to begin by explaining what a hazard is.

The hazard can be thought of as the instantaneous risk of dying at a given time point. This may vary over time, though how the hazard rises and falls over time is usually of secondary interest.

A survival analysis compares hazards between different groups of subjects in the study. One of the assumptions I made in this analysis is proportional hazards (you might encounter the phrase “assuming proportional hazards” in the epidemiological literature quite a bit). This means that we assume that although hazards may be different in different groups, and might for all we care rise/fall willy-nilly over time, when the hazard from one group changes over time, the hazard in all the other groups change by the same proportion.

In other words, the ratio between the hazards remains constant over the course of the entire study, even though the hazards themselves change over time. It’s this ratio that is the hazard ratio, and it summarizes mortality differences along the whole length of time being analysed.

I think that gives you the gist of it anyway. Feel free to rehearse it, tweak it, and drop it into your talks and lectures.


Crisis of confidence

So Crisis, a UK-based charity supporting the homeless, brought out a report highlighting differences in mortality between the homeless and the UK population in general. The headlines that accompanied it made shocking reading. “Homeless people in the UK revealed to have life expectancy of just 47” from the Guardian is a typical example, highlighting the 30 year difference between that and the equivalent figure for the UK population. More on the media later.

Let me come clean and state my conclusions right from the off. There will be swearing.


It’s not the life expectancy of the homelessness that’s 30 years lower than the UK average. It’s the average age at death. Now I know that that sounds like I’m splitting hairs over a trivial difference. Trust me, I’m not, as I will now explain.

Read the rest of this entry


How random variation fucks about with your presence of mind

we've all done it

Read the rest of this entry


Margins for error—let’s see more of them please!

Margins for error are an immensely important part of data analysis, yet are frequently ignored or misunderstood. When we make guesses, it’s very impossible to be completely certain about our guess, so they’re usually a “ballpark figure”. But similarly, making a guess isn’t usually an admission that we haven’t got the faintest idea and that we’re plucking a number at random. The question is not only “what’s the ballpark figure?” but also “how big is the ballpark?”.

Errors come in all forms. Possibly the simplest is rounding error. It’s tempting to think of seeing a figure of, say, £7.8million rounded to one decimal place and think “a-ha! this project cost exactly £7,800,000” but that isn’t quite the case. To write that the project cost £7.8million actually means to say that it cost somewhere between £7,750,000 and £7,849,999.99. That’s a whole range of £100,000 in which the exact cost could lie. Often rounding errors don’t lead to misleading figures by themselves, but add up a lot of them and the margins for error can soon mount up.

I say often they don’t mislead on their own, but last week there was a headline that did exactly that. “The UK economy shrank by more than previously thought during the last three months of 2010” reported the BBC, with similar stories in Reuters and other places. It turns out that this was due to a revision in GDP growth figures from -0.5% to -0.6% for the fourth quarter of 2010. But presented like that, we actually have very little idea what the actual extent of the revision was. It could have been from -0.54999% to -0.55001%, and would still have come up as a revision from -0.5% to -0.6%. In other words, there could have been hardly any movement at all. To be fair, by the same token it could also have been a revision of nearly two percentage points, but the point still rests: a difference of 0.1 in two figures that are rounded to one decimal place could still actually mean (to all intents and purposes) no difference at all.

Other sorts of error exist. Another one that applies to the GDP revision figures is standard error: given that we’re guessing the value of a figure based on a sample, what sort of margin for error do we expect from that? Was GDP shrinkage of 0.6% within the margin for error of the first guess? Without any information on this, it’s impossible to tell.

Another example of where knowing about a margin for error would be terribly useful but is frequently omitted as though it doesn’t matter is in the use of technology in sports officiating. A prime example arose in England’s World Cup cricket match with India earlier this week. England batsman Ian Bell was given not out despite the prediction software predicting that the ball which struck his pad would have gone on to hit the stumps. The reason for this was that the ball struck him more than 2.5 metres away from the stumps, and therefore was deemed “too far away” for the accuracy of the system to be trusted.

This rule, to a non-statistician, looks utterly ridiculous. In fact, to me, as both a statistician and an occasional sports referee, this also looks ridiculous but for different reasons. It’s all to do with margins for error.

The Hawkeye system (and similar technologies) work on a basic statistical principle. I couldn’t comment on the amazing technological wizardry they use to collect the data, but essentially, they collect lots of data on where the ball was at a lot of moments in time, judge when the ball hit an obstruction (in this case, Ian Bells leg-pad) and use that data to predict where the ball would have gone had it not hit that obstruction. There are an awful lot of moments where errors can creep into the analysis—and they will creep in. That part isn’t in question. The question is how big or small the accumulative effect of all those errors actually is. Are we talking micrometres or centimetres?

Firstly, errors can creep in during the data collection process. This process takes place between the point at which the ball lands on the floor and the point at which it hits the pad. This is the data that is subsequently used to predict the path of the ball. There will be some sort of margin for error for each time the tracking device detects where the ball is; the more times the tracker is able to do this, the more these errors will be smoothed out. In fact, there is a second rule governing when the results from the software may be called into question: the distance between where the ball hits the pitch and where it collides with the pad must be less than 40cm.

Secondly, the software has to know when to stop collecting data and start predicting. In other words, it has to accurately be able to predict when the ball collided with the batsman’s pad.

Finally, the software then has to use the data it has collected to churn out more data as to what the flight of the ball would have looked like had someone’s leg not got in the way. This means that the software would have had to apply some kind of function (I’m not a physicist, so I have no idea what that would be) to the data collected in the first stage, in order to get the predicted flight of the ball, and decide whether it would have gone on to hit the stumps. Errors may creep in here, as the function used will only be an approximation of what would have happened. Furthermore, any errors that crept in during the first two stages will be exacerbated: if the margin for error was 1mm based on the data alone then this margin will have crept up to several millimetres by the time the ball’s predicted flight gets to the stumps. This may not seem much, but given that the ball is only about 70mm wide, that’s a reasonable amount of doubt.

So it seems as though the rules-makers have brought in these “40cm” and “2.5m” limits in to try and account for margins-for-error. This is a case of right idea, wrong way of achieving it. In Ian Bell’s case, the predicted flight of the ball would have almost hit the dead centre of the wicket. Are we to assume that there is more doubt in this case than if the ball had struck him 2.4m away from the wicket and been predicted to simply graze the edge of the wicket?

The trouble is, without actually knowing the extent of the margin-for-error, there’s very little the rules-makers can sensibly do to account for it.

So anyway, back to journalism. Statistics, particularly when they’re based on guesses, need to have some kind of margin for error associated with them. It doesn’t even need to be that technical, just creating awareness that single figures might be complete guesses, subject to very rough rounding, or actually completely robust, and we as readers are left wondering which are which.


Academic bloat

As I’m coming up to the end of my PhD and the time where I finally submit that tome I’ve been writing (hopefully only three months to go!), I’m currently having to look around the job market for things to do afterwards. Yes, I’m looking for a job.

Generally this isn’t a good time to be looking for jobs, I’m told, but fortunately as a statistician looking for a research post in a university, things aren’t too bad. Except the motivation to work in a university gets utterly destroyed by today’s little gem I discovered on the academic jobs website, jobs.ac.uk.

Read the rest of this entry


Arguing about religion is difficult.

Stupidly, I got myself involved in a Guardian Comment is Free thread, on an article about how the existence of beetles should give creationists pause for thought. Articles like this bring about in me a very mixed reaction. On the one hand, the author is quite right: the evolution of the beetle should give creationists pause for thought.

On the other hand though, whether it should or not is immaterial, given that it almost certainly won’t do so.

Read the rest of this entry


A battle of red herrings: let’s move on from over-population

Tomorrow, at the Upper Gulbenkian Gallery at the Royal College of Art, the 2010 “Battle of Ideas” debate will be taking place, on the subject of “overpopulation”. The theme of the debate will be “The great population debate: too many carbon footprints?” and the speakers will be Roger Martin of the Optimum Population Trust and Brendan O’Neill, editor of Spiked magazine. The debate promises to be one where one speaker presents a doom and gloom scenario (à la Thomas Malthus) about how dreadfully over-populated the planet is based on some back-of-an-envelope ecological footprint calculations, and the other speaker claiming that the first speaker is scaremongering and that we’ve heard these predictions of disaster before, but they’ve never come to light, so we shouldn’t be worried, and population control is coercive and unethical.


Read the rest of this entry


“More research is needed.” No it isn’t.

A couple of months ago, at around 11:30pm one evening, I had an idea for a journal article. The next morning, I started writing it, and by 11:30pm that evening, working non-stop, I’d finished. The article was in the form of a systematic review: a type of study that usually takes months of painstaking study of a multitude of medical databases. This one was somewhat less grand in scale: it took a mere 24 hours from inception to first draft. I then told the delightfully helpful Adam Jacobs from Dianthus Medical Ltd about it on Twitter, and he agreed to take a look at it. Less than a week later, following his suggestions, I had a manuscript ready to be submitted for publication in a scientific journal.

The reason for this particular review taking so little time to complete was that instead of painstakingly poring through large numbers of scientific articles in the analysis, my analysis took place on a grand total of zero articles. The reason for this was that in my analysis, I searched for literature on an utterly implausible intervention for treating a completely fictional disease, whose mechanism for working was based on an utterly ridiculous and made-up theory with absolutely no basis in reality.

My guess is that this is also the reason why my manuscript was rejected, which, I guess, is fair enough. But oddly enough, systematic reviews of this type do exist, and do get published in medical journals. Mostly, to be fair, the ailments involved in these studies are real. But the interventions and the underlying theories that describe how they are supposed to work are, well, real in the sense that some people have been known to try them, but based on little more than an odd combination of folklore and belligerence. I am, of course, talking about systematic reviews of complementary and alternative medical products and services.

Edzard Ernst, from the Peninsula School at Exeter University, is involved in the writing of many of these. In fact it was through a message on his Twitter account drawing attention to one such review that I got the idea to write this article. In conducting these reviews, he and his colleagues are doing a fantastic job in synthesising the scientific literature on a whole range of products and ailments. Occasionally, for very particular pairings of intervention and ailment (such as osteopathy and lower back pain, or St. John’s Wort and depression), they find that the evidence supports the claims. More often than not, however, the evidence does not back up the claims made—even if individual studies, when cherry-picked out of the body of literature, appear to be supportive.

Overall though, there is one criticism I have of systematic reviews of CAM treatments (and this is what I aimed to highlight with this article): it’s that they tend to say that “more research is needed” when faced with negative evidence, when in fact, the reverse is true. This problem is usually highlighted in the introductory paragraphs which describe the disease, the treatment, and how it works in theory. This last bit is the crucial part, since very little attention is paid to it in the writing of the review. This strikes me as odd: anyone could just make a theory up out of thin air, perform a systematic review, find no clinical trials investigating this made up theory, and conclude that more clinical trials are needed. This is what I’ve done in this article.

I argue that this is also exactly what has been done in some of these systematic reviews, and that this approach risks conferring a sense of legitimacy (via a “the jury is still out” message) onto products and services that really don’t deserve it.

Enough of the introductions, though. Here is my article, in self-published form. Thoughts and comments welcome.


Speed cameras don’t cause road casualties. Tell your friends.

Updated 23 July 2010.

Last week, the Taxpayers’ Alliance and the Drivers’ Alliance brought out a statistical report claiming that the introduction of speed cameras had failed in reducing the number of road accidents. A number of choice quotes about how speed cameras were nothing more than money-making conspiracies and how this analysis proved that speed was not the main cause of road accidents accompanied the press releases of this report.

The report was again featured on the BBC One O’clock News this lunchtime on a story about Swindon council’s decision to axe speed cameras in the town, in the context of budget cuts in local government. Quite how this made it into today’s news is anyone’s guess, given that Swindon council made that decision two years ago. [Update – the news item is now up on the BBC website.]

The TPA's statistical report on the BBC

Anyway, what leapt out at me was a shot of a local resident [update: actually, on seeing the report again, it turns out to be the reporter] being shown a graph that claimed to show that the introduction of speed cameras had actually slowed the rate of decrease in the number of traffic accidents in the UK.

Read the rest of this entry