Power Point

Guernsey McPearson

Panic at Pannostrum. It seems that Sir Lancelot Pastit's much-vaunted pipeline is but a pipedream. We have products failing left right and centre and the fact that we have had to pull Redybrex from the market has sent the share price into freefall. Pannostrum Investor Supporter Section (a group in our marketing department) together with Public Outreach Operations and Relations (another such group; the two always seem to go together) have been working overtime trying to play down the damage and talk up our prospects. Personally, I think the most positive contribution we could get these people to make to the company would be to have them resign and save not only their salaries but also the mayhem they cause. What we need to do is find new drugs that work, not find new things to say about drugs that don't.

However, imagine my surprise, when summoned to a meeting with the marketing groups in question to discuss what to do with our latest failure (the CLOT trial of Thrombgon), to find the talk of statistics. And I don t mean statistics of sales we might make if only the latest hoped for block-buster would work. I mean statistics of clinical trials and, not just simple stuff like means and medians but, of all things, power.

Now I occasionally get called upon to give in-house courses in which I am supposed to explain statistics to the numerically challenged, which is just about everybody who works for Pannostrum with the exception of some members of the statistics department. We have a go at explaining P-values, confidence limits, that sort of thing, but the one they have most difficulty with is power. This is hardly surprising. Some of the stats department seem to have some problems with it too. There was a time for example, when every single report we got from our office in Medicine Springs would include, for any non-significant result, a retrospective power calculation informing us what the probability was that we would have failed to find a difference between treatments if the difference was exactly that which caused us to fail to find it. This produced an extraordinary sequence of reports regarding failed trials in which it turned out that the power had always been less (often much less) than 50%. In fact, of course, 50% or indeed any percent is a gross overstatement. Fed up with this idiocy, I actually proposed the following law, which I modestly entitled McPearson's Law of Power. The probability of success for any trial for which a retrospective power calculation has been calculated is zero. Think about it and you will see that it is true.

So anyway, to return to the meeting. This was the usual weekly marketing strategy powwow, with Thrombgon item number 3 on the agenda. I was not required for the whole meeting but required to be on call to turn up when they were ready. As luck would have it the Redybrex fiasco, which was item 2, dragged on and on. I was scheduled for 15:00. By 16:00 I thought that I was pretty safe and that they had abandoned the Thrombgon issue and had wandered down the corridor for a cup of coffee and was having a very pleasant chat with the charming young lady who has just been appointed to our programming support section, when our departmental secretary appeared looking rather flustered bearing the news that my presence was requested urgently.

I pride myself on my punctuality, so it put me very much on the back foot to find a room full of suits waiting with some irritation for my arrival. Rod 'Blast' Furnace, (rumour has it that the sobriquet has as much to do with his consumption of 'smokeless fuel' as it does with his surname) a member of Public Outreach Operations and Relations seemed to be in the middle of a presentation, since he was standing by the screen.

It also didn't help my sang-froid that as I arrived Dr Angina Cutter (see SPIN passim), the project leader, was sitting rather cosily next to Clive Viper a member of Pannostrum Investor Support Section, and that they were not talking about P-values, about which that group, in my opinion is naturally qualified to talk, but also power on the basis of a slide that Furnace was projecting.

At least, I thought it was a slide when I first saw it, thus is the power of prior prejudice. I should have been warned, however, by the fact that there were no bullet points, no graphs divided using a vertical and a horizontal line into four regions, no bullshit bingo phrases (pushing the envelope, thinking outside the box etc.) and no graphs with misleading axes, just numbers. Imagine my horror when I realised that I was looking at a live projection of a calculation using N-Power , a nice piece of software but unfortunately, so easy to use that any idiot can calculate a power with it and frequently does. On this occasion, the idiot in question was Furnace.

On seeing me enter, Angina gave me the benefit of one of her sweetest smiles. "Ah, Guernsey," she trilled, "So glad you could join us. Rod, here has been giving us the most fascinating analysis of the CLOT results."

An aggressive and unpleasant voice cut in. "Yes. Your power is too low, McPearson. It's only just over 20%." This was Viper speaking, a real snake in the grass if ever there was one, although with the sort of lifestyle he appeared to aspire to it wouldn't surprise me if there was often a lot of grass in this snake. I turned to the screen, which was projecting a table that looked something like this.

Significance level 0.05
1 or 2 sided test 2
Control proportion 0.08
Test proportion 0.07
Power ( % ) 22
n per group 2000

"And where exactly did you get these from?" I said. "Oh, said Angina, this is a wonderful idea of Clive's" , and she turned to gaze fondly at him. "And mine," added Furnace, with some irritation. "Oh yes, of course. They came up with it together. It's awfully clever. I can t think why we ve never used it. They just put the figures from CLOT into the power software to see what the power is. You see it s too low. The reason the trial failed is that the power is too low."

"Most interesting", I said. "You used the sample size and the observed proportions and then calculated the power. These are the data," I consulted my notes, "that gave us a P-value of 0.25. There were 140 DVTs in the Thrombgon group and 160 in the placebo group. Am I right?"

"Absolutely right," said Viper. "You screwed up McPearson. The power is too low. That's the reason the trial failed."

"Let me understand this," I replied, "you would accept the negative result if only the power were higher?"

"Yes, but the power's too low. The trial is useless."

"So your position is that the higher the power, the more inclined you are to believe the negative result."

"That's right", said Furnace. "Indubitably," said Viper. "But surely that's reasonable," said Cutter.

Speaking of power, I had powered up my laptop by now and had been playing around with some figures. "Well let's see," I said, "what happens to your nice little calculation if we keep all the parameters the same except the Thrombgon proportion and make that 0.064."

"Easy peasy," said Furnace, and produced the following table, which he projected on the screen.

Significance level 0.05
1 or 2 sided test 2
Control proportion 0.08
Test proportion 0.064
Power ( % ) 49
n per group 2000

"Very interesting," I said, "let me calculate what the P-value would be with the corresponding figures of 160 DVTs under placebo and 128 under Thrombgon using this handy software I have here." I was referring to that well known program for calculating significance for exact tests, P-Precise . "Well fancy that, I said, it seems that the P-value is 0.058. I believe that this is what Dr Cutter would describe as, a trend towards significance .

"So," said Viper, "Your point is?" said Furnace. "I do hope that you're not being negative." said Angina, "I think that these power calculations are very helpful."

"Well," I said, "let me summarise. The smaller the P-value, the more credence we give to the possibility that the treatments are not equal. This habit is certainly not without its critics but I can't ever recall any of the wonderful medical scientists we have working for Pannostrum, nor any of the inventive market," here I paused at somewhat of a loss as to what to say next, "ears" I added, "having claimed the contrary. On the other hand, I am led to believe that if a trial is negative, you are more inclined to believe the result if the retrospective power is high. However, there seems to be a contradiction if higher power means equivalence, since the case with the lower P-value has the higher power." (Note from the Editor:. McPearson s argument here is strangely reminiscent of a fine paper in The American Statistician1.)

"So what?" said Viper. "It's non-significant, see. It's amongst the trials that are not significant that you have to compare power. "

" So what sort of retrospective power do you find acceptable ?"

" Fancy asking us that!" crowed Furnace. "Yes," added Viper, "aren't you always banging on about how we need 80% power?"

"So would you please type in a value of 0.057 for the Thrombgon group ?"

"Where did that come from?" said Furnace, typing in the figure and obtaining the answer 80%. "Yes where?" added Viper. "Gosh that sounds awfully familiar," said Angina.

"It should." I added, "It's the value you had me write in the protocol after lengthy discussion with the Marketing Department. It just so happens that the placebo rate is just as we anticipated but unfortunately the Thrombgon rate is not. However, if the observed proportion in the Thrombgon group had been equal to 0.057 the P-value would actually have been 0.005. In fact, you can't have a retrospective one-sided power of greater than 50% if the result is not significant."

"So what's your explanation for the CLOT trial?" said Viper, "Yes, what?" added Furnace. "It is rather baffling," said Angina.

"No," I replied, "it's actually very simple and not particular surprising. The drug is quite possibly acronymical."

"Acronymical???"

"Yes," I replied, "acronymical in the sense that it is clearly perfectly suited to the departments of Pannostrum Investor Supporter Section and Public Outreach Operations and Relations. Or, to put it another way. Why is Thrombgon like Pannostrum Marketing?"

"Why?" they all said together.

I summoned up every last drop of scorn I could muster from my not inconsiderable reserves and said, "because it hardly works at all."

Reference

1. Hoenig JM, Heisey DM. The abuse of power: The pervasive fallacy of power calculations for data analysis. American Statistician 2001;55(1):19-24.

Return to Guernsey McPearson Prose

Return to Guernsey McPearson Homepage