Data for various papers writen by Stephen Senn


These data are provided free. No guarantee is provided for their correctness and no liability is accepted for their use.

For the moment I am too lazy to add much by way of description. You will have to read the papers referenced.

Notes regarding sources for some of the datasets

Datasets, Barbacki, Cushny and Wiebe are available elsewhere in the literature. See
Barbacki, S, Fisher, RA. A test of the supposed precision of systematic arrangments, Annals of Eugenics 1936; 7: 189-193.
Cushny, AR, Peebles, AR. The action of optimal isomers. II. Hyoscines, Journal of Physiology 1905; 32: 501-510.
Wiebe, GA. Variation and Correlation in Grain Yield among 1,500 Wheat Nursery Plots, Journal of Agricultural Research 1935; 50: 331-357.
The other data are probably only obtainable from my papers

Acknowledging the data

If you use these data, please acknowledge their source. (By citing the original publication.) Thank you.

The data

Name PaperFile
Accidents LORASS Senn, SJ, Collie, G.(1988) Accident blackspots and the bivariate negative binomial, Road Traffic Engineering and Control 29: 168-169. Accident
Atkins and Fisher Senn, SJ.(2006) An early "Atkins' Diet": RA Fisher analyses a medical "experiment", Biometrical Journal 48: 193-204. Atkins
Barbacki and Fisher Senn, SJ (2004). Added Values: Controversies concerning randomization and additivity in clinical trials, Statistics in Medicine 23: 3729-3753. Barbacki
Birth data from a GP's visiting diary 1916 Senn, SJ (1996) A general practitioner's obstetric diary, Student 1 205-210 and also Senn(1979), SJ. A sixty year old medical record, Medical Record and Health Care Information Journal 20: 528-531. . Birth
Cushny and Peebles Senn, SJ, Richardson, W.(1994) The first t-test, Statistics in Medicine 13: 785-803. Cushny
GS20Senn, S. J. (2002), Cross-over Trials in Clinical Research (Second ed.), Chichester: Wiley. GS20
Network meta-analysis in diabetes Senn, S. J., Gavini F, Magrez, D. & Scheen, A (2013), "Issues in performing a network meta-analysis" Statistical Methods in Medical Research, 22, 169-189. Network
The Shumaker and Metzler Phenytoin data Senn, S. J. (2001), "Individual Therapy: New Dawn or False Dawn," Drug Information Journal, 35, 1479-1494. Phenytoin
RussiaSenn, S. J. (2002), Cross-over Trials in Clinical Research (Second ed.), Chichester: Wiley. Russia
Incomplete blocks cross-over Senn, SJ, Lillienthal, J, Patalano, F, Till, MD (1997) An incomplete blocks cross-over in asthma: a case study in collaboration, in Cross-over Clinical Trials, Vollmar, J., and Hothorn, L. A., Eds., Fischer, Stuttgart, pp. 3-26. Selipati
Tunbridge Wells Life Tables Senn, SJ (2003) Dicing with Death, Cambridge University Press: Cambridge. T_Wells
Wiebe Senn, SJ (2004) Added Values: Controversies concerning randomization and additivity in clinical trials, Statistics in Medicine 23: 3729-3753. Wiebe
18 TrialsBakhshi A, Senn S, Phillips A. Some issues in predicting patient recruitment in multi-centre clinical trials. Statistics in Medicine. 2013; 32(30):5458-5468. 18 Trials
SunscreenSenn S, Schmitz, S. Schritz, A, Salah, S. Random main effect of treatment: a case study with a network meta-analysis. Biometrical Journal. 2018; (submitted). Sunscreen data

Notes on the data sets

Accident Bivariate frequency. Road sites in Lothian Region are cross-classified by the number of accidents in each of two two year periods: 1979/1980 & 1981/1982. The rows refer to 1979/1980 and the columns to 1981/1982 an the entries in the table are the number of sites with the given numbers of accidents in the two periods. Thus the cell corresponding to row 2, column 4 has an entry of 19 meaning that there were 19 sites which suffered two accidents in 1979/1980 and 4 in 1981/1982. Overall the data are for 3112 sites which had 3036 accidents in 1978/1980 and 3001 accidents in 1981/1982.
Alternatively, the second sheet in the workbook has the data in terms of three columns of length 3112. The first column is an (essantially irrelevant and arbitrary) site identifier running from 1 to 3112. The second column gives the number of accidents at that site in 1979/1980 and the third gives the number of accidents at that site in 1981/1982.

Atkins Data on time to vitamin C saturation for 7 groups of men reproduced from letter 3 Atkins to Fisher 3 December 1943. There are seven companies J, K, L being in one location and M, N, P, Q in another. P and Q are the groups A and B from Atkins and Fisher (Atkins & Fisher, 1944) and M & N together form the infantry unit otherwise referred to. A copy of the page of the letter giving the data can be found here Atkins letter

Barbacki These are the data that Barbacki and Fisher created based on the Wiebe data. The figures highlighted in bold are the ones where I disagree with Barbacki and Fishber's calculations. See my paper 'Added Values' for an explanation.

Birth Extracts from a general practitioner's visting diary for 1916. Variables are Date engaged, Date Birth expected, Date attended, Sex of child, Order of birth, fee charged in shillings. Based on street names, the GP was probably based in Harrow. A copy of a page can be found here GP Diary

Cushny and Peebles data Data for 11 inmates at the insane assylum at Kalamazoo. Three treatments, two optical isomers of the same drug and another drug, are compared to no treatment. The data give the average number of hours of sleep as well as the number of nights on which the means are based.

GS20 The data are peak expiratory flow (PEF) measured in litres per minute over 12 hours for a number of children on two occasions in a cross-over comparing salbutamol to formoterol. The data appear in my book Cross-over Trials in Clinical Research but the results of the trial were presented in Graff-Lonnevig, V., and Browaldh, L. (1990), "Twelve Hours Bronchodilating Effect of Inhaled Formoterol in Children with Asthma: A Double-Blind Cross-over Study Versus Salbutamol," Clinical and Experimental Allergy, 20, 429-432.

The data are presented as an Excel workbook with a number of sheets. The sheet PEF gives the data in the form of repeated measures. PEF1 1 to PEF1 16 are the values recorded on 16 occasions on day one and PEF2 1 to PEF2 16 are the corresponding values for day 2. Sequence gives the order in which the treatment formoterol and salbutamol were given. The sheet Dates gives information on the dates patients were seen and the sheet Times gives the planned elapsed times after the start of each day on which measurements were taken. Other sheets are derived values.

Network These data are for a series of trials in diabetes. "The objective of the network meta-analysis in Diabetes considered as example was to estimate the relative effects on HbA1c change, of adding different oral glucose-lowering agents to a baseline ulfonylurea therapy in patients with type 2 diabetes." Each row of the data corresponds to an arm of the trial. There were 25 two-armed studies and one three-armaed study, thus there are 53 arms in total. The data were also incorporated in contrast form in the R package: Rücker, G Schwarzer, G. Krahn, U König, I R Package, netmeta: Network Meta-Analysis using Frequentist Methods. 2017. Accessed 11.11.2017. The data are also presented with the treatment definitions expanded to include dose. Thus four different sheets corresponding to the conbinations of original treatments/expanaded treatments and means per arm/contrasts are provided. There is also a brief "Read Me" statement.

Phenytoin. Data are area under the curve (AUC) and concentration maximum values (Cmax) for two formulations of Phenytoin compared in four periods. These data appeared originally in Shumaker, R. C., and Metzler, C. M. (1998), "The Phenytoin Trial Is a Case Study of "Individual Bioequivalence", Drug Information Journal, 32, 1063-1072,.

Russia Data are values of forced expiratory volume in one second taken after an exercise test for a three period cross-over trial comparing formoterol, placebo and salbutamol. The results of the trial were presented in Tsoy, A. N., Cheltzov, O. V., Zaseyeva, V., Shilinish, L. A., and Yashina, L. A. (1990) European Respiratory Journal, 3, 235s but the data are given in Cross-over Trials in Clinical Research.

Tunbridge Wells Life table by five year age groups for each sex for 1971. This was an exercise I undertook for my first job. I was Medical Information Office for the Tunbridge Wells Health District 1975-1978. The life table is constructed from deaths in 1970, 1971 and 1972 weighted to produce a mid point corresponding to the census of 25/26 April 1971. Some of these data have been used as illustrative material in my book Dicing with Death

SELIPATI: Incomplete blocks cross-over Data are log area under the curve for forced expiratory volume in one second(FEV1) collected over 12 hours for an incomplete blocks cross-over comparing three doses of two formulations of formoterol to each other and placebo (seven treatments in all). Patients were given five of the seven treatments. Variables are Patient=patient number, Period (1,2,3,4 or 5), Treatment (ISF or MTA formulation, 6, 12 or 24 micro g or placebo), Baseline = log( FEV1) and AUC =log(AUC(FEV1)). See this powerpoint presentation for an introduction.

Wiebe The data are from a uniformity trial published originally by Gustav Wiebe and re-analysed by Barbacki and Fisher. The data presented here are the original figures by Wiebe and give yield in grams of Federation wheat. The field was sown in 125 rows arranged in 12 series. These are shown as 12 columns and 125 rows in the Excel spreadsheet. For a detailed description please see my paper in Statistic in Medicine.

18 Trials Data showing patients recruited for 18 trials. Each row corresponds to a centre. The first column gives the trial the centre was involved in, the second column gives the number of patients recruited and the third column gives the number of days over which recruitment too place. There were 1680 centres in total in the 18 trials, so there are 1680 rows in the data set.

Sunscreen Quoted from the paper:
Begin citation The studies were designed to evaluate the anti-pigmentation activity of some products on ultra-violet (UV) induced pigmentation. All studies were within subject but were unusual compared to common within-subject studies in medicine in that the treatments were not given to the same subject on different occasions but applied to different delimited areas of the skin. For a given trial, all the products being compared were applied to each subject. The subject's back was divided into 4x4cm non-adjacent areas to which the treatments being compared could be applied, allocation being at random. Within each area a 2x2cm sub-area was exposed to UV-daylight with a 0.75 Minimal Erythema Dose, definition of the MED is subject specific. Thus the basic design was that of a randomised block trial.
There were 10 studies involving 44 treatments in total. Each trial compared between 4 and 8 different treatments tested on 14 to 34 subjects (Table 1). Product application started from day 1 to the end of the trial and UV exposure from day 8 to day 11. Colorimetric measurements were made at baseline and from day 5 to the end of the trial. The analysis presented here is of the results at day 15. End citation

The data in the sheet give least square means, standard errors and number of patients treated by trial and treatment. The treatment indicator value will have to be used to identify by position which treatment a given least squares mean and standard eror corresponds to.


Any comments? Please contact Stephen Senn


Go to Stephen Senn's pharmaceutical statistics links
Go to Stephen Senn's homepage

This page last updated 16 June 2024