In 1949, 126 coalminers, chosen to cover the whole radiological range of coalworkers' pneumoconiosis, were subjected to a battery of physiological tests of lung function, clinical investigations, and anthropometry. During the ensuing 12 years 40 of these men died. An analysis of the original measurements was carried out in an attempt to identify the particular features of the disturbance of lung function associated with pneumoconiosis which were most related to the rate of dying.
The men were grouped into seven classes—survivors, and those dying within the six two-year intervals of the 12-year follow-up period. The variation of the means of these classes, for each of the measurements made, then has six degrees of freedom, which implies that six uncorrelated measurements would suffice to describe it completely. Eighteen measurements were in fact available for each man, all of course highly intercorrelated and so in part providing redundant information. The method of analysis adopted was to find a linear combination of the 18 measurements which discriminated best between the seven classes, a second combination, uncorrelated with the first, which was the second best discriminator, and so on. Six such combinations, canonical variates, would then contain all the information provided by the variation of the seven classes, and the relative magnitude of the contribution made by each original test to each canonical variate would indicate the relevance of this aspect of the men's health to their survival time.
One canonical variate absorbed 75% of the variation between the seven classes. This variate was, moreover, virtually the same as the variate which best discriminated between those who died and those who survived. Its main ingredients were the amount of progressive massive fibrosis and the size of the residual volume. A second variate, absorbing a further 12% of the variation, was of uncertain significance but could be interpreted as being related to the presence or absence of pulmonary emphysema.
Discrimination between the times of survival of those who died was most markedly related to two clinical indices, the E.S.R. and the systolic blood pressure, none of the indices of respiratory performance contributing much. Those who died within two years of the initial study were well separated by the canonical variate from those who died between two and four years, and from those who survived over four years, but there was no sepration of those who died during the final eight years of the study.
A complete re-analysis of these data using logarithmic transformations of each characteristic produced very few changes, which suggests that the results are not very dependent on the distributional assumptions involved.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.