The NIHR RDS for the East Midlands / Yorkshire & the Humber 2009 5
Sampling
2. The representative sample
It is an explicit or implicit objective of most studies in health care which ‘count’
something or other (quantitative studies), to offer conclusions that are
generalisable. This means that the findings of a study apply to situations other
than that of the cases in the study. To give a hypothetical example, Smith and
Jones’ (1997) study of consultation rates in primary care which was based on
data from five practices in differing geographic settings (urban, suburban, rural)
finds higher rates in the urban environment. When they wrote it up for publication,
Smith and Jones used statistics to claim their findings could be generalised: the
differences applied not just to these five practices, but to all practices in the
country.
For such a claim to be legitimate (technically, for the study to possess ‘external
validity’), the authors must persuade us that their sample was not biased: that it
was representative. Although other criteria must also be met (for instance, that
the design was both appropriate and carried out correctly - the study’s ‘internal
validity’ and ‘reliability’), it is the representativeness of a sample which allows the
researcher to generalise the findings to the wider population. If a study has an
unrepresentative or biased sample, then it may still have internal validity and
reliability, but it will not be generalisable (will not possess external validity).
Consequently the results of the study will be applicable only to the group under
study.
It is essential to a study’s design (assuming that study wants to generalise and is
not simply descriptive of one setting) that sampling is taken seriously. The first
part of this pack looks at how to gather a ‘representative’ sample which gives a
study external validity and permits valid generalisation.
However, there is a second issue which must be addressed in relation to
sampling, and this is predominantly a question of sample size. Generalisations
from data to wider population depend upon a kind of statistic which tests
inferences or hypotheses. For instance, the t-test can be used to test a
hypothesis that there is a difference between two populations, based on a sample
from each. To give an example, we select 100 males and 100 females and test
their body mass index. We find a difference in our samples, and wish to argue
that the difference found is not an accident (due to chance), but reflects an actual
difference in the wider populations from which the samples were drawn. We use
a t-test to see if we can make this claim legitimately.
Most people know that the larger a sample size, the more likely it is that a finding
of a difference such as this is not due to chance, but really does mean there is a
difference between men and women. Many quantitative studies undertaken and
published in medical journals do not have a sufficient sample size to adequately
test the hypothesis which the study was designed to explore. Such studies are,
by themselves, of little use, and -- for example in the case of drug trials -- could
be dangerous if their findings were generalised.
We will consider these issues of sample size, and how to calculate an adequate
size for a study sample in the second half of this pack. Before that, let us think in
greater detail about what a sample is.