A users guide
Alejandro R Jadad
3 Bias in RCTs: beyond the sequence generation
The main appeal of the randomised controlled trial (RCT) in health care derives from its potential for reducing selection bias. As discussed in Chapter 1, researchers expect that randomisation, if done properly, can keep study groups as similar as possible at the outset, thus enabling the investigators to isolate and quantify the effect of the interventions they are studying and control for other factors. No other study design allows researchers to balance unknown prognostic factors at baseline. Random allocation does not, however, protect RCTs against other types of bias.
During the past 10 years, important, albeit isolated, research efforts have used RCTs as the subject rather than the tool of research. These studies are usually designed to generate empirical evidence to improve the design, reporting, dissemination, and use of RCTs in health care.1 These studies have shown that RCTs are vulnerable to multiple types of bias at all stages of their lifespan. Random allocation of the participants to different study groups only increases the potential of a study to be free of bias. Therefore, as a reader, you must be aware that most RCTs are at risk of bias and that bias can arise from many sources, including yourself. It is against this background that I would like to devote this chapter to the concept of bias, highlighting its main sources and sharing with you some strategies that could help you identify it and minimise its impact on your decisions.
In the lay literature, bias has been defined as opinion or feeling that strongly favours one side in an argument or one item in a group or series; predisposition; prejudice.2 In health care research, however, bias is defined as any factor or process that tends to deviate the results or conclusions of a trial systematically away from the truth.3-6 This deviation from the truth can result in underestimation or exaggeration of the effects of an intervention. As there is usually more interest in showing that a new intervention works than in showing that it does not work or that it is harmful, bias in clinical trials usually leads to an exaggeration in the magnitude or importance of the effects of new interventions. Unlike the lay meaning of bias, bias in health research should not be associated immediately with a malicious attempt of investigators, funders, or readers to bend the results of a trial. Although bias can be introduced into a trial voluntarily, it is probably involuntary in most cases.
Bias can occur in a trial during the planning stages, the selection of participants, the administration of interventions, the measurement of outcomes, the analysis of data, the interpretation and reporting of results, and the publication of reports.3 Bias can also occur when a person is reading the report of a trial.4
The main reason to anticipate, detect, quantify, and control bias is that the true effects of any health care intervention are unknown. In fact, the whole purpose of RCTs, as well as any other study or research enquiry, is to produce results from a sample of participants that could be generalised to the target population at large. It is impossible ever to know for sure whether the results of a study are biased, simply because it is impossible to establish whether such results depart systematically from a truth that is unknown. Despite this major limitation, many possible sources of biases have been recognised over the years. The existence of most of these biases is supported mainly by common sense. Few studies have been designed specifically to generate empirical data to support the existence of different types of bias and to quantify them. This remains at the same time one of the most exciting and important areas of methodological research, and one of the most neglected.3
In this chapter I focus on biases that relate directly to RCTs and bring to your attention any empirical methodological studies that support their existence.
Traditionally, discussions on bias focus on biases that can occur at any point during the course of a trial, from the allocation of participants to study groups, through the delivery of interventions and the measurement of outcomes, to the interpretation and reporting of results. Other types of bias that tend to receive less attention can also, however, have a profound influence on the way in which the results of RCTs are interpreted and used. These biases can occur during the dissemination of a trial from the investigators to potential users, or during the uptake of trial information by potential users of the trial.
To illustrate how bias can affect the results of an RCT, I would like to invite you to focus on the following hypothetical scenario:
Imagine a new drug for the treatment of multiple sclerosis, which has shown promising results in animal studies and in phase I trials. These results, which suggest that the drug can delay the onset of severe motor compromise, have been widely publicised by the media during the past three months. As a result of these results, patient advocacy groups are putting pressure on the Government to make the new drug available as soon as possible. As multiple sclerosis is a debilitating disease that affects millions of people world wide and for which there is no known cure, the investigators (all clinicians who have dealt with multiple sclerosis patients for years), the company producing the new drug (which has invested millions in developing the drug), the media (interested in confirming the results that they so widely publicised), and the potential participants (patients with multiple sclerosis who have been waiting for an effective treatment to be discovered) are all interested in finding the new compound effective. After many intense sessions debating the course of action, a multidisciplinary task force created by the Government, including consumer representatives, agrees that the next step should be a randomised clinical trial. A research protocol is produced by another multidisciplinary panel of investigators and consumers, and a well known research group at a large health care centre is selected to conduct the study.
Elements of this hypothetical scenario will be discussed in the sections that follow.
A perfectly randomised method to allocate participants to the study groups does not, however, protect an RCT from selection bias. Selection bias can be introduced if some potentially eligible individuals are selectively excluded from the study because of prior knowledge of the group to which they would be allocated if they participated in the study. Imagine that the investigator in charge of recruiting patients for the multiple sclerosis trial thinks that depressed patients are less likely to respond to the new drug. The trial is not designed to detect depression in the participants and he is the only person with access to the allocation sequence (which has been generated by computer and is locked in his desk). This investigator could introduce bias into the trial, knowingly or unknowingly, just by making it more difficult for depressive patients to receive the new drug. He can achieve this in at least two ways: first, he can make depressive patients allocated to receive the new drug fit exclusion criteria more easily and more frequently than if they had been allocated to the placebo group; second, he can present information on the trial to depressive patients allocated to receive the new drug in such a way that they would be discouraged from giving consent to participate in the trial. At the end of the trial, if the investigator was right, and depressive patients were less likely to respond to the new drug, the trial will show an exaggerated effect of the new drug during the treatment of multiple sclerosis, as a result of the disproportionate number of depressive patients in the placebo group.
There is empirical evidence confirming that the effects of new interventions can be exaggerated if the randomisation sequence is not concealed from the investigators at the time of obtaining consent from prospective trial participants.7,8 One study showed that trials with inadequate allocation concealment can exaggerate the effects of interventions by as much as 40% on average.8 The irony is that allocation concealment is a very simple manoeuvre which can be incorporated into the design of any trial and which can always be implemented.
Despite its simplicity as a manoeuvre and its importance in reducing bias, allocation concealment is rarely reported and perhaps rarely implemented in RCTs. A recent study showed that allocation concealment was reported in less than 10% of articles describing RCTs published in prominent journals in five different languages.9 This does not mean necessarily that allocation is not concealed in 90% of RCTs. In some cases allocation may have been concealed, but the authors, peer-reviewers, and journal editors were not aware of how important it was to mention it (it takes about a line in the report of an RCT, so space limitation is not a good excuse!). If, however, in most cases in which allocation concealment has not been reported, it has not been done, then the majority of RCTs are at risk of exaggerating the effects of the interventions they were designed to evaluate.
Even if the report of an RCT states that efforts were made to conceal the allocation sequence, there are many ways in which randomisation can be subverted by investigators who want to break the allocation code before they obtain consent from prospective trial participants.10 Even when the allocation codes are kept in sealed opaque envelopes, investigators, for instance, can look through the envelopes using powerful lights or even open the envelope using steam and reseal it without others noticing.10
The corollary is that it is very easy to introduce selection bias in RCTs and that readers should not get a false sense of security from a description of a study as randomised. Perhaps the only way to help RCTs achieve what they have set out to achieve is through intensive educational campaigns to increase awareness among investigators, peer-reviewers, journal editors, and users about the importance of adequate randomisation, adequate allocation concealment, and adequate reporting of both.
The best way to protect a trial against ascertainment bias is by keeping the people involved in the trial unaware of the identity of the interventions for as long as possible. This is also called blinding or masking. The strategies that can be used to reduce ascertainment bias can be applied during at least two periods of a trial. The first period includes the time during which data are collected actively, from the administration of the interventions to the gathering of outcome data. The second period occurs after data have been collected, from data analysis to the reporting of results.
It is important that you recognise the difference between biases that are the result of lack of allocation concealment and biases that arise from lack of blinding. Allocation concealment helps to prevent selection bias, protects the randomisation sequence before and until the interventions are given to study participants, and can always be implemented.8 Blinding helps prevent ascertainment bias, protects the randomisation sequence after allocation, and cannot always be implemented.8
How can ascertainment bias be reduced during data collection?
In ideal circumstances, ascertainment bias should be reduced by blinding the individuals who administer the interventions, the participants who receive the interventions, and the individuals in charge of assessing and recording the outcomes. In most cases, the interventions are either administered and assessed by the same group of investigators, or self administered by the study participants. Therefore, the degree of blinding required to reduce ascertainment bias during data collection is usually achieved by double-blind trials (see Chapter 2).
The importance of blinding has been confirmed in empirical studies. It has been shown, for instance, that open studies are more likely to favour experimental interventions over the controls11 and that studies that are not double-blinded can exaggerate effect estimates by 17%.8 Despite the empirical evidence available, and common sense, it has been shown recently that only about half of the trials that could be double-blinded actually achieved double-blinding.12 Even when the trials are described as double-blind, most reports do not provide adequate information on how blinding was achieved or statements on the perceived success (or failure) of double-blinding efforts.12-14
The best strategy to achieve blinding during data collection is with the use of placebos. Placebos are inert substances that are intended to be indistinguishable from the active interventions. To be successful, placebos should be identical to the active interventions in all aspects, except for the components of the active intervention that have specific and predictable mechanisms of action. Placebos do not apply only to trials evaluating pharmacological interventions. They also apply to non-drug interventions such as psychological, physical, and surgical interventions. Placebos are certainly easier to develop and implement successfully in drug trials. In these cases they should resemble the taste, smell, and colour of the active drug, and should be given using an identical procedure.
Placebos are more difficult to develop and implement successfully in non-drug trials. For example, it is difficult to develop and implement placebo counselling, physiotherapy, acupuncture, or electrical stimulation. In some cases it is impossible, unfeasible, or simply unethical to use placebos. It would be impossible, for example, to use a placebo intervention in a trial evaluating the effect on mothers and newborns of early versus late discharge from hospital after childbirth. On the other hand, it would be unfeasible or unethical to use a placebo in trials evaluating new or existing surgical interventions (although there are examples of trials in which placebo surgery has been used successfully to challenge the perceived effectiveness of established surgical interventions). In addition, there is controversy as to whether placebo-controlled studies are ethical to study a new or existing intervention when there is an effective intervention available. Even in cases where the use of placebos is impossible, unfeasible, or unethical, trials can be at least single-blind. In a surgical or acupuncture trial, for instance, single-blinding can be achieved by keeping the investigators in charge of assessing the outcomes unaware of which participants receive which interventions.
How can ascertainment bias be reduced after data collection?
This source of bias can be controlled by keeping the data analysts and the people in charge of reporting the trial results unaware of the identity of the study groups. In a study with two groups, for instance, the outcome data could be given to analysts coded as A and B and, once they complete the analysis, the results could be given to the person in charge of writing the report using the same codes. The codes would not be broken until after the data analysis and reporting phases were completed. These strategies are rarely used.
The frequency and magnitude of ascertainment bias introduced after data collection have not been studied at all.
Other biases that can be introduced during the course of a trial
On occasion, however, it is impossible to know the status of participants at the times when the missing information should have been collected. This could happen, for example, if participants move to different areas during the study or fail to contact the investigators for an unknown reason. Excluding these participants or specific outcome measurements from the final analysis can also lead to bias.
The only strategy that can confidently be assumed to eliminate bias in these circumstances includes two components. The first is called intention to treat analysis, and means that all the study participants are included in the analyses as part of the groups to which they were randomised regardless of whether they completed the study or not. The second component includes a worst case scenario sensitivity analysis. This is performed by assigning the worst possible outcomes to the missing patients or time-points in the group that shows the best results, and the best possible outcomes to the missing patients or time-points in the group with the worst results, and evaluating whether the new analysis contradicts or supports the results of the initial analysis which does not take into account the missing data.
Bias introduced by inappropriate use of cross-over design
The main sources of bias during dissemination of trials are publication bias, language bias, country of publication bias, time lag bias, and potential breakthrough bias.
What is publication bias?
The only way to eliminate publication bias is through compulsory registration of trials at inception and publication of the results of all trials. Although highly desirable, compulsory registration of trials and publication of the results of all trials are the focus of intense debate and controversy fuelled by strong ethical and economic elements, and are unlikely to happen in the near future. Until they happen, readers must be aware that, by relying on published studies to guide their decisions, they are always at risk of overestimating the effect of interventions.
What is language bias?
What is country of publication bias?
What is time lag bias?
What is potential breakthrough bias?
More than 15 years ago, different types of reader biases were described.4 At the time in which they were reported, the existence of these biases was supported only by common sense and experience. Recently, there have been empirical studies that support the existence of reader bias, showing that there are systematic differences in the way readers assess the quality of RCTs, depending on whether the assessments are conducted under masked or open conditions.13,24 These studies do not, however, focus on any specific type of reader bias. More research is needed to establish the individual contribution of each type of reader bias.
The following is a description of the reader biases described by Owen4 with a few more added by myself.
Rivalry bias Underrating the strengths or exaggerating the weaknesses of studies published by a rival.
I owe him one bias This is a variation of the previous bias and occurs when a reader (particularly a peer-reviewer) accepts flawed results from a study by someone who did the same for the reader.
Personal habit bias Overrating or underrating a study depending on the habits of the reader (for example, a reader who enjoys eating animal fat overrating a study that challenges the adverse effects of animal fat on health).
Moral bias Overrating or underrating a study depending on how much it agrees or disagrees with the reader's morals (for example, a reader who regards abortion as immoral overrating a study showing a relationship between abortion and breast cancer).
Clinical practice bias Overrating or underrating a study according to whether the study supports or challenges the reader's current or past clinical practice (that is, a clinician who gives lidocaine to patients with acute myocardial infarction, underrating a study which suggests that lidocaine may increase mortality in these patients).
Territory bias This is related to the previous bias and can occur when readers overrate studies that support their own specialty or profession (for example, a surgeon favouring a study which suggests that surgery is more effective than medical treatment, or obstetricians underrating a study which suggests that midwives can provide adequate care during uncomplicated pregnancies and deliveries).
Complementary medicine bias I have added this to Owen's list. It refers to the systematic overrating or underrating of studies that describe complementary medicine interventions, particularly when the results suggest that the interventions are effective.
Do something bias Overrating a study which suggests that an intervention is effective, particularly when there is no effective intervention available. This is a bias that may be common among clinicians and patients (for example, a patient with AIDS overrating a study describing a cure for AIDS).
Do nothing bias This bias is related to the previous one. It occurs when readers underrate a study that discourages the use of an intervention in conditions for which no effective treatment exists. This bias may be common among researchers and academics (that is, a researcher underrating a study which shows that membrane stabilisers do not provide analgesia in patients with painful diabetic neuropathy unresponsive to any other treatment).
Favoured design bias Overrating a study that uses a design supported, publicly or privately, by the reader (for example, a consumer advocate overrating an RCT that takes into account patient preferences).
Disfavoured design bias The converse of favoured design bias. It occurs when a study is underrated because it uses a design that is not favoured by the reader (for example, a reader underrating a crossover trial, even when it meets all the criteria described in Chapter 2).
Resource allocation bias Overrating or underrating a study according to the reader's preference for resource allocation. This bias may be one of the most frequently found in health care, because it can emanate from consumers, clinicians, policy makers, researchers, and fund holders.
Prestigious journal bias This occurs when the results of studies published in prestigious journals are overrated.
Non-prestigious journal bias The converse of prestigious journal bias. It occurs when the results of studies published in non-prestigious journals are underrated.
Printed word bias This occurs when a study is overrated because of undue confidence in published data.
Lack-of-peer-review bias I have added this one to Owen's list. It occurs when a reader underrates a study, published or unpublished, because it has not been peer-reviewed.
Prominent author bias This occurs when the results of studies published by prominent authors are overrated.
Unknown or non-prominent author bias Owen called this the who is he? bias. It occurs when the results of studies published by unknown or non-prominent authors are underrated.
Famous institution bias This occurs when the results of studies emanating from famous institutions are overrated.
Unrecognised or non-prestigious institution bias Related to the previous bias. It occurs when the results of studies emanating from unrecognised or non-prestigious institutions are systematically underrated.
Large trial bias I have added this to Owen's list. It occurs when the results of large trials are overrated.
Multicentre trial bias I have added this to Owen's list. It occurs when the results of multicentre collaborative trials are overrated. These trials do not necessarily have large sample sizes.
Small trial bias I have also added this to Owen's list. It occurs when the results of trials with small sample size are underrated, particularly when they contradict the opinion of the reader (that is, attributing to chance any statistically or clinically significant effect found by a small trial, or any lack of significant effects to low power).
Flashy title bias It occurs when the results of studies with attractive titles are overrated (particularly by patients or journalists) or underrated (particularly by academics, if they regard them as sensationalist!).
Substituted question bias It occurs when a reader substitutes a question for the question that the study is designed to answer, and regards the results of the study as invalid if they do not answer the substituted question.
Credential or professional background bias Overrating or underrating the results of a study according to the qualifications of the authors (for example, physicians underrating research done by nurses or vice versa; basic scientists underrating research done by clinicians or vice versa; PhDs underrating studies published by MDs and vice versa; readers overrating research by authors with many letters after their names and vice versa).
Esteemed author bias Includes Owen's esteemed professor bias and the friendship bias. This bias occurs when the reader overrates results obtained by a close friend or mentor.
Geography bias This occurs when studies are overrated or underrated according to the country or region where it was conducted.
Language bias of publication Overrating or underrating a study depending on the language in which it is reported (that is, the belief that studies published in languages other than English are of inferior quality to those published in English).
Omission bias This occurs when a study is overrated or underrated because the reader did not read a key section.
Tradition bias Overrating or underrating the results of a study depending on whether it supports or challenges traditional procedures (that is, underrating a study that challenges episiotomy during normal vaginal deliveries).
Bankbook bias Overrating or underrating a study depending on the impact of its results on the income of the reader (for example, a surgeon underrating a study that questions the need for surgery to relieve back pain in patients with spinal stenosis, or a pharmaceutical company overrating the results of a study that supports the use of one of its products).
Belligerence bias Underrating studies systematically just for the sake of being difficult.
Technology bias Overrating (Owen's pro-technology bias) or underrating (Owen's anti-technology bias) a study according to the reader's attraction to or aversion for technology in health care.
Empiricism bias Overrating or underrating a study because it challenges the clinical experience of the reader.
I am an epidemiologist bias This occurs when the reader repudiates a study that contains any flaw, albeit minor, in its design, analysis, or interpretation.
8. Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995;273:408-12.
9. Moher D, Fortin P, Jadad AR, Juni P, Klassen T, Le Lorier J, Liberati A, Linde K, Penna A. Completeness of reporting of trials published in languages other than English: implications for conduct and reporting of systematic reviews. Lancet 1996;347:363-6.
12. Schulz KF, Grimes DA, Altman DG, Hayes RJ. Blinding and exclusions after allocation in randomised controlled trials: survey of published parallel group trials in obstetrics and gynaecology. BMJ 1996;312:742-4.
13. Jadad AR, Moore RA, Carroll D, Jenkinson C, Reynolds JM, Gavaghan DJ, McQuay DM. Assessing the quality of reports on randomized clinical trials: Is blinding necessary? Controlled Clin Trials 1996;17:1-12.
Home | Contents | | Foreword | Introduction | Acknowledgments | How to order
© BMJ Books 1998. BMJ Books is an imprint of the BMJ Publishing Group. First published in 1998 by BMJ Books, BMA House, Tavistock Square, London WC1H 9JR. A catalogue record for this book is available from the British Library. ISBN 0-7279-1208-9