When the researcher does not know the identity of the experimental and placebo groups?
So far, you have learnt to ask a RQ, identify different ways of obtaining data, and design the study. Show
In this chapter, you will learn how to ensure that the conclusions we can make are logical and sound in experimental studies. You will learn to:
IntroductionThe conclusions drawn from a study are only as good as the data that the conclusions are based on, and the data are only as good as the study design from which the data emerge. A good study requires high internal validity: When studying the relationship between the response and explanatory variables, we would like to be able to rule out---as much as possible---any other reason for the changes in the values of the response variable, so any remaining changes we see can be attributed to just the explanatory variable of interest. That is, we should design studies to have high internal validity to reduce bias. Remember the goal of study design is to maximise internal validity: to design a study to isolate the relationship of interest, by eliminating, as well as possible, all other possible explanations. Many aspects of the design must be considered to achieve this goal, some of which are discussed in this chapter. Data collection is often tedious, time consuming and expensive. You usually get one chance to collect your data, but you can analyse your data as many times as you like. Since you usually get one chance to collect your data, design the study properly the first time! Example 7.1 (Importance of internal validity) A group of researchers150 describe an experiment where free fertilizer was provided to a sample of female farmers in Mali (at the recommended amount per hectare; or at half the recommended amount per hectare). Since all the farmers knew they were being provided with fertilizer (that is, they were not blinded), the farmers changed their farm management: they employed more hired labour and used more herbicide. Consequently, the yields for all farmers changed. However, it is difficult to know whether this change in yield was was due to the amount of fertilizer applied, the change in labour, the change in herbicides, or a combination of these. That is, the study had poor internal validity. Specific design strategies that we consider for maximising internally validity are:
Not all of these will be relevant to every study. Some relevant design issues are discussed in this chapter for experimental studies. The next chapter considers design issues for observational studies. TABLE 7.1: Different design biases studied in this book related to the researchers and the individuals
In general, making the individuals unaware (blinding) that they are in a study, or unaware of what explanatory variable values apply to them, reduces or eliminates bias. Likewise, by making the researchers unaware (blinding) of what explanatory variable values apply to the individual reduces or eliminates bias. In this chapter, we will work with this RQ (based on Anthony R. Bird et al.151):
For the Himalaya 292 study:
Example 7.2 (Exclusion criteria) In the Himalaya study,152 the exclusion criteria were:
To answer this RQ, a study must be designed to collect the data. However, careful thought must be given to how the study is designed. Managing confoundingConfounding has the potential to compromise the internal validity of the study and hence the interpretation of the results, so managing the impact of confounding is important. Suppose, for example, that the researchers created two groups:
The researchers then gave Himalaya 292 to Group A, and the refined cereal to Group B. If a difference in faecal weight was found between the two groups, the difference may because:
If a difference is found between the Himalaya 292 and refined cereal groups, it may not be because of the cereal (Table 7.2). That is, the study has very poor internal validity due to poor study design. For example, the age of the subject may be related to faecal weight (as older people tend to eat less, and eat differently, than younger people), and the study design means that older people are more likely to consume the refined cereal. This is an extreme case of confounding; usually, confounding is more subtle (and hence more difficult to detect) than in this example. The key point is that the groups being compared should be as similar as possible, apart from the difference being studied (in the Himalaya 292 example, the diet that they are given). Example 7.3 (Comparing groups) An experiment to study the effect of using ginko to enhance memory154 compared two groups: one using ginko (\(n=111\)), and one using a pretend, non-active supplement (\(n=108\)). The authors randomly allocated participants to each group, but also compared the two groups to ensure that no obvious differences initially existed between the two groups that might explain any differences in the response variable (Table 7.3). The table shows that the two groups are very similar in terms of age, education and gender distribution. Hence, any difference between the groups cannot be attributed to existing difference in the age, the percentage of men, or the years of education in the two groups. TABLE 7.3: Comparing the two groups in the ginko-memory study
Potentially, many extraneous variables exist. To demonstrate, we will consider just one: age. How can we make sure that the age of the participants does not cause confounding? Confounding can be managed by:
The first two approaches (restricting; blocking) are useful if one or two variables are known, or thought likely, to cause confounding. The third approach (analysing) requires recording all the variables suspected of being confounders. The fourth approach (randomly allocating) is superior if it is possible, because it reduces the chance of confounding even for variables not even suspected as being confounding variables. Notice that a common theme is to measuring, observing, assessing or recording any variables of potential concern, to ensure no lurking variables exist to compromise the results. Of course, more than one of these approaches can be used, such as randomly allocating individuals to groups, but also measuring, observing, assessing or recording many other variables that can be managed through analysis (Example 7.3). RestrictionsSometimes the impact of confounding is managed by restricting the study to some groups, based on potential confounding variables, or keeping some variables constant. These variables are called control variables. If possible, a reason for this restriction should be given. Example 7.4 (Restricting) In the Himalaya study,156 the study might be restricted to subjects aged under 30. The control variable is 'age'. BlockingSometimes blocking is used to minimise the impacts of confounding. Blocking refers to separating the units of analysis into a small number of groups that are similar to one another, then studying those groups separately. The Himalaya study, might be blocked on age (Fig. 7.1). Definition 7.1 (Blocking) Blocking is when units of analysis are arranged in groups (called blocks) that are similar to one another.
FIGURE 7.1: Blocking in the Himalaya study, based on age AnalysisConfounding variables can be accommodated in the analysis (using analysis methodology beyond what is in this book), provided those variables have been measured, observed, assessed or recorded. Because of this, measuring, observing, assessing or recording all the information likely to be important for understanding the data is important. Measure, observe, assess or record all the information that is likely to be important for understanding the data. This may include information about
For this reason, most studies involving people record the participants' age and sex, as these two variables are common confounders. Once a sample is obtained, recording this extra information usually requires little extra effort. Example 7.5 (Analysis) In the Himalaya 292 study, the sex, age, pre-study weight and pre-study BMI were also recorded for each individual. Example 7.6 (Analysis) An experimental study157 compared nitrogen (N) and phosphorus (P) concentrations in maize, for evenly-injected liquid manure and band-injected liquid manure. As potential confounding variables, the researchers also recorded the average temperature and the precipitation (between May 1 and September 30) at each site. Random allocationOne way to minimise confounding is to randomly allocate individuals in the study to the treatment groups. (Remember that the word "random" has a special meaning.) The advantage of random allocation is that it should approximately evenly distribute potential confounding variables that have been identified (such as age) but also those variables that may not have even been considered as confounders, or are hard to measure or observe (such as genetic conditions). In the Himalaya study, the units of analysis (the people in the sample) could be allocated to a group at random, and then the groups allocated a diet through a toss of a coin (Fig. 7.2). Example 7.7 (Random allocation) In the Himalaya 292 study, the article reports that 'Subjects were allocated randomly to [...] dietary treatments...' (Bird et al.158, p. 1033).
FIGURE 7.2: Random allocation can occur in two places for the Himalaya study Random allocation may occur when randomly allocating individuals to groups (true experiment), and/or when randomly allocating treatments to groups (true or quasi-experiment). Random allocation can be shown, in general, as in Fig. 7.3.
FIGURE 7.3: Random allocation in general Random allocation vs random samplingRandom sampling and random allocation are two different concepts (Fig. 7.4), that serve two different purposes, but are often confused:
FIGURE 7.4: Comparing random allocation and random sampling Carry-over effect and washout periodsIn the Himalaya study, what if patients spent two weeks on the Himalaya 292 diet, then the next two weeks on the refined cereal diet? Potentially, the influence of the first diet could still be impacting the subjects' faecal weight for a little while after stopping the first diet. This could compromise the internally validity of the study. This is called the carryover effect. Definition 7.2 (Carryover effect) The carry-over effect is when the influence of past experience(s) of the individuals carry over to influence future experience(s) of the individuals. In the context of experiments, this may mean that the influence of one treatment carries over into the influence of the next treatment. Sometimes, researchers can randomly allocate the order in which the treatments (i.e., the diets) are used. That is, some participants start by spending four weeks on the Himalaya 292 diet, then (after a washout period) four weeks on the refined cereal diet; meanwhile, other participants start by spending four weeks on the refined cereal diet, then (after a washout period) four weeks on the Himalaya 292 diet. Example 7.8 (Washout periods) A study of paramedics159 required paramedics to conduct eight different tasks (such as electrical defibrillation and intravenous cannulation). The order in which each of the 16 paramedics performed the eight tasks was arranged so that not every paramedic started with Task 1, followed by Task 2, etc. to "control for possible effects of practice" (p. 255); that is, to mitigate the carry-over effect. The impact of the carryover effect may be minimized by using a washout period or similar; for example, after finishing one diet, the participants spend four weeks on their usual (before study) diet, and then revert to the second diet being used. In some studies, a washout is used. For example, after tasting a food sample, participants may rinse their mouth with water before tasting another food sample. Example 7.9 (Carry-over effect) In the Himalaya 292 study, the authors report:
That is, subjects were randomly allocated to a diet: some subjects began the study on the Himalaya 292 diet while others started on the refined cereal diet. No washout period was used; however, since the response variable was recorded after four weeks on the diets, no washout period was considered necessary. Example 7.10 (Washout) An engineering study161 examined drivers' exposure to lane-keeping system on their driving performance. Subjects were exposed to a driving simulation that used a lane-keeping system, and then to a driving simulation without using a lane-keeping system. The researchers found that there was a carryover effect when drivers moved from a simulation with a lane-keeping system to one without a lane-keeping system.
FIGURE 7.5: Using a 'washout' period to minimize the carry-over effect Hawthorne effect and blinding individualsWhat if the patients in the Himalaya 292 study were being watched (or waited for) while defecating? Could this lead to a misleading conclusion? People often behave differently (either positively or negatively) if they know (or think) they are in a study or are being watched. This is called the Hawthorne effect.162 This could compromise the internal validity of the study. Definition 7.3 (Hawthorne effect) The Hawthorne effect is the tendency of individuals to change their behaviour if they know (or think) they are being observed. Example 7.11 (Hawthorne effect) People are more health-conscious if they know they will be followed-up on a regular basis. For example, a study aiming to increase fruit and vegetable intake in young adults163 noted that
The impact of the Hawthorne effect can be minimized by blinding the individuals in the experiment so that they do not know:
For example, if the individuals do not know which treatment they are receiving, they cannot behave differently according to the treatment they know they are receiving. Blinding people to knowing they are involved in a study is often difficult, as ethics usually requires individuals' informed consent. Example 7.12 (Hawthorne effect) In the Himalaya 292 study, the authors report:
That is, the subjects knew they were in a study. As is usual, this was an ethics requirement (in this case, from the Ethics Committee of the CSIRO). The Hawthorne effect may influence the results. However, the subjects did not know which diet they were on:
Example 7.13 (Hawthorne effect) In an experimental study167 to compare the efficacy of a new type of toothpaste, participants were given two types of toothpaste to use (a new type, and an exisiting type), and evaluations of plaque remaining on the teeth were taken. The authors state that:
That is, since all participants knew they were being assessed after brushing their teeth, there may have been a tendency to brush their teeth better than usual. The authors then state:
Placebo effect and using controlsWhat if people thought they were on the wholegrain diet, but they weren't? Could this lead to a misleading conclusion? Perhaps surprisingly, individuals in a study may report effects of a treatment (either positive or negative), even if they have not received an active treatment. This could compromise the internally validity of the study. This is called the placebo effect. Definition 7.4 (Placebo effect) The placebo effect is when individuals report perceived or actual effects without having received the treatment. Managing the placebo effect is difficult! However, impact of the placebo effect can be minimized using a control group: units of analysis without the treatment applied, but as similar as possible in every other way to those units of analysis receiving the treatment. This allows the effect of the treatment to be ssessed, over and above the placebo effect. Definition 7.5 (Control) A control is a unit of analysis without the treatment applied (but as similar as possible in every other way to other units of analysis). Sometimes the control group receives a placebo. A placebo is a non-effective treatment. Those who receive the placebo should be selected through random allocation when possible. Sometimes, using a placebo is unethical. The Wikipedia entry about placebos is intriguing. Definition 7.6 (Placebo) A placebo is a treatment with no intended effect or active ingredient. Example 7.14 (Placebo effect) In the Himalaya 292 study, the authors report
That is, the subjects were blinded to the diet they were exposed to. However, some may think they are on the refined cereal or Himalaya diet, and respond accordingly (perhaps unconsciously). To test the effectiveness of a new drug, patients are to report to a GP to receive injections of a new drug. We wish to compare to people who do not get the injection. What is the control? The controls are not just people who don't get the injections. Ideally, controls would be people who, like the treatment group, report to a GP and receive an injection... however, they just receive an injection that will do nothing. Example 7.15 (Placebo effect) Three active analgesics (pain relievers) were compared to a placebo.171 Four different coloured placebos were used. The most pain relief was experienced by those taking red placebos (Fig. 7.6), who experienced even more pain relief than those given true pain relievers.
FIGURE 7.6: Pain relief, for various pain relief medicine Example 7.16 (Placebo effect) A study of placebos172 gave half the subjects a placebo, but told them that the pill was an expensive (implying 'very effective') pain killer ($2.50 per tablet). The other half were also given a placebo, but were told that the pill was a discount (impling 'less effective') pain killer ($0.10 per tablet). About 85% of participants in the first group reported a pain reduction, yet only 61% in the second group reported a pain reduction. Remember that both groups actually received a placebo! Observer effect and blinding researchersWhat if the researchers assessing the outcomes knew the diet allocated to each patient, and were hoping that the new diet performed better than the refined cereal diet? Could this lead to a misleading conclusion? Perhaps surprisingly, the researchers' expectations or hopes for how the new diet will perform may (unconsciously) influence how the researchers interact with the individuals, and perhaps (unconsciously) influence the behaviour of the individuals in the study. This is called observer effect. (In experiments, it is sometimes called the experimenter effect.) This could compromise the internally validity of the study. Definition 7.7 (Observer effect) The observer effect occurs when the researchers (unconsciously) change their behaviour to conform to expectations because they know what values of the explanatory variable apply to the individuals. This may cause the individuals to change their behaviour or reporting also. The impact of observer effect can be minimized by blinding the researchers so that they do not know which treatments the individuals are receiving. That is, the people giving the treatment and the people evaluating the treatment do not know what treatment has been given. Instead, a third party can be used. For example, the researchers may give an assistant two drugs labelled A and B. The assistant then administers the drug and evaluates the participants' response to the treatments. Later, the assistant tells the researchers whether Drug A or Drug B performed better, but only the researchers know what drugs the labels A and B refer to. Example 7.17 (Observer effect) In an experimental study173 that examined the impact of an injection to alleviate post-operative umbilical pain, the authors stated:
The observer effect does not just apply to situations where people are used as participants. Example 7.18 (Observer effect) 'Clever Hans' was a horse that seemed to be able to perform simple mental arithmetic. After much study, Carl Stumpf realised that the horse was responding to involuntary (and unconscious) cues from the trainer. This was discovered, in part, by using an experiment where the people interacting with the horse were blinded. The same effect has been observed in narcotic sniffer dogs,175 who may respond to their handlers' unconscious cues. The observer effect is about the observer unconsciously influencing the individuals; that is, the researchers are not aware that it is occurring. If the researchers are intentionally influencing the individuals, this is called fraud. Describing blindingBlinding is when those involved in the study do not know information about the study. Those involved in the study may not know:
When participants are blinded to as much as possible, the internal validity of the study is increased. However, when people are the individuals, ethics requirements often mean that they need to know they are in a study, and the purpose of the study. Different individuals involved in the study can be blinded:
When as many participants are blinded as possible, the internal validity of the study is increased. If only the participants are blinded, the study is called single blind. If both the researchers and participants are blinded, the study is called double blind. If the researchers, participants and the analyst are blinded, the study is called triple blind. For clarity, we strongly recommend explicitly stating who or what is blinded. Blinding should be considered in all studies, when possible (and it is not always possible). Blinding of participants does not just apply to people; it is also relevant with animals (Example 7.18 about Clever Hans). Why might it be necessary to blind the analyst to the treatments being used? Example 7.19 (Blinding) In a study comparing chest compressions with dominant and non-dominant hands of student paramedics,176 the article states that:
Participants could not, however, be blinded to which group they were in (dominant hand on chest; non-dominant hand on chest). In this case, participants were only partially blinded. Later, the article reports that:
This means that the analyst was blinded to the treatments. Example 7.20 (Double-blinding) In a cropping study comparing yields from modern and traditional cowpea crops in Tanzania, the researchers wanted to use a double-blind study. To do so:
Design issues: OverviewIn summary, issues to consider when designing a study, when possible, include:
Ways to minimize the impact of these have been discussed (Fig. 7.7), but is not always possible. These effects are important to understand, so studies can be designed to manage or minimise their influence (to maximise internal validity). This ensures that the results and conclusions from our studies are correctly interpreted (that is, noting, for example, how the Hawthorne effect may have influenced the conclusions). Often, however, some (or all) of these issues cannot be well managed. For instance, individuals often know they are involved in an experimental study (Hawthorne effect). In these cases, the impacts should be minimized as far as possible, and then the likely impact that these issues have on our conclusions discussed. The impact of these issues are often reported as limitations in a journal article (Chap. 9), perhaps part of the Discussion section. Example 7.21 (Study limitations) A study of alcohol use in college females reported these limitations of their study:
FIGURE 7.7: Design considerations. Note: Lurking variables become confounding variables when measured, observed, assessed or recorded in the study, and then they can be managed. The arrows mean that the design issue can be partially managed by the indicated means Example 7.22 (Study design) In a study of student paramedics comparing chest compressions with dominant and non-dominant hands,181 as discussed in Example 7.19, the participants were partially blinded: they were blinded to the purpose of the study, but not to which group they were allocated. The analyst was also blinded to the group allocations. Later, the article reports that:
This study used a number of good design features. SummaryDesigning effective experimental studies requires researchers to manage or minimise confounding where possible, by:
Well-designed experimental studies also try to manage:
The following short video may help explain some of these concepts:
Quick review questionsA study on the bruising of apples183 aimed to determine the relationship between the recorded surface temperature of the apple, and the depth of bruising. The researchers purposefully hit apples with three different forces (200, 700 and 1200 mJ) to inflict bruises. The researchers then recorded the depth of the bruising, and recorded the surface temperature at each bruise location. The study was conducted separately for three different regions of the apple (lower; middle; upper), and each apple was only used once.
Progress: ExercisesSelected answers are available in Sect. D.7. Exercise 7.1 A scientist is comparing the effects of two types of fertiliser on the yield of tomatoes (based on Mariel Gullian Klanian et al.184). He plants tomato seedlings, and fertilises with Fertiliser I, and later measures the yield of tomatoes. He then immediately plants more tomato seedlings in the same field, and fertilises with Fertilizer II, and measures the yield of tomatoes. What potential problems can you identify with the study design? Exercise 7.2 A scientist is expecting that tap water will taste the same as bottled water in a taste test (based on Eric Teillet et al.185). The scientist provides people with a plastic cup of either bottled or tap water, and she asks them to give a rating of the taste on a scale of 1 (terrible) to 5 (fantastic). What potential problems can you identify with the study design? Exercise 7.3 Consider this RQ (based on Teillet et al.186)):
This RQ needs some clarification, but you decide to answer this question using an experiment. How would you manage:
Exercise 7.4 In a study of time spent applying sunscreen187 the Aim was to 'determine whether time spent on sunscreen application is related to the amount of sunscreen used' (Heerfordt et al.188, p. 117). The authors state this about the study design:
When subject do not know if they are receiving the experiment treatment they are said to be blind?A double-blind study is one in which neither the participants nor the experimenters know who is receiving a particular treatment. This procedure is utilized to prevent bias in research results. Double-blind studies are particularly useful for preventing bias due to demand characteristics or the placebo effect.
What is it called when neither the participants nor the researchers know who is in the experimental group and who is in the control group?Listen to pronunciation. (DUH-bul-blind STUH-dee) A type of clinical trial in which neither the participants nor the researcher knows which treatment or intervention participants are receiving until the clinical trial is over. This makes results of the study less likely to be biased.
How important it is for the researcher to identify the type of variables used in the study?Variables are important to understand because they are the basic units of the information studied and interpreted in research studies. Researchers carefully analyze and interpret the value(s) of each variable to make sense of how things relate to each other in a descriptive study or what has happened in an experiment.
What is internal validity of research?Internal validity is defined as the extent to which the observed results represent the truth in the population we are studying and, thus, are not due to methodological errors.
|