random assignment internal validity

Random Assignment in Psychology: Definition & Examples

Julia Simkus

Editor at Simply Psychology

BA (Hons) Psychology, Princeton University

Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master's Degree in Counseling for Mental Health and Wellness in September 2023. Julia's research has been published in peer reviewed journals.

Learn about our Editorial Process

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

In psychology, random assignment refers to the practice of allocating participants to different experimental groups in a study in a completely unbiased way, ensuring each participant has an equal chance of being assigned to any group.

In experimental research, random assignment, or random placement, organizes participants from your sample into different groups using randomization.

Random assignment uses chance procedures to ensure that each participant has an equal opportunity of being assigned to either a control or experimental group.

The control group does not receive the treatment in question, whereas the experimental group does receive the treatment.

When using random assignment, neither the researcher nor the participant can choose the group to which the participant is assigned. This ensures that any differences between and within the groups are not systematic at the onset of the study.

In a study to test the success of a weight-loss program, investigators randomly assigned a pool of participants to one of two groups.

Group A participants participated in the weight-loss program for 10 weeks and took a class where they learned about the benefits of healthy eating and exercise.

Group B participants read a 200-page book that explains the benefits of weight loss. The investigator randomly assigned participants to one of the two groups.

The researchers found that those who participated in the program and took the class were more likely to lose weight than those in the other group that received only the book.

Importance

Random assignment ensures that each group in the experiment is identical before applying the independent variable.

In experiments , researchers will manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables. Random assignment increases the likelihood that the treatment groups are the same at the onset of a study.

Thus, any changes that result from the independent variable can be assumed to be a result of the treatment of interest. This is particularly important for eliminating sources of bias and strengthening the internal validity of an experiment.

Random assignment is the best method for inferring a causal relationship between a treatment and an outcome.

Random Selection vs. Random Assignment

Random selection (also called probability sampling or random sampling) is a way of randomly selecting members of a population to be included in your study.

On the other hand, random assignment is a way of sorting the sample participants into control and treatment groups.

Random selection ensures that everyone in the population has an equal chance of being selected for the study. Once the pool of participants has been chosen, experimenters use random assignment to assign participants into groups.

Random assignment is only used in between-subjects experimental designs, while random selection can be used in a variety of study designs.

Random Assignment vs Random Sampling

Random sampling refers to selecting participants from a population so that each individual has an equal chance of being chosen. This method enhances the representativeness of the sample.

Random assignment, on the other hand, is used in experimental designs once participants are selected. It involves allocating these participants to different experimental groups or conditions randomly.

This helps ensure that any differences in results across groups are due to manipulating the independent variable, not preexisting differences among participants.

When to Use Random Assignment

Random assignment is used in experiments with a between-groups or independent measures design.

In these research designs, researchers will manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables.

There is usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable at the onset of the study.

How to Use Random Assignment

There are a variety of ways to assign participants into study groups randomly. Here are a handful of popular methods:

Random Number Generator : Give each member of the sample a unique number; use a computer program to randomly generate a number from the list for each group.
Lottery : Give each member of the sample a unique number. Place all numbers in a hat or bucket and draw numbers at random for each group.
Flipping a Coin : Flip a coin for each participant to decide if they will be in the control group or experimental group (this method can only be used when you have just two groups)
Roll a Die : For each number on the list, roll a dice to decide which of the groups they will be in. For example, assume that rolling 1, 2, or 3 places them in a control group and rolling 3, 4, 5 lands them in an experimental group.

When is Random Assignment not used?

When it is not ethically permissible: Randomization is only ethical if the researcher has no evidence that one treatment is superior to the other or that one treatment might have harmful side effects.
When answering non-causal questions : If the researcher is just interested in predicting the probability of an event, the causal relationship between the variables is not important and observational designs would be more suitable than random assignment.
When studying the effect of variables that cannot be manipulated: Some risk factors cannot be manipulated and so it would not make any sense to study them in a randomized trial. For example, we cannot randomly assign participants into categories based on age, gender, or genetic factors.

Drawbacks of Random Assignment

While randomization assures an unbiased assignment of participants to groups, it does not guarantee the equality of these groups. There could still be extraneous variables that differ between groups or group differences that arise from chance. Additionally, there is still an element of luck with random assignments.

Thus, researchers can not produce perfectly equal groups for each specific study. Differences between the treatment group and control group might still exist, and the results of a randomized trial may sometimes be wrong, but this is absolutely okay.

Scientific evidence is a long and continuous process, and the groups will tend to be equal in the long run when data is aggregated in a meta-analysis.

Additionally, external validity (i.e., the extent to which the researcher can use the results of the study to generalize to the larger population) is compromised with random assignment.

Random assignment is challenging to implement outside of controlled laboratory conditions and might not represent what would happen in the real world at the population level.

Random assignment can also be more costly than simple observational studies, where an investigator is just observing events without intervening with the population.

Randomization also can be time-consuming and challenging, especially when participants refuse to receive the assigned treatment or do not adhere to recommendations.

What is the difference between random sampling and random assignment?

Random sampling refers to randomly selecting a sample of participants from a population. Random assignment refers to randomly assigning participants to treatment groups from the selected sample.

Does random assignment increase internal validity?

Yes, random assignment ensures that there are no systematic differences between the participants in each group, enhancing the study’s internal validity .

Does random assignment reduce sampling error?

Yes, with random assignment, participants have an equal chance of being assigned to either a control group or an experimental group, resulting in a sample that is, in theory, representative of the population.

Random assignment does not completely eliminate sampling error because a sample only approximates the population from which it is drawn. However, random sampling is a way to minimize sampling errors.

When is random assignment not possible?

Random assignment is not possible when the experimenters cannot control the treatment or independent variable.

For example, if you want to compare how men and women perform on a test, you cannot randomly assign subjects to these groups.

Participants are not randomly assigned to different groups in this study, but instead assigned based on their characteristics.

Does random assignment eliminate confounding variables?

Yes, random assignment eliminates the influence of any confounding variables on the treatment because it distributes them at random among the study groups. Randomization invalidates any relationship between a confounding variable and the treatment.

Why is random assignment of participants to treatment conditions in an experiment used?

Random assignment is used to ensure that all groups are comparable at the start of a study. This allows researchers to conclude that the outcomes of the study can be attributed to the intervention at hand and to rule out alternative explanations for study results.

Margin Size

Download Page (PDF)
Download Full Book (PDF)
Periodic Table
Physics Constants
Scientific Calculator
Reference & Cite
Tools expand_more
Readability

selected template will load here

This action is not available.

1.3: Threats to Internal Validity and Different Control Techniques

Last updated
Save as PDF
Page ID 32915

Yang Lydia Yang
Kansas State University

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $

$ \newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $ \newcommand{\range}{\mathrm{range}\,}$

$ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$

$ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$

$ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$ \newcommand{\Span}{\mathrm{span}}$

$ \newcommand{\id}{\mathrm{id}}$

$ \newcommand{\kernel}{\mathrm{null}\,}$

$ \newcommand{\range}{\mathrm{range}\,}$

$ \newcommand{\RealPart}{\mathrm{Re}}$

$ \newcommand{\ImaginaryPart}{\mathrm{Im}}$

$ \newcommand{\Argument}{\mathrm{Arg}}$

$ \newcommand{\norm}[1]{\| #1 \|}$

$ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\AA}{\unicode[.8,0]{x212B}}$

$ \newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$ \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$ \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vectorC}[1]{\textbf{#1}} $

$ \newcommand{\vectorD}[1]{\overrightarrow{#1}} $

$ \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} $

$ \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} $

Internal validity is often the focus from a research design perspective. To understand the pros and cons of various designs and to be able to better judge specific designs, we identify specific threats to internal validity . Before we do so, it is important to note that the primary challenge to establishing internal validity in social sciences is the fact that most of the phenomena we care about have multiple causes and are often a result of some complex set of interactions. For example, X may be only a partial cause of Y or X may cause Y, but only when Z is present. Multiple causation and interactive effects make it very difficult to demonstrate causality. Turning now to more specific threats, Figure 1.3.1 below identifies common threats to internal validity.

Figure $\PageIndex{1}$: Common Threats to Internal Validity
Threat
History	Any event that occurs while the experiment is in progress might be an alternation; using a control group mitigates this concern.
Maturation	Normal changes over time (e.g., fatigue or aging) might affect the dependent variable; using a control group mitigates this concern
Selection Bias	If randomization is not used to assign participants, the groups may not be equivalent
Experimental Mortality	If groups lost participants (e.g., due to dropping out of the experiment) they may not be equivalent.
Testing	A pre-test may confound the influence of the experimental treatment; using a control group mitigates this concern
Instrumentation	Changes or difference in the process of measurements might alternatively account for differences
Statistical Regression	The natural tendency for extreme scores to regress or move towards the mean

Different Control Techniques

All of the common threats mentioned above can introduce extraneous variables into your research design, which will potentially confound your research findings. In other words, we won't be able to tell whether it is the independent variable (i.e., the treatment we give participants), or the extraneous variable, that causes the changes in the dependent variable. Controlling for extraneous variables reduces its threats on the research design and gives us a better chance to claim the independent variable causes the changes in the dependent variable, i.e., internal validity. There are different techniques we can use to control for extraneous variables.

Random assignment

Random assignment is the single most powerful control technique we can use to minimize the potential threats of the confounding variables in research design. As we have seen in Dunn and her colleagues' study earlier, participants are not allowed to self select into either conditions (spend $20 on self or spend on others). Instead, they are randomly assigned into either group by the researcher(s). By doing so, the two groups are likely to be similar on all other factors except the independent variable itself. One confounding variable mentioned earlier is whether individuals had a happy childhood to begin with. Using random assignment, those who had a happy childhood will likely end up in each condition group. Similarly, those who didn't have a happy childhood will likely end up in each condition group too. As a consequence, we can expect the two condition groups to be very similar on this confounding variable. Applying the same logic, we can use random assignment to minimize all potential confounding variables (assuming your sample size is large enough!). With that, the only difference between the two groups is the condition participants are assigned to, which is the independent variable, then we are confident to infer that the independent variable actually causes the differences in the dependent variables.

It is critical to emphasize that random assignment is the only control technique to control for both known and unknown confounding variables. With all other control techniques mentioned below, we must first know what the confounding variable is before controlling it. Random assignment does not. With the simple act of randomly assigning participants into different conditions, we take care both the confounding variables we know of and the ones we don't even know that could threat the internal validity of our studies. As the saying goes, "what you don't know will hurt you." Random assignment take cares of it.

Matching is another technique we can use to control for extraneous variables. We must first identify the extraneous variable that can potentially confound the research design. Then we want to rank order the participants on this extraneous variable or list the participants in a ascending or descending order. Participants who are similar on the extraneous variable will be placed into different treatment groups. In other words, they are "matched" on the extraneous variable. Then we can carry out the intervention/treatment as usual. If different treatment groups do show differences on the dependent variable, we would know it is not the extraneous variables because participants are "matched" or equivalent on the extraneous variable. Rather it is more likely to the independent variable (i.e., the treatments) that causes the changes in the dependent variable. Use the example above (self-spending vs. others-spending on happiness) with the same extraneous variable of whether individuals had a happy childhood to begin with. Once we identify this extraneous variable, we do need to first collect some kind of data from the participants to measure how happy their childhood was. Or sometimes, data on the extraneous variables we plan to use may be already available (for example, you want to examine the effect of different types of tutoring on students' performance in Calculus I course and you plan to match them on this extraneous variable: college entrance test scores, which is already collected by the Admissions Office). In either case, getting the data on the identified extraneous variable is a typical step we need to do before matching. So going back to whether individuals had a happy childhood to begin with. Once we have data, we'd sort it in a certain order, for example, from the highest score (meaning participants reporting the happiest childhood) to the lowest score (meaning participants reporting the least happy childhood). We will then identify/match participants with the highest levels of childhood happiness and place them into different treatment groups. Then we go down the scale and match participants with relative high levels of childhood happiness and place them into different treatment groups. We repeat on the descending order until we match participants with the lowest levels of childhood happiness and place them into different treatment groups. By now, each treatment group will have participants with a full range of levels on childhood happiness (which is a strength...thinking about the variation, the representativeness of the sample). The two treatment groups will be similar or equivalent on this extraneous variable. If the treatments, self-spending vs. other-spending, eventually shows the differences on individual happiness, then we know it's not due to how happy their childhood was. We will be more confident it is due to the independent variable.

You may be thinking, but wait we have only taken care of one extraneous variable. What about other extraneous variables? Good thinking.That's exactly correct. We mentioned a few extraneous variables but have only matched them on one. This is the main limitation of matching. You can match participants on more than one extraneous variables, but it's cumbersome, if not impossible, to match them on 10 or 20 extraneous variables. More importantly, the more variables we try to match participants on, the less likely we will have a similar match. In other words, it may be easy to find/match participants on one particular extraneous variable (similar level of childhood happiness), but it's much harder to find/match participants to be similar on 10 different extraneous variables at once.

Holding Extraneous Variable Constant

Holding extraneous variable constant control technique is self-explanatory. We will use participants at one level of extraneous variable only, in other words, holding the extraneous variable constant. Using the same example above, for example we only want to study participants with the low level of childhood happiness. We do need to go through the same steps as in Matching: identifying the extraneous variable that can potentially confound the research design and getting the data on the identified extraneous variable. Once we have the data on childhood happiness scores, we will only include participants on the lower end of childhood happiness scores, then place them into different treatment groups and carry out the study as before. If the condition groups, self-spending vs. other-spending, eventually shows the differences on individual happiness, then we know it's not due to how happy their childhood was (since we already picked those on the lower end of childhood happiness only). We will be more confident it is due to the independent variable.

Similarly to Matching, we have to do this one extraneous variable at a time. As we increase the number of extraneous variables to be held constant, the more difficult it gets. The other limitation is by holding extraneous variable constant, we are excluding a big chunk of participants, in this case, anyone who are NOT low on childhood happiness. This is a major weakness, as we reduce the variability on the spectrum of childhood happiness levels, we decreases the representativeness of the sample and generalizabiliy suffers.

Building Extraneous Variables into Design

The last control technique building extraneous variables into research design is widely used. Like the name suggests, we would identify the extraneous variable that can potentially confound the research design, and include it into the research design by treating it as an independent variable. This control technique takes care of the limitation the previous control technique, holding extraneous variable constant, has. We don't need to excluding participants based on where they stand on the extraneous variable(s). Instead we can include participants with a wide range of levels on the extraneous variable(s). You can include multiple extraneous variables into the design at once. However, the more variables you include in the design, the large the sample size it requires for statistical analyses, which may be difficult to obtain due to limitations of time, staff, cost, access, etc.

Home » Internal Validity – Threats, Examples and Guide

Internal Validity – Threats, Examples and Guide

Table of Contents

Internal Validity

Definition:

Internal validity refers to the extent to which a research study accurately establishes a cause-and-effect relationship between the independent variable (s) and the dependent variable (s) being investigated. It assesses whether the observed changes in the dependent variable(s) are actually caused by the manipulation of the independent variable(s) rather than other extraneous factors.

How to Increase Internal Validity

To enhance internal validity, researchers need to carefully design and conduct their studies. Here are some considerations for improving internal validity:

Random Assignment: Use random assignment to allocate participants to different groups in experimental studies. Random assignment helps ensure that the groups are comparable, minimizing the influence of individual differences on the results.
Control Group: Include a control group in experimental studies. This group should be similar to the experimental group but not exposed to the treatment or intervention being tested. The control group helps establish a baseline against which the effects of the treatment can be compared.
Control Extraneous Variables: Identify and control for extraneous variables that could potentially influence the relationship being studied. This can be achieved through techniques like matching participants, using homogeneous samples, or statistically controlling for the variables.
Standardized Procedures: Use standardized procedures and protocols across all participants and conditions. This helps ensure consistency in the administration of the study, reducing the potential for systematic biases.
Counterbalancing: In studies with multiple conditions or treatment sequences, employ counterbalancing techniques. This involves systematically varying the order of conditions or treatments across participants to eliminate any potential order effects.
Minimize Experimenter Bias: Take steps to minimize experimenter bias or expectancy effects. These biases can inadvertently influence the behavior of participants or the interpretation of results. Using blind or double-blind procedures, where the experimenter is unaware of the conditions or group assignments, can help mitigate these biases.
Use Reliable and Valid Measures: Ensure that the measures used in the study are reliable and valid. Reliable measures yield consistent results, while valid measures accurately assess the construct being measured.
Pilot Testing: Conduct pilot testing before the main study to refine the study design and procedures. Pilot testing helps identify potential issues, such as unclear instructions or unforeseen confounds, and allows for necessary adjustments to enhance internal validity.
Sample Size: Increase the sample size to improve statistical power and reduce the likelihood of random variation influencing the results. Adequate sample sizes increase the generalizability and reliability of the findings.
Researcher Bias: Researchers need to be aware of their own biases and take steps to minimize their impact on the study. This can be done through careful experimental design, blind data collection and analysis, and the use of standardized protocols.

Threats To Internal Validity

Several threats can undermine internal validity and compromise the validity of research findings. Here are some common threats to internal validity:

Events or circumstances that occur during the course of a study and affect the outcome, making it difficult to attribute the results solely to the treatment or intervention being studied.

Changes that naturally occur in participants over time, such as physical or psychological development, which can influence the results independently of the treatment or intervention.

Testing Effects

The act of being tested or measured on a particular variable in an initial assessment may influence participants’ subsequent responses. This effect can arise due to familiarity with the test or increased sensitization to the topic being studied.

Instrumentation

Changes or inconsistencies in the measurement tools or procedures used across different stages or conditions of the study. If the measurement methods are not standardized or if there are variations in the administration of tests, it can lead to measurement errors and threaten internal validity.

Selection Bias

When there are systematic differences between the characteristics of individuals selected for different groups or conditions in a study. If participants are not randomly assigned to groups or conditions, the results may be influenced by pre-existing differences rather than the treatment itself.

Attrition or Dropout

The loss of participants from a study over time can introduce bias if those who drop out differ systematically from those who remain. The characteristics of participants who drop out may affect the outcomes and compromise internal validity.

Regression to The Mean

The tendency for extreme scores on a variable to move closer to the average on subsequent measurements. If participants are selected based on extreme scores, their scores are likely to regress toward the mean in subsequent measurements, leading to erroneous conclusions about the effectiveness of a treatment.

Diffusion of Treatment

When participants in one group of a study receive knowledge or benefits from participants in another group, it can dilute the treatment effect and compromise internal validity. This can occur through communication or sharing of information among participants.

Demand Characteristics

Cues or expectations within a study that may influence participants to respond in a certain way or guess the purpose of the research. Participants may modify their behavior to align with perceived expectations, leading to biased results.

Experimenter Bias

Biases or expectations on the part of the researchers that may unintentionally influence the study’s outcomes. Researchers’ behavior, interactions, or inadvertent cues can impact participants’ responses, introducing bias and threatening internal validity.

Types of Internal Validity

There are several types of internal validity that researchers consider when designing and conducting studies. Here are some common types of internal validity:

Construct validity

Refers to the extent to which the operational definitions of the variables used in the study accurately represent the theoretical concepts they are intended to measure. It ensures that the measurements or manipulations used in the study accurately reflect the intended constructs.

Statistical Conclusion Validity

Relates to the degree to which the statistical analysis accurately reflects the relationships between variables. It involves ensuring that the appropriate statistical tests are used, the data is analyzed correctly, and the reported findings are reliable.

Internal Validity of Causal Inferences

Focuses on establishing a cause-and-effect relationship between the independent variable (treatment or intervention) and the dependent variable (outcome or response variable). It involves eliminating alternative explanations or confounding factors that could account for the observed relationship.

Temporal Precedence

Ensures that the cause (independent variable) precedes the effect (dependent variable) in time. It establishes the temporal sequence necessary for making causal claims.

Covariation

Refers to the presence of a relationship or association between the independent variable and the dependent variable. It ensures that changes in the independent variable are accompanied by corresponding changes in the dependent variable.

Elimination of Confounding Variables

Involves controlling for and minimizing the influence of extraneous variables that could affect the relationship between the independent and dependent variables. It helps isolate the true effect of the independent variable on the dependent variable.

Selection bias Control

Ensures that the process of assigning participants to different groups or conditions (randomization) is unbiased. Random assignment helps create equivalent groups, reducing the influence of participant characteristics on the dependent variable.

Controlling for Testing Effects

Involves minimizing the impact of repeated testing or measurement on participants’ responses. Counterbalancing, using control groups, or employing appropriate time intervals between assessments can help control for testing effects.

Controlling for Experimenter Effects

Aims to minimize the influence of the experimenter on participants’ responses. Blinding, using standardized protocols, or automating data collection processes can reduce the potential for experimenter bias.

Replication

Conducting the study multiple times with different samples or settings to verify the consistency and generalizability of the findings. Replication enhances internal validity by ensuring that the observed effects are not due to chance or specific characteristics of the study sample.

Internal Validity Examples

Here are some real-time examples that illustrate internal validity:

Drug Trial: A pharmaceutical company conducts a clinical trial to test the effectiveness of a new medication for treating a specific disease. The study uses a randomized controlled design, where participants are randomly assigned to receive either the medication or a placebo. The internal validity is high because the random assignment helps ensure that any observed differences between the groups can be attributed to the medication rather than other factors.

Education Intervention: A researcher investigates the impact of a new teaching method on student performance in mathematics. The researcher selects two comparable groups of students from the same school and randomly assigns one group to receive the new teaching method while the other group continues with the traditional method. By controlling for factors such as the school environment and student characteristics, the study enhances internal validity by isolating the effects of the teaching method.

Psychological Experiment: A psychologist conducts an experiment to examine the relationship between sleep deprivation and cognitive performance. Participants are randomly assigned to either a sleep-deprived group or a control group. The internal validity is strengthened by manipulating the independent variable (amount of sleep) and controlling for other variables that could influence cognitive performance, such as age, gender, and prior sleep habits.

Quasi-Experimental Study: A researcher investigates the impact of a new traffic law on accident rates in a specific city. Since random assignment is not feasible, the researcher selects two similar neighborhoods: one where the law is implemented and another where it is not. By comparing accident rates before and after the law’s implementation in both areas, the study attempts to establish a causal relationship while acknowledging potential confounding variables, such as driver behavior or road conditions.

Workplace Training Program: An organization introduces a new training program aimed at improving employee productivity. To assess the effectiveness of the program, the company implements a pre-post design where performance metrics are measured before and after the training. By tracking changes in productivity within the same group of employees, the study attempts to attribute any improvements to the training program while controlling for individual differences.

Applications of Internal Validity

Internal validity is a crucial concept in research design and is applicable across various fields of study. Here are some applications of internal validity:

Experimental Research

Internal validity is particularly important in experimental research, where researchers manipulate independent variables to determine their effects on dependent variables. By ensuring strong internal validity, researchers can confidently attribute any observed changes in the dependent variable to the manipulation of the independent variable, establishing a cause-and-effect relationship.

Quasi-experimental Research

Quasi-experimental studies aim to establish causal relationships but lack random assignment to groups. Internal validity becomes crucial in such designs to minimize alternative explanations for the observed effects. Careful selection and control of potential confounding variables help strengthen internal validity in quasi-experimental research.

Observational Studies

While observational studies may not involve experimental manipulation, internal validity is still relevant. Researchers need to identify and control for confounding variables to establish a relationship between variables of interest and rule out alternative explanations for observed associations.

Program Evaluation

Internal validity is essential in evaluating the effectiveness of interventions, programs, or policies. By designing rigorous evaluation studies with strong internal validity, researchers can determine whether the observed outcomes can be attributed to the specific intervention or program being evaluated.

Clinical Trials

Internal validity is critical in clinical trials to determine the effectiveness of new treatments or therapies. Well-designed randomized controlled trials (RCTs) with strong internal validity can provide reliable evidence on the efficacy of interventions and guide clinical decision-making.

Longitudinal Studies

Longitudinal studies track participants over an extended period to examine changes and establish causal relationships. Maintaining internal validity throughout the study helps ensure that observed changes in the dependent variable(s) are indeed caused by the independent variable(s) under investigation and not other factors.

Psychology and Social Sciences

Internal validity is pertinent in psychological and social science research. Researchers aim to understand human behavior and social phenomena, and establishing strong internal validity allows them to draw accurate conclusions about the causal relationships between variables.

Advantages of Internal Validity

Internal validity is essential in research for several reasons. Here are some of the advantages of having high internal validity in a study:

Causal Inference: Internal validity allows researchers to make valid causal inferences. When a study has high internal validity, it establishes a cause-and-effect relationship between the independent variable (treatment or intervention) and the dependent variable (outcome). This provides confidence that changes in the dependent variable are genuinely due to the manipulation of the independent variable.
Elimination of Confounding Factors: High internal validity helps eliminate or control confounding factors that could influence the relationship being studied. By systematically accounting for potential confounds, researchers can attribute the observed effects to the intended independent variable rather than extraneous variables.
Accuracy of Measurements: Internal validity ensures accurate and reliable measurements. Researchers employ rigorous methods to measure variables, reducing measurement errors and increasing the validity and precision of the data collected.
Replicability and Generalizability: Studies with high internal validity are more likely to yield consistent results when replicated by other researchers. This is important for the advancement of scientific knowledge, as replication strengthens the validity of findings and allows for the generalizability of results across different populations and settings.
Intervention Effectiveness: High internal validity helps determine the effectiveness of interventions or treatments. By controlling for confounding factors and utilizing robust research designs, researchers can accurately assess whether an intervention produces the desired outcomes or effects.
Enhanced Decision-making: Studies with high internal validity provide a solid basis for decision-making. Policymakers, practitioners, and professionals can rely on research with high internal validity to make informed decisions about the implementation of interventions or treatments in real-world settings.
Validity of Theory Development: Internal validity contributes to the development and refinement of theories. By establishing strong cause-and-effect relationships, researchers can build and test theories, enhancing our understanding of underlying mechanisms and contributing to theoretical advancements.
Scientific Credibility: Research with high internal validity enhances the overall credibility of the scientific field. Studies that prioritize internal validity uphold the rigorous standards of scientific inquiry and contribute to the accumulation of reliable knowledge.

Limitations of Internal Validity

While internal validity is crucial for research, it is important to recognize its limitations. Here are some limitations or considerations associated with internal validity:

Artificial Experimental Settings: Research studies with high internal validity often take place in controlled laboratory settings. While this allows for rigorous control over variables, it may limit the generalizability of the findings to real-world settings. The controlled environment may not fully capture the complexity and variability of natural settings, potentially affecting the external validity of the study.
Demand Characteristics and Experimenter Effects: Participants in a study may behave differently due to demand characteristics or their awareness of being in a research setting. They might alter their behavior to align with their perceptions of the expected or desired responses, which can introduce bias and compromise internal validity. Similarly, experimenter effects, such as unintentional cues or biases conveyed by the researcher, can influence participant responses and affect internal validity.
Selection Bias: The process of selecting participants for a study may introduce biases and limit the generalizability of the findings. For example, if participants are not randomly selected or if they self-select into the study, the sample may not represent the larger population, impacting both internal and external validity.
Reactive or Interactive Effects: Participants’ awareness of being observed or their exposure to the experimental manipulation may elicit reactive or interactive effects. These effects can influence their behavior, leading to artificial responses that may not be representative of their natural behavior in real-world situations.
Limited Sample Characteristics: The characteristics of the sample used in a study can affect internal validity. If the sample is not diverse or representative of the population of interest, it can limit the generalizability of the findings. Additionally, small sample sizes may reduce statistical power and increase the likelihood of chance findings.
Time-related Factors: Internal validity can be influenced by factors related to the timing of the study. For example, the immediate effects observed in a short-term study may not reflect the long-term effects of an intervention. Additionally, history or maturation effects occurring during the course of the study may confound the relationship being studied.
Exclusion of Complex Variables: To establish internal validity, researchers often simplify the research design by focusing on a limited number of variables. While this allows for controlled experimentation, it may neglect the complex interactions and multiple factors that exist in real-world situations. This limitation can impact the ecological validity and external validity of the findings.
Publication Bias: Publication bias occurs when studies with significant or positive results are more likely to be published, while studies with null or negative results remain unpublished or overlooked. This bias can distort the body of evidence and compromise the overall internal validity of the research field.

Alo see Validity

About the author

Muhammad Hassan

Researcher, Academic Writer, Web developer

Construct Validity – Types, Threats and Examples

Inter-Rater Reliability – Methods, Examples and...

Internal Consistency Reliability – Methods...

Validity – Types, Examples and Guide

Reliability Vs Validity

Alternate Forms Reliability – Methods, Examples...

Statistical Thinking: A Simulation Approach to Modeling Uncertainty

Internal validity evidence and random assignment.

Medical researchers may be interested in showing that a drug helps improve people’s health (the cause of improvement is the drug), while educational researchers may be interested in showing a curricular innovation improves students’ learning (the curricular innovation causes improved learning). To attribute a causal relationship, there are three criteria a researcher needs to establish:

Temporal Precedence: The cause needs to happen BEFORE the effect.
Covariation of the Cause and Effect: There needs to be a correlational relationship between the cause and effect.
No Plausible Alternative Explanations: ALL other possible explanations for the effect need to be ruled out.

Because of this third criteria, attributing a cause-and-effect relationship is very difficult. (You can read more about each of these criteria at the Web Center for Social Research Methods .)

Experimental studies have their strength in meeting this third criteria. To rule out ALL other possible explanations for the effect, the control group and the treatment group need to be “identical” with respect to every possible characteristic (aside from the treatment) that could explain differences. This way the only characteristic that will be different is that the treatment group gets the treatment and the control group doesn’t. If there are differences in the outcome, then it must be attributable to the treatment, because the other possible explanations are ruled out.

So, the key is to make the control and treatment groups “identical” when you are forming them. One thing that makes this task (slightly) easier is that they don’t have to be exactly identical, only probabilistically equivalent . This means, for example, that if you were matching groups on age that you don’t need the two groups to have identical age distributions; they would only need to have roughly the same AVERAGE age. Here roughly means “the average ages should be the same within what we expect because of sampling error.”

Now we just need to create the groups so that they have, on average, the same characteristics … for EVERY POSSIBLE CHARCTERISTIC that could explain differences in the outcome. Zoinks! 13

It turns out that creating probabilistically equivalent groups is a really difficult problem. One method that works pretty well for doing this is to randomly assign participants to the groups. This works best when you have large sample sizes, but even with small sample sizes random assignment has the advantage of at least removing the systematic bias between the two groups (any differences are due to chance and will probably even out between the groups). As Wikipedia’s page on random assignment points out,

Random assignment of participants helps to ensure that any differences between and within the groups are not systematic at the outset of the experiment. Thus, any differences between groups recorded at the end of the experiment can be more confidently attributed to the experimental procedures or treatment. … Random assignment does not guarantee that the groups are matched or equivalent. The groups may still differ on some preexisting attribute due to chance. The use of random assignment cannot eliminate this possibility, but it greatly reduces it.

Internal validity is the degree to which cause-and-effect inferences are accurate and meaningful. Causal attribution is the goal for many researchers. Thus, by using random assignment we have a pretty high degree of evidence for internal validity; we have a much higher belief in causal inferences. Much like evidence used in a court of law, it is useful to think about validity evidence on a continuum. We will visualize this continuum as a barometer. For example, a barometer visualizing the internal validity evidence for a study that employed random assignment in the design might be:

The degree of internal validity evidence is high (in the upper-third). How high depends on other factors such as sample size.

To learn more about random assignment, you can read the following:

The research report, Random Assignment Evaluation Studies: A Guide for Out-of-School Time Program Practitioners

According to Wiktionary the earliest usage of the work “zoinks” was by Norville “Shaggy” Rogers on the show Scooby-Doo. ↩︎

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings
My Bibliography
Collections
Citation manager

Save citation to file

Email citation, add to collections.

Create a new collection
Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

Search in PubMed
Search in NLM Catalog
Add to Search

The conflict between random assignment and treatment preference: implications for internal validity

Affiliation.

1 Center for Psychiatric Rehabilitation, University of Chicago, 7230 Arbor Drive, Tinley Park, IL 60477, USA.
PMID: 24011479
DOI: 10.1016/S0149-7189(03)00014-4

The gold standard for most clinical and services outcome studies is random assignment to treatment condition because this kind of design diminishes many threats to internal validity. Although we agree with the power of randomized clinical trials, we argue in this paper that random assignment raises other, unanticipated threats to internal validity as a result of failing to consider treatment preference in research participant behavior. Treatment preference arises from an individual's knowledge and appraisal of treatment options. Treatment preferences impact: (1) the recruitment phase because people consider whether they want to participate in a study that involves the possibility of receiving an undesirable treatment or waiting for treatment, (2) degree of engagement in the intervention condition, and (3) attrition from the study. The benefits and limitations of research strategies that augment randomization while respecting treatment preference are reviewed including: approaches that enhance enrollment and engagement; pilot testing assumptions about randomization; and partially randomized clinical trials.

PubMed Disclaimer

Related information

Cited in Books

LinkOut - more resources

Full text sources.

Elsevier Science
Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Difference between Random Selection and Random Assignment

Random selection and random assignment are commonly confused or used interchangeably, though the terms refer to entirely different processes. Random selection refers to how sample members (study participants) are selected from the population for inclusion in the study. Random assignment is an aspect of experimental design in which study participants are assigned to the treatment or control group using a random procedure.

Random selection requires the use of some form of random sampling (such as stratified random sampling , in which the population is sorted into groups from which sample members are chosen randomly). Random sampling is a probability sampling method, meaning that it relies on the laws of probability to select a sample that can be used to make inference to the population; this is the basis of statistical tests of significance .

Discover How We Assist to Edit Your Dissertation Chapters

Aligning theoretical framework, gathering articles, synthesizing gaps, articulating a clear methodology and data plan, and writing about the theoretical and practical implications of your research are part of our comprehensive dissertation editing services.

Bring dissertation editing expertise to chapters 1-5 in timely manner.
Track all changes, then work with you to bring about scholarly writing.
Ongoing support to address committee feedback, reducing revisions.

Random assignment takes place following the selection of participants for the study. In a true experiment, all study participants are randomly assigned either to receive the treatment (also known as the stimulus or intervention) or to act as a control in the study (meaning they do not receive the treatment). Although random assignment is a simple procedure (it can be accomplished by the flip of a coin), it can be challenging to implement outside of controlled laboratory conditions.

A study can use both, only one, or neither. Here are some examples to illustrate each situation:

A researcher gets a list of all students enrolled at a particular school (the population). Using a random number generator, the researcher selects 100 students from the school to participate in the study (the random sample). All students’ names are placed in a hat and 50 are chosen to receive the intervention (the treatment group), while the remaining 50 students serve as the control group. This design uses both random selection and random assignment.

A study using only random assignment could ask the principle of the school to select the students she believes are most likely to enjoy participating in the study, and the researcher could then randomly assign this sample of students to the treatment and control groups. In such a design the researcher could draw conclusions about the effect of the intervention but couldn’t make any inference about whether the effect would likely to be found in the population.

A study using only random selection could randomly select students from the overall population of the school, but then assign students in one grade to the intervention and students in another grade to the control group. While any data collected from this sample could be used to make inference to the population of the school, the lack of random assignment to be in the treatment or control group would make it impossible to conclude whether the intervention had any effect.

Random selection is thus essential to external validity, or the extent to which the researcher can use the results of the study to generalize to the larger population. Random assignment is central to internal validity, which allows the researcher to make causal claims about the effect of the treatment. Nonrandom assignment often leads to non-equivalent groups, meaning that any effect of the treatment might be a result of the groups being different at the outset rather than different at the end as a result of the treatment. The consequences of random selection and random assignment are clearly very different, and a strong research design will employ both whenever possible to ensure both internal and external validity .

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

Knowledge Base
Methodology
Random Assignment in Experiments | Introduction & Examples

Random Assignment in Experiments | Introduction & Examples

Published on 6 May 2022 by Pritha Bhandari . Revised on 13 February 2023.

In experimental research, random assignment is a way of placing participants from your sample into different treatment groups using randomisation.

With simple random assignment, every member of the sample has a known or equal chance of being placed in a control group or an experimental group. Studies that use simple random assignment are also called completely randomised designs .

Random assignment is a key part of experimental design . It helps you ensure that all groups are comparable at the start of a study: any differences between them are due to random factors.

Why does random assignment matter, random sampling vs random assignment, how do you use random assignment, when is random assignment not used, frequently asked questions about random assignment.

Random assignment is an important part of control in experimental research, because it helps strengthen the internal validity of an experiment.

In experiments, researchers manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables. To do so, they often use different levels of an independent variable for different groups of participants.

This is called a between-groups or independent measures design.

You use three groups of participants that are each given a different level of the independent variable:

A control group that’s given a placebo (no dosage)
An experimental group that’s given a low dosage
A second experimental group that’s given a high dosage

Random assignment to helps you make sure that the treatment groups don’t differ in systematic or biased ways at the start of the experiment.

If you don’t use random assignment, you may not be able to rule out alternative explanations for your results.

Participants recruited from pubs are placed in the control group
Participants recruited from local community centres are placed in the low-dosage experimental group
Participants recruited from gyms are placed in the high-dosage group

With this type of assignment, it’s hard to tell whether the participant characteristics are the same across all groups at the start of the study. Gym users may tend to engage in more healthy behaviours than people who frequent pubs or community centres, and this would introduce a healthy user bias in your study.

Although random assignment helps even out baseline differences between groups, it doesn’t always make them completely equivalent. There may still be extraneous variables that differ between groups, and there will always be some group differences that arise from chance.

Most of the time, the random variation between groups is low, and, therefore, it’s acceptable for further analysis. This is especially true when you have a large sample. In general, you should always use random assignment in experiments when it is ethically possible and makes sense for your study topic.

Prevent plagiarism, run a free check.

Random sampling and random assignment are both important concepts in research, but it’s important to understand the difference between them.

Random sampling (also called probability sampling or random selection) is a way of selecting members of a population to be included in your study. In contrast, random assignment is a way of sorting the sample participants into control and experimental groups.

While random sampling is used in many types of studies, random assignment is only used in between-subjects experimental designs.

Some studies use both random sampling and random assignment, while others use only one or the other.

Random sampling enhances the external validity or generalisability of your results, because it helps to ensure that your sample is unbiased and representative of the whole population. This allows you to make stronger statistical inferences .

You use a simple random sample to collect data. Because you have access to the whole population (all employees), you can assign all 8,000 employees a number and use a random number generator to select 300 employees. These 300 employees are your full sample.

Random assignment enhances the internal validity of the study, because it ensures that there are no systematic differences between the participants in each group. This helps you conclude that the outcomes can be attributed to the independent variable .

A control group that receives no intervention
An experimental group that has a remote team-building intervention every week for a month

You use random assignment to place participants into the control or experimental group. To do so, you take your list of participants and assign each participant a number. Again, you use a random number generator to place each participant in one of the two groups.

To use simple random assignment, you start by giving every member of the sample a unique number. Then, you can use computer programs or manual methods to randomly assign each participant to a group.

Random number generator: Use a computer program to generate random numbers from the list for each group.
Lottery method: Place all numbers individually into a hat or a bucket, and draw numbers at random for each group.
Flip a coin: When you only have two groups, for each number on the list, flip a coin to decide if they’ll be in the control or the experimental group.
Use a dice: When you have three groups, for each number on the list, roll a die to decide which of the groups they will be in. For example, assume that rolling 1 or 2 lands them in a control group; 3 or 4 in an experimental group; and 5 or 6 in a second control or experimental group.

This type of random assignment is the most powerful method of placing participants in conditions, because each individual has an equal chance of being placed in any one of your treatment groups.

Random assignment in block designs

In more complicated experimental designs, random assignment is only used after participants are grouped into blocks based on some characteristic (e.g., test score or demographic variable). These groupings mean that you need a larger sample to achieve high statistical power .

For example, a randomised block design involves placing participants into blocks based on a shared characteristic (e.g., college students vs graduates), and then using random assignment within each block to assign participants to every treatment condition. This helps you assess whether the characteristic affects the outcomes of your treatment.

In an experimental matched design , you use blocking and then match up individual participants from each block based on specific characteristics. Within each matched pair or group, you randomly assign each participant to one of the conditions in the experiment and compare their outcomes.

Sometimes, it’s not relevant or ethical to use simple random assignment, so groups are assigned in a different way.

When comparing different groups

Sometimes, differences between participants are the main focus of a study, for example, when comparing children and adults or people with and without health conditions. Participants are not randomly assigned to different groups, but instead assigned based on their characteristics.

In this type of study, the characteristic of interest (e.g., gender) is an independent variable, and the groups differ based on the different levels (e.g., men, women). All participants are tested the same way, and then their group-level outcomes are compared.

When it’s not ethically permissible

When studying unhealthy or dangerous behaviours, it’s not possible to use random assignment. For example, if you’re studying heavy drinkers and social drinkers, it’s unethical to randomly assign participants to one of the two groups and ask them to drink large amounts of alcohol for your experiment.

When you can’t assign participants to groups, you can also conduct a quasi-experimental study . In a quasi-experiment, you study the outcomes of pre-existing groups who receive treatments that you may not have any control over (e.g., heavy drinkers and social drinkers).

These groups aren’t randomly assigned, but may be considered comparable when some other variables (e.g., age or socioeconomic status) are controlled for.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomisation. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalisability of your results, while random assignment improves the internal validity of your study.

Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

To implement random assignment , assign a unique number to every member of your study’s sample .

Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a die to randomly assign participants to groups.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2023, February 13). Random Assignment in Experiments | Introduction & Examples. Scribbr. Retrieved 18 June 2024, from https://www.scribbr.co.uk/research-methods/random-assignment-experiments/

Is this article helpful?

Pritha Bhandari

Other students also liked, a quick guide to experimental design | 5 steps & examples, controlled experiments | methods & examples of control, control groups and treatment groups | uses & examples.

Frequently asked questions

What’s the difference between random assignment and random selection.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal validity of your study.

Frequently asked questions: Methodology

Attrition refers to participants leaving a study. It always happens to some extent—for example, in randomized controlled trials for medical research.

Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group . As a result, the characteristics of the participants who drop out differ from the characteristics of those who stay in the study. Because of this, study results may be biased .

Action research is conducted in order to solve a particular issue immediately, while case studies are often conducted over a longer period of time and focus more on observing and analyzing a particular ongoing phenomenon.

Action research is focused on solving a problem or informing individual and community-based knowledge in a way that impacts teaching, learning, and other related processes. It is less focused on contributing theoretical input, instead producing actionable input.

Action research is particularly popular with educators as a form of systematic inquiry because it prioritizes reflection and bridges the gap between theory and practice. Educators are able to simultaneously investigate an issue as they solve it, and the method is very iterative and flexible.

A cycle of inquiry is another name for action research . It is usually visualized in a spiral shape following a series of steps, such as “planning → acting → observing → reflecting.”

To make quantitative observations , you need to use instruments that are capable of measuring the quantity you want to observe. For example, you might use a ruler to measure the length of an object or a thermometer to measure its temperature.

Criterion validity and construct validity are both types of measurement validity . In other words, they both show you how accurately a method measures something.

While construct validity is the degree to which a test or other measurement method measures what it claims to measure, criterion validity is the degree to which a test can predictively (in the future) or concurrently (in the present) measure something.

Construct validity is often considered the overarching type of measurement validity . You need to have face validity , content validity , and criterion validity in order to achieve construct validity.

Convergent validity and discriminant validity are both subtypes of construct validity . Together, they help you evaluate whether a test measures the concept it was designed to measure.

Convergent validity indicates whether a test that is designed to measure a particular construct correlates with other tests that assess the same or similar construct.
Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related. This type of validity is also called divergent validity .

You need to assess both in order to demonstrate construct validity. Neither one alone is sufficient for establishing construct validity.

Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related

Content validity shows you how accurately a test or other measurement method taps into the various aspects of the specific construct you are researching.

In other words, it helps you answer the question: “does the test measure all aspects of the construct I want to measure?” If it does, then the test has high content validity.

The higher the content validity, the more accurate the measurement of the construct.

If the test fails to include parts of the construct, or irrelevant parts are included, the validity of the instrument is threatened, which brings your results into question.

Face validity and content validity are similar in that they both evaluate how suitable the content of a test is. The difference is that face validity is subjective, and assesses content at surface level.

When a test has strong face validity, anyone would agree that the test’s questions appear to measure what they are intended to measure.

For example, looking at a 4th grade math test consisting of problems in which students have to add and multiply, most people would agree that it has strong face validity (i.e., it looks like a math test).

On the other hand, content validity evaluates how well a test represents all the aspects of a topic. Assessing content validity is more systematic and relies on expert evaluation. of each question, analyzing whether each one covers the aspects that the test was designed to cover.

A 4th grade math test would have high content validity if it covered all the skills taught in that grade. Experts(in this case, math teachers), would have to evaluate the content validity by comparing the test to the learning objectives.

Snowball sampling is a non-probability sampling method . Unlike probability sampling (which involves some form of random selection ), the initial individuals selected to be studied are the ones who recruit new participants.

Because not every member of the target population has an equal chance of being recruited into the sample, selection in snowball sampling is non-random.

Snowball sampling is a non-probability sampling method , where there is not an equal chance for every member of the population to be included in the sample .

This means that you cannot use inferential statistics and make generalizations —often the goal of quantitative research . As such, a snowball sample is not representative of the target population and is usually a better fit for qualitative research .

Snowball sampling relies on the use of referrals. Here, the researcher recruits one or more initial participants, who then recruit the next ones.

Participants share similar characteristics and/or know each other. Because of this, not every member of the population has an equal chance of being included in the sample, giving rise to sampling bias .

Snowball sampling is best used in the following cases:

If there is no sampling frame available (e.g., people with a rare disease)
If the population of interest is hard to access or locate (e.g., people experiencing homelessness)
If the research focuses on a sensitive topic (e.g., extramarital affairs)

The reproducibility and replicability of a study can be ensured by writing a transparent, detailed method section and using clear, unambiguous language.

Reproducibility and replicability are related terms.

Reproducing research entails reanalyzing the existing data in the same manner.
Replicating (or repeating ) the research entails reconducting the entire analysis, including the collection of new data .
A successful reproduction shows that the data analyses were conducted in a fair and honest manner.
A successful replication shows that the reliability of the results is high.

Stratified sampling and quota sampling both involve dividing the population into subgroups and selecting units from each subgroup. The purpose in both cases is to select a representative sample and/or to allow comparisons between subgroups.

The main difference is that in stratified sampling, you draw a random sample from each subgroup ( probability sampling ). In quota sampling you select a predetermined number or proportion of units, in a non-random manner ( non-probability sampling ).

Purposive and convenience sampling are both sampling methods that are typically used in qualitative data collection.

A convenience sample is drawn from a source that is conveniently accessible to the researcher. Convenience sampling does not distinguish characteristics among the participants. On the other hand, purposive sampling focuses on selecting participants possessing characteristics associated with the research study.

The findings of studies based on either convenience or purposive sampling can only be generalized to the (sub)population from which the sample is drawn, and not to the entire population.

Random sampling or probability sampling is based on random selection. This means that each unit has an equal chance (i.e., equal probability) of being included in the sample.

On the other hand, convenience sampling involves stopping people at random, which means that not everyone has an equal chance of being selected depending on the place, time, or day you are collecting your data.

Convenience sampling and quota sampling are both non-probability sampling methods. They both use non-random criteria like availability, geographical proximity, or expert knowledge to recruit study participants.

However, in convenience sampling, you continue to sample units or cases until you reach the required sample size.

In quota sampling, you first need to divide your population of interest into subgroups (strata) and estimate their proportions (quota) in the population. Then you can start your data collection, using convenience sampling to recruit participants, until the proportions in each subgroup coincide with the estimated proportions in the population.

A sampling frame is a list of every member in the entire population . It is important that the sampling frame is as complete as possible, so that your sample accurately reflects your population.

Stratified and cluster sampling may look similar, but bear in mind that groups created in cluster sampling are heterogeneous , so the individual characteristics in the cluster vary. In contrast, groups created in stratified sampling are homogeneous , as units share characteristics.

Relatedly, in cluster sampling you randomly select entire groups and include all units of each group in your sample. However, in stratified sampling, you select some units of all groups and include them in your sample. In this way, both methods can ensure that your sample is representative of the target population .

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

An observational study is a great choice for you if your research question is based purely on observations. If there are ethical, logistical, or practical concerns that prevent you from conducting a traditional experiment , an observational study may be a good choice. In an observational study, there is no interference or manipulation of the research subjects, as well as no control or treatment groups .

It’s often best to ask a variety of people to review your measurements. You can ask experts, such as other researchers, or laypeople, such as potential participants, to judge the face validity of tests.

While experts have a deep understanding of research methods , the people you’re studying can provide you with valuable insights you may have missed otherwise.

Face validity is important because it’s a simple first step to measuring the overall validity of a test or technique. It’s a relatively intuitive, quick, and easy way to start checking whether a new measure seems useful at first glance.

Good face validity means that anyone who reviews your measure says that it seems to be measuring what it’s supposed to. With poor face validity, someone reviewing your measure may be left confused about what you’re measuring and why you’re using this method.

Face validity is about whether a test appears to measure what it’s supposed to measure. This type of validity is concerned with whether a measure seems relevant and appropriate for what it’s assessing only on the surface.

Statistical analyses are often applied to test validity with data from your measures. You test convergent validity and discriminant validity with correlations to see if results from your test are positively or negatively related to those of other established tests.

You can also use regression analyses to assess whether your measure is actually predictive of outcomes that you expect it to predict theoretically. A regression analysis that supports your expectations strengthens your claim of construct validity .

When designing or evaluating a measure, construct validity helps you ensure you’re actually measuring the construct you’re interested in. If you don’t have construct validity, you may inadvertently measure unrelated or distinct constructs and lose precision in your research.

Construct validity is often considered the overarching type of measurement validity , because it covers all of the other types. You need to have face validity , content validity , and criterion validity to achieve construct validity.

Construct validity is about how well a test measures the concept it was designed to evaluate. It’s one of four types of measurement validity , which includes construct validity, face validity , and criterion validity.

There are two subtypes of construct validity.

Convergent validity : The extent to which your measure corresponds to measures of related constructs
Discriminant validity : The extent to which your measure is unrelated or negatively related to measures of distinct constructs

Naturalistic observation is a valuable tool because of its flexibility, external validity , and suitability for topics that can’t be studied in a lab setting.

The downsides of naturalistic observation include its lack of scientific control , ethical considerations , and potential for bias from observers and subjects.

Naturalistic observation is a qualitative research method where you record the behaviors of your research subjects in real world settings. You avoid interfering or influencing anything in a naturalistic observation.

You can think of naturalistic observation as “people watching” with a purpose.

A dependent variable is what changes as a result of the independent variable manipulation in experiments . It’s what you’re interested in measuring, and it “depends” on your independent variable.

In statistics, dependent variables are also called:

Response variables (they respond to a change in another variable)
Outcome variables (they represent the outcome you want to measure)
Left-hand-side variables (they appear on the left-hand side of a regression equation)

An independent variable is the variable you manipulate, control, or vary in an experimental study to explore its effects. It’s called “independent” because it’s not influenced by any other variables in the study.

Independent variables are also called:

Explanatory variables (they explain an event or outcome)
Predictor variables (they can be used to predict the value of a dependent variable)
Right-hand-side variables (they appear on the right-hand side of a regression equation).

As a rule of thumb, questions related to thoughts, beliefs, and feelings work well in focus groups. Take your time formulating strong questions, paying special attention to phrasing. Be careful to avoid leading questions , which can bias your responses.

Overall, your focus group questions should be:

Open-ended and flexible
Impossible to answer with “yes” or “no” (questions that start with “why” or “how” are often best)
Unambiguous, getting straight to the point while still stimulating discussion
Unbiased and neutral

A structured interview is a data collection method that relies on asking questions in a set order to collect data on a topic. They are often quantitative in nature. Structured interviews are best used when:

You already have a very clear understanding of your topic. Perhaps significant research has already been conducted, or you have done some prior research yourself, but you already possess a baseline for designing strong structured questions.
You are constrained in terms of time or resources and need to analyze your data quickly and efficiently.
Your research question depends on strong parity between participants, with environmental conditions held constant.

More flexible interview options include semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias is the tendency for interview participants to give responses that will be viewed favorably by the interviewer or other participants. It occurs in all types of interviews and surveys , but is most common in semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias can be mitigated by ensuring participants feel at ease and comfortable sharing their views. Make sure to pay attention to your own body language and any physical or verbal cues, such as nodding or widening your eyes.

This type of bias can also occur in observations if the participants know they’re being observed. They might alter their behavior accordingly.

The interviewer effect is a type of bias that emerges when a characteristic of an interviewer (race, age, gender identity, etc.) influences the responses given by the interviewee.

There is a risk of an interviewer effect in all types of interviews , but it can be mitigated by writing really high-quality interview questions.

A semi-structured interview is a blend of structured and unstructured types of interviews. Semi-structured interviews are best used when:

You have prior interview experience. Spontaneous questions are deceptively challenging, and it’s easy to accidentally ask a leading question or make a participant uncomfortable.
Your research question is exploratory in nature. Participant answers can guide future research questions and help you develop a more robust knowledge base for future research.

An unstructured interview is the most flexible type of interview, but it is not always the best fit for your research topic.

Unstructured interviews are best used when:

You are an experienced interviewer and have a very strong background in your research topic, since it is challenging to ask spontaneous, colloquial questions.
Your research question is exploratory in nature. While you may have developed hypotheses, you are open to discovering new or shifting viewpoints through the interview process.
You are seeking descriptive data, and are ready to ask questions that will deepen and contextualize your initial thoughts and hypotheses.
Your research depends on forming connections with your participants and making them feel comfortable revealing deeper emotions, lived experiences, or thoughts.

The four most common types of interviews are:

Structured interviews : The questions are predetermined in both topic and order.
Semi-structured interviews : A few questions are predetermined, but other questions aren’t planned.
Unstructured interviews : None of the questions are predetermined.
Focus group interviews : The questions are presented to a group instead of one individual.

Deductive reasoning is commonly used in scientific research, and it’s especially associated with quantitative research .

In research, you might have come across something called the hypothetico-deductive method . It’s the scientific method of testing hypotheses to check whether your predictions are substantiated by real-world data.

Deductive reasoning is a logical approach where you progress from general ideas to specific conclusions. It’s often contrasted with inductive reasoning , where you start with specific observations and form general conclusions.

Deductive reasoning is also called deductive logic.

There are many different types of inductive reasoning that people use formally or informally.

Here are a few common types:

Inductive generalization : You use observations about a sample to come to a conclusion about the population it came from.
Statistical generalization: You use specific numbers about samples to make statements about populations.
Causal reasoning: You make cause-and-effect links between different things.
Sign reasoning: You make a conclusion about a correlational relationship between different things.
Analogical reasoning: You make a conclusion about something based on its similarities to something else.

Inductive reasoning is a bottom-up approach, while deductive reasoning is top-down.

Inductive reasoning takes you from the specific to the general, while in deductive reasoning, you make inferences by going from general premises to specific conclusions.

In inductive research , you start by making observations or gathering data. Then, you take a broad scan of your data and search for patterns. Finally, you make general conclusions that you might incorporate into theories.

Inductive reasoning is a method of drawing conclusions by going from the specific to the general. It’s usually contrasted with deductive reasoning, where you proceed from general information to specific conclusions.

Inductive reasoning is also called inductive logic or bottom-up reasoning.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Triangulation can help:

Reduce research bias that comes from using a single method, theory, or investigator
Enhance validity by approaching the same topic with different tools
Establish credibility by giving you a complete picture of the research problem

But triangulation can also pose problems:

It’s time-consuming and labor-intensive, often involving an interdisciplinary team.
Your results may be inconsistent or even contradictory.

There are four main types of triangulation :

Data triangulation : Using data from different times, spaces, and people
Investigator triangulation : Involving multiple researchers in collecting or analyzing data
Theory triangulation : Using varying theoretical perspectives in your research
Methodological triangulation : Using different methodologies to approach the same topic

Many academic fields use peer review , largely to determine whether a manuscript is suitable for publication. Peer review enhances the credibility of the published manuscript.

However, peer review is also common in non-academic settings. The United Nations, the European Union, and many individual nations use peer review to evaluate grant applications. It is also widely used in medical and health-related fields as a teaching or quality-of-care measure.

Peer assessment is often used in the classroom as a pedagogical tool. Both receiving feedback and providing it are thought to enhance the learning process, helping students think critically and collaboratively.

Peer review can stop obviously problematic, falsified, or otherwise untrustworthy research from being published. It also represents an excellent opportunity to get feedback from renowned experts in your field. It acts as a first defense, helping you ensure your argument is clear and that there are no gaps, vague terms, or unanswered questions for readers who weren’t involved in the research process.

Peer-reviewed articles are considered a highly credible source due to this stringent process they go through before publication.

In general, the peer review process follows the following steps:

First, the author submits the manuscript to the editor.
Reject the manuscript and send it back to author, or
Send it onward to the selected peer reviewer(s)
Next, the peer review process occurs. The reviewer provides feedback, addressing any major or minor issues with the manuscript, and gives their advice regarding what edits should be made.
Lastly, the edited manuscript is sent back to the author. They input the edits, and resubmit it to the editor for publication.

Exploratory research is often used when the issue you’re studying is new or when the data collection process is challenging for some reason.

You can use exploratory research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

Exploratory research is a methodology approach that explores research questions that have not previously been studied in depth. It is often used when the issue you’re studying is new, or the data collection process is challenging in some way.

Explanatory research is used to investigate how or why a phenomenon occurs. Therefore, this type of research is often one of the first stages in the research process , serving as a jumping-off point for future research.

Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.

Explanatory research is a research method used to investigate how or why something occurs when only a small amount of information is available pertaining to that topic. It can help you increase your understanding of a given topic.

Clean data are valid, accurate, complete, consistent, unique, and uniform. Dirty data include inconsistencies and errors.

Dirty data can come from any part of the research process, including poor research design , inappropriate measurement materials, or flawed data entry.

Data cleaning takes place between data collection and data analyses. But you can use some methods even before collecting data.

For clean data, you should start by designing measures that collect valid data. Data validation at the time of data entry or collection helps you minimize the amount of data cleaning you’ll need to do.

After data collection, you can use data standardization and data transformation to clean your data. You’ll also deal with any missing values, outliers, and duplicate values.

Every dataset requires different techniques to clean dirty data , but you need to address these issues in a systematic way. You focus on finding and resolving data points that don’t agree or fit with the rest of your dataset.

These data might be missing values, outliers, duplicate values, incorrectly formatted, or irrelevant. You’ll start with screening and diagnosing your data. Then, you’ll often standardize and accept or remove data to make your dataset consistent and valid.

Data cleaning is necessary for valid and appropriate analyses. Dirty data contain inconsistencies or errors , but cleaning your data helps you minimize or resolve these.

Without data cleaning, you could end up with a Type I or II error in your conclusion. These types of erroneous conclusions can be practically significant with important consequences, because they lead to misplaced investments or missed opportunities.

Data cleaning involves spotting and resolving potential data inconsistencies or errors to improve your data quality. An error is any value (e.g., recorded weight) that doesn’t reflect the true value (e.g., actual weight) of something that’s being measured.

In this process, you review, analyze, detect, modify, or remove “dirty” data to make your dataset “clean.” Data cleaning is also called data cleansing or data scrubbing.

Research misconduct means making up or falsifying data, manipulating data analyses, or misrepresenting results in research reports. It’s a form of academic fraud.

These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement but a serious ethical failure.

Anonymity means you don’t know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations .

You can only guarantee anonymity by not collecting any personally identifying information—for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, or videos.

You can keep data confidential by using aggregate information in your research report, so that you only refer to groups of participants rather than individuals.

Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe.

Ethical considerations in research are a set of principles that guide your research designs and practices. These principles include voluntary participation, informed consent, anonymity, confidentiality, potential for harm, and results communication.

Scientists and researchers must always adhere to a certain code of conduct when collecting data from others .

These considerations protect the rights of research participants, enhance research validity , and maintain scientific integrity.

In multistage sampling , you can use probability or non-probability sampling methods .

For a probability sample, you have to conduct probability sampling at every stage.

You can mix it up by using simple random sampling , systematic sampling , or stratified sampling to select units at different stages, depending on what is applicable and relevant to your study.

Multistage sampling can simplify data collection when you have large, geographically spread samples, and you can obtain a probability sample without a complete sampling frame.

But multistage sampling may not lead to a representative sample, and larger samples are needed for multistage samples to achieve the statistical properties of simple random samples .

These are four of the most common mixed methods designs :

Convergent parallel: Quantitative and qualitative data are collected at the same time and analyzed separately. After both analyses are complete, compare your results to draw overall conclusions.
Embedded: Quantitative and qualitative data are collected at the same time, but within a larger quantitative or qualitative design. One type of data is secondary to the other.
Explanatory sequential: Quantitative data is collected and analyzed first, followed by qualitative data. You can use this design if you think your qualitative data will explain and contextualize your quantitative findings.
Exploratory sequential: Qualitative data is collected and analyzed first, followed by quantitative data. You can use this design if you think the quantitative data will confirm or validate your qualitative findings.

Triangulation in research means using multiple datasets, methods, theories and/or investigators to address a research question. It’s a research strategy that can help you enhance the validity and credibility of your findings.

Triangulation is mainly used in qualitative research , but it’s also commonly applied in quantitative research . Mixed methods research always uses triangulation.

In multistage sampling , or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.

This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from state to city to neighborhood) to create a sample that’s less expensive and time-consuming to collect data from.

No, the steepness or slope of the line isn’t related to the correlation coefficient value. The correlation coefficient only tells you how closely your data fit on a line, so two datasets with the same correlation coefficient can have very different slopes.

To find the slope of the line, you’ll need to perform a regression analysis .

Correlation coefficients always range between -1 and 1.

The sign of the coefficient tells you the direction of the relationship: a positive value means the variables change together in the same direction, while a negative value means they change together in opposite directions.

The absolute value of a number is equal to the number without its sign. The absolute value of a correlation coefficient tells you the magnitude of the correlation: the greater the absolute value, the stronger the correlation.

These are the assumptions your data must meet if you want to use Pearson’s r :

Both variables are on an interval or ratio level of measurement
Data from both variables follow normal distributions
Your data have no outliers
Your data is from a random or representative sample
You expect a linear relationship between the two variables

Quantitative research designs can be divided into two main categories:

Correlational and descriptive designs are used to investigate characteristics, averages, trends, and associations between variables.
Experimental and quasi-experimental designs are used to test causal relationships .

Qualitative research designs tend to be more flexible. Common types of qualitative design include case study , ethnography , and grounded theory designs.

A well-planned research design helps ensure that your methods match your research aims, that you collect high-quality data, and that you use the right kind of analysis to answer your questions, utilizing credible sources . This allows you to draw valid , trustworthy conclusions.

The priorities of a research design can vary depending on the field, but you usually have to specify:

Your research questions and/or hypotheses
Your overall approach (e.g., qualitative or quantitative )
The type of design you’re using (e.g., a survey , experiment , or case study )
Your sampling methods or criteria for selecting subjects
Your data collection methods (e.g., questionnaires , observations)
Your data collection procedures (e.g., operationalization , timing and data management)
Your data analysis methods (e.g., statistical tests or thematic analysis )

A research design is a strategy for answering your research question . It defines your overall approach and determines how you will collect and analyze data.

Questionnaires can be self-administered or researcher-administered.

Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or through mail. All questions are standardized so that all respondents receive the same questions with identical wording.

Researcher-administered questionnaires are interviews that take place by phone, in-person, or online between researchers and respondents. You can gain deeper insights by clarifying questions for respondents or asking follow-up questions.

You can organize the questions logically, with a clear progression from simple to complex, or randomly between respondents. A logical flow helps respondents process the questionnaire easier and quicker, but it may lead to bias. Randomization can minimize the bias from order effects.

Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. These questions are easier to answer quickly.

Open-ended or long-form questions allow respondents to answer in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered.

A questionnaire is a data collection tool or instrument, while a survey is an overarching research method that involves collecting and analyzing data from people using questionnaires.

The third variable and directionality problems are two main reasons why correlation isn’t causation .

The third variable problem means that a confounding variable affects both variables to make them seem causally related when they are not.

The directionality problem is when two variables correlate and might actually have a causal relationship, but it’s impossible to conclude which variable causes changes in the other.

Correlation describes an association between variables : when one variable changes, so does the other. A correlation is a statistical indicator of the relationship between variables.

Causation means that changes in one variable brings about changes in the other (i.e., there is a cause-and-effect relationship between variables). The two variables are correlated with each other, and there’s also a causal link between them.

While causation and correlation can exist simultaneously, correlation does not imply causation. In other words, correlation is simply a relationship where A relates to B—but A doesn’t necessarily cause B to happen (or vice versa). Mistaking correlation for causation is a common error and can lead to false cause fallacy .

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

A correlation reflects the strength and/or direction of the association between two or more variables.

A positive correlation means that both variables change in the same direction.
A negative correlation means that the variables change in opposite directions.
A zero correlation means there’s no relationship between the variables.

Random error is almost always present in scientific studies, even in highly controlled settings. While you can’t eradicate it completely, you can reduce random error by taking repeated measurements, using a large sample, and controlling extraneous variables .

You can avoid systematic error through careful design of your sampling , data collection , and analysis procedures. For example, use triangulation to measure your variables using multiple methods; regularly calibrate instruments or procedures; use random sampling and random assignment ; and apply masking (blinding) where possible.

Systematic error is generally a bigger problem in research.

With random error, multiple measurements will tend to cluster around the true value. When you’re collecting data from a large sample , the errors in different directions will cancel each other out.

Systematic errors are much more problematic because they can skew your data away from the true value. This can lead you to false conclusions ( Type I and II errors ) about the relationship between the variables you’re studying.

Random and systematic error are two types of measurement error.

Random error is a chance difference between the observed and true values of something (e.g., a researcher misreading a weighing scale records an incorrect measurement).

Systematic error is a consistent or proportional difference between the observed and true values of something (e.g., a miscalibrated scale consistently records weights as higher than they actually are).

On graphs, the explanatory variable is conventionally placed on the x-axis, while the response variable is placed on the y-axis.

If you have quantitative variables , use a scatterplot or a line graph.
If your response variable is categorical, use a scatterplot or a line graph.
If your explanatory variable is categorical, use a bar graph.

The term “ explanatory variable ” is sometimes preferred over “ independent variable ” because, in real world contexts, independent variables are often influenced by other variables. This means they aren’t totally independent.

Multiple independent variables may also be correlated with each other, so “explanatory variables” is a more appropriate term.

The difference between explanatory and response variables is simple:

An explanatory variable is the expected cause, and it explains the results.
A response variable is the expected effect, and it responds to other variables.

In a controlled experiment , all extraneous variables are held constant so that they can’t influence the results. Controlled experiments require:

A control group that receives a standard treatment, a fake treatment, or no treatment.
Random assignment of participants to ensure the groups are equivalent.

Depending on your study topic, there are various other methods of controlling variables .

There are 4 main types of extraneous variables :

Demand characteristics : environmental cues that encourage participants to conform to researchers’ expectations.
Experimenter effects : unintentional actions by researchers that influence study outcomes.
Situational variables : environmental variables that alter participants’ behaviors.
Participant variables : any characteristic or aspect of a participant’s background that could affect study results.

An extraneous variable is any variable that you’re not investigating that can potentially affect the dependent variable of your research study.

A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.

In a factorial design, multiple independent variables are tested.

If you test two variables, each level of one independent variable is combined with each level of the other independent variable to create different conditions.

Within-subjects designs have many potential threats to internal validity , but they are also very statistically powerful .

Advantages:

Only requires small samples
Statistically powerful
Removes the effects of individual differences on the outcomes

Disadvantages:

Internal validity threats reduce the likelihood of establishing a direct relationship between variables
Time-related effects, such as growth, can influence the outcomes
Carryover effects mean that the specific order of different treatments affect the outcomes

While a between-subjects design has fewer threats to internal validity , it also requires more participants for high statistical power than a within-subjects design .

Prevents carryover effects of learning and fatigue.
Shorter study duration.
Needs larger samples for high power.
Uses more resources to recruit participants, administer sessions, cover costs, etc.
Individual differences may be an alternative explanation for results.

Yes. Between-subjects and within-subjects designs can be combined in a single study when you have two or more independent variables (a factorial design). In a mixed factorial design, one variable is altered between subjects and another is altered within subjects.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

To implement random assignment , assign a unique number to every member of your study’s sample .

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

“Controlling for a variable” means measuring extraneous variables and accounting for them statistically to remove their effects on other variables.

Researchers often model control variable data along with independent and dependent variable data in regression analyses and ANCOVAs . That way, you can isolate the control variable’s effects from the relationship between the variables of interest.

Control variables help you establish a correlational or causal relationship between variables by enhancing internal validity .

If you don’t control relevant extraneous variables , they may influence the outcomes of your study, and you may not be able to demonstrate that your results are really an effect of your independent variable .

A control variable is any variable that’s held constant in a research study. It’s not a variable of interest in the study, but it’s controlled because it could influence the outcomes.

Including mediators and moderators in your research helps you go beyond studying a simple relationship between two variables for a fuller picture of the real world. They are important to consider when studying complex correlational or causal relationships.

Mediators are part of the causal pathway of an effect, and they tell you how or why an effect takes place. Moderators usually help you judge the external validity of your study by identifying the limitations of when the relationship between variables holds.

If something is a mediating variable :

It’s caused by the independent variable .
It influences the dependent variable
When it’s taken into account, the statistical correlation between the independent and dependent variables is higher than when it isn’t considered.

A confounder is a third variable that affects variables of interest and makes them seem related when they are not. In contrast, a mediator is the mechanism of a relationship between two variables: it explains the process by which they are related.

A mediator variable explains the process through which two variables are related, while a moderator variable affects the strength and direction of that relationship.

There are three key steps in systematic sampling :

Define and list your population , ensuring that it is not ordered in a cyclical or periodic order.
Decide on your sample size and calculate your interval, k , by dividing your population by your target sample size.
Choose every k th member of the population as your sample.

Systematic sampling is a probability sampling method where researchers select members of the population at a regular interval – for example, by selecting every 15th person on a list of the population. If the population is in a random order, this can imitate the benefits of simple random sampling .

Yes, you can create a stratified sample using multiple characteristics, but you must ensure that every participant in your study belongs to one and only one subgroup. In this case, you multiply the numbers of subgroups for each characteristic to get the total number of groups.

For example, if you were stratifying by location with three subgroups (urban, rural, or suburban) and marital status with five subgroups (single, divorced, widowed, married, or partnered), you would have 3 x 5 = 15 subgroups.

You should use stratified sampling when your sample can be divided into mutually exclusive and exhaustive subgroups that you believe will take on different mean values for the variable that you’re studying.

Using stratified sampling will allow you to obtain more precise (with lower variance ) statistical estimates of whatever you are trying to measure.

For example, say you want to investigate how income differs based on educational attainment, but you know that this relationship can vary based on race. Using stratified sampling, you can ensure you obtain a large enough sample from each racial group, allowing you to draw more precise conclusions.

In stratified sampling , researchers divide subjects into subgroups called strata based on characteristics that they share (e.g., race, gender, educational attainment).

Once divided, each subgroup is randomly sampled using another probability sampling method.

Cluster sampling is more time- and cost-efficient than other probability sampling methods , particularly when it comes to large samples spread across a wide geographical area.

However, it provides less statistical certainty than other methods, such as simple random sampling , because it is difficult to ensure that your clusters properly represent the population as a whole.

There are three types of cluster sampling : single-stage, double-stage and multi-stage clustering. In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample.

In single-stage sampling , you collect data from every unit within the selected clusters.
In double-stage sampling , you select a random sample of units from within the clusters.
In multi-stage sampling , you repeat the procedure of randomly sampling elements from within the clusters until you have reached a manageable sample.

Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample.

The clusters should ideally each be mini-representations of the population as a whole.

If properly implemented, simple random sampling is usually the best sampling method for ensuring both internal and external validity . However, it can sometimes be impractical and expensive to implement, depending on the size of the population to be studied,

If you have a list of every member of the population and the ability to reach whichever members are selected, you can use simple random sampling.

The American Community Survey is an example of simple random sampling . In order to collect detailed data on the population of the US, the Census Bureau officials randomly select 3.5 million households per year and use a variety of methods to convince them to fill out the survey.

Simple random sampling is a type of probability sampling in which the researcher randomly selects a subset of participants from a population . Each member of the population has an equal chance of being selected. Data is then collected from as large a percentage as possible of this random subset.

Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .

Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity as they can use real-world interventions instead of artificial laboratory settings.

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.

Blinding is important to reduce research bias (e.g., observer bias , demand characteristics ) and ensure a study’s internal validity .

If participants know whether they are in a control or treatment group , they may adjust their behavior in ways that affect the outcome that researchers are trying to measure. If the people administering the treatment are aware of group assignment, they may treat participants differently and thus directly or indirectly influence the final results.

In a single-blind study , only the participants are blinded.
In a double-blind study , both participants and experimenters are blinded.
In a triple-blind study , the assignment is hidden not only from participants and experimenters, but also from the researchers analyzing the data.

Blinding means hiding who is assigned to the treatment group and who is assigned to the control group in an experiment .

A true experiment (a.k.a. a controlled experiment) always includes at least one control group that doesn’t receive the experimental treatment.

However, some experiments use a within-subjects design to test treatments without a control group. In these designs, you usually compare one group’s outcomes before and after a treatment (instead of comparing outcomes between different groups).

For strong internal validity , it’s usually best to include a control group if possible. Without a control group, it’s harder to be certain that the outcome was caused by the experimental treatment and not by other variables.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Individual Likert-type questions are generally considered ordinal data , because the items have clear rank order, but don’t have an even distribution.

Overall Likert scale scores are sometimes treated as interval data. These scores are considered to have directionality and even spacing between them.

The type of data determines what statistical tests you should use to analyze your data.

A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviors. It is made up of 4 or more questions that measure a single attitude or trait when response scores are combined.

To use a Likert scale in a survey , you present participants with Likert-type questions or statements, and a continuum of items, usually with 5 or 7 possible responses, to capture their degree of agreement.

In scientific research, concepts are the abstract ideas or phenomena that are being studied (e.g., educational achievement). Variables are properties or characteristics of the concept (e.g., performance at school), while indicators are ways of measuring or quantifying variables (e.g., yearly grade reports).

The process of turning abstract concepts into measurable variables and indicators is called operationalization .

There are various approaches to qualitative data analysis , but they all share five steps in common:

Prepare and organize your data.
Review and explore your data.
Develop a data coding system.
Assign codes to the data.
Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

There are five common approaches to qualitative research :

Grounded theory involves collecting data in order to develop new theories.
Ethnography involves immersing yourself in a group or organization to understand its culture.
Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
Phenomenological research involves investigating phenomena through people’s lived experiences.
Action research links theory and practice in several cycles to drive innovative changes.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.

When conducting research, collecting original data has significant advantages:

You can tailor data collection to your specific research aims (e.g. understanding the needs of your consumers or user testing your website)
You can control and standardize the process for high reliability and validity (e.g. choosing appropriate measurements and sampling methods )

However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control and randomization.

In restriction , you restrict your sample by only including certain subjects that have the same values of potential confounding variables.

In matching , you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable .

In statistical control , you include potential confounders as variables in your regression .

In randomization , you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.

A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.

Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.

To ensure the internal validity of your research, you must consider the impact of confounding variables. If you fail to account for them, you might over- or underestimate the causal relationship between your independent and dependent variables , or even find a causal relationship where none exists.

Yes, but including more than one of either type requires multiple research questions .

For example, if you are interested in the effect of a diet on health, you can use multiple measures of health: blood sugar, blood pressure, weight, pulse, and many more. Each of these is its own dependent variable with its own research question.

You could also choose to look at the effect of exercise levels as well as diet, or even the additional effect of the two combined. Each of these is a separate independent variable .

To ensure the internal validity of an experiment , you should only change one independent variable at a time.

No. The value of a dependent variable depends on an independent variable, so a variable cannot be both independent and dependent at the same time. It must be either the cause or the effect, not both!

You want to find out how blood sugar levels are affected by drinking diet soda and regular soda, so you conduct an experiment .

The type of soda – diet or regular – is the independent variable .
The level of blood sugar that you measure is the dependent variable – it changes depending on the type of soda.

Determining cause and effect is one of the most important parts of scientific research. It’s essential to know which is the cause – the independent variable – and which is the effect – the dependent variable.

In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.

Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling, and quota sampling .

Probability sampling means that every member of the target population has a known chance of being included in the sample.

Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .

Using careful research design and sampling procedures can help you avoid sampling bias . Oversampling can be used to correct undercoverage bias .

Some common types of sampling bias include self-selection bias , nonresponse bias , undercoverage bias , survivorship bias , pre-screening or advertising bias, and healthy user bias.

Sampling bias is a threat to external validity – it limits the generalizability of your findings to a broader group of people.

A sampling error is the difference between a population parameter and a sample statistic .

A statistic refers to measures about the sample , while a parameter refers to measures about the population .

Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible.

Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.

There are seven threats to external validity : selection bias , history, experimenter effect, Hawthorne effect , testing effect, aptitude-treatment and situation effect.

The two types of external validity are population validity (whether you can generalize to other groups of people) and ecological validity (whether you can generalize to other situations and settings).

The external validity of a study is the extent to which you can generalize your findings to different groups of people, situations, and measures.

Cross-sectional studies cannot establish a cause-and-effect relationship or analyze behavior over a period of time. To investigate cause and effect, you need to do a longitudinal study or an experimental study .

Cross-sectional studies are less expensive and time-consuming than many other types of study. They can provide useful insights into a population’s characteristics and identify correlations for further research.

Sometimes only cross-sectional data is available for analysis; other times your research question may only require a cross-sectional study to answer it.

Longitudinal studies can last anywhere from weeks to decades, although they tend to be at least a year long.

The 1970 British Cohort Study , which has collected data on the lives of 17,000 Brits since their births in 1970, is one well-known example of a longitudinal study .

Longitudinal studies are better to establish the correct sequence of events, identify changes over time, and provide insight into cause-and-effect relationships, but they also tend to be more expensive and time-consuming than other types of studies.

Longitudinal studies and cross-sectional studies are two different types of research design . In a cross-sectional study you collect data from a population at a specific point in time; in a longitudinal study you repeatedly collect data from the same sample over an extended period of time.

Longitudinal study	Cross-sectional study
observations	Observations at a in time
Observes the multiple times	Observes (a “cross-section”) in the population
Follows in participants over time	Provides of society at a given point

There are eight threats to internal validity : history, maturation, instrumentation, testing, selection bias , regression to the mean, social interaction and attrition .

Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

The research methods you use depend on the type of data you need to answer your research question .

If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

Discrete and continuous variables are two types of quantitative variables :

Discrete variables represent counts (e.g. the number of objects in a collection).
Continuous variables represent measurable amounts (e.g. water volume or weight).

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .

In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:

The independent variable is the amount of nutrients added to the crop field.
The dependent variable is the biomass of the crops at harvest time.

Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

A testable hypothesis
At least one independent variable that can be precisely manipulated
At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

How you will manipulate the variable(s)
How you will control for any potential confounding variables
How many subjects or samples will be included in the study
How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

I nternal validity is the degree of confidence that the causal relationship you are testing is not influenced by other factors or variables .

External validity is the extent to which your results can be generalized to other contexts.

The validity of your experiment depends on your experimental design .

Reliability and validity are both about how well a method measures something:

Reliability refers to the consistency of a measure (whether the results can be reproduced under the same conditions).
Validity refers to the accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

Ask our team

Want to contact us directly? No problem. We are always here for you.

Email [email protected]
Start live chat
Call +1 (510) 822-8066
WhatsApp +31 20 261 6040

Our team helps students graduate by offering:

A world-class citation generator
Plagiarism Checker software powered by Turnitin
Innovative Citation Checker software
Professional proofreading services
Over 300 helpful articles about academic writing, citing sources, plagiarism, and more

Scribbr specializes in editing study-related documents . We proofread:

PhD dissertations
Research proposals
Personal statements
Admission essays
Motivation letters
Reflection papers
Journal articles
Capstone projects

Scribbr’s Plagiarism Checker is powered by elements of Turnitin’s Similarity Checker , namely the plagiarism detection software and the Internet Archive and Premium Scholarly Publications content databases .

The add-on AI detector is powered by Scribbr’s proprietary software.

The Scribbr Citation Generator is developed using the open-source Citation Style Language (CSL) project and Frank Bennett’s citeproc-js . It’s the same technology used by dozens of other popular citation tools, including Mendeley and Zotero.

You can find all the citation styles and locales used in the Scribbr Citation Generator in our publicly accessible repository on Github .

Skip to secondary menu
Skip to main content
Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Internal and External Validity

By Jim Frost Leave a Comment

Internal and external validity relate to the findings of studies and experiments.

Image of scientist who considered the internal and external validity of his experiment.

External validity assesses the applicability or generalizability of the findings to the real world. So, your study had significant findings in a controlled environment. But will you get the same results outside of the lab?

In this post, learn more about internal and external validity, how to increase both of them in a study, threats that can reduce them, and why studies high in one type tend to be low in the other.

Learn more about Experimental Design: Definition, Types, and Examples .

If you’re interested in the validity of test scores and measurements rather than experiments, read my post Validity .

Internal Validity

Internal validity is the degree of confidence that a causal relationship exists between the treatment and the difference in outcomes. In other words, how well did the researchers perform the study? How likely is it that your treatment caused the differences in results that you observe? Are the researcher’s conclusions correct? Or can changes in the outcome be attributed to other causes?

Establishing internal validity involves assessing data collection procedures, the reliability and validity of the data, the experimental design, and even things such as the setting and duration of the experiment. It could involve understanding events and natural processes that occur outside of the investigation. In other words, it’s the whole thing. Does the entirety of the experiment allow you to conclude that the treatment causes the differences in outcomes?

Studies that have a high degree of internal validity provide strong evidence of causality . On the other hand, studies with low internal validity provide weak evidence of causality.

Learn more about Correlation vs. Causation: Understanding the Differences .

How to Increase Internal Validity

Typically, highly controlled experiments improve internal validity. Experiment with the following features tend to have the highest internal validity:

They occur in a lab setting to reduce variability from sources other than the treatment.
Use random sampling to obtain a sample that represents the population .
Use random assignment to create control and treatment groups that are equivalent at the beginning.
Include a control group to understand treatment effects.
Use blinding and other protocols that reduce the influence of extraneous factors , such as knowledge about the treatment and experimenter bias.

Removing these properties, such as moving from the lab to the real world, not being able to randomize, or not having a control group reduces internal validity.

Internal validity relates to causality for a single study. For the study in question, did the treatment cause changes in the outcomes? Internal validity does not address generalizability to other settings, subjects, or populations. It only assesses causality for one study. We’ll get into the other issues when we talk about external validity.

Learn more about Randomized Controlled Trials , which tend to have high internal validity.

Threats to Internal Validity

Threats to internal validity are types of confounding variables because they provide alternative explanations for changes in outcomes. They are threats because they make us doubt causality. The real reason for apparent treatment effects might be these potential threats.

For example, imagine a weight loss program where the researchers measure the subjects’ weights at the beginning, conduct the program, and then measure weights at the end. If the intervention causes weight loss, you’d expect to see decreases between the pretest and posttest.

However, there are various threats to attributing a causal connection between the weight loss program and the changes in weights. The following items are threats to internal validity.

An outside event occurred between the pretest and posttest that affected the outcomes and can reduce internal validity. Perhaps a fitness program became popular in town, and many subjects participated. It might be the fitness program that caused the weight loss rather than the weight loss program we’re studying.

The change between pretest and posttest scores might represent a process that occurs naturally over time and, thus, raises questions about internal validity. Imagine if instead of a weight loss program, we are studying an educational program. If the posttest scores are higher at the end, we might be observing regular knowledge acquisition rather than the program causing the increase. If it’s a natural process, we would have seen the same change even if the subjects did not participate in the experiment.

The pretest influences outcomes by increasing awareness or sensitivity among test takers. Suppose that the mere fact of weighing the subjects makes them more weight conscious and increases their motivation to lose weight.

Instrumentation

The change between tests is an artifact of a difference between the pretest and posttest assessment instruments rather than an actual change in outcomes. This threat to internal validity can involve a change in the instrument, different instructions for administering the test, or researchers using different procedures to take measurements. If the scale stops working correctly at some point after the pretest and displays lower weights in the posttest, the subjects’ weights appear to decrease.

Mortality refers to an experiment’s attrition rates amongst its subjects—not necessarily actual deaths! It becomes a problem when subjects with specific characteristics drop out of the study more frequently than other subjects. If these characteristics are associated with changes in the outcome variable, the systematic loss of subjects with these characteristics can bias the posttest results.

For example, in an experiment for an educational program, if the more dedicated learners have more extracurricular activities, they might be more likely to drop out of the study. Losing a disproportionate number of dedicated learners can deceptively reduce the apparent effectiveness of an education-al program. This threat to internal validity is higher for studies that have relatively high attrition rates.

Regression to the mean. If you get an unusual average in the pretest, the group will tend to regress to the mean in the posttest. Suppose we’re assessing an education program and the pretest produces unusually low means. Regression to the mean will tend to cause the posttest to be higher even if the intervention doesn’t cause an increase.

External Validity

External validity relates to the ability to generalize the results of the experiment to other people, places, or times. Scientific studies generally do not want findings that apply only to the relatively few subjects who participated in the study. Instead, studies want to be able to use the experimental results and apply them to a larger population. This is a key goal of inferential statistics .

For example, if you’re assessing a new medication or a new educational program, you don’t want to know that it’s effective for a handful of people. You want to apply those results beyond just the experimental setting and the particular individuals that participated. That’s generalizability—and the heart of the matter for external validity.

Unlike internal validity, external validity doesn’t assess causality and ruling out confounders.

There are two broad types of external validity—population and ecological.

Population Validity

Population validity relates to how well the experimental sample represents a population. Sampling methodology addresses this issue. If you use a random sampling technique to obtain a representative sample , it greatly helps you generalize from the sample to the population because they are similar. Population validity requires a sample that reflects the target population.

On the other hand, if the sample does not represent the population, it reduces external validity and you might not be able to generalize from the sample to the population.

Ecological Validity

Ecological validity relates to the degree of similarity between the experimental setting and the setting to which you want to generalize. The greater the similarity of key characteristics between settings, the more confident you can be that the results will generalize to that other setting. In this context, “key characteristics” are factors that can influence the outcome variable. Generalizability requires that the methods, materials, and environment in the experiment approximate the relevant real-world setting to which you want to generalize.

Threats to external validity are differences between experimental conditions and the real-world setting. Threats indicate that you might not be able to generalize the experimental results beyond the experiment. You performed your research in a particular context, at a particular time, and with specific people. As you move to different conditions, you lose the ability to generalize. The ability to generalize the results is never guaranteed. This issue is one that you really need to think about. If another researcher conducted a similar study in a different setting, would that study obtain the same results?

The following practices can help increase external validity:

Use random sampling to obtain a representative sample from the population you are studying.
Understand how your experiment is similar to and different from the setting(s) to which you want to generalize the results. Identify the factors that are particularly relevant to the research question and minimize the difference between experimental conditions and the real-world setting.
Replicate your study. If you or other researchers replicate your experiment at different times, in various settings, and with different people, you can be more confident about generalizability.

Learn more in depth about Ecological Validity: Definition & Why It Matters .

Internal vs. External Validity: The Relationship Between Them

There tends to be a negative correlation between internal and external validity in experiments. Experiments that have high internal validity tend to have lower external validity. And, vice versa.

Why does this happen?

To understand the reason, you must think about the experimental conditions that produce high degrees of internal and external validity. They’re diametrically opposed!

To produce high internal validity, you need a highly controlled environment that minimizes variability in extraneous variables. By controlling the environmental conditions, implementing strict measurement methodologies, using random assignment, and using a standardized treatment, you can effectively rule out alternative explanations for differences in outcomes. That produces a high degree of confidence in causality, which is high internal validity.

However, that artificial lab environment is a far cry from any real-world setting! To have high external validity, you want the experimental conditions to match the real-world setting. Observational studies are much more realistic than a lab setting. You experience the full impact of real-world variability! That creates high external validity because the experimental conditions are virtually the real-world setting. However, as I explain in my article about observational studies, that type of study opens the door to confounding variables and alternative explanations for differences in outcomes—in other words, lower internal validity!

So, what’s the answer?

Replication! Researchers can conduct multiple experiments in different places and use different methodologies—some true experiments in a lab and other observational studies in the field. This point reiterates the importance of replicating studies because no single study is ever enough.

As you can see, planning an experiment so you can draw valid conclusions and apply them to other settings requires a thorough assessment. Failure to do the appropriate planning for both internal and external validity can cause your experiment or study to produce results that you cannot trust!

Various types of bias can reduce both internal and external validity. These include Selection Bias , Sampling Bias , and Cognitive Bias . Each type has its own solutions.

Internal and external validity , San Jose State University

Glenn H. Bracht and Gene V. Glass, External Validity of Experiments , American Educational Research Journal, Vol. 5, No. 4 (Nov., 1968), pp. 437-474.

Reader Interactions

Comments and questions cancel reply.

Statistics Made Easy

Random Selection vs. Random Assignment

Random selection and random assignment are two techniques in statistics that are commonly used, but are commonly confused.

Random selection refers to the process of randomly selecting individuals from a population to be involved in a study.

Random assignment refers to the process of randomly assigning the individuals in a study to either a treatment group or a control group.

You can think of random selection as the process you use to “get” the individuals in a study and you can think of random assignment as what you “do” with those individuals once they’re selected to be part of the study.

The Importance of Random Selection and Random Assignment

When a study uses random selection , it selects individuals from a population using some random process. For example, if some population has 1,000 individuals then we might use a computer to randomly select 100 of those individuals from a database. This means that each individual is equally likely to be selected to be part of the study, which increases the chances that we will obtain a representative sample – a sample that has similar characteristics to the overall population.

By using a representative sample in our study, we’re able to generalize the findings of our study to the population. In statistical terms, this is referred to as having external validity – it’s valid to externalize our findings to the overall population.

When a study uses random assignment , it randomly assigns individuals to either a treatment group or a control group. For example, if we have 100 individuals in a study then we might use a random number generator to randomly assign 50 individuals to a control group and 50 individuals to a treatment group.

By using random assignment, we increase the chances that the two groups will have roughly similar characteristics, which means that any difference we observe between the two groups can be attributed to the treatment. This means the study has internal validity – it’s valid to attribute any differences between the groups to the treatment itself as opposed to differences between the individuals in the groups.

Examples of Random Selection and Random Assignment

It’s possible for a study to use both random selection and random assignment, or just one of these techniques, or neither technique. A strong study is one that uses both techniques.

The following examples show how a study could use both, one, or neither of these techniques, along with the effects of doing so.

Example 1: Using both Random Selection and Random Assignment

Study: Researchers want to know whether a new diet leads to more weight loss than a standard diet in a certain community of 10,000 people. They recruit 100 individuals to be in the study by using a computer to randomly select 100 names from a database. Once they have the 100 individuals, they once again use a computer to randomly assign 50 of the individuals to a control group (e.g. stick with their standard diet) and 50 individuals to a treatment group (e.g. follow the new diet). They record the total weight loss of each individual after one month.

Results: The researchers used random selection to obtain their sample and random assignment when putting individuals in either a treatment or control group. By doing so, they’re able to generalize the findings from the study to the overall population and they’re able to attribute any differences in average weight loss between the two groups to the new diet.

Example 2: Using only Random Selection

Study: Researchers want to know whether a new diet leads to more weight loss than a standard diet in a certain community of 10,000 people. They recruit 100 individuals to be in the study by using a computer to randomly select 100 names from a database. However, they decide to assign individuals to groups based solely on gender. Females are assigned to the control group and males are assigned to the treatment group. They record the total weight loss of each individual after one month.

Random assignment vs. random selection in statistics

Results: The researchers used random selection to obtain their sample, but they did not use random assignment when putting individuals in either a treatment or control group. Instead, they used a specific factor – gender – to decide which group to assign individuals to. By doing this, they’re able to generalize the findings from the study to the overall population but they are not able to attribute any differences in average weight loss between the two groups to the new diet. The internal validity of the study has been compromised because the difference in weight loss could actually just be due to gender, rather than the new diet.

Example 3: Using only Random Assignment

Study: Researchers want to know whether a new diet leads to more weight loss than a standard diet in a certain community of 10,000 people. They recruit 100 males athletes to be in the study. Then, they use a computer program to randomly assign 50 of the male athletes to a control group and 50 to the treatment group. They record the total weight loss of each individual after one month.

Random assignment vs. random selection example

Results: The researchers did not use random selection to obtain their sample since they specifically chose 100 male athletes. Because of this, their sample is not representative of the overall population so their external validity is compromised – they will not be able to generalize the findings from the study to the overall population. However, they did use random assignment, which means they can attribute any difference in weight loss to the new diet.

Example 4: Using Neither Technique

Study: Researchers want to know whether a new diet leads to more weight loss than a standard diet in a certain community of 10,000 people. They recruit 50 males athletes and 50 female athletes to be in the study. Then, they assign all of the female athletes to the control group and all of the male athletes to the treatment group. They record the total weight loss of each individual after one month.

Results: The researchers did not use random selection to obtain their sample since they specifically chose 100 athletes. Because of this, their sample is not representative of the overall population so their external validity is compromised – they will not be able to generalize the findings from the study to the overall population. Also, they split individuals into groups based on gender rather than using random assignment, which means their internal validity is also compromised – differences in weight loss might be due to gender rather than the diet.

Featured Posts

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike. My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

Join the Statology Community

Sign up to receive Statology's exclusive study resource: 100 practice problems with step-by-step solutions. Plus, get our latest insights, tutorials, and data analysis tips straight to your inbox!

By subscribing you accept Statology's Privacy Policy.

Statistical Thinking: A Simulation Approach to Modeling Uncertainty (UM STAT 216 edition)

3.6 causation and random assignment.

To attribute a causal relationship, there are three criteria a researcher needs to establish:

Association of the Cause and Effect: There needs to be a association between the cause and effect.
Timing: The cause needs to happen BEFORE the effect.
No Plausible Alternative Explanations: ALL other possible explanations for the effect need to be ruled out.

Please read more about each of these criteria at the Web Center for Social Research Methods .

The third criterion can be quite difficult to meet. To rule out ALL other possible explanations for the effect, we want to compare the world with the cause applied to the world without the cause. In practice, we do this by comparing two different groups: a “treatment” group that gets the cause applied to them, and a “control” group that does not. To rule out alternative explanations, the groups need to be “identical” with respect to every possible characteristic (aside from the treatment) that could explain differences. This way the only characteristic that will be different is that the treatment group gets the treatment and the control group doesn’t. If there are differences in the outcome, then it must be attributable to the treatment, because the other possible explanations are ruled out.

Now we just need to create the groups so that they have, on average, the same characteristics … for EVERY POSSIBLE CHARCTERISTIC that could explain differences in the outcome.

Random assignment of participants helps to ensure that any differences between and within the groups are not systematic at the outset of the experiment. Thus, any differences between groups recorded at the end of the experiment can be more confidently attributed to the experimental procedures or treatment. … Random assignment does not guarantee that the groups are matched or equivalent. The groups may still differ on some preexisting attribute due to chance. The use of random assignment cannot eliminate this possibility, but it greatly reduces it.

We use the term internal validity to describe the degree to which cause-and-effect inferences are accurate and meaningful. Causal attribution is the goal for many researchers. Thus, by using random assignment we have a pretty high degree of evidence for internal validity; we have a much higher belief in causal inferences. Much like evidence used in a court of law, it is useful to think about validity evidence on a continuum. For example, a visualization of the internal validity evidence for a study that employed random assignment in the design might be:

The degree of internal validity evidence is high (in the upper-third). How high depends on other factors such as sample size.

To learn more about random assignment, you can read the following:

The research report, Random Assignment Evaluation Studies: A Guide for Out-of-School Time Program Practitioners

3.6.1 Example: Does sleep deprivation cause an decrease in performance?

Let’s consider the criteria with respect to the sleep deprivation study we explored in class.

3.6.1.1 Association of cause and effect

First, we ask, Is there an association between the cause and the effect? In the sleep deprivation study, we would ask, “Is sleep deprivation associated with an decrease in performance?”

This is what a hypothesis test helps us answer! If the result is statistically significant , then we have an association between the cause and the effect. If the result is not statistically significant, then there is not sufficient evidence for an association between cause and effect.

In the case of the sleep deprivation experiment, the result was statistically significant, so we can say that sleep deprivation is associated with a decrease in performance.

3.6.1.2 Timing

Second, we ask, Did the cause come before the effect? In the sleep deprivation study, the answer is yes. The participants were sleep deprived before their performance was tested. It may seem like this is a silly question to ask, but as the link above describes, it is not always so clear to establish the timing. Thus, it is important to consider this question any time we are interested in establishing causality.

3.6.1.3 No plausible alternative explanations

Finally, we ask Are there any plausible alternative explanations for the observed effect? In the sleep deprivation study, we would ask, “Are there plausible alternative explanations for the observed difference between the groups, other than sleep deprivation?” Because this is a question about plausibility, human judgment comes into play. Researchers must make an argument about why there are no plausible alternatives. As described above, a strong study design can help to strengthen the argument.

At first, it may seem like there are a lot of plausible alternative explanations for the difference in performance. There are a lot of things that might affect someone’s performance on a visual task! Sleep deprivation is just one of them! For example, artists may be more adept at visual discrimination than other people. This is an example of a potential confounding variable. A confounding variable is a variable that might affect the results, other than the causal variable that we are interested in.

Here’s the thing though. We are not interested in figuring out why any particular person got the score that they did. Instead, we are interested in determining why one group was different from another group. In the sleep deprivation study, the participants were randomly assigned. This means that the there is no systematic difference between the groups, with respect to any confounding variables. Yes—artistic experience is a possible confounding variable, and it may be the reason why two people score differently. BUT: There is no systematic difference between the groups with respect to artistic experience, and so artistic experience is not a plausible explanation as to why the groups would be different. The same can be said for any possible confounding variable. Because the groups were randomly assigned, it is not plausible to say that the groups are different with respect to any confounding variable. Random assignment helps us rule out plausible alternatives.

3.6.1.4 Making a causal claim

Now, let’s see about make a causal claim for the sleep deprivation study:

Association: There is a statistically significant result, so the cause is associated with the effect
Timing: The participants were sleep deprived before their performance was measured, so the cause came before the effect
Plausible alternative explanations: The participants were randomly assigned, so the groups are not systematically different on any confounding variable. The only systematic difference between the groups was sleep deprivation. Thus, there are no plausible alternative explanations for the difference between the groups, other than sleep deprivation

Thus, the internal validity evidence for this study is high, and we can make a causal claim. For the participants in this study, we can say that sleep deprivation caused a decrease in performance.

Key points: Causation and internal validity

To make a cause-and-effect inference, you need to consider three criteria:

Association of the Cause and Effect: There needs to be a association between the cause and effect. This can be established by a hypothesis test.

Random assignment removes any systematic differences between the groups (other than the treatment), and thus helps to rule out plausible alternative explanations.

Internal validity describes the degree to which cause-and-effect inferences are accurate and meaningful.

Confounding variables are variables that might affect the results, other than the causal variable that we are interested in.

Probabilistic equivalence means that there is not a systematic difference between groups. The groups are the same on average.

How can we make "equivalent" experimental groups?

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

What tradeoffs are there between internal and external validity?

I was reading through an article about research design. The authors gave an example on the relation between internal and external validity which I thought might be an important area of consideration when designing research proposals.

"A restricted study population (and exclusion criteria) may limit bias and increase the internal validity of the study; however, this approach will limit external validity of the study and, thus, the generalizability of the findings to the practical clinical setting. Conversely, a broadly defined study population and inclusion criteria may be representative of practical clinical practice but may increase bias and reduce the internal validity of the study."

Broadly speaking, what questions can be asked to help decide one should strike a balance between the internal and external validity in our research proposal designs? Is it necessary then to carry out a pilot study first to establish some level of internal validity before moving on to establish external validity in a much bigger and formal follow-up study?

Crosspost: Quora

experiment-design
methodology

2 Answers 2

I've been following your post since it was posted in CogSci and have never had the time to give a full answer: even this will only be a quick push in the (hopefully) right direction.

First, as I always enjoy seminal works, check out Donald Campbell's work defining several designs relationships to internal and external validity (in the references). This has a great quote on the "balance between internal and external validity":

If one is in a situation where either internal validity or representativeness [external validity] must be sacrificed, which should it be? The answer is clear. Internal validity is the prior and indispensable consideration. (Campbell, 1957, p. 310)

Thinking about why: I tend to teach internal validity as "Is my study telling me what I think it is?", while external validity is more along the lines of "Will this apply to other populations?" Using these working definitions, one can't really have "applications to other people" if one doesn't have a study that is telling you what you think it is.

Being a bit more pragmatic: consider your research goals. Is your goal to apply your research in a wide variety of situations, or just one? Is there any evidence that the method is effective in any circumstance (or that the there is a relationship between variables, etc.)? If so, do you have reason to believe it will transfer to a new population (aka generalize)? The answer to these can guide what your first focus should be, but these questions are strongly grounded in theory: without a theoretical backing, go back to the default of establishing internal validity first.

I think that the above answers your second question, but I can't emphasize enough the importance of theory (including methodological theory). In the cited article you'll see that Campbell assigns the strength of internal and external validity to the methodology , not necessarily the data. Not all methods are created equally. This doesn't really have to do with a "pilot test": if you use control groups, have random assignment, do proper sampling schemes, etc., your study will have better internal and external validity. Again, this is by the nature of the method, not because it has been piloted.

It may help to see the two very broad methods (in social research) that we use to increase internal and external validity:

To increase internal validity, we tend to use random assignment to treatment and control groups (discussed in Campbell's paper). The idea is that random assignment randomly distributes all sorts of confounding variables that you haven't accounted for among both groups, so they should be equally balanced (or, if they're not, it was by random chance).
To increase external validity, we tend to use random sampling to collect our participants. The idea is that random sampling grabs a representative sample of the population you are sampling from, and that if the treatment "works" for the representative sample, it should work for the rest of the population.

Note that these can both occur at the same time, and that the practice of one does not preclude the other. Also note that research can still have internal and external validity without doing these, though you'd have to make a strong argument as to why.

As a final note, let me add this: you don't ever "have" internal or external validity, you merely have evidence supporting internal/external validity. Most (if not all) types of validity are just a body of evidence in favor of the concept: for internal validity, a body of evidence that only your proposed treatment influenced your outcome variable; for external validity, a body of evidence that your proposed treatment would influence the outcome variable for other samples/populations. Any ideas that you can think of to show these are contributions to the evidence of the internal/external validity of your study.

Campbell, D. T. (1957). Factors relevant to the validity of experiments in social settings. Psychological Bulletin, 54(4), 297–312.

I appreciate articles in which as well the limited set of clean data as the broader set of available data are analysed. Those conclusions that emerge from both sets are extremely valid, of cource.

The reason to do the research (the reason why someone pays you) determines the required external validity. If you work for a local hospital, for instance, then there is no problem limiting to the most frequent etnico group. If you work fo an international organisation and only studie one ethnique groep, you MUST try to be compatinle with other studies in other regions, so that your results can be includerd in a revieuw article.

Your Answer

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged experiment-design methodology validity or ask your own question .

Featured on Meta
Upcoming sign-up experiments related to tags

Hot Network Questions

What does "DC" mean in gravitational physics?
John, in his spaceship traveling at relativistic speed, is crossing the Milky Way in 500 years. How many supernovae explosions would he experience?
Would killing 444 billion humans leave any physical impact on Earth that's measurable?
Nonconsecutive Anti-Knight Fillomino?
Does Ordering of Columns in INCLUDE Statement of Index Matter?
Why are heavy metals toxic? Lead and Carbon are in the same group. One is toxic, the other is not
QGIS show only one label
giving variables a default value in a spec file
How can the CMOS version of 555 timer have the output current tested at 2 mA while its maximum supply is 250 μA?
Article that plagiarized our paper is still available and gets cited - what to do?
Non-uniform Gaussian spaced vector
Is this crumbling concrete step salvageable?
Please help me: Sending String from PyCharm via usb to NXT, which is programmed by Bricx CC
Intercept significant, but confidence intervals around its standardized β include 0
Finding equivalent resistance in a circuit in which 12 resistors are arranged in the edge of a cube
Why did the UNIVAC 1100-series Exec-8 O/S call the @ character "master space?"
Estimating effects in the presence of a mediator
In general, How's a computer science subject taught in Best Universities of the World that are not MIT level?
Could alien species with blood based on different elements eat the same food?
In "Romeo and Juliet", why is Juliet the "sun"?
How did the `long` and `short` integer types originate?
Workers Comp. Insurance with Regard to Single-Member LLCs in Virginia
Clustered standard error - intuitive explanation
Do rich parents pay less in child support if they didn't provide a rich lifestyle for their family pre-divorce?

Bipolar Disorder
Therapy Center
When To See a Therapist
Types of Therapy
Best Online Therapy
Best Couples Therapy
Best Family Therapy
Managing Stress
Sleep and Dreaming
Understanding Emotions
Self-Improvement
Healthy Relationships
Student Resources
Personality Types
Guided Meditations
Verywell Mind Insights
2024 Verywell Mind 25
Mental Health in the Classroom
Editorial Process
Meet Our Review Board
Crisis Support

The Definition of Random Assignment According to Psychology

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Emily is a board-certified science editor who has worked with top digital publishing brands like Voices for Biodiversity, Study.com, GoodTherapy, Vox, and Verywell.

Materio / Getty Images

Random assignment refers to the use of chance procedures in psychology experiments to ensure that each participant has the same opportunity to be assigned to any given group in a study to eliminate any potential bias in the experiment at the outset. Participants are randomly assigned to different groups, such as the treatment group versus the control group. In clinical research, randomized clinical trials are known as the gold standard for meaningful results.

Simple random assignment techniques might involve tactics such as flipping a coin, drawing names out of a hat, rolling dice, or assigning random numbers to a list of participants. It is important to note that random assignment differs from random selection .

While random selection refers to how participants are randomly chosen from a target population as representatives of that population, random assignment refers to how those chosen participants are then assigned to experimental groups.

Random Assignment In Research

To determine if changes in one variable will cause changes in another variable, psychologists must perform an experiment. Random assignment is a critical part of the experimental design that helps ensure the reliability of the study outcomes.

Researchers often begin by forming a testable hypothesis predicting that one variable of interest will have some predictable impact on another variable.

The variable that the experimenters will manipulate in the experiment is known as the independent variable , while the variable that they will then measure for different outcomes is known as the dependent variable. While there are different ways to look at relationships between variables, an experiment is the best way to get a clear idea if there is a cause-and-effect relationship between two or more variables.

Once researchers have formulated a hypothesis, conducted background research, and chosen an experimental design, it is time to find participants for their experiment. How exactly do researchers decide who will be part of an experiment? As mentioned previously, this is often accomplished through something known as random selection.

Random Selection

In order to generalize the results of an experiment to a larger group, it is important to choose a sample that is representative of the qualities found in that population. For example, if the total population is 60% female and 40% male, then the sample should reflect those same percentages.

Choosing a representative sample is often accomplished by randomly picking people from the population to be participants in a study. Random selection means that everyone in the group stands an equal chance of being chosen to minimize any bias. Once a pool of participants has been selected, it is time to assign them to groups.

By randomly assigning the participants into groups, the experimenters can be fairly sure that each group will have the same characteristics before the independent variable is applied.

Participants might be randomly assigned to the control group , which does not receive the treatment in question. The control group may receive a placebo or receive the standard treatment. Participants may also be randomly assigned to the experimental group , which receives the treatment of interest. In larger studies, there can be multiple treatment groups for comparison.

There are simple methods of random assignment, like rolling the die. However, there are more complex techniques that involve random number generators to remove any human error.

There can also be random assignment to groups with pre-established rules or parameters. For example, if you want to have an equal number of men and women in each of your study groups, you might separate your sample into two groups (by sex) before randomly assigning each of those groups into the treatment group and control group.

Random assignment is essential because it increases the likelihood that the groups are the same at the outset. With all characteristics being equal between groups, other than the application of the independent variable, any differences found between group outcomes can be more confidently attributed to the effect of the intervention.

Example of Random Assignment

Imagine that a researcher is interested in learning whether or not drinking caffeinated beverages prior to an exam will improve test performance. After randomly selecting a pool of participants, each person is randomly assigned to either the control group or the experimental group.

The participants in the control group consume a placebo drink prior to the exam that does not contain any caffeine. Those in the experimental group, on the other hand, consume a caffeinated beverage before taking the test.

Participants in both groups then take the test, and the researcher compares the results to determine if the caffeinated beverage had any impact on test performance.

A Word From Verywell

Random assignment plays an important role in the psychology research process. Not only does this process help eliminate possible sources of bias, but it also makes it easier to generalize the results of a tested sample of participants to a larger population.

Random assignment helps ensure that members of each group in the experiment are the same, which means that the groups are also likely more representative of what is present in the larger population of interest. Through the use of this technique, psychology researchers are able to study complex phenomena and contribute to our understanding of the human mind and behavior.

Lin Y, Zhu M, Su Z. The pursuit of balance: An overview of covariate-adaptive randomization techniques in clinical trials . Contemp Clin Trials. 2015;45(Pt A):21-25. doi:10.1016/j.cct.2015.07.011

Sullivan L. Random assignment versus random selection . In: The SAGE Glossary of the Social and Behavioral Sciences. SAGE Publications, Inc.; 2009. doi:10.4135/9781412972024.n2108

Alferes VR. Methods of Randomization in Experimental Design . SAGE Publications, Inc.; 2012. doi:10.4135/9781452270012

Nestor PG, Schutt RK. Research Methods in Psychology: Investigating Human Behavior. (2nd Ed.). SAGE Publications, Inc.; 2015.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

IMAGES

Random Assignment in Experiments
Random Assignment Is Used in Experiments Because Researchers Want to
PPT
What is internal validity in research: Definition, tips & examples
Internal validity is maximized by random assignment to conditions
11 Internal Validity Examples (2024)

VIDEO

Random variable
Analyse with Lisa
Reliability, Internal Validity, & External Validity of Construct Measurement & Operationalization
Section 2 Overview (Part 4 of Course)
Threats to Internal Validity in Experimental Research
Threats to Internal Validity in Experimental Research

COMMENTS

Random Assignment in Experiments
Random assignment is an important part of control in experimental research, because it helps strengthen the internal validity of an experiment and avoid biases. In experiments, researchers manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables.
Random Assignment in Psychology: Definition & Examples
Thus, any changes that result from the independent variable can be assumed to be a result of the treatment of interest. This is particularly important for eliminating sources of bias and strengthening the internal validity of an experiment. Random assignment is the best method for inferring a causal relationship between a treatment and an outcome.
1.3: Threats to Internal Validity and Different Control Techniques
Random assignment. Matching; Holding Extraneous Variable Constant; Building Extraneous Variables into Design; Internal validity is often the focus from a research design perspective. To understand the pros and cons of various designs and to be able to better judge specific designs, we identify specific threats to internal validity. Before we do ...
Internal Validity in Research
Altering the experimental design can counter several threats to internal validity in multi-group studies. Random assignment of participants to groups counters selection bias and regression to the mean by making groups comparable at the start of the study. Blinding participants to the aim of the study counters the effects of social interaction.
Internal Validity vs. External Validity in Research
The essential difference between internal validity and external validity is that internal validity refers to the structure of a study (and its variables) while external validity refers to the universality of the results. But there are further differences between the two as well. For instance, internal validity focuses on showing a difference ...
Internal Validity
Internal validity makes the conclusions of a causal relationship credible and trustworthy. Without high internal validity, an experiment cannot demonstrate a causal link between two variables. ... Random assignment of participants to groups counters selection bias and regression to the mean by making groups comparable at the start of the study.
3.7 Internal Validity Evidence and Random Assignment
The use of random assignment cannot eliminate this possibility, but it greatly reduces it. Internal validity is the degree to which cause-and-effect inferences are accurate and meaningful. Causal attribution is the goal for many researchers. Thus, by using random assignment we have a pretty high degree of evidence for internal validity; we have ...
PDF Internal Validity page 1
Internal validity is the confidence you can have that the independent variable is responsible (caused) changes in the dependent variable. Random assignment increases internal validity by reducing the risk of systematic pre-existing differences between the levels of the independent variable. Studies that use random assignment are called experiments.
Internal validity
When considering only Internal Validity, highly controlled true experimental designs (i.e. with random selection, random assignment to either the control or experimental groups, reliable instruments, reliable manipulation processes, and safeguards against confounding factors) may be the "gold standard" of scientific research.
Random Assignment in Experiments
Because random assignment helps ensure that the groups are comparable when the experiment begins, you can be more confident that the treatments caused the post-study differences. Random assignment helps increase the internal validity of your study. Comparing the Vitamin Study With and Without Random Assignment
Internal Validity
The internal validity is high because the random assignment helps ensure that any observed differences between the groups can be attributed to the medication rather than other factors. Example 2: Education Intervention: A researcher investigates the impact of a new teaching method on student performance in mathematics.
Random Assignment
Internal Validity. M.M. Mark, C.S. Reichardt, in International Encyclopedia of the Social & Behavioral Sciences, 2001 5.1 Misconception #1: Successful Random Assignment Guarantees Internal Validity. Although random assignment of participants (or other units) to treatment condition can greatly enhance the likelihood of internal validity, problems can still occur.
Internal Validity Evidence and Random Assignment
Internal validity is the degree to which cause-and-effect inferences are accurate and meaningful. Causal attribution is the goal for many researchers. Thus, by using random assignment we have a pretty high degree of evidence for internal validity; we have a much higher belief in causal inferences. Much like evidence used in a court of law, it ...
The conflict between random assignment and treatment ...
Abstract. The gold standard for most clinical and services outcome studies is random assignment to treatment condition because this kind of design diminishes many threats to internal validity. Although we agree with the power of randomized clinical trials, we argue in this paper that random assignment raises other, unanticipated threats to ...
Difference between Random Selection and Random Assignment
Random selection is thus essential to external validity, or the extent to which the researcher can use the results of the study to generalize to the larger population. Random assignment is central to internal validity, which allows the researcher to make causal claims about the effect of the treatment. Nonrandom assignment often leads to non ...
Random Assignment in Experiments
Random assignment is an important part of control in experimental research, because it helps strengthen the internal validity of an experiment. In experiments, researchers manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables.
What's the difference between random assignment and random ...
Random selection, or random sampling, is a way of selecting members of a population for your study's sample. In contrast, random assignment is a way of sorting the sample into control and experimental groups. Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal ...
Internal and External Validity
Internal validity is the degree of confidence that a causal relationship exists between the treatment and the difference in outcomes. In other words, how well did the researchers perform the study? ... Use random assignment to create control and treatment groups that are equivalent at the beginning. Include a control group to understand ...
Random Selection vs. Random Assignment
The internal validity of the study has been compromised because the difference in weight loss could actually just be due to gender, rather than the new diet. Example 3: Using only Random Assignment. Study: Researchers want to know whether a new diet leads to more weight loss than a standard diet in a certain community of 10,000 people. They ...
3.6 Causation and Random Assignment
Random assignment does not guarantee that the groups are matched or equivalent. The groups may still differ on some preexisting attribute due to chance. The use of random assignment cannot eliminate this possibility, but it greatly reduces it. We use the term internal validity to describe the degree to which cause-and-effect inferences are ...
What Is Internal Validity?
Internal validity refers to the extent to which a research design minimizes the likelihood of alternative explanations for the observed effect. Paraphraser Grammar Checker Plagiarism Checker ... Random assignment (randomization). By randomly assigning participants to groups, you counter the effects of selection bias and regression to the mean ...
What tradeoffs are there between internal and external validity?
To increase internal validity, we tend to use random assignment to treatment and control groups (discussed in Campbell's paper). The idea is that random assignment randomly distributes all sorts of confounding variables that you haven't accounted for among both groups, so they should be equally balanced (or, if they're not, it was by random ...
The Definition of Random Assignment In Psychology
Random assignment refers to the use of chance procedures in psychology experiments to ensure that each participant has the same opportunity to be assigned to any given group in a study to eliminate any potential bias in the experiment at the outset. Participants are randomly assigned to different groups, such as the treatment group versus the ...
Ch. 10 Quiz Flashcards
Study with Quizlet and memorize flashcards containing terms like Random selection enhances _____ validity, and random assignment enhances _____ validity. internal; internal external; external external; internal internal; external, Which of the following phrases describes a manipulated variable? "Participants were placed in the high tempo music condition, the low tempo music condition, or the ...

Random Assignment in Psychology: Definition & Examples

Importance

Random Selection vs. Random Assignment

Random Assignment vs Random Sampling

When to Use Random Assignment

How to Use Random Assignment

When is Random Assignment not used?

Drawbacks of Random Assignment

What is the difference between random sampling and random assignment?

Does random assignment increase internal validity?

Does random assignment reduce sampling error?

When is random assignment not possible?

Does random assignment eliminate confounding variables?

Why is random assignment of participants to treatment conditions in an experiment used?

Further Reading

Margin Size

1.3: Threats to Internal Validity and Different Control Techniques

Different Control Techniques

Random assignment

Holding Extraneous Variable Constant

Building Extraneous Variables into Design

Internal Validity – Threats, Examples and Guide

Internal Validity

How to Increase Internal Validity

Threats To Internal Validity

Testing Effects

Instrumentation

Selection Bias

Attrition or Dropout

Regression to The Mean

Diffusion of Treatment

Demand Characteristics

Experimenter Bias

Types of Internal Validity

Construct validity

Statistical Conclusion Validity

Internal Validity of Causal Inferences

Temporal Precedence

Covariation

Elimination of Confounding Variables

Selection bias Control

Controlling for Testing Effects

Controlling for Experimenter Effects

Replication

Internal Validity Examples

Applications of Internal Validity

Advantages of Internal Validity

Limitations of Internal Validity

About the author

Muhammad Hassan

You may also like

Construct Validity – Types, Threats and Examples

Inter-Rater Reliability – Methods, Examples and...

Internal Consistency Reliability – Methods...

Validity – Types, Examples and Guide

Reliability Vs Validity

Alternate Forms Reliability – Methods, Examples...

Statistical Thinking: A Simulation Approach to Modeling Uncertainty

Save citation to file

Add to My Bibliography

The conflict between random assignment and treatment preference: implications for internal validity

Similar articles

Related information

LinkOut - more resources

Difference between Random Selection and Random Assignment

Discover How We Assist to Edit Your Dissertation Chapters

Have a language expert improve your writing

Random Assignment in Experiments | Introduction & Examples

Table of contents

Prevent plagiarism, run a free check.

Random assignment in block designs

When comparing different groups

When it’s not ethically permissible

Cite this Scribbr article

Is this article helpful?

Pritha Bhandari

Frequently asked questions

Frequently asked questions: Methodology

Ask our team

Internal and External Validity