Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Types of Variables in Research | Definitions & Examples

Types of Variables in Research | Definitions & Examples

Published on 19 September 2022 by Rebecca Bevans . Revised on 28 November 2022.

In statistical research, a variable is defined as an attribute of an object of study. Choosing which variables to measure is central to good experimental design .

You need to know which types of variables you are working with in order to choose appropriate statistical tests and interpret the results of your study.

You can usually identify the type of variable by asking two questions:

  • What type of data does the variable contain?
  • What part of the experiment does the variable represent?

Table of contents

Types of data: quantitative vs categorical variables, parts of the experiment: independent vs dependent variables, other common types of variables, frequently asked questions about variables.

Data is a specific measurement of a variable – it is the value you record in your data sheet. Data is generally divided into two categories:

  • Quantitative data represents amounts.
  • Categorical data represents groupings.

A variable that contains quantitative data is a quantitative variable ; a variable that contains categorical data is a categorical variable . Each of these types of variable can be broken down into further types.

Quantitative variables

When you collect quantitative data, the numbers you record represent real amounts that can be added, subtracted, divided, etc. There are two types of quantitative variables: discrete and continuous .

Discrete vs continuous variables
Type of variable What does the data represent? Examples
Discrete variables (aka integer variables) Counts of individual items or values.
Continuous variables (aka ratio variables) Measurements of continuous or non-finite values.

Categorical variables

Categorical variables represent groupings of some kind. They are sometimes recorded as numbers, but the numbers represent categories rather than actual amounts of things.

There are three types of categorical variables: binary , nominal , and ordinal variables.

Binary vs nominal vs ordinal variables
Type of variable What does the data represent? Examples
Binary variables (aka dichotomous variables) Yes/no outcomes.
Nominal variables Groups with no rank or order between them.
Ordinal variables Groups that are ranked in a specific order.

*Note that sometimes a variable can work as more than one type! An ordinal variable can also be used as a quantitative variable if the scale is numeric and doesn’t need to be kept as discrete integers. For example, star ratings on product reviews are ordinal (1 to 5 stars), but the average star rating is quantitative.

Example data sheet

To keep track of your salt-tolerance experiment, you make a data sheet where you record information about the variables in the experiment, like salt addition and plant health.

To gather information about plant responses over time, you can fill out the same data sheet every few days until the end of the experiment. This example sheet is colour-coded according to the type of variable: nominal , continuous , ordinal , and binary .

Example data sheet showing types of variables in a plant salt tolerance experiment

Prevent plagiarism, run a free check.

Experiments are usually designed to find out what effect one variable has on another – in our example, the effect of salt addition on plant growth.

You manipulate the independent variable (the one you think might be the cause ) and then measure the dependent variable (the one you think might be the effect ) to find out what this effect might be.

You will probably also have variables that you hold constant ( control variables ) in order to focus on your experimental treatment.

Independent vs dependent vs control variables
Type of variable Definition Example (salt tolerance experiment)
Independent variables (aka treatment variables) Variables you manipulate in order to affect the outcome of an experiment. The amount of salt added to each plant’s water.
Dependent variables (aka response variables) Variables that represent the outcome of the experiment. Any measurement of plant health and growth: in this case, plant height and wilting.
Control variables Variables that are held constant throughout the experiment. The temperature and light in the room the plants are kept in, and the volume of water given to each plant.

In this experiment, we have one independent and three dependent variables.

The other variables in the sheet can’t be classified as independent or dependent, but they do contain data that you will need in order to interpret your dependent and independent variables.

Example of a data sheet showing dependent and independent variables for a plant salt tolerance experiment.

What about correlational research?

When you do correlational research , the terms ‘dependent’ and ‘independent’ don’t apply, because you are not trying to establish a cause-and-effect relationship.

However, there might be cases where one variable clearly precedes the other (for example, rainfall leads to mud, rather than the other way around). In these cases, you may call the preceding variable (i.e., the rainfall) the predictor variable and the following variable (i.e., the mud) the outcome variable .

Once you have defined your independent and dependent variables and determined whether they are categorical or quantitative, you will be able to choose the correct statistical test .

But there are many other ways of describing variables that help with interpreting your results. Some useful types of variable are listed below.

Type of variable Definition Example (salt tolerance experiment)
A variable that hides the true effect of another variable in your experiment. This can happen when another variable is closely related to a variable you are interested in, but you haven’t controlled it in your experiment. Pot size and soil type might affect plant survival as much as or more than salt additions. In an experiment, you would control these potential confounders by holding them constant.
Latent variables A variable that can’t be directly measured, but that you represent via a proxy. Salt tolerance in plants cannot be measured directly, but can be inferred from measurements of plant health in our salt-addition experiment.
Composite variables A variable that is made by combining multiple variables in an experiment. These variables are created when you analyse data, not when you measure it. The three plant-health variables could be combined into a single plant-health score to make it easier to present your findings.

A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.

Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g., the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g., water volume or weight).

You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .

In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:

  • The  independent variable  is the amount of nutrients added to the crop field.
  • The  dependent variable is the biomass of the crops at harvest time.

Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bevans, R. (2022, November 28). Types of Variables in Research | Definitions & Examples. Scribbr. Retrieved 3 September 2024, from https://www.scribbr.co.uk/research-methods/variables-types/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, a quick guide to experimental design | 5 steps & examples, quasi-experimental design | definition, types & examples, construct validity | definition, types, & examples.

what is variable in research study

Variables in Research | Types, Definiton & Examples

what is variable in research study

Introduction

What is a variable, what are the 5 types of variables in research, other variables in research.

Variables are fundamental components of research that allow for the measurement and analysis of data. They can be defined as characteristics or properties that can take on different values. In research design , understanding the types of variables and their roles is crucial for developing hypotheses , designing methods , and interpreting results .

This article outlines the the types of variables in research, including their definitions and examples, to provide a clear understanding of their use and significance in research studies. By categorizing variables into distinct groups based on their roles in research, their types of data, and their relationships with other variables, researchers can more effectively structure their studies and achieve more accurate conclusions.

what is variable in research study

A variable represents any characteristic, number, or quantity that can be measured or quantified. The term encompasses anything that can vary or change, ranging from simple concepts like age and height to more complex ones like satisfaction levels or economic status. Variables are essential in research as they are the foundational elements that researchers manipulate, measure, or control to gain insights into relationships, causes, and effects within their studies. They enable the framing of research questions, the formulation of hypotheses, and the interpretation of results.

Variables can be categorized based on their role in the study (such as independent and dependent variables ), the type of data they represent (quantitative or categorical), and their relationship to other variables (like confounding or control variables). Understanding what constitutes a variable and the various variable types available is a critical step in designing robust and meaningful research.

what is variable in research study

ATLAS.ti makes complex data easy to understand

Turn to our powerful data analysis tools to make the most of your research. Get started with a free trial.

Variables are crucial components in research, serving as the foundation for data collection , analysis , and interpretation . They are attributes or characteristics that can vary among subjects or over time, and understanding their types is essential for any study. Variables can be broadly classified into five main types, each with its distinct characteristics and roles within research.

This classification helps researchers in designing their studies, choosing appropriate measurement techniques, and analyzing their results accurately. The five types of variables include independent variables, dependent variables, categorical variables, continuous variables, and confounding variables. These categories not only facilitate a clearer understanding of the data but also guide the formulation of hypotheses and research methodologies.

Independent variables

Independent variables are foundational to the structure of research, serving as the factors or conditions that researchers manipulate or vary to observe their effects on dependent variables. These variables are considered "independent" because their variation does not depend on other variables within the study. Instead, they are the cause or stimulus that directly influences the outcomes being measured. For example, in an experiment to assess the effectiveness of a new teaching method on student performance, the teaching method applied (traditional vs. innovative) would be the independent variable.

The selection of an independent variable is a critical step in research design, as it directly correlates with the study's objective to determine causality or association. Researchers must clearly define and control these variables to ensure that observed changes in the dependent variable can be attributed to variations in the independent variable, thereby affirming the reliability of the results. In experimental research, the independent variable is what differentiates the control group from the experimental group, thereby setting the stage for meaningful comparison and analysis.

Dependent variables

Dependent variables are the outcomes or effects that researchers aim to explore and understand in their studies. These variables are called "dependent" because their values depend on the changes or variations of the independent variables.

Essentially, they are the responses or results that are measured to assess the impact of the independent variable's manipulation. For instance, in a study investigating the effect of exercise on weight loss, the amount of weight lost would be considered the dependent variable, as it depends on the exercise regimen (the independent variable).

The identification and measurement of the dependent variable are crucial for testing the hypothesis and drawing conclusions from the research. It allows researchers to quantify the effect of the independent variable , providing evidence for causal relationships or associations. In experimental settings, the dependent variable is what is being tested and measured across different groups or conditions, enabling researchers to assess the efficacy or impact of the independent variable's variation.

To ensure accuracy and reliability, the dependent variable must be defined clearly and measured consistently across all participants or observations. This consistency helps in reducing measurement errors and increases the validity of the research findings. By carefully analyzing the dependent variables, researchers can derive meaningful insights from their studies, contributing to the broader knowledge in their field.

Categorical variables

Categorical variables, also known as qualitative variables, represent types or categories that are used to group observations. These variables divide data into distinct groups or categories that lack a numerical value but hold significant meaning in research. Examples of categorical variables include gender (male, female, other), type of vehicle (car, truck, motorcycle), or marital status (single, married, divorced). These categories help researchers organize data into groups for comparison and analysis.

Categorical variables can be further classified into two subtypes: nominal and ordinal. Nominal variables are categories without any inherent order or ranking among them, such as blood type or ethnicity. Ordinal variables, on the other hand, imply a sort of ranking or order among the categories, like levels of satisfaction (high, medium, low) or education level (high school, bachelor's, master's, doctorate).

Understanding and identifying categorical variables is crucial in research as it influences the choice of statistical analysis methods. Since these variables represent categories without numerical significance, researchers employ specific statistical tests designed for a nominal or ordinal variable to draw meaningful conclusions. Properly classifying and analyzing categorical variables allow for the exploration of relationships between different groups within the study, shedding light on patterns and trends that might not be evident with numerical data alone.

Continuous variables

Continuous variables are quantitative variables that can take an infinite number of values within a given range. These variables are measured along a continuum and can represent very precise measurements. Examples of continuous variables include height, weight, temperature, and time. Because they can assume any value within a range, continuous variables allow for detailed analysis and a high degree of accuracy in research findings.

The ability to measure continuous variables at very fine scales makes them invaluable for many types of research, particularly in the natural and social sciences. For instance, in a study examining the effect of temperature on plant growth, temperature would be considered a continuous variable since it can vary across a wide spectrum and be measured to several decimal places.

When dealing with continuous variables, researchers often use methods incorporating a particular statistical test to accommodate a wide range of data points and the potential for infinite divisibility. This includes various forms of regression analysis, correlation, and other techniques suited for modeling and analyzing nuanced relationships between variables. The precision of continuous variables enhances the researcher's ability to detect patterns, trends, and causal relationships within the data, contributing to more robust and detailed conclusions.

Confounding variables

Confounding variables are those that can cause a false association between the independent and dependent variables, potentially leading to incorrect conclusions about the relationship being studied. These are extraneous variables that were not considered in the study design but can influence both the supposed cause and effect, creating a misleading correlation.

Identifying and controlling for a confounding variable is crucial in research to ensure the validity of the findings. This can be achieved through various methods, including randomization, stratification, and statistical control. Randomization helps to evenly distribute confounding variables across study groups, reducing their potential impact. Stratification involves analyzing the data within strata or layers that share common characteristics of the confounder. Statistical control allows researchers to adjust for the effects of confounders in the analysis phase.

Properly addressing confounding variables strengthens the credibility of research outcomes by clarifying the direct relationship between the dependent and independent variables, thus providing more accurate and reliable results.

what is variable in research study

Beyond the primary categories of variables commonly discussed in research methodology , there exists a diverse range of other variables that play significant roles in the design and analysis of studies. Below is an overview of some of these variables, highlighting their definitions and roles within research studies:

  • Discrete variables : A discrete variable is a quantitative variable that represents quantitative data , such as the number of children in a family or the number of cars in a parking lot. Discrete variables can only take on specific values.
  • Categorical variables : A categorical variable categorizes subjects or items into groups that do not have a natural numerical order. Categorical data includes nominal variables, like country of origin, and ordinal variables, such as education level.
  • Predictor variables : Often used in statistical models, a predictor variable is used to forecast or predict the outcomes of other variables, not necessarily with a causal implication.
  • Outcome variables : These variables represent the results or outcomes that researchers aim to explain or predict through their studies. An outcome variable is central to understanding the effects of predictor variables.
  • Latent variables : Not directly observable, latent variables are inferred from other, directly measured variables. Examples include psychological constructs like intelligence or socioeconomic status.
  • Composite variables : Created by combining multiple variables, composite variables can measure a concept more reliably or simplify the analysis. An example would be a composite happiness index derived from several survey questions .
  • Preceding variables : These variables come before other variables in time or sequence, potentially influencing subsequent outcomes. A preceding variable is crucial in longitudinal studies to determine causality or sequences of events.

what is variable in research study

Master qualitative research with ATLAS.ti

Turn data into critical insights with our data analysis platform. Try out a free trial today.

what is variable in research study

what is variable in research study

Research Variables 101

Independent variables, dependent variables, control variables and more

By: Derek Jansen (MBA) | Expert Reviewed By: Kerryn Warren (PhD) | January 2023

If you’re new to the world of research, especially scientific research, you’re bound to run into the concept of variables , sooner or later. If you’re feeling a little confused, don’t worry – you’re not the only one! Independent variables, dependent variables, confounding variables – it’s a lot of jargon. In this post, we’ll unpack the terminology surrounding research variables using straightforward language and loads of examples .

Overview: Variables In Research

1. ?
2. variables
3. variables
4. variables

5. variables
6. variables
7. variables
8. variables

What (exactly) is a variable?

The simplest way to understand a variable is as any characteristic or attribute that can experience change or vary over time or context – hence the name “variable”. For example, the dosage of a particular medicine could be classified as a variable, as the amount can vary (i.e., a higher dose or a lower dose). Similarly, gender, age or ethnicity could be considered demographic variables, because each person varies in these respects.

Within research, especially scientific research, variables form the foundation of studies, as researchers are often interested in how one variable impacts another, and the relationships between different variables. For example:

  • How someone’s age impacts their sleep quality
  • How different teaching methods impact learning outcomes
  • How diet impacts weight (gain or loss)

As you can see, variables are often used to explain relationships between different elements and phenomena. In scientific studies, especially experimental studies, the objective is often to understand the causal relationships between variables. In other words, the role of cause and effect between variables. This is achieved by manipulating certain variables while controlling others – and then observing the outcome. But, we’ll get into that a little later…

The “Big 3” Variables

Variables can be a little intimidating for new researchers because there are a wide variety of variables, and oftentimes, there are multiple labels for the same thing. To lay a firm foundation, we’ll first look at the three main types of variables, namely:

  • Independent variables (IV)
  • Dependant variables (DV)
  • Control variables

What is an independent variable?

Simply put, the independent variable is the “ cause ” in the relationship between two (or more) variables. In other words, when the independent variable changes, it has an impact on another variable.

For example:

  • Increasing the dosage of a medication (Variable A) could result in better (or worse) health outcomes for a patient (Variable B)
  • Changing a teaching method (Variable A) could impact the test scores that students earn in a standardised test (Variable B)
  • Varying one’s diet (Variable A) could result in weight loss or gain (Variable B).

It’s useful to know that independent variables can go by a few different names, including, explanatory variables (because they explain an event or outcome) and predictor variables (because they predict the value of another variable). Terminology aside though, the most important takeaway is that independent variables are assumed to be the “cause” in any cause-effect relationship. As you can imagine, these types of variables are of major interest to researchers, as many studies seek to understand the causal factors behind a phenomenon.

Need a helping hand?

what is variable in research study

What is a dependent variable?

While the independent variable is the “ cause ”, the dependent variable is the “ effect ” – or rather, the affected variable . In other words, the dependent variable is the variable that is assumed to change as a result of a change in the independent variable.

Keeping with the previous example, let’s look at some dependent variables in action:

  • Health outcomes (DV) could be impacted by dosage changes of a medication (IV)
  • Students’ scores (DV) could be impacted by teaching methods (IV)
  • Weight gain or loss (DV) could be impacted by diet (IV)

In scientific studies, researchers will typically pay very close attention to the dependent variable (or variables), carefully measuring any changes in response to hypothesised independent variables. This can be tricky in practice, as it’s not always easy to reliably measure specific phenomena or outcomes – or to be certain that the actual cause of the change is in fact the independent variable.

As the adage goes, correlation is not causation . In other words, just because two variables have a relationship doesn’t mean that it’s a causal relationship – they may just happen to vary together. For example, you could find a correlation between the number of people who own a certain brand of car and the number of people who have a certain type of job. Just because the number of people who own that brand of car and the number of people who have that type of job is correlated, it doesn’t mean that owning that brand of car causes someone to have that type of job or vice versa. The correlation could, for example, be caused by another factor such as income level or age group, which would affect both car ownership and job type.

To confidently establish a causal relationship between an independent variable and a dependent variable (i.e., X causes Y), you’ll typically need an experimental design , where you have complete control over the environmen t and the variables of interest. But even so, this doesn’t always translate into the “real world”. Simply put, what happens in the lab sometimes stays in the lab!

As an alternative to pure experimental research, correlational or “ quasi-experimental ” research (where the researcher cannot manipulate or change variables) can be done on a much larger scale more easily, allowing one to understand specific relationships in the real world. These types of studies also assume some causality between independent and dependent variables, but it’s not always clear. So, if you go this route, you need to be cautious in terms of how you describe the impact and causality between variables and be sure to acknowledge any limitations in your own research.

Free Webinar: Research Methodology 101

What is a control variable?

In an experimental design, a control variable (or controlled variable) is a variable that is intentionally held constant to ensure it doesn’t have an influence on any other variables. As a result, this variable remains unchanged throughout the course of the study. In other words, it’s a variable that’s not allowed to vary – tough life 🙂

As we mentioned earlier, one of the major challenges in identifying and measuring causal relationships is that it’s difficult to isolate the impact of variables other than the independent variable. Simply put, there’s always a risk that there are factors beyond the ones you’re specifically looking at that might be impacting the results of your study. So, to minimise the risk of this, researchers will attempt (as best possible) to hold other variables constant . These factors are then considered control variables.

Some examples of variables that you may need to control include:

  • Temperature
  • Time of day
  • Noise or distractions

Which specific variables need to be controlled for will vary tremendously depending on the research project at hand, so there’s no generic list of control variables to consult. As a researcher, you’ll need to think carefully about all the factors that could vary within your research context and then consider how you’ll go about controlling them. A good starting point is to look at previous studies similar to yours and pay close attention to which variables they controlled for.

Of course, you won’t always be able to control every possible variable, and so, in many cases, you’ll just have to acknowledge their potential impact and account for them in the conclusions you draw. Every study has its limitations , so don’t get fixated or discouraged by troublesome variables. Nevertheless, always think carefully about the factors beyond what you’re focusing on – don’t make assumptions!

 A control variable is intentionally held constant (it doesn't vary) to ensure it doesn’t have an influence on any other variables.

Other types of variables

As we mentioned, independent, dependent and control variables are the most common variables you’ll come across in your research, but they’re certainly not the only ones you need to be aware of. Next, we’ll look at a few “secondary” variables that you need to keep in mind as you design your research.

  • Moderating variables
  • Mediating variables
  • Confounding variables
  • Latent variables

Let’s jump into it…

What is a moderating variable?

A moderating variable is a variable that influences the strength or direction of the relationship between an independent variable and a dependent variable. In other words, moderating variables affect how much (or how little) the IV affects the DV, or whether the IV has a positive or negative relationship with the DV (i.e., moves in the same or opposite direction).

For example, in a study about the effects of sleep deprivation on academic performance, gender could be used as a moderating variable to see if there are any differences in how men and women respond to a lack of sleep. In such a case, one may find that gender has an influence on how much students’ scores suffer when they’re deprived of sleep.

It’s important to note that while moderators can have an influence on outcomes , they don’t necessarily cause them ; rather they modify or “moderate” existing relationships between other variables. This means that it’s possible for two different groups with similar characteristics, but different levels of moderation, to experience very different results from the same experiment or study design.

What is a mediating variable?

Mediating variables are often used to explain the relationship between the independent and dependent variable (s). For example, if you were researching the effects of age on job satisfaction, then education level could be considered a mediating variable, as it may explain why older people have higher job satisfaction than younger people – they may have more experience or better qualifications, which lead to greater job satisfaction.

Mediating variables also help researchers understand how different factors interact with each other to influence outcomes. For instance, if you wanted to study the effect of stress on academic performance, then coping strategies might act as a mediating factor by influencing both stress levels and academic performance simultaneously. For example, students who use effective coping strategies might be less stressed but also perform better academically due to their improved mental state.

In addition, mediating variables can provide insight into causal relationships between two variables by helping researchers determine whether changes in one factor directly cause changes in another – or whether there is an indirect relationship between them mediated by some third factor(s). For instance, if you wanted to investigate the impact of parental involvement on student achievement, you would need to consider family dynamics as a potential mediator, since it could influence both parental involvement and student achievement simultaneously.

Mediating variables can explain the relationship between the independent and dependent variable, including whether it's causal or not.

What is a confounding variable?

A confounding variable (also known as a third variable or lurking variable ) is an extraneous factor that can influence the relationship between two variables being studied. Specifically, for a variable to be considered a confounding variable, it needs to meet two criteria:

  • It must be correlated with the independent variable (this can be causal or not)
  • It must have a causal impact on the dependent variable (i.e., influence the DV)

Some common examples of confounding variables include demographic factors such as gender, ethnicity, socioeconomic status, age, education level, and health status. In addition to these, there are also environmental factors to consider. For example, air pollution could confound the impact of the variables of interest in a study investigating health outcomes.

Naturally, it’s important to identify as many confounding variables as possible when conducting your research, as they can heavily distort the results and lead you to draw incorrect conclusions . So, always think carefully about what factors may have a confounding effect on your variables of interest and try to manage these as best you can.

What is a latent variable?

Latent variables are unobservable factors that can influence the behaviour of individuals and explain certain outcomes within a study. They’re also known as hidden or underlying variables , and what makes them rather tricky is that they can’t be directly observed or measured . Instead, latent variables must be inferred from other observable data points such as responses to surveys or experiments.

For example, in a study of mental health, the variable “resilience” could be considered a latent variable. It can’t be directly measured , but it can be inferred from measures of mental health symptoms, stress, and coping mechanisms. The same applies to a lot of concepts we encounter every day – for example:

  • Emotional intelligence
  • Quality of life
  • Business confidence
  • Ease of use

One way in which we overcome the challenge of measuring the immeasurable is latent variable models (LVMs). An LVM is a type of statistical model that describes a relationship between observed variables and one or more unobserved (latent) variables. These models allow researchers to uncover patterns in their data which may not have been visible before, thanks to their complexity and interrelatedness with other variables. Those patterns can then inform hypotheses about cause-and-effect relationships among those same variables which were previously unknown prior to running the LVM. Powerful stuff, we say!

Latent variables are unobservable factors that can influence the behaviour of individuals and explain certain outcomes within a study.

Let’s recap

In the world of scientific research, there’s no shortage of variable types, some of which have multiple names and some of which overlap with each other. In this post, we’ve covered some of the popular ones, but remember that this is not an exhaustive list .

To recap, we’ve explored:

  • Independent variables (the “cause”)
  • Dependent variables (the “effect”)
  • Control variables (the variable that’s not allowed to vary)

If you’re still feeling a bit lost and need a helping hand with your research project, check out our 1-on-1 coaching service , where we guide you through each step of the research journey. Also, be sure to check out our free dissertation writing course and our collection of free, fully-editable chapter templates .

what is variable in research study

Psst... there’s more!

This post was based on one of our popular Research Bootcamps . If you're working on a research project, you'll definitely want to check this out ...

Fiona

Very informative, concise and helpful. Thank you

Ige Samuel Babatunde

Helping information.Thanks

Ancel George

practical and well-demonstrated

Michael

Very helpful and insightful

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Educational resources and simple solutions for your research journey

independent vs dependent variables

Independent vs Dependent Variables: Definitions & Examples

A variable is an important element of research. It is a characteristic, number, or quantity of any category that can be measured or counted and whose value may change with time or other parameters.  

Variables are defined in different ways in different fields. For instance, in mathematics, a variable is an alphabetic character that expresses a numerical value. In algebra, a variable represents an unknown entity, mostly denoted by a, b, c, x, y, z, etc. In statistics, variables represent real-world conditions or factors. Despite the differences in definitions, in all fields, variables represent the entity that changes and help us understand how one factor may or may not influence another factor.  

Variables in research and statistics are of different types—independent, dependent, quantitative (discrete or continuous), qualitative (nominal/categorical, ordinal), intervening, moderating, extraneous, confounding, control, and composite. In this article we compare the first two types— independent vs dependent variables .  

Table of Contents

What is a variable?  

Researchers conduct experiments to understand the cause-and-effect relationships between various entities. In such experiments, the entities whose values change are called variables. These variables describe the relationships among various factors and help in drawing conclusions in experiments. They help in understanding how some factors influence others. Some examples of variables include age, gender, race, income, weight, etc.   

As mentioned earlier, different types of variables are used in research. Of these, we will compare the most common types— independent vs dependent variables . The independent variable is the cause and the dependent variable is the effect, that is, independent variables influence dependent variables. In research, a dependent variable is the outcome of interest of the study and the independent variable is the factor that may influence the outcome. Let’s explain this with an independent and dependent variable example : In a study to analyze the effect of antibiotic use on microbial resistance, antibiotic use is the independent variable and microbial resistance is the dependent variable because antibiotic use affects microbial resistance.( 1)  

What is an independent variable?  

Here is a list of the important characteristics of independent variables .( 2,3)  

  • An independent variable is the factor that is being manipulated in an experiment.  
  • In a research study, independent variables affect or influence dependent variables and cause them to change.  
  • Independent variables help gather evidence and draw conclusions about the research subject.  
  • They’re also called predictors, factors, treatment variables, explanatory variables, and input variables.  
  • On graphs, independent variables are usually placed on the X-axis.  
  • Example: In a study on the relationship between screen time and sleep problems, screen time is the independent variable because it influences sleep (the dependent variable).  
  • In addition, some factors like age are independent variables because other variables such as a person’s income will not change their age.  

what is variable in research study

Types of independent variables  

Independent variables in research are of the following two types:( 4)  

Quantitative  

Quantitative independent variables differ in amounts or scales. They are numeric and answer questions like “how many” or “how often.”  

Here are a few quantitative independent variables examples :  

  • Differences in treatment dosages and frequencies: Useful in determining the appropriate dosage to get the desired outcome.  
  • Varying salinities: Useful in determining the range of salinity that organisms can tolerate.  

Qualitative  

Qualitative independent variables are non-numerical variables.  

A few qualitative independent variables examples are listed below:  

  • Different strains of a species: Useful in identifying the strain of a crop that is most resistant to a specific disease.  
  • Varying methods of how a treatment is administered—oral or intravenous.  

A quantitative variable is represented by actual amounts and a qualitative variable by categories or groups.  

What is a dependent variable ?  

Here are a few characteristics of dependent variables: ( 3)  

  • A dependent variable represents a quantity whose value depends on the independent variable and how it is changed.  
  • The dependent variable is influenced by the independent variable under various circumstances.  
  • It is also known as the response variable and outcome variable.  
  • On graphs, dependent variables are placed on the Y-axis.  

Here are a few dependent variable examples :  

  • In a study on the effect of exercise on mood, the dependent variable is mood because it may change with exercise.  
  • In a study on the effect of pH on enzyme activity, the enzyme activity is the dependent variable because it changes with changing pH.   

Types of dependent variables  

Dependent variables are of two types:( 5)  

Continuous dependent variables

These variables can take on any value within a given range and are measured on a continuous scale, for example, weight, height, temperature, time, distance, etc.  

Categorical or discrete dependent variables

These variables are divided into distinct categories. They are not measured on a continuous scale so only a limited number of values are possible, for example, gender, race, etc.  

what is variable in research study

Differences between independent and dependent variables  

The following table compares independent vs dependent variables .  

     
How to identify  Manipulated or controlled  Observed or measured 
Purpose  Cause or predictor variable  Outcome or response variable 
Relationship  Independent of other variables  Influenced by the independent variable 
Control  Manipulated or assigned by researcher  Measured or observed during experiments 

Independent and dependent variable examples  

Listed below are a few examples of research questions from various disciplines and their corresponding independent and dependent variables.( 6)

       
Genetics  What is the relationship between genetics and susceptibility to diseases?  genetic factors  susceptibility to diseases 
History  How do historical events influence national identity?  historical events  national identity 
Political science  What is the effect of political campaign advertisements on voter behavior?  political campaign advertisements  voter behavior 
Sociology  How does social media influence cultural awareness?  social media exposure  cultural awareness 
Economics  What is the impact of economic policies on unemployment rates?  economic policies  unemployment rates 
Literature  How does literary criticism affect book sales?  literary criticism  book sales 
Geology  How do a region’s geological features influence the magnitude of earthquakes?  geological features  earthquake magnitudes 
Environment  How do changes in climate affect wildlife migration patterns?  climate changes  wildlife migration patterns 
Gender studies  What is the effect of gender bias in the workplace on job satisfaction?  gender bias  job satisfaction 
Film studies  What is the relationship between cinematographic techniques and viewer engagement?  cinematographic techniques  viewer engagement 
Archaeology  How does archaeological tourism affect local communities?  archaeological techniques  local community development 

  Independent vs dependent variables in research  

Experiments usually have at least two variables—independent and dependent. The independent variable is the entity that is being tested and the dependent variable is the result. Classifying independent and dependent variables as discrete and continuous can help in determining the type of analysis that is appropriate in any given research experiment, as shown in the table below. ( 7)  

   
   
    Chi-Square  t-test 
Logistic regression  ANOVA 
Phi  Regression 
Cramer’s V  Point-biserial correlation 
  Logistic regression  Regression 
Point-biserial correlation  Correlation 

  Here are some more research questions and their corresponding independent and dependent variables. ( 6)  

     
What is the impact of online learning platforms on academic performance?  type of learning  academic performance 
What is the association between exercise frequency and mental health?  exercise frequency  mental health 
How does smartphone use affect productivity?  smartphone use  productivity levels 
Does family structure influence adolescent behavior?  family structure  adolescent behavior 
What is the impact of nonverbal communication on job interviews?  nonverbal communication  job interviews 

  How to identify independent vs dependent variables  

In addition to all the characteristics of independent and dependent variables listed previously, here are few simple steps to identify the variable types in a research question.( 8)  

  • Keep in mind that there are no specific words that will always describe dependent and independent variables.  
  • If you’re given a paragraph, convert that into a question and identify specific words describing cause and effect.  
  • The word representing the cause is the independent variable and that describing the effect is the dependent variable.  

Let’s try out these steps with an example.  

A researcher wants to conduct a study to see if his new weight loss medication performs better than two bestseller alternatives. He wants to randomly select 20 subjects from Richmond, Virginia, aged 20 to 30 years and weighing above 60 pounds. Each subject will be randomly assigned to three treatment groups.  

To identify the independent and dependent variables, we convert this paragraph into a question, as follows: Does the new medication perform better than the alternatives? Here, the medications are the independent variable and their performances or effect on the individuals are the dependent variable.  

what is variable in research study

Visualizing independent vs dependent variables  

Data visualization is the graphical representation of information by using charts, graphs, and maps. Visualizations help in making data more understandable by making it easier to compare elements, identify trends and relationships (among variables), among other functions.  

Bar graphs, pie charts, and scatter plots are the best methods to graphically represent variables. While pie charts and bar graphs are suitable for depicting categorical data, scatter plots are appropriate for quantitative data. The independent variable is usually placed on the X-axis and the dependent variable on the Y-axis.  

Figure 1 is a scatter plot that depicts the relationship between the number of household members and their monthly grocery expenses. 9 The number of household members is the independent variable and the expenses the dependent variable. The graph shows that as the number of members increases the expenditure also increases.  

scatter plot

Key takeaways   

Let’s summarize the key takeaways about independent vs dependent variables from this article:  

  • A variable is any entity being measured in a study.  
  • A dependent variable is often the focus of a research study and is the response or outcome. It depends on or varies with changes in other variables.  
  • Independent variables cause changes in dependent variables and don’t depend on other variables.  
  • An independent variable can influence a dependent variable, but a dependent variable cannot influence an independent variable.  
  • An independent variable is the cause and dependent variable is the effect.  

Frequently asked questions  

  • What are the different types of variables used in research?  

The following table lists the different types of variables used in research.( 10)  

     
Categorical  Measures a construct that has different categories  gender, race, religious affiliation, political affiliation 
Quantitative  Measures constructs that vary by degree of the amount  weight, height, age, intelligence scores 
Independent (IV)  Measures constructs considered to be the cause  Higher education (IV) leads to higher income (DV) 
Dependent (DV)  Measures constructs that are considered the effect  Exercise (IV) will reduce anxiety levels (DV) 
Intervening or mediating (MV)  Measures constructs that intervene or stand in between the cause and effect  Incarcerated individuals are more likely to have psychiatric disorder (MV), which leads to disability in social roles 
Confounding (CV)  “Rival explanations” that explain the cause-and-effect relationship  Age (CV) explains the relationship between increased shoe size and increase in intelligence in children 
Control variable   Extraneous variables whose influence can be controlled or eliminated  Demographic data such as gender, socioeconomic status, age 

 2. Why is it important to differentiate between independent vs dependent variables ?  

  Differentiating between independent vs dependent variables is important to ensure the correct application in your own research and also the correct understanding of other studies. An incorrectly framed research question can lead to confusion and inaccurate results. An easy way to differentiate is to identify the cause and effect.  

 3. How are independent and dependent variables used in non-experimental research?  

  So far in this article we talked about variables in relation to experimental research, wherein variables are manipulated or measured to test a hypothesis, that is, to observe the effect on dependent variables. Let’s examine non-experimental research and how variable are used. 11 In non-experimental research, variables are not manipulated but are observed in their natural state. Researchers do not have control over the variables and cannot manipulate them based on their research requirements. For example, a study examining the relationship between income and education level would not manipulate either variable. Instead, the researcher would observe and measure the levels of each variable in the sample population. The level of control researchers have is the major difference between experimental and non-experimental research. Another difference is the causal relationship between the variables. In non-experimental research, it is not possible to establish a causal relationship because other variables may be influencing the outcome.  

  4. Are there any advantages and disadvantages of using independent vs dependent variables ?

  Here are a few advantages and disadvantages of both independent and dependent variables.( 12)

Advantages: 

  • Dependent variables are not liable to any form of bias because they cannot be manipulated by researchers or other external factors.  
  • Independent variables are easily obtainable and don’t require complex mathematical procedures to be observed, like dependent variables. This is because researchers can easily manipulate these variables or collect the data from respondents.  
  • Some independent variables are natural factors and cannot be manipulated. They are also easily obtainable because less time is required for data collection.

Disadvantages: 

  • Obtaining dependent variables is a very expensive and effort- and time-intensive process because these variables are obtained from longitudinal research by solving complex equations.  
  • Independent variables are prone to researcher and respondent bias because they can be manipulated, and this may affect the study results.  

We hope this article has provided you with an insight into the use and importance of independent vs dependent variables , which can help you effectively use variables in your next research study.    

  • Kaliyadan F, Kulkarni V. Types of variables, descriptive statistics, and sample size. Indian Dermatol Online J. 2019 Jan-Feb; 10(1): 82–86. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6362742/  
  • What Is an independent variable? (with uses and examples). Indeed website. Accessed March 11, 2024. https://www.indeed.com/career-advice/career-development/what-is-independent-variable  
  • Independent and dependent variables: Differences & examples. Statistics by Jim website. Accessed March 10, 2024. https://statisticsbyjim.com/regression/independent-dependent-variables/  
  • Independent variable. Biology online website. Accessed March 9, 2024. https://www.biologyonline.com/dictionary/independent-variable#:~:text=The%20independent%20variable%20in%20research,how%20many%20or%20how%20often .  
  • Dependent variables: Definition and examples. Clubz Tutoring Services website. Accessed March 10, 2024. https://clubztutoring.com/ed-resources/math/dependent-variable-definitions-examples-6-7-2/  
  • Research topics with independent and dependent variables. Good research topics website. Accessed March 12, 2024. https://goodresearchtopics.com/research-topics-with-independent-and-dependent-variables/  
  • Levels of measurement and using the correct statistical test. Univariate quantitative methods. Accessed March 14, 2024. https://web.pdx.edu/~newsomj/uvclass/ho_levels.pdf  
  • Easiest way to identify dependent and independent variables. Afidated website. Accessed March 15, 2024. https://www.afidated.com/2014/07/how-to-identify-dependent-and.html  
  • Choosing data visualizations. Math for the people website. Accessed March 14, 2024. https://web.stevenson.edu/mbranson/m4tp/version1/environmental-racism-choosing-data-visualization.html  
  • Trivedi C. Types of variables in scientific research. Concepts Hacked website. Accessed March 15, 2024. https://conceptshacked.com/variables-in-scientific-research/  
  • Variables in experimental and non-experimental research. Statistics solutions website. Accessed March 14, 2024. https://www.statisticssolutions.com/variables-in-experimental-and-non-experimental-research/#:~:text=The%20independent%20variable%20would%20be,state%20instead%20of%20manipulating%20them .  
  • Dependent vs independent variables: 11 key differences. Formplus website. Accessed March 15, 2024. https://www.formpl.us/blog/dependent-independent-variables

Editage All Access is a subscription-based platform that unifies the best AI tools and services designed to speed up, simplify, and streamline every step of a researcher’s journey. The Editage All Access Pack is a one-of-a-kind subscription that unlocks full access to an AI writing assistant, literature recommender, journal finder, scientific illustration tool, and exclusive discounts on professional publication services from Editage.  

Based on 22+ years of experience in academia, Editage All Access empowers researchers to put their best research forward and move closer to success. Explore our top AI Tools pack, AI Tools + Publication Services pack, or Build Your Own Plan. Find everything a researcher needs to succeed, all in one place –  Get All Access now starting at just $14 a month !    

Related Posts

Back to school 2024 sale

Back to School – Lock-in All Access Pack for a Year at the Best Price

journal turnaround time

Journal Turnaround Time: Researcher.Life and Scholarly Intelligence Join Hands to Empower Researchers with Publication Time Insights 

  • USC Libraries
  • Research Guides

Organizing Your Social Sciences Research Paper

  • Independent and Dependent Variables
  • Purpose of Guide
  • Design Flaws to Avoid
  • Glossary of Research Terms
  • Reading Research Effectively
  • Narrowing a Topic Idea
  • Broadening a Topic Idea
  • Extending the Timeliness of a Topic Idea
  • Academic Writing Style
  • Applying Critical Thinking
  • Choosing a Title
  • Making an Outline
  • Paragraph Development
  • Research Process Video Series
  • Executive Summary
  • The C.A.R.S. Model
  • Background Information
  • The Research Problem/Question
  • Theoretical Framework
  • Citation Tracking
  • Content Alert Services
  • Evaluating Sources
  • Primary Sources
  • Secondary Sources
  • Tiertiary Sources
  • Scholarly vs. Popular Publications
  • Qualitative Methods
  • Quantitative Methods
  • Insiderness
  • Using Non-Textual Elements
  • Limitations of the Study
  • Common Grammar Mistakes
  • Writing Concisely
  • Avoiding Plagiarism
  • Footnotes or Endnotes?
  • Further Readings
  • Generative AI and Writing
  • USC Libraries Tutorials and Other Guides
  • Bibliography

Definitions

Dependent Variable The variable that depends on other factors that are measured. These variables are expected to change as a result of an experimental manipulation of the independent variable or variables. It is the presumed effect.

Independent Variable The variable that is stable and unaffected by the other variables you are trying to measure. It refers to the condition of an experiment that is systematically manipulated by the investigator. It is the presumed cause.

Cramer, Duncan and Dennis Howitt. The SAGE Dictionary of Statistics . London: SAGE, 2004; Penslar, Robin Levin and Joan P. Porter. Institutional Review Board Guidebook: Introduction . Washington, DC: United States Department of Health and Human Services, 2010; "What are Dependent and Independent Variables?" Graphic Tutorial.

Identifying Dependent and Independent Variables

Don't feel bad if you are confused about what is the dependent variable and what is the independent variable in social and behavioral sciences research . However, it's important that you learn the difference because framing a study using these variables is a common approach to organizing the elements of a social sciences research study in order to discover relevant and meaningful results. Specifically, it is important for these two reasons:

  • You need to understand and be able to evaluate their application in other people's research.
  • You need to apply them correctly in your own research.

A variable in research simply refers to a person, place, thing, or phenomenon that you are trying to measure in some way. The best way to understand the difference between a dependent and independent variable is that the meaning of each is implied by what the words tell us about the variable you are using. You can do this with a simple exercise from the website, Graphic Tutorial. Take the sentence, "The [independent variable] causes a change in [dependent variable] and it is not possible that [dependent variable] could cause a change in [independent variable]." Insert the names of variables you are using in the sentence in the way that makes the most sense. This will help you identify each type of variable. If you're still not sure, consult with your professor before you begin to write.

Fan, Shihe. "Independent Variable." In Encyclopedia of Research Design. Neil J. Salkind, editor. (Thousand Oaks, CA: SAGE, 2010), pp. 592-594; "What are Dependent and Independent Variables?" Graphic Tutorial; Salkind, Neil J. "Dependent Variable." In Encyclopedia of Research Design , Neil J. Salkind, editor. (Thousand Oaks, CA: SAGE, 2010), pp. 348-349;

Structure and Writing Style

The process of examining a research problem in the social and behavioral sciences is often framed around methods of analysis that compare, contrast, correlate, average, or integrate relationships between or among variables . Techniques include associations, sampling, random selection, and blind selection. Designation of the dependent and independent variable involves unpacking the research problem in a way that identifies a general cause and effect and classifying these variables as either independent or dependent.

The variables should be outlined in the introduction of your paper and explained in more detail in the methods section . There are no rules about the structure and style for writing about independent or dependent variables but, as with any academic writing, clarity and being succinct is most important.

After you have described the research problem and its significance in relation to prior research, explain why you have chosen to examine the problem using a method of analysis that investigates the relationships between or among independent and dependent variables . State what it is about the research problem that lends itself to this type of analysis. For example, if you are investigating the relationship between corporate environmental sustainability efforts [the independent variable] and dependent variables associated with measuring employee satisfaction at work using a survey instrument, you would first identify each variable and then provide background information about the variables. What is meant by "environmental sustainability"? Are you looking at a particular company [e.g., General Motors] or are you investigating an industry [e.g., the meat packing industry]? Why is employee satisfaction in the workplace important? How does a company make their employees aware of sustainability efforts and why would a company even care that its employees know about these efforts?

Identify each variable for the reader and define each . In the introduction, this information can be presented in a paragraph or two when you describe how you are going to study the research problem. In the methods section, you build on the literature review of prior studies about the research problem to describe in detail background about each variable, breaking each down for measurement and analysis. For example, what activities do you examine that reflect a company's commitment to environmental sustainability? Levels of employee satisfaction can be measured by a survey that asks about things like volunteerism or a desire to stay at the company for a long time.

The structure and writing style of describing the variables and their application to analyzing the research problem should be stated and unpacked in such a way that the reader obtains a clear understanding of the relationships between the variables and why they are important. This is also important so that the study can be replicated in the future using the same variables but applied in a different way.

Fan, Shihe. "Independent Variable." In Encyclopedia of Research Design. Neil J. Salkind, editor. (Thousand Oaks, CA: SAGE, 2010), pp. 592-594; "What are Dependent and Independent Variables?" Graphic Tutorial; “Case Example for Independent and Dependent Variables.” ORI Curriculum Examples. U.S. Department of Health and Human Services, Office of Research Integrity; Salkind, Neil J. "Dependent Variable." In Encyclopedia of Research Design , Neil J. Salkind, editor. (Thousand Oaks, CA: SAGE, 2010), pp. 348-349; “Independent Variables and Dependent Variables.” Karl L. Wuensch, Department of Psychology, East Carolina University [posted email exchange]; “Variables.” Elements of Research. Dr. Camille Nebeker, San Diego State University.

  • << Previous: Design Flaws to Avoid
  • Next: Glossary of Research Terms >>
  • Last Updated: Sep 3, 2024 1:54 PM
  • URL: https://libguides.usc.edu/writingguide

helpful professor logo

27 Types of Variables in Research and Statistics

27 Types of Variables in Research and Statistics

Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

Learn about our Editorial Process

types of variables in research, explained below

In research and statistics, a variable is a characteristic or attribute that can take on different values or categories. It represents data points or information that can be measured, observed, or manipulated within a study.

Statistical and experimental analysis aims to explore the relationships between variables. For example, researchers may hypothesize a connection between a particular variable and an outcome, like the association between physical activity levels (an independent variable) and heart health (a dependent variable).

Variables play a crucial role in data analysis . Data sets collected through research typically consist of multiple variables, and the analysis is driven by how these variables are related, how they influence each other, and what patterns emerge from these relationships.

Therefore, as a researcher, your understanding of variables and their manipulation forms the crux of your study.

To help with your understanding, I’ve presented 27 of the most common types of variables below.

Types of Variables

1. quantitative (numerical) variables.

Definition: Quantitative variables, also known as numerical variables, are quantifiable in nature and represented in numbers, allowing the data collected to be measured on a scale or range (Moodie & Johnson, 2021). These variables generally yield data that can be organized, ranked, measured, and subjected to mathematical operations.

Explanation: The values of quantitative variables can either be counted (referred to as discrete variables) or measured (continuous variables). Quantifying data in numerical form allows for a range of statistical analysis techniques to be applied, from calculating averages to finding correlations.

ProsCons
They provide a precise measure, allow for a higher level of measurement, and can be manipulated statistically for inferential analysis. The resulting data is objective and consistent (Moodie & Johnson, 2021). can be time-consuming and costly. Secondly, important context or explanation may be lost when data is purely numerical (Katz, 2006).

Quantitative Variable Example : Consider a marketing survey where you ask respondents to rate their satisfaction with your product on a scale of 1 to 10. The satisfaction score here represents a quantitative variable. The data can be quantified and used to calculate average satisfaction scores, identify the scope for product improvement, or compare satisfaction levels across different demographic groups.

2. Continuous Variables

Definition: Continuous variables are a subtype of quantitative variables that can have an infinite number of measurements within a specified range. They provide detailed insights based on precise measurements and are often representative on a continuous scale (Christmann & Badgett, 2009).

Explanation: The variable is “continuous” because there are an infinite number of possible values within the chosen range. For instance, variables like height, weight, or time are measured continuously.

ProsCons
They give a higher level of detail, useful in determining precise measurements, and allow for complex statistical analysis (Christmann & Badgett, 2009).They can easily lead to information overload due to granularity (Allen, 2017). The representation and interpretation of results may also be more complex.

Continuous Variable Example : The best real-world example of a continuous variable is time. For instance, the time it takes for a customer service representative to resolve a customer issue can range anywhere from few seconds to several hours, and can accurately be measured down to the second, providing an almost finite set of possible values.

3. Discrete Variables

Definition: Discrete variables are a form of quantitative variable that can only assume a finite number of values. They are typically count-based (Frankfort-Nachmias & Leon-Guerrero, 2006).

Explanation: Discrete variables are commonly used in situations where the “count” or “quantity” is distinctly separate. For instance, the number of children in a family is a common example – you can’t have 2.5 kids.

ProsCons
They are easier to comprehend and simpler to analyze, as they provide direct and countable insight (Frankfort-Nachmias & Leon-Guerrero, 2006).They might lack in-depth information because they cannot provide the granularity that continuous variables offer (Privitera, 2022).

Discrete Variable Example : The number of times a customer contacts customer service within a month. This is a discrete variable because it can only take a whole number of values – you can’t call customer service 2.5 times.

4. Qualitative (Categorical) Variables

Definition: Qualitative, or categorical variables, are non-numerical data points that categorize or group data entities based on shared features or qualities (Moodie & Johnson, 2021).

Explanation: They are often used in research to classify particular traits, characteristics, or properties of subjects that are not easily quantifiable, such as colors, textures, tastes, or smells.

ProsCons
Essences or characteristics that cannot be measured numerically can be captured. They provide richer, subjective, and explanatory data (Moodie & Johnson, 2021).The analysis might be challenging because these variables cannot be subjected to mathematical calculations or operations (Creswell & Creswell, 2018).

Qualitative Variable Example : Consider a survey that asks respondents to identify their favorite color from a list of choices. The color preference would be a qualitative variable as it categorizes data into different categories corresponding to different colors.

5. Nominal Variables

Definition: Nominal variables, a subtype of qualitative variables, represent categories without any inherent order or ranking (Norman & Streiner, 2008).

Explanation: Nominal variables are often used to label or categorize particular sets of items or individuals, with no intention of giving numerical value or order. For example, race, gender, or religion.

ProsCons
They are simple to understand and effective in segregating data into clearly defined, mutually exclusive categories (Norman & Streiner, 2008).They can often be overly simplistic, leading to a loss of data differentiation and information (Katz, 2006). They also do not provide any directionality or order.

Nominal Variable Example : For instance, the type of car someone owns (sedan, SUV, truck, etc.) is a nominal variable. Each category is unique and one is not inherently higher, better, or larger than the others.

6. Ordinal Variables

Definition: Ordinal variables are a subtype of categorical (qualitative) variables with a key feature of having a clear, distinct, and meaningful order or ranking to the categories (De Vaus, 2001).

Explanation: Ordinal variables represent categories that can be logically arranged in a specific order or sequence but the difference between categories is unknown or doesn’t matter, such as satisfaction rating scale (unsatisfied, neutral, satisfied).

ProsCons
.Ordinal variables allow categorization of data that also reflect some sort of ranking or order, allowing more nuanced insights from your data (De Vaus, 2001)..It becomes challenging during data analysis due to the unequal intervals (Katz, 2006). Differences between the adjacent categories are unknown and not measurable.

Ordinal Variable Example : A classic example is asking survey respondents how strongly they agree or disagree with a statement (strongly disagree, disagree, neither agree nor disagree, agree, strongly agree). The answers form an ordinal scale; they can be ranked, but the intervals between responses are not necessarily equal.

7. Dichotomous (Binary) Variables

Definition: Dichotomous or binary variables are a type of categorical variable that consist of only two opposing categories like true/false, yes/no, success/failure, and so on (Adams & McGuire, 2022).

Explanation: Dichotomous variables refer to situations where there can only be two, and just two, possible outcomes – there is no middle ground.

ProsCons
Dichotomous variables simplify analysis. They are particularly useful for “yes/no” questions, which can be coded into a numerical format for statistical analysis (Coolidge, 2012).Dichotomous variables might , losing valuable information by reducing them to just two categories (Adams & McGuire, 2022).

Dichotomous Variable Example : Whether a customer completed a transaction (Yes or No) is a binary variable. Either they completed the purchase (yes) or they did not (no).

8. Ratio Variables

Definition: Ratio variables are the highest level of quantitative variables that contain a zero point or absolute zero, which represents a complete absence of the quantity (Norman & Streiner, 2008).

Explanation: Besides being able to categorize and order units, ratio variables also allow for the relative degree of difference between them to be calculated. For example, income, height, weight, and temperature (in Kelvin) are ratio variables.

ProsCons
Having an inherent zero value allows for a broad range of statistical analysis that involves ratios (Norman & Streiner, 2008). It provides a larger volume of information than any other variable type.Ratio variables may give results that do not actually reflect the reality if zero does not exist (De Vaus, 2001).

Ratio Variable Example : An individual’s annual income is a ratio variable. You can say someone earning $50,000 earns twice as much as someone making $25,000. The zero point in this case would be an income of $0, which indicates that no income is being earned.

9. Interval Variables

Definition: Interval variables are quantitative variables that have equal, predictable differences between values, but they do not have a true zero point (Norman & Streiner, 2008).

Explanation: Interval variables are similar to ratio variables; both provide a clear ordering of categories and have equal intervals between successive values. The primary difference is the absence of an absolute zero.

ProsCons
Interval variables allow for more complex statistical analyses as they can accommodate a range of mathematical operations like addition and subtraction (Norman & Streiner, 2008).They restrict the ability to measure the ratio of categories since there’s no true zero (Babbie, Halley & Zaino, 2007).

Interval Variable Example : The classic example of an interval variable is the temperature in Fahrenheit or Celsius. The difference between 20 degrees and 30 degrees is the same as the difference between 70 degrees and 80 degrees, but there isn’t a true zero because the scale doesn’t start from absolute nonexistence of the quantity being measured.

Related: Quantitative Reasoning Examples

10. Dependent Variables

Definition: The dependent variable is the outcome or effect that the researcher wants to study. Its value depends on or is influenced by one or more other variables known as independent variables.

Explanation: In a research study, the dependent variable is the phenomenon or behavior that may be affected by manipulations in the independent variable. It’s what you measure to see if your predictions about the effects of the independent variable are correct.

ProsCons
It provides the results for the research question. Without a dependent variable, it would be impossible to draw conclusions from the conducted experiment or study.It’s not always straightforward to isolate the impact of independent variables on the dependent variable, especially when multiple independent variables are influencing the results.

Dependent Variable Example: Suppose you want to study the impact of exercise frequency on weight loss. In this case, the dependent variable is weight loss, which changes based on how often the subject exercises (the independent variable).

11. Independent Variables

Definition: The independent variable, or the predictor variable, is what the researcher manipulates to test its effect on the dependent variable.

Explanation: The independent variable is presumed to have some effect on the dependent variable in a study. It can often be thought of as the cause in a cause-and-effect relationship.

ProsCons
Manipulating the independent variable allows researchers to observe changes it causes in the dependent variable, aiding in understanding causal relationships in the data.It can be challenging to isolate the impact of a single independent variable when multiple factors may influence the dependent variable.

Independent Variable Example: In a study looking at how different dosages of a medication affect the severity of symptoms, the medication dosage is an independent variable. Researchers will adjust the dosage to see what effect it has on the symptoms (the dependent variable).

See Also: Independent and Dependent Variable Examples

12. Confounding Variables

Definition: Confounding variables—also known as confounders—are variables that might distort, confuse or interfere with the relationship between an independent variable and a dependent variable, leading to a false correlation (Boniface, 2019).

Explanation: Confounders are typically related in some way to both the independent and dependent variables. Because of this, they can create or hide relationships, leading researchers to make inaccurate conclusions about causality.

ProsCons
Identifying potential confounders during study design can help optimize the process and to the conclusions drawn (Knapp, 2017).Confounders can introduce bias and affect the validity of a study. If overlooked, they can lead to about correlations or cause-and-effect relationships (Bonidace, 2019).

Confounding Variable Example : If you’re studying the relationship between physical activity and heart health, diet could potentially act as a confounding variable. People who are physically active often also eat healthier diets, which could independently improve heart health [National Heart, Lung, and Blood Institute].

13. Control Variables

Definition: Control variables are variables in a research study that the researcher keeps constant to prevent them from interfering with the relationship between the independent and dependent variables (Sproull, 2002).

Explanation: Control variables allow researchers to isolate the effects of the independent variable on the dependent variable, ensuring that any changes observed are solely due to the manipulation of the independent variable and not an external factor.

ProsCons
Control variables increase the reliability of experiments, ensure a fair comparison between groups, and support the validity of the conclusions (Sproull, 2002).Misidentification or non-consideration of control variables might affect the outcome of the experiment, leading to biased results (Bonidace, 2019).

Control Variable Example : In a study evaluating the impact of a tutoring program on student performance, some control variables could include the teacher’s experience, the type of test used to measure performance, and the student’s previous grades.

14. Latent Variables

Definition: Latent variables—also referred to as hidden or unobserved variables—are variables that are not directly observed or measured but are inferred from other variables that are observed (measured directly).

Explanation: Latent variables can represent abstract concepts like intelligence, socioeconomic status, or even happiness. They are often used in psychological and sociological research, where certain concepts can’t be measured directly.

ProsCons
Latent variables can help capture unseen factors and give insight into the underlying constructs affecting observable behaviors.Inferring the values of latent variables can involve complex statistical methods and assumptions. Also, there might be several ways to interpret the values of latent variables, potentially impacting the validity and consistency of findings.

Latent Variable Example: In a study on job satisfaction, factors like job stress, financial reward, work-life balance, or relationship with colleagues can be measured directly. However, “job satisfaction” itself is a latent variable as it is inferred from these observed variables.

15. Derived Variables

Definition: Derived variables are variables that are created or developed based on existing variables in a dataset. They involve applying certain calculations or manipulations to one or more variables to create a new one.

Explanation: Derived variables can be created by either transforming a single variable (like taking the square root) or combining multiple variables (computing the ratio of two variables).

ProsCons
Derived variables can reduce complexity, extract more relevant information, and create new insights from existing data.They require careful creation as any errors in the genesis of the original variables will impact the derived variable. Also, the process of deriving variables needs to be adequately documented to ensure replicability and avoid misunderstanding.

Derived Variable Example: In a dataset containing a person’s height and weight, a derived variable could be the Body Mass Index (BMI). The BMI is calculated by dividing weight (in kilograms) by the square of height (in meters).

16. Time-series Variables

Definition: Time-series variables are a set of data points ordered or indexed in time order. They provide a sequence of data points, each associated with a specific instance in time.

Explanation: Time-series variables are often used in statistical models to study trends, analyze patterns over time, make forecasts, and understand underlying causes and characteristics of the trend.

ProsCons
Time series variables allow for the exploration of causal relationships, testing of theories, and forecasting of future values based on established patterns.They can be difficult to work with due to issues like seasonality, irregular intervals, autocorrelation, or non-stationarity. Often, additional statistical techniques- such as decomposition, differencing, or transformations- may need to be employed.

Time-series Variable Example : The quarterly GDP (Gross Domestic Product) data over a period of several years would be an example of a time series variable. Economists use such data to examine economic trends over time.

17. Cross-sectional Variables

Definition: Cross-sectional variables are data collected from many subjects at the same point in time or without regard to differences in time.

Explanation: This type of data provides a “snapshot” of the variables at a specific time. They’re often used in research to compare different population groups at a single point in time.

ProsCons
Cross-sectional data can be relatively easy and quick to collect. They are useful for examining the relationship between different variables at a given point in time.Cross-sectional data does not provide any information about causality or the sequence of events. It’s also susceptive to “snapshot bias” since it does not take into account changes over time.

Cross-sectional Variable Example: A basic example of a set of cross-sectional data could be a national survey that asks respondents about their current employment status. The data captured represents a single point in time and does not track changes in employment over time.

18. Predictor Variables

Definition: A predictor variable—also known as independent or explanatory variable—is a variable that is being manipulated in an experiment or study to see how it influences the dependent or response variable.

Explanation: In a cause-and-effect relationship, the predictor variable is the cause. Its modification allows the researcher to study its effect on the response variable.

ProsCons
Predictor variables establish cause-and-effect relationships and allow for the prediction of outcomes for the response variable.It can be challenging to isolate a single predictor variable’s impact when multiple predictor variables are involved, leading to potential interaction effects.

Predictor Variable Example : In a study evaluating the impact of studying hours on exam score, the number of studying hours is a predictor variable. Researchers alter the study duration to see its impact on the exam results (response variable).

19. Response Variables

Definition: A response variable—also known as the dependent or outcome variable—is what the researcher observes for any changes in an experiment or study. Its value depends on the predictor or independent variable.

Explanation: The response variable is the “effect” in a cause-and-effect scenario. Any changes occurring to this variable due to the predictor variable are observed and recorded.

ProsCons
The response variable supplies the results for the research question, offering crucial insights into the study.It may be influenced by several predictor variables making it difficult to isolate the effect of one specific predictor.

Response Variable Example: Continuing from the previous example, the exam score is the response variable. It changes based on the manipulation of the predictor variable, i.e., the number of studying hours.

20. Exogenous Variables

Definition: Exogenous variables are variables that are not affected by other variables in the system but can affect other variables within the same system.

Explanation: In a model, an exogenous variable is considered to be an input, it’s determined outside the model, and its value is simply imposed on the system.

ProsCons
Exogenous variables are often used as control variables in experimental studies, making them essential for creating cause-and-effect relationships.The relationship between exogenous variables and the dependent variable can be complex and challenging to identify precisely.

Exogenous Variable Example: In an economic model, the government’s taxation rate may be considered an exogenous variable. The rate is set externally (not determined within the economic model) but impacts variables within the model, such as business profitability.

21. Endogenous Variables

Definition: In contrast, endogenous variables are variables whose value is determined by the functional relationships within the system in an economic or statistical model. They depend on the values of other variables in the model.

Explanation: These are the “output” variables of a system, determined through cause-and-effect relationships within the system.

ProsCons
Endogenous variables play a significant role in understanding complex systems’ dynamics and aid in developing nuanced mathematical or statistical models.It can be difficult to untangle the causal relationships and influences surrounding endogenous variables.

Endogenous Variable Example: To continue the previous example, business profitability in an economic model may be considered an endogenous variable. It is influenced by several other variables within the model, including the exogenous taxation rate set by the government.

22. Causal Variables

Definition: Causal variables are variables which can directly cause an effect on the outcome or dependent variable. Their value or level determines the value or level of other variables.

Explanation: In a cause-and-effect relationship, a causal variable is the cause. The understanding of causal relationships is the basis of scientific enquiry, allowing researchers to manipulate variables to see the effect.

ProsCons
Identifying and understanding causal variables can lead to practical interventions as it offers the opportunity to control or change the outcome.Confusion can arise between correlation and causation. Just because two variables move together doesn’t necessarily mean that one causes the other to move.

Causal Variable Example: In a study examining the effect of fertilizer on plant growth, the type or amount of fertilizer used is the causal variable. Changing its type or amount should directly affect the outcome—plant growth.

23. Moderator Variables

Definition: Moderator variables are variables that can affect the strength or direction of the association between the predictor (independent) and response (dependent) variable. They specify when or under what conditions a relationship holds.

Explanation: The role of a moderator is to illustrate “how” or “when” an independent variable’s effect on a dependent variable changes.

ProsCons
The identification of the moderator variables can provide a more nuanced understanding of the relationship between independent and dependent variables.It’s often challenging to identify potential moderators and require experimental design to appropriately assess their impact.

Moderator Variable Example: If you are studying the effect of a training program on job performance, a potential moderator variable could be the employee’s education level. The influence of the training program on job performance could depend on the employee’s initial level of education.

24. Mediator Variables

Definition: Mediator variables are variables that account for, or explain, the relationship between an independent variable and a dependent variable, providing an understanding of “why” or “how” an effect occurs.

Explanation: Often, the relationship between an independent and a dependent variable isn’t direct—it’s through a third, intervening, variable known as a mediator variable.

ProsCons
The identification of mediators can enhance the understanding of underlying processes or mechanisms that explain why an effect exists.The establishment of mediation effects requires strong and complex modeling techniques, and it may be difficult to establish temporal precedence, a prerequisite for mediation.

Mediator Variable Example: In a study looking at the relationship between socioeconomic status and academic performance, a mediator variable might be the access to educational resources. Socioeconomic status may influence access to educational resources, which in turn affects academic performance. The relationship between socioeconomic status and academic performance isn’t direct but through access to resources.

25. Extraneous Variables

Definition: Extraneous variables are variables that are not of primary interest to a researcher but might influence the outcome of a study. They can add “noise” to the research data if not controlled.

Explanation: An extraneous variable is anything else that has the potential to influence our dependent variable or confound our results if not kept in check, other than our independent variable.

ProsCons
The identification and control of extraneous variables can improve the validity of the study’s conclusions by minimizing potential sources of bias.These variables can confuse the outcome of a study if not adequately observed, measured, and controlled.

Extraneous Variable Example : Consider an experiment to test whether temperature influences the rate of a chemical reaction. Potential extraneous variables could include the light level, humidity, or impurities in the chemicals used—each could affect the reaction rate and, thus, should be controlled to ensure valid results.

26. Dummy Variables

Definition: Dummy variables, often used in regression analysis, are artificial variables created to represent an attribute with two or more distinct categories or levels.

Explanation: They are used to turn a qualitative variable into a quantitative one to facilitate mathematical processing. Typically, dummy variables are binary – taking a value of either 0 or 1.

ProsCons
Using dummy variables allows the modelling of categorical or nominal variables in regression equations, which can only handle numerical values.Creating too many dummy variables—known as the “dummy variable trap”—can lead to multicollinearity in regression models, making the results hard to interpret.

Dummy Variable Example: Consider a dataset that includes a variable “Gender” with categories “male” and “female”. A corresponding dummy variable “IsMale” could be introduced, where males get classified as 1 and females as 0.

27. Composite Variables

Definition: Composite variables are new variables created by combining or grouping two or more variables.

Explanation: Depending upon their complexity, composite variables can help assess concepts that are explicit (e.g., “total score”) or relatively abstract (e.g., “life quality index”).

ProsCons
They can simplify analysis by reducing the number of variables considered and may help in handling multicollinearity in statistical models.The creation of composite variables requires careful consideration of the underlying variables that make up the composite. It might be hard to interpret and requires an understanding of the individual variables.

Composite Variable Example: A “Healthy Living Index” might be created as a composite of multiple variables such as eating habits, physical activity level, sleep quality, and stress level. Each of these variables contributes to the overall “Healthy Living Index”.

Knowing your variables will make you a better researcher. Some you need to keep an eye out for: confounding variables , for instance, always need to be in the backs of our minds. Others you need to think about during study design, matching the research design to the research objectives.

Adams, K. A., & McGuire, E. K. (2022). Research Methods, Statistics, and Applications . SAGE Publications.

Allen, M. (2017). The SAGE Encyclopedia of Communication Research Methods (Vol. 1). New York: SAGE Publications.

Babbie, E., Halley, F., & Zaino, J. (2007).  Adventures in Social Research: Data Analysis Using SPSS 14.0 and 15.0 for Windows  (6th ed.). New York: SAGE Publications.

Boniface, D. R. (2019). Experiment Design and Statistical Methods For Behavioural and Social Research . CRC Press. ISBN: 9781351449298.

Christmann, E. P., & Badgett, J. L. (2009). Interpreting Assessment Data: Statistical Techniques You Can Use. New York: NSTA Press.

Coolidge, F. L. (2012). Statistics: A Gentle Introduction (3rd ed.). SAGE Publications.

Creswell, J. W., & Creswell, J. D. (2018). Research Design: Qualitative, Quantitative, and Mixed Methods Approaches . New York: SAGE Publications.

De Vaus, D. A. (2001). Research Design in Social Research . New York: SAGE Publications.

Katz, M. (2006) . Study Design and Statistical Analysis: A Practical Guide for Clinicians . Cambridge: Cambridge University Press.

Knapp, H. (2017). Intermediate Statistics Using SPSS. SAGE Publications.

Moodie, P. F., & Johnson, D. E. (2021). Applied Regression and ANOVA Using SAS. CRC Press.

Norman, G. R., & Streiner, D. L. (2008). Biostatistics: The Bare Essentials . New York: B.C. Decker.

Privitera, G. J. (2022). Research Methods for the Behavioral Sciences . New Jersey: SAGE Publications.

Chris

  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 10 Reasons you’re Perpetually Single
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 20 Montessori Toddler Bedrooms (Design Inspiration)
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 21 Montessori Homeschool Setups
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 101 Hidden Talents Examples

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian Dermatol Online J
  • v.10(1); Jan-Feb 2019

Types of Variables, Descriptive Statistics, and Sample Size

Feroze kaliyadan.

Department of Dermatology, King Faisal University, Al Hofuf, Saudi Arabia

Vinay Kulkarni

1 Department of Dermatology, Prayas Amrita Clinic, Pune, Maharashtra, India

This short “snippet” covers three important aspects related to statistics – the concept of variables , the importance, and practical aspects related to descriptive statistics and issues related to sampling – types of sampling and sample size estimation.

What is a variable?[ 1 , 2 ] To put it in very simple terms, a variable is an entity whose value varies. A variable is an essential component of any statistical data. It is a feature of a member of a given sample or population, which is unique, and can differ in quantity or quantity from another member of the same sample or population. Variables either are the primary quantities of interest or act as practical substitutes for the same. The importance of variables is that they help in operationalization of concepts for data collection. For example, if you want to do an experiment based on the severity of urticaria, one option would be to measure the severity using a scale to grade severity of itching. This becomes an operational variable. For a variable to be “good,” it needs to have some properties such as good reliability and validity, low bias, feasibility/practicality, low cost, objectivity, clarity, and acceptance. Variables can be classified into various ways as discussed below.

Quantitative vs qualitative

A variable can collect either qualitative or quantitative data. A variable differing in quantity is called a quantitative variable (e.g., weight of a group of patients), whereas a variable differing in quality is called a qualitative variable (e.g., the Fitzpatrick skin type)

A simple test which can be used to differentiate between qualitative and quantitative variables is the subtraction test. If you can subtract the value of one variable from the other to get a meaningful result, then you are dealing with a quantitative variable (this of course will not apply to rating scales/ranks).

Quantitative variables can be either discrete or continuous

Discrete variables are variables in which no values may be assumed between the two given values (e.g., number of lesions in each patient in a sample of patients with urticaria).

Continuous variables, on the other hand, can take any value in between the two given values (e.g., duration for which the weals last in the same sample of patients with urticaria). One way of differentiating between continuous and discrete variables is to use the “mid-way” test. If, for every pair of values of a variable, a value exactly mid-way between them is meaningful, the variable is continuous. For example, two values for the time taken for a weal to subside can be 10 and 13 min. The mid-way value would be 11.5 min which makes sense. However, for a number of weals, suppose you have a pair of values – 5 and 8 – the midway value would be 6.5 weals, which does not make sense.

Under the umbrella of qualitative variables, you can have nominal/categorical variables and ordinal variables

Nominal/categorical variables are, as the name suggests, variables which can be slotted into different categories (e.g., gender or type of psoriasis).

Ordinal variables or ranked variables are similar to categorical, but can be put into an order (e.g., a scale for severity of itching).

Dependent and independent variables

In the context of an experimental study, the dependent variable (also called outcome variable) is directly linked to the primary outcome of the study. For example, in a clinical trial on psoriasis, the PASI (psoriasis area severity index) would possibly be one dependent variable. The independent variable (sometime also called explanatory variable) is something which is not affected by the experiment itself but which can be manipulated to affect the dependent variable. Other terms sometimes used synonymously include blocking variable, covariate, or predictor variable. Confounding variables are extra variables, which can have an effect on the experiment. They are linked with dependent and independent variables and can cause spurious association. For example, in a clinical trial for a topical treatment in psoriasis, the concomitant use of moisturizers might be a confounding variable. A control variable is a variable that must be kept constant during the course of an experiment.

Descriptive Statistics

Statistics can be broadly divided into descriptive statistics and inferential statistics.[ 3 , 4 ] Descriptive statistics give a summary about the sample being studied without drawing any inferences based on probability theory. Even if the primary aim of a study involves inferential statistics, descriptive statistics are still used to give a general summary. When we describe the population using tools such as frequency distribution tables, percentages, and other measures of central tendency like the mean, for example, we are talking about descriptive statistics. When we use a specific statistical test (e.g., Mann–Whitney U-test) to compare the mean scores and express it in terms of statistical significance, we are talking about inferential statistics. Descriptive statistics can help in summarizing data in the form of simple quantitative measures such as percentages or means or in the form of visual summaries such as histograms and box plots.

Descriptive statistics can be used to describe a single variable (univariate analysis) or more than one variable (bivariate/multivariate analysis). In the case of more than one variable, descriptive statistics can help summarize relationships between variables using tools such as scatter plots.

Descriptive statistics can be broadly put under two categories:

  • Sorting/grouping and illustration/visual displays
  • Summary statistics.

Sorting and grouping

Sorting and grouping is most commonly done using frequency distribution tables. For continuous variables, it is generally better to use groups in the frequency table. Ideally, group sizes should be equal (except in extreme ends where open groups are used; e.g., age “greater than” or “less than”).

Another form of presenting frequency distributions is the “stem and leaf” diagram, which is considered to be a more accurate form of description.

Suppose the weight in kilograms of a group of 10 patients is as follows:

56, 34, 48, 43, 87, 78, 54, 62, 61, 59

The “stem” records the value of the “ten's” place (or higher) and the “leaf” records the value in the “one's” place [ Table 1 ].

Stem and leaf plot

0-
1-
2-
34
43 8
54 6 9
61 2
78
87
9-

Illustration/visual display of data

The most common tools used for visual display include frequency diagrams, bar charts (for noncontinuous variables) and histograms (for continuous variables). Composite bar charts can be used to compare variables. For example, the frequency distribution in a sample population of males and females can be illustrated as given in Figure 1 .

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g001.jpg

Composite bar chart

A pie chart helps show how a total quantity is divided among its constituent variables. Scatter diagrams can be used to illustrate the relationship between two variables. For example, global scores given for improvement in a condition like acne by the patient and the doctor [ Figure 2 ].

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g002.jpg

Scatter diagram

Summary statistics

The main tools used for summary statistics are broadly grouped into measures of central tendency (such as mean, median, and mode) and measures of dispersion or variation (such as range, standard deviation, and variance).

Imagine that the data below represent the weights of a sample of 15 pediatric patients arranged in ascending order:

30, 35, 37, 38, 38, 38, 42, 42, 44, 46, 47, 48, 51, 53, 86

Just having the raw data does not mean much to us, so we try to express it in terms of some values, which give a summary of the data.

The mean is basically the sum of all the values divided by the total number. In this case, we get a value of 45.

The problem is that some extreme values (outliers), like “'86,” in this case can skew the value of the mean. In this case, we consider other values like the median, which is the point that divides the distribution into two equal halves. It is also referred to as the 50 th percentile (50% of the values are above it and 50% are below it). In our previous example, since we have already arranged the values in ascending order we find that the point which divides it into two equal halves is the 8 th value – 42. In case of a total number of values being even, we choose the two middle points and take an average to reach the median.

The mode is the most common data point. In our example, this would be 38. The mode as in our case may not necessarily be in the center of the distribution.

The median is the best measure of central tendency from among the mean, median, and mode. In a “symmetric” distribution, all three are the same, whereas in skewed data the median and mean are not the same; lie more toward the skew, with the mean lying further to the skew compared with the median. For example, in Figure 3 , a right skewed distribution is seen (direction of skew is based on the tail); data values' distribution is longer on the right-hand (positive) side than on the left-hand side. The mean is typically greater than the median in such cases.

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g003.jpg

Location of mode, median, and mean

Measures of dispersion

The range gives the spread between the lowest and highest values. In our previous example, this will be 86-30 = 56.

A more valuable measure is the interquartile range. A quartile is one of the values which break the distribution into four equal parts. The 25 th percentile is the data point which divides the group between the first one-fourth and the last three-fourth of the data. The first one-fourth will form the first quartile. The 75 th percentile is the data point which divides the distribution into a first three-fourth and last one-fourth (the last one-fourth being the fourth quartile). The range between the 25 th percentile and 75 th percentile is called the interquartile range.

Variance is also a measure of dispersion. The larger the variance, the further the individual units are from the mean. Let us consider the same example we used for calculating the mean. The mean was 45.

For the first value (30), the deviation from the mean will be 15; for the last value (86), the deviation will be 41. Similarly we can calculate the deviations for all values in a sample. Adding these deviations and averaging will give a clue to the total dispersion, but the problem is that since the deviations are a mix of negative and positive values, the final total becomes zero. To calculate the variance, this problem is overcome by adding squares of the deviations. So variance would be the sum of squares of the variation divided by the total number in the population (for a sample we use “n − 1”). To get a more realistic value of the average dispersion, we take the square root of the variance, which is called the “standard deviation.”

The box plot

The box plot is a composite representation that portrays the mean, median, range, and the outliers [ Figure 4 ].

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g004.jpg

The concept of skewness and kurtosis

Skewness is a measure of the symmetry of distribution. Basically if the distribution curve is symmetric, it looks the same on either side of the central point. When this is not the case, it is said to be skewed. Kurtosis is a representation of outliers. Distributions with high kurtosis tend to have “heavy tails” indicating a larger number of outliers, whereas distributions with low kurtosis have light tails, indicating lesser outliers. There are formulas to calculate both skewness and kurtosis [Figures ​ [Figures5 5 – 8 ].

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g005.jpg

Positive skew

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g008.jpg

High kurtosis (positive kurtosis – also called leptokurtic)

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g006.jpg

Negative skew

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g007.jpg

Low kurtosis (negative kurtosis – also called “Platykurtic”)

Sample Size

In an ideal study, we should be able to include all units of a particular population under study, something that is referred to as a census.[ 5 , 6 ] This would remove the chances of sampling error (difference between the outcome characteristics in a random sample when compared with the true population values – something that is virtually unavoidable when you take a random sample). However, it is obvious that this would not be feasible in most situations. Hence, we have to study a subset of the population to reach to our conclusions. This representative subset is a sample and we need to have sufficient numbers in this sample to make meaningful and accurate conclusions and reduce the effect of sampling error.

We also need to know that broadly sampling can be divided into two types – probability sampling and nonprobability sampling. Examples of probability sampling include methods such as simple random sampling (each member in a population has an equal chance of being selected), stratified random sampling (in nonhomogeneous populations, the population is divided into subgroups – followed be random sampling in each subgroup), systematic (sampling is based on a systematic technique – e.g., every third person is selected for a survey), and cluster sampling (similar to stratified sampling except that the clusters here are preexisting clusters unlike stratified sampling where the researcher decides on the stratification criteria), whereas nonprobability sampling, where every unit in the population does not have an equal chance of inclusion into the sample, includes methods such as convenience sampling (e.g., sample selected based on ease of access) and purposive sampling (where only people who meet specific criteria are included in the sample).

An accurate calculation of sample size is an essential aspect of good study design. It is important to calculate the sample size much in advance, rather than have to go for post hoc analysis. A sample size that is too less may make the study underpowered, whereas a sample size which is more than necessary might lead to a wastage of resources.

We will first go through the sample size calculation for a hypothesis-based design (like a randomized control trial).

The important factors to consider for sample size calculation include study design, type of statistical test, level of significance, power and effect size, variance (standard deviation for quantitative data), and expected proportions in the case of qualitative data. This is based on previous data, either based on previous studies or based on the clinicians' experience. In case the study is something being conducted for the first time, a pilot study might be conducted which helps generate these data for further studies based on a larger sample size). It is also important to know whether the data follow a normal distribution or not.

Two essential aspects we must understand are the concept of Type I and Type II errors. In a study that compares two groups, a null hypothesis assumes that there is no significant difference between the two groups, and any observed difference being due to sampling or experimental error. When we reject a null hypothesis, when it is true, we label it as a Type I error (also denoted as “alpha,” correlating with significance levels). In a Type II error (also denoted as “beta”), we fail to reject a null hypothesis, when the alternate hypothesis is actually true. Type II errors are usually expressed as “1- β,” correlating with the power of the test. While there are no absolute rules, the minimal levels accepted are 0.05 for α (corresponding to a significance level of 5%) and 0.20 for β (corresponding to a minimum recommended power of “1 − 0.20,” or 80%).

Effect size and minimal clinically relevant difference

For a clinical trial, the investigator will have to decide in advance what clinically detectable change is significant (for numerical data, this is could be the anticipated outcome means in the two groups, whereas for categorical data, it could correlate with the proportions of successful outcomes in two groups.). While we will not go into details of the formula for sample size calculation, some important points are as follows:

In the context where effect size is involved, the sample size is inversely proportional to the square of the effect size. What this means in effect is that reducing the effect size will lead to an increase in the required sample size.

Reducing the level of significance (alpha) or increasing power (1-β) will lead to an increase in the calculated sample size.

An increase in variance of the outcome leads to an increase in the calculated sample size.

A note is that for estimation type of studies/surveys, sample size calculation needs to consider some other factors too. This includes an idea about total population size (this generally does not make a major difference when population size is above 20,000, so in situations where population size is not known we can assume a population of 20,000 or more). The other factor is the “margin of error” – the amount of deviation which the investigators find acceptable in terms of percentages. Regarding confidence levels, ideally, a 95% confidence level is the minimum recommended for surveys too. Finally, we need an idea of the expected/crude prevalence – either based on previous studies or based on estimates.

Sample size calculation also needs to add corrections for patient drop-outs/lost-to-follow-up patients and missing records. An important point is that in some studies dealing with rare diseases, it may be difficult to achieve desired sample size. In these cases, the investigators might have to rework outcomes or maybe pool data from multiple centers. Although post hoc power can be analyzed, a better approach suggested is to calculate 95% confidence intervals for the outcome and interpret the study results based on this.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Sweepstakes
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

Types of Variables in Psychology Research

Examples of Independent and Dependent Variables

Dependent and Independent Variables

  • Intervening Variables
  • Extraneous Variables
  • Controlled Variables
  • Confounding Variables
  • Operationalizing Variables

Frequently Asked Questions

Variables in psychology are things that can be changed or altered, such as a characteristic or value. Variables are generally used in psychology experiments to determine if changes to one thing result in changes to another.

Variables in psychology play a critical role in the research process. By systematically changing some variables in an experiment and measuring what happens as a result, researchers are able to learn more about cause-and-effect relationships.

The two main types of variables in psychology are the independent variable and the dependent variable. Both variables are important in the process of collecting data about psychological phenomena.

This article discusses different types of variables that are used in psychology research. It also covers how to operationalize these variables when conducting experiments.

Students often report problems with identifying the independent and dependent variables in an experiment. While this task can become more difficult as the complexity of an experiment increases, in a psychology experiment:

  • The independent variable is the variable that is manipulated by the experimenter. An example of an independent variable in psychology: In an experiment on the impact of sleep deprivation on test performance, sleep deprivation would be the independent variable. The experimenters would have some of the study participants be sleep-deprived while others would be fully rested.
  • The dependent variable is the variable that is measured by the experimenter. In the previous example, the scores on the test performance measure would be the dependent variable.

So how do you differentiate between the independent and dependent variables? Start by asking yourself what the experimenter is manipulating. The things that change, either naturally or through direct manipulation from the experimenter, are generally the independent variables. What is being measured? The dependent variable is the one that the experimenter is measuring.

Intervening Variables in Psychology

Intervening variables, also sometimes called intermediate or mediator variables, are factors that play a role in the relationship between two other variables. In the previous example, sleep problems in university students are often influenced by factors such as stress. As a result, stress might be an intervening variable that plays a role in how much sleep people get, which may then influence how well they perform on exams.

Extraneous Variables in Psychology

Independent and dependent variables are not the only variables present in many experiments. In some cases, extraneous variables may also play a role. This type of variable is one that may have an impact on the relationship between the independent and dependent variables.

For example, in our previous example of an experiment on the effects of sleep deprivation on test performance, other factors such as age, gender, and academic background may have an impact on the results. In such cases, the experimenter will note the values of these extraneous variables so any impact can be controlled for.

There are two basic types of extraneous variables:

  • Participant variables : These extraneous variables are related to the individual characteristics of each study participant that may impact how they respond. These factors can include background differences, mood, anxiety, intelligence, awareness, and other characteristics that are unique to each person.
  • Situational variables : These extraneous variables are related to things in the environment that may impact how each participant responds. For example, if a participant is taking a test in a chilly room, the temperature would be considered an extraneous variable. Some participants may not be affected by the cold, but others might be distracted or annoyed by the temperature of the room.

Other extraneous variables include the following:

  • Demand characteristics : Clues in the environment that suggest how a participant should behave
  • Experimenter effects : When a researcher unintentionally suggests clues for how a participant should behave

Controlled Variables in Psychology

In many cases, extraneous variables are controlled for by the experimenter. A controlled variable is one that is held constant throughout an experiment.

In the case of participant variables, the experiment might select participants that are the same in background and temperament to ensure that these factors don't interfere with the results. Holding these variables constant is important for an experiment because it allows researchers to be sure that all other variables remain the same across all conditions.  

Using controlled variables means that when changes occur, the researchers can be sure that these changes are due to the manipulation of the independent variable and not caused by changes in other variables.

It is important to also note that a controlled variable is not the same thing as a control group . The control group in a study is the group of participants who do not receive the treatment or change in the independent variable.

All other variables between the control group and experimental group are held constant (i.e., they are controlled). The dependent variable being measured is then compared between the control group and experimental group to see what changes occurred because of the treatment.

Confounding Variables in Psychology

If a variable cannot be controlled for, it becomes what is known as a confounding variabl e. This type of variable can have an impact on the dependent variable, which can make it difficult to determine if the results are due to the influence of the independent variable, the confounding variable, or an interaction of the two.

Operationalizing Variables in Psychology

An operational definition describes how the variables are measured and defined in the study. Before conducting a psychology experiment , it is essential to create firm operational definitions for both the independent variable and dependent variables.

For example, in our imaginary experiment on the effects of sleep deprivation on test performance, we would need to create very specific operational definitions for our two variables. If our hypothesis is "Students who are sleep deprived will score significantly lower on a test," then we would have a few different concepts to define:

  • Students : First, what do we mean by "students?" In our example, let’s define students as participants enrolled in an introductory university-level psychology course.
  • Sleep deprivation : Next, we need to operationally define the "sleep deprivation" variable. In our example, let’s say that sleep deprivation refers to those participants who have had less than five hours of sleep the night before the test.
  • Test variable : Finally, we need to create an operational definition for the test variable. For this example, the test variable will be defined as a student’s score on a chapter exam in the introductory psychology course.

Once all the variables are operationalized, we're ready to conduct the experiment.

Variables play an important part in psychology research. Manipulating an independent variable and measuring the dependent variable allows researchers to determine if there is a cause-and-effect relationship between them.

A Word From Verywell

Understanding the different types of variables used in psychology research is important if you want to conduct your own psychology experiments. It is also helpful for people who want to better understand what the results of psychology research really mean and become more informed consumers of psychology information .

Independent and dependent variables are used in experimental research. Unlike some other types of research (such as correlational studies ), experiments allow researchers to evaluate cause-and-effect relationships between two variables.

Researchers can use statistical analyses to determine the strength of a relationship between two variables in an experiment. Two of the most common ways to do this are to calculate a p-value or a correlation. The p-value indicates if the results are statistically significant while the correlation can indicate the strength of the relationship.

In an experiment on how sugar affects short-term memory, sugar intake would be the independent variable and scores on a short-term memory task would be the independent variable.

In an experiment looking at how caffeine intake affects test anxiety, the amount of caffeine consumed before a test would be the independent variable and scores on a test anxiety assessment would be the dependent variable.

Just as with other types of research, the independent variable in a cognitive psychology study would be the variable that the researchers manipulate. The specific independent variable would vary depending on the specific study, but it might be focused on some aspect of thinking, memory, attention, language, or decision-making.

American Psychological Association. Operational definition . APA Dictionary of Psychology.

American Psychological Association. Mediator . APA Dictionary of Psychology.

Altun I, Cınar N, Dede C. The contributing factors to poor sleep experiences in according to the university students: A cross-sectional study .  J Res Med Sci . 2012;17(6):557-561. PMID:23626634

Skelly AC, Dettori JR, Brodt ED. Assessing bias: The importance of considering confounding .  Evid Based Spine Care J . 2012;3(1):9-12. doi:10.1055/s-0031-1298595

  • Evans, AN & Rooney, BJ. Methods in Psychological Research. Thousand Oaks, CA: SAGE Publications; 2014.
  • Kantowitz, BH, Roediger, HL, & Elmes, DG. Experimental Psychology. Stamfort, CT: Cengage Learning; 2015.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

what is variable in research study

Advertisement

The Independent Variable vs. Dependent Variable in Research

  • Share Content on Facebook
  • Share Content on LinkedIn
  • Share Content on Flipboard
  • Share Content on Reddit
  • Share Content via Email

lab

In any scientific research, there are typically two variables of interest: independent variables and dependent variables. In forming the backbone of scientific experiments , they help scientists understand relationships, predict outcomes and, in general, make sense of the factors that they're investigating.

Understanding the independent variable vs. dependent variable is so fundamental to scientific research that you need to have a good handle on both if you want to design your own research study or interpret others' findings.

To grasp the distinction between the two, let's delve into their definitions and roles.

What Is an Independent Variable?

What is a dependent variable, research study example, predictor variables vs. outcome variables, other variables, the relationship between independent and dependent variables.

The independent variable, often denoted as X, is the variable that is manipulated or controlled by the researcher intentionally. It's the factor that researchers believe may have a causal effect on the dependent variable.

In simpler terms, the independent variable is the variable you change or vary in an experiment so you can observe its impact on the dependent variable.

The dependent variable, often represented as Y, is the variable that is observed and measured to determine the outcome of the experiment.

In other words, the dependent variable is the variable that is affected by the changes in the independent variable. The values of the dependent variable always depend on the independent variable.

Let's consider an example to illustrate these concepts. Imagine you're conducting a research study aiming to investigate the effect of studying techniques on test scores among students.

In this scenario, the independent variable manipulated would be the studying technique, which you could vary by employing different methods, such as spaced repetition, summarization or practice testing.

The dependent variable, in this case, would be the test scores of the students. As the researcher following the scientific method , you would manipulate the independent variable (the studying technique) and then measure its impact on the dependent variable (the test scores).

You can also categorize variables as predictor variables or outcome variables. Sometimes a researcher will refer to the independent variable as the predictor variable since they use it to predict or explain changes in the dependent variable, which is also known as the outcome variable.

When conducting an experiment or study, it's crucial to acknowledge the presence of other variables, or extraneous variables, which may influence the outcome of the experiment but are not the focus of study.

These variables can potentially confound the results if they aren't controlled. In the example from above, other variables might include the students' prior knowledge, level of motivation, time spent studying and preferred learning style.

As a researcher, it would be your goal to control these extraneous variables to ensure you can attribute any observed differences in the dependent variable to changes in the independent variable. In practice, however, it's not always possible to control every variable.

The distinction between independent and dependent variables is essential for designing and conducting research studies and experiments effectively.

By manipulating the independent variable and measuring its impact on the dependent variable while controlling for other factors, researchers can gain insights into the factors that influence outcomes in their respective fields.

Whether investigating the effects of a new drug on blood pressure or studying the relationship between socioeconomic factors and academic performance, understanding the role of independent and dependent variables is essential for advancing knowledge and making informed decisions.

Correlation vs. Causation

Understanding the relationship between independent and dependent variables is essential for making sense of research findings. Depending on the nature of this relationship, researchers may identify correlations or infer causation between the variables.

Correlation implies that changes in one variable are associated with changes in another variable, while causation suggests that changes in the independent variable directly cause changes in the dependent variable.

Control and Intervention

In experimental research, the researcher has control over the independent variable, allowing them to manipulate it to observe its effects on the dependent variable. This controlled manipulation distinguishes experiments from other types of research designs.

For example, in observational studies, researchers merely observe variables without intervention, meaning they don't control or manipulate any variables.

Context and Analysis

Whether it's intentional or unintentional, independent, dependent and other variables can vary in different contexts, and their effects may differ based on various factors, such as age, characteristics of the participants, environmental influences and so on.

Researchers employ statistical analysis techniques to measure and analyze the relationships between these variables, helping them to draw meaningful conclusions from their data.

We created this article in conjunction with AI technology, then made sure it was fact-checked and edited by a HowStuffWorks editor.

Please copy/paste the following text to properly cite this HowStuffWorks.com article:

Frequently asked questions

What are independent and dependent variables.

You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .

In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:

  • The  independent variable  is the amount of nutrients added to the crop field.
  • The  dependent variable is the biomass of the crops at harvest time.

Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .

Frequently asked questions: Methodology

Attrition refers to participants leaving a study. It always happens to some extent—for example, in randomized controlled trials for medical research.

Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group . As a result, the characteristics of the participants who drop out differ from the characteristics of those who stay in the study. Because of this, study results may be biased .

Action research is conducted in order to solve a particular issue immediately, while case studies are often conducted over a longer period of time and focus more on observing and analyzing a particular ongoing phenomenon.

Action research is focused on solving a problem or informing individual and community-based knowledge in a way that impacts teaching, learning, and other related processes. It is less focused on contributing theoretical input, instead producing actionable input.

Action research is particularly popular with educators as a form of systematic inquiry because it prioritizes reflection and bridges the gap between theory and practice. Educators are able to simultaneously investigate an issue as they solve it, and the method is very iterative and flexible.

A cycle of inquiry is another name for action research . It is usually visualized in a spiral shape following a series of steps, such as “planning → acting → observing → reflecting.”

To make quantitative observations , you need to use instruments that are capable of measuring the quantity you want to observe. For example, you might use a ruler to measure the length of an object or a thermometer to measure its temperature.

Criterion validity and construct validity are both types of measurement validity . In other words, they both show you how accurately a method measures something.

While construct validity is the degree to which a test or other measurement method measures what it claims to measure, criterion validity is the degree to which a test can predictively (in the future) or concurrently (in the present) measure something.

Construct validity is often considered the overarching type of measurement validity . You need to have face validity , content validity , and criterion validity in order to achieve construct validity.

Convergent validity and discriminant validity are both subtypes of construct validity . Together, they help you evaluate whether a test measures the concept it was designed to measure.

  • Convergent validity indicates whether a test that is designed to measure a particular construct correlates with other tests that assess the same or similar construct.
  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related. This type of validity is also called divergent validity .

You need to assess both in order to demonstrate construct validity. Neither one alone is sufficient for establishing construct validity.

  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related

Content validity shows you how accurately a test or other measurement method taps  into the various aspects of the specific construct you are researching.

In other words, it helps you answer the question: “does the test measure all aspects of the construct I want to measure?” If it does, then the test has high content validity.

The higher the content validity, the more accurate the measurement of the construct.

If the test fails to include parts of the construct, or irrelevant parts are included, the validity of the instrument is threatened, which brings your results into question.

Face validity and content validity are similar in that they both evaluate how suitable the content of a test is. The difference is that face validity is subjective, and assesses content at surface level.

When a test has strong face validity, anyone would agree that the test’s questions appear to measure what they are intended to measure.

For example, looking at a 4th grade math test consisting of problems in which students have to add and multiply, most people would agree that it has strong face validity (i.e., it looks like a math test).

On the other hand, content validity evaluates how well a test represents all the aspects of a topic. Assessing content validity is more systematic and relies on expert evaluation. of each question, analyzing whether each one covers the aspects that the test was designed to cover.

A 4th grade math test would have high content validity if it covered all the skills taught in that grade. Experts(in this case, math teachers), would have to evaluate the content validity by comparing the test to the learning objectives.

Snowball sampling is a non-probability sampling method . Unlike probability sampling (which involves some form of random selection ), the initial individuals selected to be studied are the ones who recruit new participants.

Because not every member of the target population has an equal chance of being recruited into the sample, selection in snowball sampling is non-random.

Snowball sampling is a non-probability sampling method , where there is not an equal chance for every member of the population to be included in the sample .

This means that you cannot use inferential statistics and make generalizations —often the goal of quantitative research . As such, a snowball sample is not representative of the target population and is usually a better fit for qualitative research .

Snowball sampling relies on the use of referrals. Here, the researcher recruits one or more initial participants, who then recruit the next ones.

Participants share similar characteristics and/or know each other. Because of this, not every member of the population has an equal chance of being included in the sample, giving rise to sampling bias .

Snowball sampling is best used in the following cases:

  • If there is no sampling frame available (e.g., people with a rare disease)
  • If the population of interest is hard to access or locate (e.g., people experiencing homelessness)
  • If the research focuses on a sensitive topic (e.g., extramarital affairs)

The reproducibility and replicability of a study can be ensured by writing a transparent, detailed method section and using clear, unambiguous language.

Reproducibility and replicability are related terms.

  • Reproducing research entails reanalyzing the existing data in the same manner.
  • Replicating (or repeating ) the research entails reconducting the entire analysis, including the collection of new data . 
  • A successful reproduction shows that the data analyses were conducted in a fair and honest manner.
  • A successful replication shows that the reliability of the results is high.

Stratified sampling and quota sampling both involve dividing the population into subgroups and selecting units from each subgroup. The purpose in both cases is to select a representative sample and/or to allow comparisons between subgroups.

The main difference is that in stratified sampling, you draw a random sample from each subgroup ( probability sampling ). In quota sampling you select a predetermined number or proportion of units, in a non-random manner ( non-probability sampling ).

Purposive and convenience sampling are both sampling methods that are typically used in qualitative data collection.

A convenience sample is drawn from a source that is conveniently accessible to the researcher. Convenience sampling does not distinguish characteristics among the participants. On the other hand, purposive sampling focuses on selecting participants possessing characteristics associated with the research study.

The findings of studies based on either convenience or purposive sampling can only be generalized to the (sub)population from which the sample is drawn, and not to the entire population.

Random sampling or probability sampling is based on random selection. This means that each unit has an equal chance (i.e., equal probability) of being included in the sample.

On the other hand, convenience sampling involves stopping people at random, which means that not everyone has an equal chance of being selected depending on the place, time, or day you are collecting your data.

Convenience sampling and quota sampling are both non-probability sampling methods. They both use non-random criteria like availability, geographical proximity, or expert knowledge to recruit study participants.

However, in convenience sampling, you continue to sample units or cases until you reach the required sample size.

In quota sampling, you first need to divide your population of interest into subgroups (strata) and estimate their proportions (quota) in the population. Then you can start your data collection, using convenience sampling to recruit participants, until the proportions in each subgroup coincide with the estimated proportions in the population.

A sampling frame is a list of every member in the entire population . It is important that the sampling frame is as complete as possible, so that your sample accurately reflects your population.

Stratified and cluster sampling may look similar, but bear in mind that groups created in cluster sampling are heterogeneous , so the individual characteristics in the cluster vary. In contrast, groups created in stratified sampling are homogeneous , as units share characteristics.

Relatedly, in cluster sampling you randomly select entire groups and include all units of each group in your sample. However, in stratified sampling, you select some units of all groups and include them in your sample. In this way, both methods can ensure that your sample is representative of the target population .

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

An observational study is a great choice for you if your research question is based purely on observations. If there are ethical, logistical, or practical concerns that prevent you from conducting a traditional experiment , an observational study may be a good choice. In an observational study, there is no interference or manipulation of the research subjects, as well as no control or treatment groups .

It’s often best to ask a variety of people to review your measurements. You can ask experts, such as other researchers, or laypeople, such as potential participants, to judge the face validity of tests.

While experts have a deep understanding of research methods , the people you’re studying can provide you with valuable insights you may have missed otherwise.

Face validity is important because it’s a simple first step to measuring the overall validity of a test or technique. It’s a relatively intuitive, quick, and easy way to start checking whether a new measure seems useful at first glance.

Good face validity means that anyone who reviews your measure says that it seems to be measuring what it’s supposed to. With poor face validity, someone reviewing your measure may be left confused about what you’re measuring and why you’re using this method.

Face validity is about whether a test appears to measure what it’s supposed to measure. This type of validity is concerned with whether a measure seems relevant and appropriate for what it’s assessing only on the surface.

Statistical analyses are often applied to test validity with data from your measures. You test convergent validity and discriminant validity with correlations to see if results from your test are positively or negatively related to those of other established tests.

You can also use regression analyses to assess whether your measure is actually predictive of outcomes that you expect it to predict theoretically. A regression analysis that supports your expectations strengthens your claim of construct validity .

When designing or evaluating a measure, construct validity helps you ensure you’re actually measuring the construct you’re interested in. If you don’t have construct validity, you may inadvertently measure unrelated or distinct constructs and lose precision in your research.

Construct validity is often considered the overarching type of measurement validity ,  because it covers all of the other types. You need to have face validity , content validity , and criterion validity to achieve construct validity.

Construct validity is about how well a test measures the concept it was designed to evaluate. It’s one of four types of measurement validity , which includes construct validity, face validity , and criterion validity.

There are two subtypes of construct validity.

  • Convergent validity : The extent to which your measure corresponds to measures of related constructs
  • Discriminant validity : The extent to which your measure is unrelated or negatively related to measures of distinct constructs

Naturalistic observation is a valuable tool because of its flexibility, external validity , and suitability for topics that can’t be studied in a lab setting.

The downsides of naturalistic observation include its lack of scientific control , ethical considerations , and potential for bias from observers and subjects.

Naturalistic observation is a qualitative research method where you record the behaviors of your research subjects in real world settings. You avoid interfering or influencing anything in a naturalistic observation.

You can think of naturalistic observation as “people watching” with a purpose.

A dependent variable is what changes as a result of the independent variable manipulation in experiments . It’s what you’re interested in measuring, and it “depends” on your independent variable.

In statistics, dependent variables are also called:

  • Response variables (they respond to a change in another variable)
  • Outcome variables (they represent the outcome you want to measure)
  • Left-hand-side variables (they appear on the left-hand side of a regression equation)

An independent variable is the variable you manipulate, control, or vary in an experimental study to explore its effects. It’s called “independent” because it’s not influenced by any other variables in the study.

Independent variables are also called:

  • Explanatory variables (they explain an event or outcome)
  • Predictor variables (they can be used to predict the value of a dependent variable)
  • Right-hand-side variables (they appear on the right-hand side of a regression equation).

As a rule of thumb, questions related to thoughts, beliefs, and feelings work well in focus groups. Take your time formulating strong questions, paying special attention to phrasing. Be careful to avoid leading questions , which can bias your responses.

Overall, your focus group questions should be:

  • Open-ended and flexible
  • Impossible to answer with “yes” or “no” (questions that start with “why” or “how” are often best)
  • Unambiguous, getting straight to the point while still stimulating discussion
  • Unbiased and neutral

A structured interview is a data collection method that relies on asking questions in a set order to collect data on a topic. They are often quantitative in nature. Structured interviews are best used when: 

  • You already have a very clear understanding of your topic. Perhaps significant research has already been conducted, or you have done some prior research yourself, but you already possess a baseline for designing strong structured questions.
  • You are constrained in terms of time or resources and need to analyze your data quickly and efficiently.
  • Your research question depends on strong parity between participants, with environmental conditions held constant.

More flexible interview options include semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias is the tendency for interview participants to give responses that will be viewed favorably by the interviewer or other participants. It occurs in all types of interviews and surveys , but is most common in semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias can be mitigated by ensuring participants feel at ease and comfortable sharing their views. Make sure to pay attention to your own body language and any physical or verbal cues, such as nodding or widening your eyes.

This type of bias can also occur in observations if the participants know they’re being observed. They might alter their behavior accordingly.

The interviewer effect is a type of bias that emerges when a characteristic of an interviewer (race, age, gender identity, etc.) influences the responses given by the interviewee.

There is a risk of an interviewer effect in all types of interviews , but it can be mitigated by writing really high-quality interview questions.

A semi-structured interview is a blend of structured and unstructured types of interviews. Semi-structured interviews are best used when:

  • You have prior interview experience. Spontaneous questions are deceptively challenging, and it’s easy to accidentally ask a leading question or make a participant uncomfortable.
  • Your research question is exploratory in nature. Participant answers can guide future research questions and help you develop a more robust knowledge base for future research.

An unstructured interview is the most flexible type of interview, but it is not always the best fit for your research topic.

Unstructured interviews are best used when:

  • You are an experienced interviewer and have a very strong background in your research topic, since it is challenging to ask spontaneous, colloquial questions.
  • Your research question is exploratory in nature. While you may have developed hypotheses, you are open to discovering new or shifting viewpoints through the interview process.
  • You are seeking descriptive data, and are ready to ask questions that will deepen and contextualize your initial thoughts and hypotheses.
  • Your research depends on forming connections with your participants and making them feel comfortable revealing deeper emotions, lived experiences, or thoughts.

The four most common types of interviews are:

  • Structured interviews : The questions are predetermined in both topic and order. 
  • Semi-structured interviews : A few questions are predetermined, but other questions aren’t planned.
  • Unstructured interviews : None of the questions are predetermined.
  • Focus group interviews : The questions are presented to a group instead of one individual.

Deductive reasoning is commonly used in scientific research, and it’s especially associated with quantitative research .

In research, you might have come across something called the hypothetico-deductive method . It’s the scientific method of testing hypotheses to check whether your predictions are substantiated by real-world data.

Deductive reasoning is a logical approach where you progress from general ideas to specific conclusions. It’s often contrasted with inductive reasoning , where you start with specific observations and form general conclusions.

Deductive reasoning is also called deductive logic.

There are many different types of inductive reasoning that people use formally or informally.

Here are a few common types:

  • Inductive generalization : You use observations about a sample to come to a conclusion about the population it came from.
  • Statistical generalization: You use specific numbers about samples to make statements about populations.
  • Causal reasoning: You make cause-and-effect links between different things.
  • Sign reasoning: You make a conclusion about a correlational relationship between different things.
  • Analogical reasoning: You make a conclusion about something based on its similarities to something else.

Inductive reasoning is a bottom-up approach, while deductive reasoning is top-down.

Inductive reasoning takes you from the specific to the general, while in deductive reasoning, you make inferences by going from general premises to specific conclusions.

In inductive research , you start by making observations or gathering data. Then, you take a broad scan of your data and search for patterns. Finally, you make general conclusions that you might incorporate into theories.

Inductive reasoning is a method of drawing conclusions by going from the specific to the general. It’s usually contrasted with deductive reasoning, where you proceed from general information to specific conclusions.

Inductive reasoning is also called inductive logic or bottom-up reasoning.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Triangulation can help:

  • Reduce research bias that comes from using a single method, theory, or investigator
  • Enhance validity by approaching the same topic with different tools
  • Establish credibility by giving you a complete picture of the research problem

But triangulation can also pose problems:

  • It’s time-consuming and labor-intensive, often involving an interdisciplinary team.
  • Your results may be inconsistent or even contradictory.

There are four main types of triangulation :

  • Data triangulation : Using data from different times, spaces, and people
  • Investigator triangulation : Involving multiple researchers in collecting or analyzing data
  • Theory triangulation : Using varying theoretical perspectives in your research
  • Methodological triangulation : Using different methodologies to approach the same topic

Many academic fields use peer review , largely to determine whether a manuscript is suitable for publication. Peer review enhances the credibility of the published manuscript.

However, peer review is also common in non-academic settings. The United Nations, the European Union, and many individual nations use peer review to evaluate grant applications. It is also widely used in medical and health-related fields as a teaching or quality-of-care measure. 

Peer assessment is often used in the classroom as a pedagogical tool. Both receiving feedback and providing it are thought to enhance the learning process, helping students think critically and collaboratively.

Peer review can stop obviously problematic, falsified, or otherwise untrustworthy research from being published. It also represents an excellent opportunity to get feedback from renowned experts in your field. It acts as a first defense, helping you ensure your argument is clear and that there are no gaps, vague terms, or unanswered questions for readers who weren’t involved in the research process.

Peer-reviewed articles are considered a highly credible source due to this stringent process they go through before publication.

In general, the peer review process follows the following steps: 

  • First, the author submits the manuscript to the editor.
  • Reject the manuscript and send it back to author, or 
  • Send it onward to the selected peer reviewer(s) 
  • Next, the peer review process occurs. The reviewer provides feedback, addressing any major or minor issues with the manuscript, and gives their advice regarding what edits should be made. 
  • Lastly, the edited manuscript is sent back to the author. They input the edits, and resubmit it to the editor for publication.

Exploratory research is often used when the issue you’re studying is new or when the data collection process is challenging for some reason.

You can use exploratory research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

Exploratory research is a methodology approach that explores research questions that have not previously been studied in depth. It is often used when the issue you’re studying is new, or the data collection process is challenging in some way.

Explanatory research is used to investigate how or why a phenomenon occurs. Therefore, this type of research is often one of the first stages in the research process , serving as a jumping-off point for future research.

Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.

Explanatory research is a research method used to investigate how or why something occurs when only a small amount of information is available pertaining to that topic. It can help you increase your understanding of a given topic.

Clean data are valid, accurate, complete, consistent, unique, and uniform. Dirty data include inconsistencies and errors.

Dirty data can come from any part of the research process, including poor research design , inappropriate measurement materials, or flawed data entry.

Data cleaning takes place between data collection and data analyses. But you can use some methods even before collecting data.

For clean data, you should start by designing measures that collect valid data. Data validation at the time of data entry or collection helps you minimize the amount of data cleaning you’ll need to do.

After data collection, you can use data standardization and data transformation to clean your data. You’ll also deal with any missing values, outliers, and duplicate values.

Every dataset requires different techniques to clean dirty data , but you need to address these issues in a systematic way. You focus on finding and resolving data points that don’t agree or fit with the rest of your dataset.

These data might be missing values, outliers, duplicate values, incorrectly formatted, or irrelevant. You’ll start with screening and diagnosing your data. Then, you’ll often standardize and accept or remove data to make your dataset consistent and valid.

Data cleaning is necessary for valid and appropriate analyses. Dirty data contain inconsistencies or errors , but cleaning your data helps you minimize or resolve these.

Without data cleaning, you could end up with a Type I or II error in your conclusion. These types of erroneous conclusions can be practically significant with important consequences, because they lead to misplaced investments or missed opportunities.

Data cleaning involves spotting and resolving potential data inconsistencies or errors to improve your data quality. An error is any value (e.g., recorded weight) that doesn’t reflect the true value (e.g., actual weight) of something that’s being measured.

In this process, you review, analyze, detect, modify, or remove “dirty” data to make your dataset “clean.” Data cleaning is also called data cleansing or data scrubbing.

Research misconduct means making up or falsifying data, manipulating data analyses, or misrepresenting results in research reports. It’s a form of academic fraud.

These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement but a serious ethical failure.

Anonymity means you don’t know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations .

You can only guarantee anonymity by not collecting any personally identifying information—for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, or videos.

You can keep data confidential by using aggregate information in your research report, so that you only refer to groups of participants rather than individuals.

Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe.

Ethical considerations in research are a set of principles that guide your research designs and practices. These principles include voluntary participation, informed consent, anonymity, confidentiality, potential for harm, and results communication.

Scientists and researchers must always adhere to a certain code of conduct when collecting data from others .

These considerations protect the rights of research participants, enhance research validity , and maintain scientific integrity.

In multistage sampling , you can use probability or non-probability sampling methods .

For a probability sample, you have to conduct probability sampling at every stage.

You can mix it up by using simple random sampling , systematic sampling , or stratified sampling to select units at different stages, depending on what is applicable and relevant to your study.

Multistage sampling can simplify data collection when you have large, geographically spread samples, and you can obtain a probability sample without a complete sampling frame.

But multistage sampling may not lead to a representative sample, and larger samples are needed for multistage samples to achieve the statistical properties of simple random samples .

These are four of the most common mixed methods designs :

  • Convergent parallel: Quantitative and qualitative data are collected at the same time and analyzed separately. After both analyses are complete, compare your results to draw overall conclusions. 
  • Embedded: Quantitative and qualitative data are collected at the same time, but within a larger quantitative or qualitative design. One type of data is secondary to the other.
  • Explanatory sequential: Quantitative data is collected and analyzed first, followed by qualitative data. You can use this design if you think your qualitative data will explain and contextualize your quantitative findings.
  • Exploratory sequential: Qualitative data is collected and analyzed first, followed by quantitative data. You can use this design if you think the quantitative data will confirm or validate your qualitative findings.

Triangulation in research means using multiple datasets, methods, theories and/or investigators to address a research question. It’s a research strategy that can help you enhance the validity and credibility of your findings.

Triangulation is mainly used in qualitative research , but it’s also commonly applied in quantitative research . Mixed methods research always uses triangulation.

In multistage sampling , or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.

This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from state to city to neighborhood) to create a sample that’s less expensive and time-consuming to collect data from.

No, the steepness or slope of the line isn’t related to the correlation coefficient value. The correlation coefficient only tells you how closely your data fit on a line, so two datasets with the same correlation coefficient can have very different slopes.

To find the slope of the line, you’ll need to perform a regression analysis .

Correlation coefficients always range between -1 and 1.

The sign of the coefficient tells you the direction of the relationship: a positive value means the variables change together in the same direction, while a negative value means they change together in opposite directions.

The absolute value of a number is equal to the number without its sign. The absolute value of a correlation coefficient tells you the magnitude of the correlation: the greater the absolute value, the stronger the correlation.

These are the assumptions your data must meet if you want to use Pearson’s r :

  • Both variables are on an interval or ratio level of measurement
  • Data from both variables follow normal distributions
  • Your data have no outliers
  • Your data is from a random or representative sample
  • You expect a linear relationship between the two variables

Quantitative research designs can be divided into two main categories:

  • Correlational and descriptive designs are used to investigate characteristics, averages, trends, and associations between variables.
  • Experimental and quasi-experimental designs are used to test causal relationships .

Qualitative research designs tend to be more flexible. Common types of qualitative design include case study , ethnography , and grounded theory designs.

A well-planned research design helps ensure that your methods match your research aims, that you collect high-quality data, and that you use the right kind of analysis to answer your questions, utilizing credible sources . This allows you to draw valid , trustworthy conclusions.

The priorities of a research design can vary depending on the field, but you usually have to specify:

  • Your research questions and/or hypotheses
  • Your overall approach (e.g., qualitative or quantitative )
  • The type of design you’re using (e.g., a survey , experiment , or case study )
  • Your sampling methods or criteria for selecting subjects
  • Your data collection methods (e.g., questionnaires , observations)
  • Your data collection procedures (e.g., operationalization , timing and data management)
  • Your data analysis methods (e.g., statistical tests  or thematic analysis )

A research design is a strategy for answering your   research question . It defines your overall approach and determines how you will collect and analyze data.

Questionnaires can be self-administered or researcher-administered.

Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or through mail. All questions are standardized so that all respondents receive the same questions with identical wording.

Researcher-administered questionnaires are interviews that take place by phone, in-person, or online between researchers and respondents. You can gain deeper insights by clarifying questions for respondents or asking follow-up questions.

You can organize the questions logically, with a clear progression from simple to complex, or randomly between respondents. A logical flow helps respondents process the questionnaire easier and quicker, but it may lead to bias. Randomization can minimize the bias from order effects.

Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. These questions are easier to answer quickly.

Open-ended or long-form questions allow respondents to answer in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered.

A questionnaire is a data collection tool or instrument, while a survey is an overarching research method that involves collecting and analyzing data from people using questionnaires.

The third variable and directionality problems are two main reasons why correlation isn’t causation .

The third variable problem means that a confounding variable affects both variables to make them seem causally related when they are not.

The directionality problem is when two variables correlate and might actually have a causal relationship, but it’s impossible to conclude which variable causes changes in the other.

Correlation describes an association between variables : when one variable changes, so does the other. A correlation is a statistical indicator of the relationship between variables.

Causation means that changes in one variable brings about changes in the other (i.e., there is a cause-and-effect relationship between variables). The two variables are correlated with each other, and there’s also a causal link between them.

While causation and correlation can exist simultaneously, correlation does not imply causation. In other words, correlation is simply a relationship where A relates to B—but A doesn’t necessarily cause B to happen (or vice versa). Mistaking correlation for causation is a common error and can lead to false cause fallacy .

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

  • In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
  • In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

A correlation reflects the strength and/or direction of the association between two or more variables.

  • A positive correlation means that both variables change in the same direction.
  • A negative correlation means that the variables change in opposite directions.
  • A zero correlation means there’s no relationship between the variables.

Random error  is almost always present in scientific studies, even in highly controlled settings. While you can’t eradicate it completely, you can reduce random error by taking repeated measurements, using a large sample, and controlling extraneous variables .

You can avoid systematic error through careful design of your sampling , data collection , and analysis procedures. For example, use triangulation to measure your variables using multiple methods; regularly calibrate instruments or procedures; use random sampling and random assignment ; and apply masking (blinding) where possible.

Systematic error is generally a bigger problem in research.

With random error, multiple measurements will tend to cluster around the true value. When you’re collecting data from a large sample , the errors in different directions will cancel each other out.

Systematic errors are much more problematic because they can skew your data away from the true value. This can lead you to false conclusions ( Type I and II errors ) about the relationship between the variables you’re studying.

Random and systematic error are two types of measurement error.

Random error is a chance difference between the observed and true values of something (e.g., a researcher misreading a weighing scale records an incorrect measurement).

Systematic error is a consistent or proportional difference between the observed and true values of something (e.g., a miscalibrated scale consistently records weights as higher than they actually are).

On graphs, the explanatory variable is conventionally placed on the x-axis, while the response variable is placed on the y-axis.

  • If you have quantitative variables , use a scatterplot or a line graph.
  • If your response variable is categorical, use a scatterplot or a line graph.
  • If your explanatory variable is categorical, use a bar graph.

The term “ explanatory variable ” is sometimes preferred over “ independent variable ” because, in real world contexts, independent variables are often influenced by other variables. This means they aren’t totally independent.

Multiple independent variables may also be correlated with each other, so “explanatory variables” is a more appropriate term.

The difference between explanatory and response variables is simple:

  • An explanatory variable is the expected cause, and it explains the results.
  • A response variable is the expected effect, and it responds to other variables.

In a controlled experiment , all extraneous variables are held constant so that they can’t influence the results. Controlled experiments require:

  • A control group that receives a standard treatment, a fake treatment, or no treatment.
  • Random assignment of participants to ensure the groups are equivalent.

Depending on your study topic, there are various other methods of controlling variables .

There are 4 main types of extraneous variables :

  • Demand characteristics : environmental cues that encourage participants to conform to researchers’ expectations.
  • Experimenter effects : unintentional actions by researchers that influence study outcomes.
  • Situational variables : environmental variables that alter participants’ behaviors.
  • Participant variables : any characteristic or aspect of a participant’s background that could affect study results.

An extraneous variable is any variable that you’re not investigating that can potentially affect the dependent variable of your research study.

A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.

In a factorial design, multiple independent variables are tested.

If you test two variables, each level of one independent variable is combined with each level of the other independent variable to create different conditions.

Within-subjects designs have many potential threats to internal validity , but they are also very statistically powerful .

Advantages:

  • Only requires small samples
  • Statistically powerful
  • Removes the effects of individual differences on the outcomes

Disadvantages:

  • Internal validity threats reduce the likelihood of establishing a direct relationship between variables
  • Time-related effects, such as growth, can influence the outcomes
  • Carryover effects mean that the specific order of different treatments affect the outcomes

While a between-subjects design has fewer threats to internal validity , it also requires more participants for high statistical power than a within-subjects design .

  • Prevents carryover effects of learning and fatigue.
  • Shorter study duration.
  • Needs larger samples for high power.
  • Uses more resources to recruit participants, administer sessions, cover costs, etc.
  • Individual differences may be an alternative explanation for results.

Yes. Between-subjects and within-subjects designs can be combined in a single study when you have two or more independent variables (a factorial design). In a mixed factorial design, one variable is altered between subjects and another is altered within subjects.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

To implement random assignment , assign a unique number to every member of your study’s sample .

Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a dice to randomly assign participants to groups.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal validity of your study.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

“Controlling for a variable” means measuring extraneous variables and accounting for them statistically to remove their effects on other variables.

Researchers often model control variable data along with independent and dependent variable data in regression analyses and ANCOVAs . That way, you can isolate the control variable’s effects from the relationship between the variables of interest.

Control variables help you establish a correlational or causal relationship between variables by enhancing internal validity .

If you don’t control relevant extraneous variables , they may influence the outcomes of your study, and you may not be able to demonstrate that your results are really an effect of your independent variable .

A control variable is any variable that’s held constant in a research study. It’s not a variable of interest in the study, but it’s controlled because it could influence the outcomes.

Including mediators and moderators in your research helps you go beyond studying a simple relationship between two variables for a fuller picture of the real world. They are important to consider when studying complex correlational or causal relationships.

Mediators are part of the causal pathway of an effect, and they tell you how or why an effect takes place. Moderators usually help you judge the external validity of your study by identifying the limitations of when the relationship between variables holds.

If something is a mediating variable :

  • It’s caused by the independent variable .
  • It influences the dependent variable
  • When it’s taken into account, the statistical correlation between the independent and dependent variables is higher than when it isn’t considered.

A confounder is a third variable that affects variables of interest and makes them seem related when they are not. In contrast, a mediator is the mechanism of a relationship between two variables: it explains the process by which they are related.

A mediator variable explains the process through which two variables are related, while a moderator variable affects the strength and direction of that relationship.

There are three key steps in systematic sampling :

  • Define and list your population , ensuring that it is not ordered in a cyclical or periodic order.
  • Decide on your sample size and calculate your interval, k , by dividing your population by your target sample size.
  • Choose every k th member of the population as your sample.

Systematic sampling is a probability sampling method where researchers select members of the population at a regular interval – for example, by selecting every 15th person on a list of the population. If the population is in a random order, this can imitate the benefits of simple random sampling .

Yes, you can create a stratified sample using multiple characteristics, but you must ensure that every participant in your study belongs to one and only one subgroup. In this case, you multiply the numbers of subgroups for each characteristic to get the total number of groups.

For example, if you were stratifying by location with three subgroups (urban, rural, or suburban) and marital status with five subgroups (single, divorced, widowed, married, or partnered), you would have 3 x 5 = 15 subgroups.

You should use stratified sampling when your sample can be divided into mutually exclusive and exhaustive subgroups that you believe will take on different mean values for the variable that you’re studying.

Using stratified sampling will allow you to obtain more precise (with lower variance ) statistical estimates of whatever you are trying to measure.

For example, say you want to investigate how income differs based on educational attainment, but you know that this relationship can vary based on race. Using stratified sampling, you can ensure you obtain a large enough sample from each racial group, allowing you to draw more precise conclusions.

In stratified sampling , researchers divide subjects into subgroups called strata based on characteristics that they share (e.g., race, gender, educational attainment).

Once divided, each subgroup is randomly sampled using another probability sampling method.

Cluster sampling is more time- and cost-efficient than other probability sampling methods , particularly when it comes to large samples spread across a wide geographical area.

However, it provides less statistical certainty than other methods, such as simple random sampling , because it is difficult to ensure that your clusters properly represent the population as a whole.

There are three types of cluster sampling : single-stage, double-stage and multi-stage clustering. In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample.

  • In single-stage sampling , you collect data from every unit within the selected clusters.
  • In double-stage sampling , you select a random sample of units from within the clusters.
  • In multi-stage sampling , you repeat the procedure of randomly sampling elements from within the clusters until you have reached a manageable sample.

Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample.

The clusters should ideally each be mini-representations of the population as a whole.

If properly implemented, simple random sampling is usually the best sampling method for ensuring both internal and external validity . However, it can sometimes be impractical and expensive to implement, depending on the size of the population to be studied,

If you have a list of every member of the population and the ability to reach whichever members are selected, you can use simple random sampling.

The American Community Survey  is an example of simple random sampling . In order to collect detailed data on the population of the US, the Census Bureau officials randomly select 3.5 million households per year and use a variety of methods to convince them to fill out the survey.

Simple random sampling is a type of probability sampling in which the researcher randomly selects a subset of participants from a population . Each member of the population has an equal chance of being selected. Data is then collected from as large a percentage as possible of this random subset.

Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .

Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity  as they can use real-world interventions instead of artificial laboratory settings.

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.

Blinding is important to reduce research bias (e.g., observer bias , demand characteristics ) and ensure a study’s internal validity .

If participants know whether they are in a control or treatment group , they may adjust their behavior in ways that affect the outcome that researchers are trying to measure. If the people administering the treatment are aware of group assignment, they may treat participants differently and thus directly or indirectly influence the final results.

  • In a single-blind study , only the participants are blinded.
  • In a double-blind study , both participants and experimenters are blinded.
  • In a triple-blind study , the assignment is hidden not only from participants and experimenters, but also from the researchers analyzing the data.

Blinding means hiding who is assigned to the treatment group and who is assigned to the control group in an experiment .

A true experiment (a.k.a. a controlled experiment) always includes at least one control group that doesn’t receive the experimental treatment.

However, some experiments use a within-subjects design to test treatments without a control group. In these designs, you usually compare one group’s outcomes before and after a treatment (instead of comparing outcomes between different groups).

For strong internal validity , it’s usually best to include a control group if possible. Without a control group, it’s harder to be certain that the outcome was caused by the experimental treatment and not by other variables.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Individual Likert-type questions are generally considered ordinal data , because the items have clear rank order, but don’t have an even distribution.

Overall Likert scale scores are sometimes treated as interval data. These scores are considered to have directionality and even spacing between them.

The type of data determines what statistical tests you should use to analyze your data.

A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviors. It is made up of 4 or more questions that measure a single attitude or trait when response scores are combined.

To use a Likert scale in a survey , you present participants with Likert-type questions or statements, and a continuum of items, usually with 5 or 7 possible responses, to capture their degree of agreement.

In scientific research, concepts are the abstract ideas or phenomena that are being studied (e.g., educational achievement). Variables are properties or characteristics of the concept (e.g., performance at school), while indicators are ways of measuring or quantifying variables (e.g., yearly grade reports).

The process of turning abstract concepts into measurable variables and indicators is called operationalization .

There are various approaches to qualitative data analysis , but they all share five steps in common:

  • Prepare and organize your data.
  • Review and explore your data.
  • Develop a data coding system.
  • Assign codes to the data.
  • Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

There are five common approaches to qualitative research :

  • Grounded theory involves collecting data in order to develop new theories.
  • Ethnography involves immersing yourself in a group or organization to understand its culture.
  • Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
  • Phenomenological research involves investigating phenomena through people’s lived experiences.
  • Action research links theory and practice in several cycles to drive innovative changes.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g. understanding the needs of your consumers or user testing your website)
  • You can control and standardize the process for high reliability and validity (e.g. choosing appropriate measurements and sampling methods )

However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control and randomization.

In restriction , you restrict your sample by only including certain subjects that have the same values of potential confounding variables.

In matching , you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable .

In statistical control , you include potential confounders as variables in your regression .

In randomization , you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.

A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.

Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.

To ensure the internal validity of your research, you must consider the impact of confounding variables. If you fail to account for them, you might over- or underestimate the causal relationship between your independent and dependent variables , or even find a causal relationship where none exists.

Yes, but including more than one of either type requires multiple research questions .

For example, if you are interested in the effect of a diet on health, you can use multiple measures of health: blood sugar, blood pressure, weight, pulse, and many more. Each of these is its own dependent variable with its own research question.

You could also choose to look at the effect of exercise levels as well as diet, or even the additional effect of the two combined. Each of these is a separate independent variable .

To ensure the internal validity of an experiment , you should only change one independent variable at a time.

No. The value of a dependent variable depends on an independent variable, so a variable cannot be both independent and dependent at the same time. It must be either the cause or the effect, not both!

You want to find out how blood sugar levels are affected by drinking diet soda and regular soda, so you conduct an experiment .

  • The type of soda – diet or regular – is the independent variable .
  • The level of blood sugar that you measure is the dependent variable – it changes depending on the type of soda.

Determining cause and effect is one of the most important parts of scientific research. It’s essential to know which is the cause – the independent variable – and which is the effect – the dependent variable.

In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.

Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling, and quota sampling .

Probability sampling means that every member of the target population has a known chance of being included in the sample.

Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .

Using careful research design and sampling procedures can help you avoid sampling bias . Oversampling can be used to correct undercoverage bias .

Some common types of sampling bias include self-selection bias , nonresponse bias , undercoverage bias , survivorship bias , pre-screening or advertising bias, and healthy user bias.

Sampling bias is a threat to external validity – it limits the generalizability of your findings to a broader group of people.

A sampling error is the difference between a population parameter and a sample statistic .

A statistic refers to measures about the sample , while a parameter refers to measures about the population .

Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible.

Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.

There are seven threats to external validity : selection bias , history, experimenter effect, Hawthorne effect , testing effect, aptitude-treatment and situation effect.

The two types of external validity are population validity (whether you can generalize to other groups of people) and ecological validity (whether you can generalize to other situations and settings).

The external validity of a study is the extent to which you can generalize your findings to different groups of people, situations, and measures.

Cross-sectional studies cannot establish a cause-and-effect relationship or analyze behavior over a period of time. To investigate cause and effect, you need to do a longitudinal study or an experimental study .

Cross-sectional studies are less expensive and time-consuming than many other types of study. They can provide useful insights into a population’s characteristics and identify correlations for further research.

Sometimes only cross-sectional data is available for analysis; other times your research question may only require a cross-sectional study to answer it.

Longitudinal studies can last anywhere from weeks to decades, although they tend to be at least a year long.

The 1970 British Cohort Study , which has collected data on the lives of 17,000 Brits since their births in 1970, is one well-known example of a longitudinal study .

Longitudinal studies are better to establish the correct sequence of events, identify changes over time, and provide insight into cause-and-effect relationships, but they also tend to be more expensive and time-consuming than other types of studies.

Longitudinal studies and cross-sectional studies are two different types of research design . In a cross-sectional study you collect data from a population at a specific point in time; in a longitudinal study you repeatedly collect data from the same sample over an extended period of time.

Longitudinal study Cross-sectional study
observations Observations at a in time
Observes the multiple times Observes (a “cross-section”) in the population
Follows in participants over time Provides of society at a given point

There are eight threats to internal validity : history, maturation, instrumentation, testing, selection bias , regression to the mean, social interaction and attrition .

Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
  • If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

I nternal validity is the degree of confidence that the causal relationship you are testing is not influenced by other factors or variables .

External validity is the extent to which your results can be generalized to other contexts.

The validity of your experiment depends on your experimental design .

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

Ask our team

Want to contact us directly? No problem.  We  are always here for you.

Support team - Nina

Our team helps students graduate by offering:

  • A world-class citation generator
  • Plagiarism Checker software powered by Turnitin
  • Innovative Citation Checker software
  • Professional proofreading services
  • Over 300 helpful articles about academic writing, citing sources, plagiarism, and more

Scribbr specializes in editing study-related documents . We proofread:

  • PhD dissertations
  • Research proposals
  • Personal statements
  • Admission essays
  • Motivation letters
  • Reflection papers
  • Journal articles
  • Capstone projects

Scribbr’s Plagiarism Checker is powered by elements of Turnitin’s Similarity Checker , namely the plagiarism detection software and the Internet Archive and Premium Scholarly Publications content databases .

The add-on AI detector is powered by Scribbr’s proprietary software.

The Scribbr Citation Generator is developed using the open-source Citation Style Language (CSL) project and Frank Bennett’s citeproc-js . It’s the same technology used by dozens of other popular citation tools, including Mendeley and Zotero.

You can find all the citation styles and locales used in the Scribbr Citation Generator in our publicly accessible repository on Github .

  • Privacy Policy

Research Method

Home » Dependent Variable – Definition, Types and Example

Dependent Variable – Definition, Types and Example

Table of Contents

Dependent Variable

Dependent Variable

Definition:

Dependent variable is a variable in a study or experiment that is being measured or observed and is affected by the independent variable. In other words, it is the variable that researchers are interested in understanding, predicting, or explaining based on the changes made to the independent variable.

Types of Dependent Variables

Types of Dependent Variables are as follows:

  • Continuous dependent variable : A continuous variable is a variable that can take on any value within a certain range. Examples include height, weight, and temperature.
  • Discrete dependent variable: A discrete variable is a variable that can only take on certain values within a certain range. Examples include the number of children in a family, the number of pets someone has, and the number of cars owned by a household.
  • Categorical dependent variable: A categorical variable is a variable that can take on values that belong to specific categories or groups. Examples include gender, race, and marital status.
  • Dichotomous dependent variable: A dichotomous variable is a categorical variable that can take on only two values. Examples include whether someone is a smoker or non-smoker, or whether someone has a certain medical condition or not.
  • Ordinal dependent variable: An ordinal variable is a categorical variable that has a specific order or ranking to its categories. Examples include education level (e.g., high school diploma, college degree, graduate degree), or socioeconomic status (e.g., low, middle, high).
  • Interval dependent variable: An interval variable is a continuous variable that has a specific measurement scale with equal intervals between the values. Examples include temperature measured in degrees Celsius or Fahrenheit.
  • Ratio dependent variable : A ratio variable is a continuous variable that has a true zero point and equal intervals between the values. Examples include height, weight, and income.
  • Count dependent variable: A count variable is a discrete variable that represents the number of times an event occurs within a specific time period. Examples include the number of times a customer visits a store, or the number of times a student misses a class.
  • Time-to-event dependent variable: A time-to-event variable is a type of continuous variable that measures the time it takes for an event to occur. Examples include the time until a customer makes a purchase, or the time until a patient recovers from an illness.
  • Latent dependent variable: A latent variable is a variable that cannot be directly observed or measured, but is inferred from other observable variables. Examples include intelligence, personality traits, and motivation.
  • Binary dependent variable: A binary variable is a dichotomous variable with only two possible outcomes, usually represented by 0 or 1. Examples include whether a customer will make a purchase or not, or whether a patient will respond to a treatment or not.
  • Multinomial dependent variable: A multinomial variable is a categorical variable with more than two possible outcomes. Examples include political affiliation, type of employment, or type of transportation used to commute.
  • Longitudinal dependent variable : A longitudinal variable is a type of continuous variable that measures change over time. Examples include academic performance, income, or health status.

Examples of Dependent Variable

Here are some examples of dependent variables in different fields:

  • In physics : The velocity of an object is a dependent variable as it changes in response to the force applied to it.
  • In psychology : The level of happiness or satisfaction of a person can be a dependent variable as it may change in response to different factors such as the level of stress or social support.
  • I n medicine: The effectiveness of a new drug can be a dependent variable as it may be measured in relation to the symptoms of a disease.
  • In education : The grades of a student can be a dependent variable as they may be influenced by factors such as teaching methods or amount of studying.
  • In economics : The demand for a product can be a dependent variable as it may change in response to factors such as the price or availability of the product.
  • In biology : The growth rate of a plant can be a dependent variable as it may change in response to factors such as sunlight, water, or soil nutrients.
  • In sociology: The level of social support for an individual can be a dependent variable as it may change in response to factors such as the availability of community resources or the strength of social networks.
  • In marketing : The sales of a product can be a dependent variable as they may change in response to factors such as advertising, pricing, or consumer trends.
  • In environmental science : The biodiversity of an ecosystem can be a dependent variable as it may change in response to factors such as climate change, pollution, or habitat destruction.
  • I n political science : The outcome of an election can be a dependent variable as it may change in response to factors such as campaign strategies, political advertising, or voter turnout.
  • I n criminology : The likelihood of a person committing a crime can be a dependent variable as it may change in response to factors such as poverty, education, or socialization.
  • In engineering : The efficiency of a machine can be a dependent variable as it may change in response to factors such as the materials used, the design of the machine, or the operating conditions.
  • In linguistics: The speed and accuracy of language processing can be a dependent variable as they may change in response to factors such as linguistic complexity, language experience, or cognitive ability.
  • In history : The outcome of a historical event, such as a battle or a revolution, can be a dependent variable as it may change in response to factors such as leadership, strategy, or external forces.
  • In sports science : The performance of an athlete can be a dependent variable as it may change in response to factors such as training methods, nutrition, or psychological factors.

Applications of Dependent Variable

  • Experimental studies: In experimental studies, the dependent variable is used to test the effect of one or more independent variables on the outcome variable. For example, in a study on the effect of a new drug on blood pressure, the dependent variable is the blood pressure.
  • Observational studies : In observational studies, the dependent variable is used to explore the relationship between two or more variables. For example, in a study on the relationship between physical activity and depression, the dependent variable is the level of depression.
  • Psychology : In psychology, dependent variables are used to measure the response or behavior of individuals in response to different experimental or natural conditions.
  • Predictive modeling : In predictive modeling, the dependent variable is used to predict the outcome of a future event or situation. For example, in financial modeling, the dependent variable can be used to predict the future value of a stock or currency.
  • Regression analysis : In regression analysis, the dependent variable is used to predict the value of one or more independent variables based on their relationship with the dependent variable. For example, in a study on the relationship between income and education, the dependent variable is income.
  • Machine learning : In machine learning, the dependent variable is used to train the model to predict the value of the dependent variable based on the values of one or more independent variables. For example, in image recognition, the dependent variable can be used to identify the object in an image.
  • Quality control : In quality control, the dependent variable is used to monitor the performance of a product or process. For example, in a manufacturing process, the dependent variable can be used to measure the quality of the product and identify any defects.
  • Marketing research : In marketing research, the dependent variable is used to understand consumer behavior and preferences. For example, in a study on the effectiveness of a new advertising campaign, the dependent variable can be used to measure consumer response to the ad.
  • Social sciences research : In social sciences research, the dependent variable is used to study human behavior and attitudes. For example, in a study on the impact of social media on mental health, the dependent variable can be used to measure the level of anxiety or depression.
  • Epidemiological studies: In epidemiological studies, the dependent variable is used to investigate the prevalence and incidence of diseases or health conditions. For example, in a study on the risk factors for heart disease, the dependent variable can be used to measure the occurrence of heart disease.
  • Environmental studies : In environmental studies, the dependent variable is used to assess the impact of environmental factors on ecosystems and natural resources. For example, in a study on the effect of pollution on aquatic life, the dependent variable can be used to measure the health and survival of aquatic organisms.
  • Educational research: In educational research, the dependent variable is used to study the effectiveness of different teaching methods and instructional strategies. For example, in a study on the impact of a new teaching program on student achievement, the dependent variable can be used to measure student performance.

Purpose of Dependent Variable

The purpose of the dependent variable is to help researchers understand the relationship between the independent variable and the outcome they are studying. By measuring the changes in the dependent variable, researchers can determine the effects of different variables on the outcome of interest.

When to use Dependent Variable

Following are some situations When to use Dependent Variable:

  • When conducting scientific research or experiments, the dependent variable is the factor that is being measured or observed to determine its relationship with other factors or variables.
  • In statistical analysis, the dependent variable is the outcome or response variable that is being predicted or explained by one or more independent variables.
  • When formulating hypotheses, the dependent variable is the variable that is being predicted or explained by the independent variable(s).
  • When writing a research paper or report, it is important to clearly define the dependent variable(s) in order to provide a clear understanding of the research question and methods used to answer it.
  • In social sciences, such as psychology or sociology, the dependent variable may refer to behaviors, attitudes, or other measurable aspects of individuals or groups.
  • In natural sciences, such as biology or physics, the dependent variable may refer to physical properties or characteristics, such as temperature, speed, or mass.
  • The dependent variable is often contrasted with the independent variable, which is the variable that is being manipulated or changed in order to observe its effects on the dependent variable.

Characteristics of Dependent Variable

Some Characteristics of Dependent Variable are as follows:

  • The dependent variable is the outcome or response variable in the study.
  • Its value depends on the values of one or more independent variables.
  • The dependent variable is typically measured or observed, rather than manipulated by the researcher.
  • It can be continuous (e.g., height, weight) or categorical (e.g., yes/no, red/green/blue).
  • The dependent variable should be relevant to the research question and meaningful to the study participants.
  • It should have a clear and consistent definition and be measured or observed consistently across all participants in the study.
  • The dependent variable should be valid and reliable, meaning that it measures what it is intended to measure and produces consistent results over time.

Advantages of Dependent Variable

Some Advantages of Dependent Variable are as follows:

  • Allows for the testing of hypotheses: By measuring the dependent variable in response to changes in the independent variable, researchers can test hypotheses and draw conclusions about cause-and-effect relationships.
  • Provides insight into the relationship between variables: The dependent variable can provide insight into how one variable is related to another, allowing researchers to identify patterns and make predictions about future outcomes.
  • Enables the evaluation of interventions : By measuring changes in the dependent variable over time, researchers can evaluate the effectiveness of interventions and determine whether they have a meaningful impact on the outcome being studied.
  • Enables the comparison of groups: The dependent variable can be used to compare groups of participants or populations, helping researchers to identify differences or similarities and draw conclusions about underlying factors that may be contributing to those differences.
  • Enables the calculation of statistical measures: By measuring the dependent variable, researchers can calculate statistical measures such as means, variances, and standard deviations, which are used to make statistical inferences about the population being studied.

Disadvantages of Dependent Variable

  • Limited in scope: The dependent variable is limited to the specific outcome being studied, which may not capture the full complexity of the system or phenomenon being investigated.
  • Vulnerable to confounding variables: Confounding variables, or factors that are not controlled for in the study, can influence the dependent variable and obscure the relationship between the independent and dependent variables.
  • Prone to measurement error: The dependent variable may be subject to measurement error due to issues with data collection methods or measurement instruments, which can lead to inaccurate or unreliable results.
  • Limited to observable variables : The dependent variable is typically limited to variables that can be measured or observed, which may not capture underlying or latent variables that may be important for understanding the phenomenon being studied.
  • Ethical concerns: In some cases, measuring the dependent variable may raise ethical concerns, such as in studies of sensitive topics or vulnerable populations.
  • Limited to specific time periods : The dependent variable is typically measured at specific time points or over specific time periods, which may not capture changes or fluctuations in the outcome over longer periods of time.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Extraneous Variable

Extraneous Variable – Types, Control and Examples

Categorical Variable

Categorical Variable – Definition, Types and...

Confounding Variable

Confounding Variable – Definition, Method and...

Intervening Variable

Intervening Variable – Definition, Types and...

Moderating Variable

Moderating Variable – Definition, Analysis...

Independent Variable

Independent Variable – Definition, Types and...

RESEARCH VARIABLES: TYPES, USES AND DEFINITION OF TERMS

  • In book: Research in Education (pp.43-54)
  • Publisher: His Lineage Publishing House

Olayemi Jumoke Abiodun-Oyebanji at University of Ibadan

  • University of Ibadan

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Erwin Halim

  • Tiffany Angelene Dharsono
  • Marylise Hebrard
  • Ahmed Ali DARAR
  • Susan Sabillo
  • Merry Jean Tiauzon
  • Mylene A Bautista

Guarin Maguate

  • Elsye Mayshelly
  • Jonathan Phelipe Silaban

Nursyamsi N.L

  • Kapeso Singogo
  • Louis Cohen
  • Lawrence Manion
  • Keith Morrison
  • J O Adeleke
  • O Aderounmu
  • D O Owuamanam
  • A E Uzoagulu
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

Pardon Our Interruption

As you were browsing something about your browser made us think you were a bot. There are a few reasons this might happen:

  • You've disabled JavaScript in your web browser.
  • You're a power user moving through this website with super-human speed.
  • You've disabled cookies in your web browser.
  • A third-party browser plugin, such as Ghostery or NoScript, is preventing JavaScript from running. Additional information is available in this support article .

To regain access, please make sure that cookies and JavaScript are enabled before reloading the page.

  • Open access
  • Published: 28 August 2024

Determinants of survival time for HIV/AIDS patients in the pastoralist region of Borena: a study at Yabelo General Hospital, South East Ethiopia

  • Galgalo Jaba Nura 1 ,
  • Kumbi Sara Wario 2 &
  • Markos Abiso Erango 3  

AIDS Research and Therapy volume  21 , Article number:  58 ( 2024 ) Cite this article

Metrics details

Introduction

HIV/AIDS is one of the most dangerous diseases globally, impacting public health, economics, society, political issues, and communities. As of 2023, the World Health Organization estimates that 40.4 million people are living with HIV/AIDS. This study aimed to identify the determinants of survival time for HIV/AIDS patients in the pastoralist region of Borena at Yabelo General Hospital.

The study design was a retrospective cohort study, with a sample size of 293 individuals living with HIV/AIDS, based on recorded data. This research utilized survival model analysis, employing Kaplan-Meier plots, the log-rank test, and Cox proportional hazard model analysis.

Out of the total sample size, 179 (61.1%) were female and 114 (38.1%) were male. Among these males, 36 (31.6%) were deceased. The analysis using the Cox proportional hazard model revealed that the following variables were significantly associated with the survival time of HIV/AIDS patients: gender, educational status, area of residence, tuberculosis (TB), and opportunistic infections.

Conclusions

We concluded that individuals living with HIV/AIDS in urban areas have a lower risk of death compared to those in rural areas, indicating that rural residents have a reduced survival probability. Therefore, the Borena zone administration should focus on adult patients to enhance life expectancy.

The human immunodeficiency virus (HIV) is the world’s most critical public health issue. According to estimates from the World Health Organization, approximately 40.4 million people were living with HIV by mid-2023. In the African region, an estimated 25.6 million individuals had HIV by that time, as reported by the WHO. In 2022, over 20.9 million people received antiretroviral treatment. That same year, an estimated 660,000 individuals acquired HIV, and by mid-2023, the rate of new HIV infections across all ages had decreased to 0.57 per 1,000, although the uninfected population had declined from 1.75 in 2010 [ 1 ].

Survival patterns among African communities following HIV infection before the introduction of ART served as an initial benchmark for assessing the future viability of intervention initiatives [ 2 ]. Since the advent of antiretroviral therapy (ART), HIV infection has transitioned from a severe condition to a chronic illness [ 3 ]. In Ethiopia, current estimates indicate a slight decline in PLWH, from 610,350 in 2022 to 603,537 in 2023. Reported prevalence shows that the number of PLHIV in the Oromia region gradually decreased, from 158,152 in 2022 to 156,184 in 2023 [ 4 ].

The Borena community pastoralists have long existed under the Gada society’s cultural, social, community, and political organization, led by the Abba Gada or elders of Borena. Following 1950, the modern education system in Borena began, but the Gada system’s structure has been in place since around the 14th century, resulting in a lack of contemporary education. According to a report from the Ethiopia Public Health Institute [ 5 ], 2,600 adult Borena individuals are living with HIV infection, indicating that many pastoralists remain unaware of disease transmission. This vulnerability to the disease is prevalent throughout all areas of the Borena pastoralist community. Consequently, numerous individuals have been infected, primarily due to insufficient protective measures and insufficient education.

In addition, concurrent extramarital sexual activities, polygamy, and marrying a deceased wife’s sister have been identified as risk factors for HIV infection. Although not extensively documented, the practice of maintaining extramarital sexual partners by both men and women, widow inheritance, and polygamy appears to have decreased, although it continues to occur in secret [ 6 , 7 ]. Despite the lack of studies on vulnerability within the Borana population, a few behavioral and biological studies indicate a very high HIV prevalence in the region compared to similar contexts [ 8 , 9 ]. The researcher aimed to determine the survival time for HIV/AIDS patients in the pastoralist region of Borena at Yabelo General Hospital from January 2016 to December 2019. The results will provide information about the determinants of survival time for people living with HIV/AIDS in the pastoralist region of Borena.

Methods and materials

The study was conducted at Yabelo General Hospital, situated in Yabelo town, Borena Zone. This zone is one of twenty-one zones in the Oromia Region. In 2010, the hospital was upgraded from a Health Center to a general hospital. It provides various services to the residents of Borena Zone and other Ethiopian ethnic groups. Currently, the zone comprises ten rural pastoralist woredas and one town administration, Yabelo, which has a state function. The zone is located in the southern part of the Oromia region. It shares borders with the West Guji Zone to the north, the South Nations, Nationalities, and Peoples region to the west, the Somali region to the southeast, and an international boundary with Kenya to the south (as shown in the geographical map below, Fig.  1 ).

figure 1

Map of Borena zone

According to the 2023 report from the Borena Zone Administration Office, over 1.4 million people reside in the zone, with a male-to-female ratio of 1:1. This suggests significant variation in settlement patterns from district to district. Approximately 89% of the population inhabits the rural pastoralist areas of the zone [ 10 ]. The Borana Zone is one of the most pastoralist regions in Ethiopia, primarily relying on livestock rearing. The livestock population in Borena includes 1,482,053 goats, 1,179,645 sheep, 637,632 horses, 2,222 mules, 5,525 donkeys, 68,799 camels, and 185,382 cattle [ 11 ].

Source of data and study population

The study is a retrospective cohort analysis, indicating that all events and exposures detailed in the review subjects’ patient cards and information sheets occurred in the past. All individuals diagnosed with HIV at Yabelo General Hospital and receiving ART were included in the study at regular intervals. Based on the inclusion and exclusion criteria, 293 adult HIV/AIDS patients were selected from their medical records. Participants in this study were HIV-positive individuals receiving follow-up antiretroviral therapy during the study intervals. This study encompassed all adult HIV-positive patients who visited the hospital for treatment three or more times, as well as adult HIV/AIDS patients who initiated treatment between January 2016 and December 2019. According to hospital records, 1,147 HIV patients underwent ART treatment and were assessed for baseline CD4 count cells during the study periods.

Sample size determination

The researcher was able to obtain statistically significant results by employing the formula for calculating the required sample size [ 12 ]. According to [ 13 ], the sample size was determined by analyzing the mortality rates in two groups of HIV-positive individuals on ART, categorized by their WHO clinical stage as exposure status. Consequently, the sample size for this current study has reached 293 HIV/AIDS-positive subjects, taking into account the inclusion criteria (further calculations are available in the supplementary material).

Variables of the study

The outcome variable for survival analysis is the survival time and/or time to death of patients under follow-up among HIV-infected adults. The predictors included in this study were gender, age, marital status, educational status, place of residence, WHO stages, TB, adherence to ART treatment, functional status, family history, and opportunistic infectious diseases.

These are the clinical stages of patients based on CD4 values, classified into four stages: stage I, stage II, stage III, and stage IV.

Tuberculosis (TB)

Individuals with HIV and weakened immune systems are at a higher risk of contracting tuberculosis compared to those with typical immune systems.

Family History

This refers to the previous occurrences of HIV/AIDS disease or past incidences among family members.

Opportunistic infectious diseases

These are infections that occur more frequently and are more severe in individuals with declining immune systems.

Functional status

Working: able to perform usual work in or out of the house; Ambulatory: able to carry out activities of daily living; Bedridden: unable to perform activities of daily living [ 14 ].

Adherence was categorized as good if patients adhered to at least 95% of the prescribed medication, fair if they adhered between 85% and 95%, and poor if they adhered to less than 85% of the prescribed medication [ 15 ].

Method of analysis

The analysis was conducted using R software version 4.3.1. It includes descriptive statistics of variables, the Kaplan-Meier method, the log-rank test, and the Cox proportional hazards model for the time-to-event data from the survival datasets.

Survival analysis model

Survival analysis is a branch of statistics that investigates the anticipated duration until one or more events take place [ 16 ]. This data shows that not all patients experience the event by the conclusion of the observation period; thus, the actual survival times for some individuals living with HIV/AIDS remain unknown, a phenomenon referred to as censoring, which must be accounted for in the study to yield meaningful results [ 17 , 18 ].

Kaplan - Meier estimator

The Kaplan-Meier estimator [ 19 ] provides a non-parametric maximum likelihood estimate of the survival function.

Cox proportional hazards model

The basic model for survival analysis is investigated under the Cox proportional hazard model, a model originated by Cox [ 16 ]. In a model, the unique effect of a unit increase in a covariate is multiplicative in terms of the hazard rate. Its covariates can be time-independent. This model implies that the hazard function \(\:{\lambda\:}_{\:}\) (t, X,) \(\:\beta\:\) is connected to the covariates as a product of a baseline hazard \(\:{{\lambda\:}}_{0}\left(\text{t}\right)\) and a function of covariates.

In this study, records of 293 individuals living with HIV/AIDS were included; of this total, 179 (61.1%) were female. Among these females, 33 (18.4%) had died, while the others were censored. Among the male patients, 36 (31.6%) were deceased. Of the total samples, 83 (28.3%) were related to tuberculosis. Among the tuberculosis (TB) patients, 34 (41.0%) died, whereas 35 (16.7%) of the non-tuberculosis patients died. Regarding functional status, 221 (75.4%) of the patients were working, 27 (9.2%) were bedridden, and 45 (15.4%) were ambulatory. Among those who were working, 50 (22.6%) patients died.

In the baseline test results, 201 (68.6%) of the patients had no family members related to this disease (none related to HIV/AIDS previously), while the remaining 92 (31.4%) were suffering from opportunistic infections of another disease, with 35 (38.0%) of these patients having died from their opportunistic infections (Table  1 ).

Survival analysis

Comparison of survival grouped data.

The survival data for these studies consists of baseline information extracted from the entire sample patient set. The significant difference in group variables was determined using Kaplan-Meier plots and a log-rank test. Figure  2 below illustrates a significant difference between the categorical groups, as shown in the Kaplan-Meier plot. Female patients had slightly higher survival rates than males from the beginning to the end. Based on place of residence, patients from urban areas exhibited a higher survival probability than those from rural areas regarding survival time. The log-rank test for these variables indicates a statistically significant difference between patients from urban areas and those from rural areas (Supplementary Table 1 ).

When comparing the different educational statuses of patients, a Kaplan-Meier plot for this variable is presented in Fig.  2 . It is evident that there is no significant difference between the groups in the plot. In comparing the categories, primary and secondary education displayed similar patterns, while not formally educated and tertiary groups also showed similar trends, though not statistically supported. A statistical test using the log-rank method reveals a statistically significant difference ( P  = 0.02) among not formally educated, primary, secondary, and tertiary groups concerning survival time in months.

Among tuberculosis (TB) patients, the Kaplan-Meier estimate plot indicates that individuals living with HIV/AIDS who did not have TB were more likely to survive than those who had TB, in terms of survival time in months. The log-rank test for these variables also demonstrates a statistically significant difference between patients with TB and those without (Supplementary Table 1 ).

figure 2

Kaplan-Meier plots of different categorical variables

Assumption checking

The results of the covariates and the global test for the proportionality assumption of the Cox proportional hazards model are presented. The p-values for the covariate terms and the global test are insignificant at the 5% level, indicating that the proportional hazards assumptions are not violated. In the Schoenfeld residual plot, no patterns are observed between the variables and time. The assumption of proportional hazards has been satisfied for both methods (Supplementary Tables 2 and Supplementary Fig.  1 ).

Multivariate analysis of the Cox-PH model

Variables such as gender, educational status, place of residence, tuberculosis, family history, and opportunistic infections were significantly associated with the survival time of adults living with HIV/AIDS undergoing ART treatment at the 5% level of significance. According to the adjusted hazard ratio, male HIV-infected patients were 1.69 times more likely to die than their female counterparts (HR = 1.69, p-value = 0.036). This indicates that male patients faced a 69% higher risk of experiencing an event compared to female patients (Table  2 ).

It has been estimated that patients educated at the secondary level have a hazard rate of 0.31, indicating a 0.31-fold lower risk of death compared to non-formally educated patients (HR = 0.31, p-value = 0.028). There was a 1.72 times greater mortality risk for HIV-infected adults with TB compared to those without TB. The results indicate that 72% of TB patients face an increased risk of death compared to those without TB.

Regarding the family history of HIV patients, families with a history of the disease were at 1.66 times higher risk of death than those without a family history of HIV/AIDS (HR = 1.66, p-value = 0.047). Concerning opportunistic infections, patients with a risk of opportunistic infections had a 2.30 times higher risk of death than patients without such a risk (HR = 2.30, p-value = 0.002). However, marital status and WHO stages do not significantly affect the survival time to death in HIV patients.

Discussions

This study aimed to identify factors affecting the survival time of adult HIV/AIDS patients in the pastoralist area of Borena at Yabelo General Hospital from January 2016 to December 2019. In the current study, the gender variable is significantly associated with survival time until death, consistent with several other studies [ 20 , 21 , 22 ]. The mortality risk for adult male patients was higher than that for adult female patients, suggesting that female patients are more likely to know their HIV status at an earlier stage and to start ART with higher CD4 counts than males [ 20 ]. According to other studies, gender status was not associated with survival time until HIV/AIDS-related risks [ 23 , 24 , 25 , 26 ].

The findings of this study revealed that individuals living with HIV/AIDS who had a secondary educational status had a lower hazard ratio of death than those with no formal education. Various studies supported the notion that secondary educational status was linked to a lower risk of mortality among HIV-infected antiretroviral therapy users, indicating significant effects on the survival time of adult patients [ 25 , 27 , 28 , 29 , 30 , 31 ].

A patient living in urban areas has a 0.46 times lower death rate than a patient living in rural areas, indicating that patients from urban areas are more likely to survive than those in rural regions. Similarly, the study at Debre Tabor Referral Hospital suggests that patients in urban areas had significantly higher survival rates compared to those from rural areas [ 32 ]. In a study examining the impact of the “universal test and treat” program on HIV treatment in the Gurage Zone, it was found that rural patients had significantly better survival rates than urban patients [ 33 ]. Possible reasons include better drug adherence, improved access to services, closer proximity to health centers, superior care provided, and varying levels of knowledge.

According to the findings of this study, patients with tuberculosis (TB) and HIV faced 1.72 times the risk of dying from the disease compared to patients without TB. Therefore, patients without coinfection diseases have a better survival rate than those with them. A similar study conducted at Goba Hospital in Southeast Ethiopia found that TB coinfection at the start of ART was strongly associated with increased mortality risks among ART patients [ 26 , 33 ]. However, other study results did not demonstrate any association between baseline TB infection and the death hazard rate [ 23 ].

People living with HIV/AIDS who have opportunistic infections are linked to an increase in HIV-infected patients, according to our study. It has been estimated that patients with opportunistic infections alongside other diseases face a higher risk of death compared to those without such infections. Various studies support the notion that opportunistic infections are significantly associated with the survival and mortality of HIV-infected patients [ 23 , 25 ].

The main objective of this study was to determine the survival time for HIV/AIDS patients in the pastoralist region of Borena at Yabelo General Hospital from January 2016 to December 2019. In this study, a total of 293 adults living with HIV/AIDS were analyzed. According to the Cox-PH model, covariates such as gender, educational status, place of residence, TB, family history, and opportunistic infections were identified as factors affecting the survival time of HIV-infected individuals. Patients residing in urban areas have a lower risk of death than those living in rural areas, indicating that rural patients have a lower survival probability compared to their urban counterparts. Therefore, the Borena zone administration should pay special attention to adult patients to enhance life expectancy.

Data availability

after getting acceptance.

Abbreviations

Ethiopian Public Health Institute

Proportional Hazard

Tuberculosis

Joint United Nations Programme on HIV/AIDS

World Health Organization

The path that ends AIDS: UNAIDS Global AIDS Update. 2023. Geneva: Joint United Nations Programme on HIV/AIDS, 2023. License: CC BY-NC-SA 3.0 IGO.

US Department of Health and Human Services. Guidelines for the use of antiretroviral agents in HIV-1-infected adults and adolescents. http://aidsinfo.nih.gov/OrderPublication/OrderPubsBrowseSearchResultsTable . aspx? 2009, ID = 115.

EPHI. (2023). HIV-Related Estimates and Projections in Ethiopia for the Year 2022–2023, Addis Ababa.

The Ethiopia Public Health Institute. (EPHI, 2023). HIV Related Estimates and Projections in Ethiopia for the Year 2022–2023. May 2023, Addis Ababa. https://ephi.gov.et/wp-content/uploads/2021/02/HIV-Estimates-and-projection-for-the-year-2022-and-2023.pdf

Mirgissa K, Ibrahim A, Damen HM. Extramarital sexual practices and perceived association with HIV infection among the Borana pastoral community. Ethiop J Health Dev. 2013;27(1):25–32.

Google Scholar  

Miz-Hasab Research Centre. HIV/AIDS and gender in Ethiopia: the case of ten Weredas in Oromia and Southern Nations and Nationalities people’s region. Addis Ababa: Miz-Hasab research center; 2004.

Tefera B, Ahmed Y. Contribution of the anti HIV/AIDS community conversation programs in preventing and controlling the spread of HIV/ AIDS. Ethiop J Health Dev. 2013;27(3):216–29.

Mela Research. Know Your HIV Epidemic/Know Your HIV Response (KYE/KYR) Synthesis in Oromia, Ethiopia. Addis Ababa, Ethiopia; 2014.

Collett D. Modeling survival data in medical research. Chapman and Hall/CRC; 2023.

Borena. (2023). Borena zone administration office report on the population severe drought effects in 2023.unpublished document.

Fenetahun Y, Fentahun T. Socio-economic profile of arid and semi-arid agro-pastoral region of Borana rangeland Southern Ethiopia. MOJ Eco Environ Sci. 2020;5(3):113–22.

Gebrerufael GG, Asfaw ZG, Chekole DM. The effect of longitudinal body weight and CD4 cell progression for the survival of HIV/AIDS patients. Cogent Med. 2021;8(1):1986269.

Article   Google Scholar  

Cox DR. Regression models and life-tables. J Roy Stat Soc: Ser B (Methodol). 1972;34(2):187–202.

Tsegaye E, Worku A. Assessment of antiretroviral treatment outcome in public hospitals, South nations Nationalities and Peoples Region, Ethiopia. Ethiop J Health Dev. 2011;25:102–9.

Abbastabar H, Rezaianzadeh A, Rajaeefard A, Ghaem H, Motamedifar M, Kazeroon PA. 2016. Determining factors of CD4 cell count in HIV patients: in a historical cohort study. International Journal of Life Science and Pharma Research, 2016, 93–101.

Schober P, Vetter TR. Survival analysis and interpretation of time-to-event data: the tortoise and the hare. Anesth Analgesia. 2018;127(3):792–8.

George B, Seals S, Aban I. Survival analysis and regression models. J Nuclear Cardiol. 2014;21(4):686–94.

Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Stat Assoc. 1958;53(282):457–81.

Mageda K, Leyna GH, Mmbaga EJ. High initial HIV/AIDS-Related mortality and predictors among patients on antiretroviral therapy in the Kagera Region of Tanzania: a five-year retrospective cohort study. AIDS Res Treat. 2012;2012(1):843598.

PubMed   PubMed Central   Google Scholar  

Zheng H, Wang L, Huang P, Norris J, Wang Q, Guo W, Peng Z, Yu R, Wang N. Incidence and risk factors for AIDS-related mortality in HIV patients in China: a cross-sectional study. BMC Public Health. 2014;14:1–9.

Article   CAS   Google Scholar  

Mengesha S, Belayihun B, Kumie A. Predictors of survival in HIV-infected patients after initiation of HAART in Zewditu Memorial Hospital, Addis Ababa, Ethiopia. Int Sch Res Notices. 2014;2014(1):250913.

Seyoum D, Degryse JM, Kifle YG, Taye A, Tadesse M, Birlie B, Banbeta A, Rosas-Aguirre A, Duchateau L, Speybroeck N. Risk factors for mortality among adult HIV/AIDS patients following antiretroviral therapy in Southwestern Ethiopia: an assessment through survival models. Int J Environ Res Public Health. 2017;14(3):296.

Article   PubMed   PubMed Central   Google Scholar  

Tegegne AS, Ndlovu P, Zewotir T. Determinants of CD4 cell count change and time-to default from HAART; a comparison of separate and joint models. BMC Infect Dis. 2018;18:1–1.

Setegn T, Takele A, Gizaw T, Nigatu D, Haile D. Predictors of mortality among adult antiretroviral therapy users in southeastern Ethiopia: retrospective cohort study. AIDS Res Treat. 2015;2015(1):148769.

Hassan AS, Mwaringa SM, Ndirangu KK, Sanders EJ, de Wit TF, Berkley JA. Incidence and predictors of attrition from antiretroviral care among adults in a rural HIV clinic in Coastal Kenya: a retrospective cohort study. BMC Public Health. 2015;15:1–9.

Tadesse K, Haile F, Hiruy N. Predictors of mortality among patients enrolled on antiretroviral therapy in Aksum hospital, northern Ethiopia: a retrospective cohort study. PLoS ONE. 2014;9(1):e87392.

Bello SI, Itiola OA. Drug adherence amongst tuberculosis patients in the University of Ilorin Teaching Hospital, Ilorin, Nigeria. Afr J Pharm Pharmacol. 2010;4(3):109–14.

Jarrin I, Lumbreras B, Ferrero I, Pérez-Hoyos S, Hurtado I, Hernández-Aguado I. Effect of education on overall and cause-specific mortality in injecting drug users, according to HIV and introduction of HAART. Int J Epidemiol. 2007;36(1):187–94.

Article   CAS   PubMed   Google Scholar  

Seid A, Getie M, Birlie B, Getachew Y. Joint modeling of longitudinal CD4 cell counts and time-to-default from HAART treatment: a comparison of separate and joint models. Electron J Appl Stat Anal. 2014;7(2):292–314.

Kebede MM, Zegeye DT, Zeleke BM. Predictors of CD4 count changes after initiation of antiretroviral treatment in University of Gondar Hospital, Gondar in Ethiopia. Clin Res HIV/AIDS. 2015;1(2):1–5.

Birhan H, Seyoum A, Derebe K, Muche S, Wale M, Sisay S. Joint clinical and socio-demographic determinants of CD4 cell count and body weight in HIV/TB co-infected adult patients on HAART. Sci Afr. 2022;18:e01396.

Girum T, Yasin F, Wasie A, Shumbej T, Bekele F, Zeleke B. The effect of the universal test and treat program on HIV treatment outcomes and patient survival among a cohort of adults taking antiretroviral treatment (ART) in low-income settings of Gurage Zone, South Ethiopia. AIDS Res Therapy. 2020;17:1–9.

Ayalew J, Moges H, Worku A. Identifying factors related to the survival of AIDS patients under the follow-up of antiretroviral therapy (ART): the case of South Wollo. Int J Data Envelopment Anal Oper Res. 2014;1:21–7.

Download references

Acknowledgements

First and foremost, I would like to thank the almighty God for being there with me in every step of my life. Next, I would like to express my grateful and sincere gratitude to my principal advisor Dr. Markos Abiso (PhD).

The authors received no specific funding for this work.

Author information

Authors and affiliations.

Borena Zone Labour and Social Affairs Office, Borena, Oromia, Ethiopia

Galgalo Jaba Nura

Department of Economics, Borena University, Borena, Ethiopia

Kumbi Sara Wario

Department of Statistics, Arba Minch University, Arba Minch, Ethiopia

Markos Abiso Erango

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization: Galgalo Jaba Nura, Markos Abiso Erango.Data curation: Galgalo Jaba Nura, Kumbi Sara Wario. Formal analysis: Galgalo Jaba Nura. Investigation: Galgalo Jaba Nura, Kumbi Sara Wario, Markos Abiso Erango.Methodology: Galgalo Jaba Nura, Markos Abiso Erango.Project administration: Markos Abiso Erango.Software: Galgalo Jaba Nura, Markos Abiso Erango. Supervision: Markos Abiso Erango.Validation: Markos Abiso Erango. Writing – original draft: Kumbi Sara Wario, Markos Abiso Erango. Writing – review & editing: Kumbi Sara Wario, Markos Abiso Erango.

Corresponding author

Correspondence to Galgalo Jaba Nura .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Nura, G.J., Wario, K.S. & Erango, M.A. Determinants of survival time for HIV/AIDS patients in the pastoralist region of Borena: a study at Yabelo General Hospital, South East Ethiopia. AIDS Res Ther 21 , 58 (2024). https://doi.org/10.1186/s12981-024-00644-1

Download citation

Received : 03 July 2024

Accepted : 09 August 2024

Published : 28 August 2024

DOI : https://doi.org/10.1186/s12981-024-00644-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Survival time
  • Yabelo General Hospital

AIDS Research and Therapy

ISSN: 1742-6405

what is variable in research study

IMAGES

  1. 27 Types of Variables in Research and Statistics (2024)

    what is variable in research study

  2. Types of variables in scientific research

    what is variable in research study

  3. Types of Research Variable in Research with Example

    what is variable in research study

  4. 10 Types of Variables in Research: Definitions and Examples

    what is variable in research study

  5. Types Of Variables In Research Ppt

    what is variable in research study

  6. PPT

    what is variable in research study

VIDEO

  1. Variables in Psychological Research

  2. Independent and Dependent Variable -Research Methods -Psychology-with examples:For IB, AS A level 11

  3. Statistics lecture 3, observations, variables, types of variables

  4. What is a variable?: Fundamentals part 1

  5. Binary Logistic Regression Analysis using SmartPLS: How to Run, and Interpret the Results

  6. Variable types, study hypothesis, p-value and hypothesis testing

COMMENTS

  1. Variables in Research

    It is a variable that comes in between the independent and dependent variables and is affected by the independent variable, which then affects the dependent variable. For example, in a study on the relationship between exercise and weight loss, the mediating variable could be metabolism, as exercise can increase metabolism, which can then lead ...

  2. Independent vs. Dependent Variables

    The independent variable is the cause. Its value is independent of other variables in your study. The dependent variable is the effect. Its value depends on changes in the independent variable. Example: Independent and dependent variables. You design a study to test whether changes in room temperature have an effect on math test scores.

  3. Types of Variables in Research & Statistics

    Types of Variables in Research & Statistics | Examples. Published on September 19, 2022 by Rebecca Bevans. Revised on June 21, 2023. In statistical research, a variable is defined as an attribute of an object of study. Choosing which variables to measure is central to good experimental design.

  4. Variables in Research: Breaking Down the ...

    By maintaining consistent control variables, researchers can isolate the effects of the independent variable on the dependent variable, strengthening the validity of the study. Example: In the plant growth study, the researcher might control variables such as soil type, temperature, and water supply to ensure that the observed effects on plant ...

  5. Types of Variables in Research

    In statistical research, a variable is defined as an attribute of an object of study. Choosing which variables to measure is central to good experimental design. Example: Variables If you want to test whether some plant species are more salt-tolerant than others, ...

  6. Variables in Research

    Variables can be categorized based on their role in the study (such as independent and dependent variables), the type of data they represent (quantitative or categorical), and their relationship to other variables (like confounding or control variables). Understanding what constitutes a variable and the various variable types available is a ...

  7. Independent & Dependent Variables (With Examples)

    What is a control variable? In an experimental design, a control variable (or controlled variable) is a variable that is intentionally held constant to ensure it doesn't have an influence on any other variables. As a result, this variable remains unchanged throughout the course of the study. In other words, it's a variable that's not allowed to vary - tough life 🙂

  8. Independent vs Dependent Variables: Definitions & Examples

    The independent variable is the cause and the dependent variable is the effect, that is, independent variables influence dependent variables. In research, a dependent variable is the outcome of interest of the study and the independent variable is the factor that may influence the outcome. Let's explain this with an independent and dependent ...

  9. Organizing Your Social Sciences Research Paper

    Don't feel bad if you are confused about what is the dependent variable and what is the independent variable in social and behavioral sciences research. However, it's important that you learn the difference because framing a study using these variables is a common approach to organizing the elements of a social sciences research study in order ...

  10. Variables in Research

    Variables in Research. The definition of a variable in the context of a research study is some feature with the potential to change, typically one that may influence or reflect a relationship or ...

  11. Variables in Research

    An independent variable is a variable believed to affect the dependent variable. Confounding variables are defined as interference caused by another variable. Read Variables in Research ...

  12. Types of Variables and Commonly Used Statistical Designs

    Suitable statistical design represents a critical factor in permitting inferences from any research or scientific study.[1] Numerous statistical designs are implementable due to the advancement of software available for extensive data analysis.[1] Healthcare providers must possess some statistical knowledge to interpret new studies and provide up-to-date patient care. We present an overview of ...

  13. 27 Types of Variables in Research and Statistics

    18. Predictor Variables. Definition: A predictor variable—also known as independent or explanatory variable—is a variable that is being manipulated in an experiment or study to see how it influences the dependent or response variable. Explanation: In a cause-and-effect relationship, the predictor variable is the cause.

  14. Types of Variables, Descriptive Statistics, and Sample Size

    Variables. What is a variable?[1,2] To put it in very simple terms, a variable is an entity whose value varies.A variable is an essential component of any statistical data. It is a feature of a member of a given sample or population, which is unique, and can differ in quantity or quantity from another member of the same sample or population.

  15. Types of Variables in Psychology Research

    Just as with other types of research, the independent variable in a cognitive psychology study would be the variable that the researchers manipulate. The specific independent variable would vary depending on the specific study, but it might be focused on some aspect of thinking, memory, attention, language, or decision-making.

  16. Variables: Definition, Examples, Types of Variables in Research

    Quantitative Variables. Quantitative variables, also called numeric variables, are those variables that are measured in terms of numbers. A simple example of a quantitative variable is a person's age. Age can take on different values because a person can be 20 years old, 35 years old, and so on.

  17. The Independent Variable vs. Dependent Variable in Research

    The independent variable, often denoted as X, is the variable that is manipulated or controlled by the researcher intentionally. It's the factor that researchers believe may have a causal effect on the dependent variable. In simpler terms, the independent variable is the variable you change or vary in an experiment so you can observe its impact ...

  18. What are independent and dependent variables?

    A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause, while the dependent variable is the supposed effect. A confounding variable is a third variable that influences both the independent and dependent variables.

  19. 10 Types of Variables in Research and Statistics

    Types. Discrete and continuous. Binary, nominal and ordinal. Researchers can further categorize quantitative variables into discrete or continuous types of variables: Discrete: Any numerical variables you can realistically count, such as the coins in your wallet or the money in your savings account.

  20. Independent vs. Dependent Research Variables: Differences

    The number of hours the student studies is the independent variable because nothing directly affects the number of study hours. The grade the student earns in the class is the dependent variable because how much time the student commits to preparing can affect the grade. Related: 23 Research Databases for Professional and Academic Use.

  21. Dependent Variable

    Dependent variable is a variable in a study or experiment that is being measured or observed and is affected by the independent variable. In other words, it is the variable that researchers are interested in understanding, predicting, or explaining based on the changes made to the independent variable.

  22. Research Variables: Types, Uses and Definition of Terms

    The study revealed that as assessed by the sports officiating officials when they are grouped according to the study's variables, the results show a "high level" in all areas.Furthermore, the ...

  23. What Is an Independent Variable? (With Uses and Examples)

    An independent variable is a condition in a research study that causes an effect on a dependent variable. In research, scientists try to understand cause-and-effect relationships between two or more conditions. To identify how specific conditions affect others, researchers define independent and dependent variables.

  24. Understanding Independent and Dependent Variables in Research

    In most studies, the research question is written so that it outlines various aspects of the study, including the population and variables to be studied and the problem the study addresses. 5 When collecting information from respondents in a case study, the researcher can use various question formats to gather the data they need.

  25. Exploratory, pilot study: Treatments accessed by caregivers of children

    Down syndrome is associated with a range of developmental strengths and challenges. The treatment use of individuals with Down syndrome along with associated factors have not yet been determined. In a pilot study to address this issue, we elected to conduct an online survey rather than a classical representative population survey to generate relevant information quickly. An online survey was ...

  26. Determinants of survival time for HIV/AIDS patients in the pastoralist

    Variables of the study The outcome variable for survival analysis is the survival time and/or time to death of patients under follow-up among HIV-infected adults. The predictors included in this study were gender, age, marital status, educational status, place of residence, WHO stages, TB, adherence to ART treatment, functional status, family ...