U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

The PMC website is updating on October 15, 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Cardiopulm Phys Ther J
  • v.20(3); 2009 Sep

Regression Analysis for Prediction: Understanding the Process

Phillip b palmer.

1 Hardin-Simmons University, Department of Physical Therapy, Abilene, TX

Dennis G O'Connell

2 Hardin-Simmons University, Department of Physical Therapy, Abilene, TX

Research related to cardiorespiratory fitness often uses regression analysis in order to predict cardiorespiratory status or future outcomes. Reading these studies can be tedious and difficult unless the reader has a thorough understanding of the processes used in the analysis. This feature seeks to “simplify” the process of regression analysis for prediction in order to help readers understand this type of study more easily. Examples of the use of this statistical technique are provided in order to facilitate better understanding.

INTRODUCTION

Graded, maximal exercise tests that directly measure maximum oxygen consumption (VO 2 max) are impractical in most physical therapy clinics because they require expensive equipment and personnel trained to administer the tests. Performing these tests in the clinic may also require medical supervision; as a result researchers have sought to develop exercise and non-exercise models that would allow clinicians to predict VO 2 max without having to perform direct measurement of oxygen uptake. In most cases, the investigators utilize regression analysis to develop their prediction models.

Regression analysis is a statistical technique for determining the relationship between a single dependent (criterion) variable and one or more independent (predictor) variables. The analysis yields a predicted value for the criterion resulting from a linear combination of the predictors. According to Pedhazur, 15 regression analysis has 2 uses in scientific literature: prediction, including classification, and explanation. The following provides a brief review of the use of regression analysis for prediction. Specific emphasis is given to the selection of the predictor variables (assessing model efficiency and accuracy) and cross-validation (assessing model stability). The discussion is not intended to be exhaustive. For a more thorough explanation of regression analysis, the reader is encouraged to consult one of many books written about this statistical technique (eg, Fox; 5 Kleinbaum, Kupper, & Muller; 12 Pedhazur; 15 and Weisberg 16 ). Examples of the use of regression analysis for prediction are drawn from a study by Bradshaw et al. 3 In this study, the researchers' stated purpose was to develop an equation for prediction of cardiorespiratory fitness (CRF) based on non-exercise (N-EX) data.

SELECTING THE CRITERION (OUTCOME MEASURE)

The first step in regression analysis is to determine the criterion variable. Pedhazur 15 suggests that the criterion have acceptable measurement qualities (ie, reliability and validity). Bradshaw et al 3 used VO 2 max as the criterion of choice for their model and measured it using a maximum graded exercise test (GXT) developed by George. 6 George 6 indicated that his protocol for testing compared favorably with the Bruce protocol in terms of predictive ability and had good test-retest reliability ( ICC = .98 –.99). The American College of Sports Medicine indicates that measurement of VO 2 max is the “gold standard” for measuring cardiorespiratory fitness. 1 These facts support that the criterion selected by Bradshaw et al 3 was appropriate and meets the requirements for acceptable reliability and validity.

SELECTING THE PREDICTORS: MODEL EFFICIENCY

Once the criterion has been selected, predictor variables should be identified (model selection). The aim of model selection is to minimize the number of predictors which account for the maximum variance in the criterion. 15 In other words, the most efficient model maximizes the value of the coefficient of determination ( R 2 ). This coefficient estimates the amount of variance in the criterion score accounted for by a linear combination of the predictor variables. The higher the value is for R 2 , the less error or unexplained variance and, therefore, the better prediction. R 2 is dependent on the multiple correlation coefficient ( R ), which describes the relationship between the observed and predicted criterion scores. If there is no difference between the predicted and observed scores, R equals 1.00. This represents a perfect prediction with no error and no unexplained variance ( R 2 = 1.00). When R equals 0.00, there is no relationship between the predictor(s) and the criterion and no variance in scores has been explained ( R 2 = 0.00). The chosen variables cannot predict the criterion. The goal of model selection is, as stated previously, to develop a model that results in the highest estimated value for R 2 .

According to Pedhazur, 15 the value of R is often overestimated. The reasons for this are beyond the scope of this discussion; however, the degree of overestimation is affected by sample size. The larger the ratio is between the number of predictors and subjects, the larger the overestimation. To account for this, sample sizes should be large and there should be 15 to 30 subjects per predictor. 11 , 15 Of course, the most effective way to determine optimal sample size is through statistical power analysis. 11 , 15

Another method of determining the best model for prediction is to test the significance of adding one or more variables to the model using the partial F-test . This process, which is further discussed by Kleinbaum, Kupper, and Muller, 12 allows for exclusion of predictors that do not contribute significantly to the prediction, allowing determination of the most efficient model of prediction. In general, the partial F-test is similar to the F-test used in analysis of variance. It assesses the statistical significance of the difference between values for R 2 derived from 2 or more prediction models using a subset of the variables from the original equation. For example, Bradshaw et al 3 indicated that all variables contributed significantly to their prediction. Though the researchers do not detail the procedure used, it is highly likely that different models were tested, excluding one or more variables, and the resulting values for R 2 assessed for statistical difference.

Although the techniques discussed above are useful in determining the most efficient model for prediction, theory must be considered in choosing the appropriate variables. Previous research should be examined and predictors selected for which a relationship between the criterion and predictors has been established. 12 , 15

It is clear that Bradshaw et al 3 relied on theory and previous research to determine the variables to use in their prediction equation. The 5 variables they chose for inclusion–gender, age, body mass index (BMI), perceived functional ability (PFA), and physical activity rating (PAR)–had been shown in previous studies to contribute to the prediction of VO 2 max (eg, Heil et al; 8 George, Stone, & Burkett 7 ). These 5 predictors accounted for 87% ( R = .93, R 2 = .87 ) of the variance in the predicted values for VO 2 max. Based on a ratio of 1:20 (predictor:sample size), this estimate of R , and thus R 2 , is not likely to be overestimated. The researchers used changes in the value of R 2 to determine whether to include or exclude these or other variables. They reported that removal of perceived functional ability (PFA) as a variable resulted in a decrease in R from .93 to .89. Without this variable, the remaining 4 predictors would account for only 79% of the variance in VO 2 max. The investigators did note that each predictor variable contributed significantly ( p < .05 ) to the prediction of VO 2 max (see above discussion related to the partial F-test).

ASSESSING ACCURACY OF THE PREDICTION

Assessing accuracy of the model is best accomplished by analyzing the standard error of estimate ( SEE ) and the percentage that the SEE represents of the predicted mean ( SEE % ). The SEE represents the degree to which the predicted scores vary from the observed scores on the criterion measure, similar to the standard deviation used in other statistical procedures. According to Jackson, 10 lower values of the SEE indicate greater accuracy in prediction. Comparison of the SEE for different models using the same sample allows for determination of the most accurate model to use for prediction. SEE % is calculated by dividing the SEE by the mean of the criterion ( SEE /mean criterion) and can be used to compare different models derived from different samples.

Bradshaw et al 3 report a SEE of 3.44 mL·kg −1 ·min −1 (approximately 1 MET) using all 5 variables in the equation (gender, age, BMI, PFA, PA-R). When the PFA variable is removed from the model, leaving only 4 variables for the prediction (gender, age, BMI, PA-R), the SEE increases to 4.20 mL·kg −1 ·min −1 . The increase in the error term indicates that the model excluding PFA is less accurate in predicting VO 2 max. This is confirmed by the decrease in the value for R (see discussion above). The researchers compare their model of prediction with that of George, Stone, and Burkett, 7 indicating that their model is as accurate. It is not advisable to compare models based on the SEE if the data were collected from different samples as they were in these 2 studies. That type of comparison should be made using SEE %. Bradshaw and colleagues 3 report SEE % for their model (8.62%), but do not report values from other models in making comparisons.

Some advocate the use of statistics derived from the predicted residual sum of squares ( PRESS ) as a means of selecting predictors. 2 , 4 , 16 These statistics are used more often in cross-validation of models and will be discussed in greater detail later.

ASSESSING STABILITY OF THE MODEL FOR PREDICTION

Once the most efficient and accurate model for prediction has been determined, it is prudent that the model be assessed for stability. A model, or equation, is said to be “stable” if it can be applied to different samples from the same population without losing the accuracy of the prediction. This is accomplished through cross-validation of the model. Cross-validation determines how well the prediction model developed using one sample performs in another sample from the same population. Several methods can be employed for cross-validation, including the use of 2 independent samples, split samples, and PRESS -related statistics developed from the same sample.

Using 2 independent samples involves random selection of 2 groups from the same population. One group becomes the “training” or “exploratory” group used for establishing the model of prediction. 5 The second group, the “confirmatory” or “validatory” group is used to assess the model for stability. The researcher compares R 2 values from the 2 groups and assessment of “shrinkage,” the difference between the two values for R 2 , is used as an indicator of model stability. There is no rule of thumb for interpreting the differences, but Kleinbaum, Kupper, and Muller 12 suggest that “shrinkage” values of less than 0.10 indicate a stable model. While preferable, the use of independent samples is rarely used due to cost considerations.

A similar technique of cross-validation uses split samples. Once the sample has been selected from the population, it is randomly divided into 2 subgroups. One subgroup becomes the “exploratory” group and the other is used as the “validatory” group. Again, values for R 2 are compared and model stability is assessed by calculating “shrinkage.”

Holiday, Ballard, and McKeown 9 advocate the use of PRESS-related statistics for cross-validation of regression models as a means of dealing with the problems of data-splitting. The PRESS method is a jackknife analysis that is used to address the issue of estimate bias associated with the use of small sample sizes. 13 In general, a jackknife analysis calculates the desired test statistic multiple times with individual cases omitted from the calculations. In the case of the PRESS method, residuals, or the differences between the actual values of the criterion for each individual and the predicted value using the formula derived with the individual's data removed from the prediction, are calculated. The PRESS statistic is the sum of the squares of the residuals derived from these calculations and is similar to the sum of squares for the error (SS error ) used in analysis of variance (ANOVA). Myers 14 discusses the use of the PRESS statistic and describes in detail how it is calculated. The reader is referred to this text and the article by Holiday, Ballard, and McKeown 9 for additional information.

Once determined, the PRESS statistic can be used to calculate a modified form of R 2 and the SEE . R 2 PRESS is calculated using the following formula: R 2 PRESS = 1 – [ PRESS / SS total ], where SS total equals the sum of squares for the original regression equation. 14 Standard error of the estimate for PRESS ( SEE PRESS ) is calculated as follows: SEE PRESS =, where n equals the number of individual cases. 14 The smaller the difference between the 2 values for R 2 and SEE , the more stable the model for prediction. Bradshaw et al 3 used this technique in their investigation. They reported a value for R 2 PRESS of .83, a decrease of .04 from R 2 for their prediction model. Using the standard set by Kleinbaum, Kupper, and Muller, 12 the model developed by these researchers would appear to have stability, meaning it could be used for prediction in samples from the same population. This is further supported by the small difference between the SEE and the SEE PRESS , 3.44 and 3.63 mL·kg −1 ·min −1 , respectively.

COMPARING TWO DIFFERENT PREDICTION MODELS

A comparison of 2 different models for prediction may help to clarify the use of regression analysis in prediction. Table ​ Table1 1 presents data from 2 studies and will be used in the following discussion.

Comparison of Two Non-exercise Models for Predicting CRF

VariablesHeil et al = 374Bradshaw et al = 100
Intercept36.58048.073
Gender (male = 1, female = 0)3.7066.178
Age (years)0.558−0.246
Age −7.81 E-3
Percent body fat−0.541
Body mass index (kg-m )−0.619
Activity code (0-7)1.347
Physical activity rating (0–10)0.671
Perceived functional abilty0.712
)
.88 (.77).93 (.87)
4.90·mL–kg ·min 3.44 mL·kg min
12.7%8.6%

As noted above, the first step is to select an appropriate criterion, or outcome measure. Bradshaw et al 3 selected VO 2 max as their criterion for measuring cardiorespiratory fitness. Heil et al 8 used VO 2 peak. These 2 measures are often considered to be the same, however, VO 2 peak assumes that conditions for measuring maximum oxygen consumption were not met. 17 It would be optimal to compare models based on the same criterion, but that is not essential, especially since both criteria measure cardiorespiratory fitness in much the same way.

The second step involves selection of variables for prediction. As can be seen in Table ​ Table1, 1 , both groups of investigators selected 5 variables to use in their model. The 5 variables selected by Bradshaw et al 3 provide a better prediction based on the values for R 2 (.87 and .77), indicating that their model accounts for more variance (87% versus 77%) in the prediction than the model of Heil et al. 8 It should also be noted that the SEE calculated in the Bradshaw 3 model (3.44 mL·kg −1 ·min −1 ) is less than that reported by Heil et al 8 (4.90 mL·kg −1 ·min −1 ). Remember, however, that comparison of the SEE should only be made when both models are developed using samples from the same population. Comparing predictions developed from different populations can be accomplished using the SEE% . Review of values for the SEE% in Table ​ Table1 1 would seem to indicate that the model developed by Bradshaw et al 3 is more accurate because the percentage of the mean value for VO 2 max represented by error is less than that reported by Heil et al. 8 In summary, the Bradshaw 3 model would appear to be more efficient, accounting for more variance in the prediction using the same number of variables. It would also appear to be more accurate based on comparison of the SEE% .

The 2 models cannot be compared based on stability of the models. Each set of researchers used different methods for cross-validation. Both models, however, appear to be relatively stable based on the data presented. A clinician can assume that either model would perform fairly well when applied to samples from the same populations as those used by the investigators.

The purpose of this brief review has been to demystify regression analysis for prediction by explaining it in simple terms and to demonstrate its use. When reviewing research articles in which regression analysis has been used for prediction, physical therapists should ensure that the: (1) criterion chosen for the study is appropriate and meets the standards for reliability and validity, (2) processes used by the investigators to assess both model efficiency and accuracy are appropriate, 3) predictors selected for use in the model are reasonable based on theory or previous research, and 4) investigators assessed model stability through a process of cross-validation, providing the opportunity for others to utilize the prediction model in different samples drawn from the same population.

Multiple Regression Analysis Example with Conceptual Framework

Data analysis using multiple regression analysis is a fairly common tool used in statistics. Many graduate students find this too complicated to understand. However, this is not that difficult to do, especially with computers as everyday household items nowadays. You can now quickly analyze more than just two sets of variables in your research using multiple regression analysis. 

Multiple regression is often confused with multivariate regression. Multivariate regression, while also using several variables, deals with more than one dependent variable . Karen Grace-Martin clearly explains the difference in her post on the difference between the Multiple Regression Model and Multivariate Regression Model .

Table of Contents

Statistical software applications used in computing multiple regression analysis.

Using multiple regression analysis requires a dedicated statistical software like the popular  Statistical Package for the Social Sciences (SPSS) , Statistica, Microstat, and open-source statistical software applications like SOFA statistics and Jasp, among other sophisticated statistical packages.

However, a standard spreadsheet application like Microsoft Excel can help you compute and model the relationship between the dependent variable and a set of predictor or independent variables. But you cannot do this without activating first the setting of statistical tools that ship with MS Excel.

Activating MS Excel

Multiple Regression Analysis Example

The study pertains to identifying the factors predicting a current problem among high school students, the long hours they spend online for a variety of reasons. The purpose is to address many parents’ concerns about their difficulty of weaning their children away from the lures of online gaming, social networking, and other engaging virtual activities.

Review of Literature on Internet Use and Its Effect on Children

Given that there is a need to use a computer to analyze multiple variable data, a principal who is nearing retirement was “forced” to buy a laptop, as she had none. Anyhow, she is very much open-minded and performed the class activities that require data analysis with much enthusiasm.

The Research on High School Students’ Use of the Internet

“Is there a significant relationship between the total number of hours spent online and the students’ age, gender, relationship with their mother, and relationship with their father?”

Although many studies have identified factors that influence the use of the internet, it is standard practice to include the respondents’ profile among the set of predictor or independent variables. Hence, the standard variables age and gender are included in the multiple regression analysis.

Findings of the Research Using Multiple Regression Analysis

The number of hours spent online relates significantly to the number of hours spent by a parent, specifically the mother, with her child. These two factors are inversely or negatively correlated.

The number of hours spent by the children online relates significantly to the mother’s number of hours interacting with their children.

But establishing a close bond between mother and child is a good start. Undertaking more investigations along this research concern will help strengthen the findings of this study.

Thus, this example of a research using multiple regression analysis streamlines solutions and focuses on those influential factors that must be given attention.

Related Posts

Statistical research questions: five examples for quantitative analysis, 18 species of insects from a ceiling lamp, how to write a good abstract: four essential elements with example, about the author, patrick regoniel.

the example is good but lacks the table of regression results. With the tables, a student could learn more on how to interpret regression results

SimplyEducate.Me Privacy Policy

Regression Analysis

  • Reference work entry
  • First Online: 03 December 2021
  • Cite this reference work entry

regression analysis research paper sample

  • Bernd Skiera 4 ,
  • Jochen Reiner 4 &
  • Sönke Albers 5  

8599 Accesses

6 Citations

Linear regression analysis is one of the most important statistical methods. It examines the linear relationship between a metric-scaled dependent variable (also called endogenous, explained, response, or predicted variable) and one or more metric-scaled independent variables (also called exogenous, explanatory, control, or predictor variable). We illustrate how regression analysis work and how it supports marketing decisions, e.g., the derivation of an optimal marketing mix. We also outline how to use linear regression analysis to estimate nonlinear functions such as a multiplicative sales response function. Furthermore, we show how to use the results of a regression to calculate elasticities and to identify outliers and discuss in details the problems that occur in case of autocorrelation, multicollinearity and heteroscedasticity. We use a numerical example to illustrate in detail all calculations and use this numerical example to outline the problems that occur in case of endogeneity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

regression analysis research paper sample

Estimation and Testing

Albers, S. (2012). Optimizable and implementable aggregate response modeling for marketing decision support. International Journal of Research in Marketing, 29 (2), 111–122.

Article   Google Scholar  

Albers, S., Mantrala, M. K., & Sridhar, S. (2010). Personal selling elasticities: A meta-analysis. Journal of Marketing Research, 47 (5), 840–853.

Assmus, G., Farley, J. W., & Lehmann, D. R. (1984). How advertising affects sales: A meta-analysis of econometric results. Journal of Marketing Research, 21 (1), 65–74.

Bijmolt, T. H. A., van Heerde, H., & Pieters, R. G. M. (2005). New empirical generalizations on the determinants of price elasticity. Journal of Marketing Research, 42 (2), 141–156.

Chatterjee, S., & Hadi, A. S. (1986). Influential observations, high leverage points, and outliers in linear regressions. Statistical Science, 1 (3), 379–416.

Google Scholar  

Greene, W. H. (2008). Econometric analysis (6th ed.). Upper Saddle River: Pearson.

Gujarati, D. N. (2003). Basic econometrics (4th ed.). New York: McGraw Hill.

Hair, J. F., Black, W. C., Babin, J. B., & Anderson, R. E. (2014). Multivariate data analysis (7th ed.). Upper Saddle River: Pearson.

Hair, J. F., Hult, G. T. M., Ringle, C. M., & Sarstedt, M. (2017). A primer on partial least squares structural equation modeling (PLS-SEM) (2nd ed.). Thousand Oaks: Sage.

Hanssens, D. M., Parsons, L. J., & Schultz, R. L. (1990). Market response models: Econometric and time series analysis . Boston: Springer.

Hsiao, C. (2014). Analysis of panel data (3rd ed.). Cambridge: Cambridge University Press.

Book   Google Scholar  

Irwin, J. R., & McClelland, G. H. (2001). Misleading heuristics and moderated multiple regression models. Journal of Marketing Research, 38 (1), 100–109.

Koutsoyiannis, A. (1977). Theory of econometrics (2nd ed.). Houndmills: MacMillan.

Laurent, G. (2013). EMAC distinguished marketing scholar 2012: Respect the data! International Journal of Research in Marketing, 30 (4), 323–334.

Leeflang, P. S. H., Wittink, D. R., Wedel, M., & Neart, P. A. (2000). Building models for marketing decisions . Berlin: Kluwer.

Lodish, L. L., Abraham, M. M., Kalmenson, S., Livelsberger, J., Lubetkin, B., Richardson, B., & Stevens, M. E. (1995). How TV advertising works: A meta-analysis of 389 real world split cable T. V. advertising experiments. Journal of Marketing Research, 32 (2), 125–139.

Pindyck, R. S., & Rubenfeld, D. (1998). Econometric models and econometric forecasts (4th ed.). New York: McGraw-Hill.

Sethuraman, R., Tellis, G. J., & Briesch, R. A. (2011). How well does advertising work? Generalizations from meta-analysis of brand advertising elasticities. Journal of Marketing Research, 48 (3), 457–471.

Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling (2nd ed.). London: Sage.

Stock, J., & Watson, M. (2015). Introduction to econometrics (3rd ed.). Upper Saddle River: Pearson.

Tellis, G. J. (1988). The price sensitivity of selective demand: A meta-analysis of econometric models of sales. Journal of Marketing Research, 25 (4), 391–404.

White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica, 48 (4), 817–838.

Wooldridge, J. M. (2009). Introductory econometrics: A modern approach (4th ed.). Mason: South-Western Cengage.

Download references

Author information

Authors and affiliations.

Goethe University Frankfurt, Frankfurt, Germany

Bernd Skiera & Jochen Reiner

Kuehne Logistics University, Hamburg, Germany

Sönke Albers

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Bernd Skiera .

Editor information

Editors and affiliations.

Department of Business-to-Business Marketing, Sales, and Pricing, University of Mannheim, Mannheim, Germany

Christian Homburg

Department of Marketing & Sales Research Group, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

Martin Klarmann

Marketing & Sales Department, University of Mannheim, Mannheim, Germany

Arnd Vomberg

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this entry

Cite this entry.

Skiera, B., Reiner, J., Albers, S. (2022). Regression Analysis. In: Homburg, C., Klarmann, M., Vomberg, A. (eds) Handbook of Market Research. Springer, Cham. https://doi.org/10.1007/978-3-319-57413-4_17

Download citation

DOI : https://doi.org/10.1007/978-3-319-57413-4_17

Published : 03 December 2021

Publisher Name : Springer, Cham

Print ISBN : 978-3-319-57411-0

Online ISBN : 978-3-319-57413-4

eBook Packages : Business and Management Reference Module Humanities and Social Sciences Reference Module Business, Economics and Social Sciences

Share this entry

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • Privacy Policy

Research Method

Home » Regression Analysis – Methods, Types and Examples

Regression Analysis – Methods, Types and Examples

Table of Contents

Regression Analysis

Regression Analysis

Regression analysis is a set of statistical processes for estimating the relationships among variables . It includes many techniques for modeling and analyzing several variables when the focus is on the relationship between a dependent variable and one or more independent variables (or ‘predictors’).

Regression Analysis Methodology

Here is a general methodology for performing regression analysis:

  • Define the research question: Clearly state the research question or hypothesis you want to investigate. Identify the dependent variable (also called the response variable or outcome variable) and the independent variables (also called predictor variables or explanatory variables) that you believe are related to the dependent variable.
  • Collect data: Gather the data for the dependent variable and independent variables. Ensure that the data is relevant, accurate, and representative of the population or phenomenon you are studying.
  • Explore the data: Perform exploratory data analysis to understand the characteristics of the data, identify any missing values or outliers, and assess the relationships between variables through scatter plots, histograms, or summary statistics.
  • Choose the regression model: Select an appropriate regression model based on the nature of the variables and the research question. Common regression models include linear regression, multiple regression, logistic regression, polynomial regression, and time series regression, among others.
  • Assess assumptions: Check the assumptions of the regression model. Some common assumptions include linearity (the relationship between variables is linear), independence of errors, homoscedasticity (constant variance of errors), and normality of errors. Violation of these assumptions may require additional steps or alternative models.
  • Estimate the model: Use a suitable method to estimate the parameters of the regression model. The most common method is ordinary least squares (OLS), which minimizes the sum of squared differences between the observed and predicted values of the dependent variable.
  • I nterpret the results: Analyze the estimated coefficients, p-values, confidence intervals, and goodness-of-fit measures (e.g., R-squared) to interpret the results. Determine the significance and direction of the relationships between the independent variables and the dependent variable.
  • Evaluate model performance: Assess the overall performance of the regression model using appropriate measures, such as R-squared, adjusted R-squared, and root mean squared error (RMSE). These measures indicate how well the model fits the data and how much of the variation in the dependent variable is explained by the independent variables.
  • Test assumptions and diagnose problems: Check the residuals (the differences between observed and predicted values) for any patterns or deviations from assumptions. Conduct diagnostic tests, such as examining residual plots, testing for multicollinearity among independent variables, and assessing heteroscedasticity or autocorrelation, if applicable.
  • Make predictions and draw conclusions: Once you have a satisfactory model, use it to make predictions on new or unseen data. Draw conclusions based on the results of the analysis, considering the limitations and potential implications of the findings.

Types of Regression Analysis

Types of Regression Analysis are as follows:

Linear Regression

Linear regression is the most basic and widely used form of regression analysis. It models the linear relationship between a dependent variable and one or more independent variables. The goal is to find the best-fitting line that minimizes the sum of squared differences between observed and predicted values.

Multiple Regression

Multiple regression extends linear regression by incorporating two or more independent variables to predict the dependent variable. It allows for examining the simultaneous effects of multiple predictors on the outcome variable.

Polynomial Regression

Polynomial regression models non-linear relationships between variables by adding polynomial terms (e.g., squared or cubic terms) to the regression equation. It can capture curved or nonlinear patterns in the data.

Logistic Regression

Logistic regression is used when the dependent variable is binary or categorical. It models the probability of the occurrence of a certain event or outcome based on the independent variables. Logistic regression estimates the coefficients using the logistic function, which transforms the linear combination of predictors into a probability.

Ridge Regression and Lasso Regression

Ridge regression and Lasso regression are techniques used for addressing multicollinearity (high correlation between independent variables) and variable selection. Both methods introduce a penalty term to the regression equation to shrink or eliminate less important variables. Ridge regression uses L2 regularization, while Lasso regression uses L1 regularization.

Time Series Regression

Time series regression analyzes the relationship between a dependent variable and independent variables when the data is collected over time. It accounts for autocorrelation and trends in the data and is used in forecasting and studying temporal relationships.

Nonlinear Regression

Nonlinear regression models are used when the relationship between the dependent variable and independent variables is not linear. These models can take various functional forms and require estimation techniques different from those used in linear regression.

Poisson Regression

Poisson regression is employed when the dependent variable represents count data. It models the relationship between the independent variables and the expected count, assuming a Poisson distribution for the dependent variable.

Generalized Linear Models (GLM)

GLMs are a flexible class of regression models that extend the linear regression framework to handle different types of dependent variables, including binary, count, and continuous variables. GLMs incorporate various probability distributions and link functions.

Regression Analysis Formulas

Regression analysis involves estimating the parameters of a regression model to describe the relationship between the dependent variable (Y) and one or more independent variables (X). Here are the basic formulas for linear regression, multiple regression, and logistic regression:

Linear Regression:

Simple Linear Regression Model: Y = β0 + β1X + ε

Multiple Linear Regression Model: Y = β0 + β1X1 + β2X2 + … + βnXn + ε

In both formulas:

  • Y represents the dependent variable (response variable).
  • X represents the independent variable(s) (predictor variable(s)).
  • β0, β1, β2, …, βn are the regression coefficients or parameters that need to be estimated.
  • ε represents the error term or residual (the difference between the observed and predicted values).

Multiple Regression:

Multiple regression extends the concept of simple linear regression by including multiple independent variables.

Multiple Regression Model: Y = β0 + β1X1 + β2X2 + … + βnXn + ε

The formulas are similar to those in linear regression, with the addition of more independent variables.

Logistic Regression:

Logistic regression is used when the dependent variable is binary or categorical. The logistic regression model applies a logistic or sigmoid function to the linear combination of the independent variables.

Logistic Regression Model: p = 1 / (1 + e^-(β0 + β1X1 + β2X2 + … + βnXn))

In the formula:

  • p represents the probability of the event occurring (e.g., the probability of success or belonging to a certain category).
  • X1, X2, …, Xn represent the independent variables.
  • e is the base of the natural logarithm.

The logistic function ensures that the predicted probabilities lie between 0 and 1, allowing for binary classification.

Regression Analysis Examples

Regression Analysis Examples are as follows:

  • Stock Market Prediction: Regression analysis can be used to predict stock prices based on various factors such as historical prices, trading volume, news sentiment, and economic indicators. Traders and investors can use this analysis to make informed decisions about buying or selling stocks.
  • Demand Forecasting: In retail and e-commerce, real-time It can help forecast demand for products. By analyzing historical sales data along with real-time data such as website traffic, promotional activities, and market trends, businesses can adjust their inventory levels and production schedules to meet customer demand more effectively.
  • Energy Load Forecasting: Utility companies often use real-time regression analysis to forecast electricity demand. By analyzing historical energy consumption data, weather conditions, and other relevant factors, they can predict future energy loads. This information helps them optimize power generation and distribution, ensuring a stable and efficient energy supply.
  • Online Advertising Performance: It can be used to assess the performance of online advertising campaigns. By analyzing real-time data on ad impressions, click-through rates, conversion rates, and other metrics, advertisers can adjust their targeting, messaging, and ad placement strategies to maximize their return on investment.
  • Predictive Maintenance: Regression analysis can be applied to predict equipment failures or maintenance needs. By continuously monitoring sensor data from machines or vehicles, regression models can identify patterns or anomalies that indicate potential failures. This enables proactive maintenance, reducing downtime and optimizing maintenance schedules.
  • Financial Risk Assessment: Real-time regression analysis can help financial institutions assess the risk associated with lending or investment decisions. By analyzing real-time data on factors such as borrower financials, market conditions, and macroeconomic indicators, regression models can estimate the likelihood of default or assess the risk-return tradeoff for investment portfolios.

Importance of Regression Analysis

Importance of Regression Analysis is as follows:

  • Relationship Identification: Regression analysis helps in identifying and quantifying the relationship between a dependent variable and one or more independent variables. It allows us to determine how changes in independent variables impact the dependent variable. This information is crucial for decision-making, planning, and forecasting.
  • Prediction and Forecasting: Regression analysis enables us to make predictions and forecasts based on the relationships identified. By estimating the values of the dependent variable using known values of independent variables, regression models can provide valuable insights into future outcomes. This is particularly useful in business, economics, finance, and other fields where forecasting is vital for planning and strategy development.
  • Causality Assessment: While correlation does not imply causation, regression analysis provides a framework for assessing causality by considering the direction and strength of the relationship between variables. It allows researchers to control for other factors and assess the impact of a specific independent variable on the dependent variable. This helps in determining the causal effect and identifying significant factors that influence outcomes.
  • Model Building and Variable Selection: Regression analysis aids in model building by determining the most appropriate functional form of the relationship between variables. It helps researchers select relevant independent variables and eliminate irrelevant ones, reducing complexity and improving model accuracy. This process is crucial for creating robust and interpretable models.
  • Hypothesis Testing: Regression analysis provides a statistical framework for hypothesis testing. Researchers can test the significance of individual coefficients, assess the overall model fit, and determine if the relationship between variables is statistically significant. This allows for rigorous analysis and validation of research hypotheses.
  • Policy Evaluation and Decision-Making: Regression analysis plays a vital role in policy evaluation and decision-making processes. By analyzing historical data, researchers can evaluate the effectiveness of policy interventions and identify the key factors contributing to certain outcomes. This information helps policymakers make informed decisions, allocate resources effectively, and optimize policy implementation.
  • Risk Assessment and Control: Regression analysis can be used for risk assessment and control purposes. By analyzing historical data, organizations can identify risk factors and develop models that predict the likelihood of certain outcomes, such as defaults, accidents, or failures. This enables proactive risk management, allowing organizations to take preventive measures and mitigate potential risks.

When to Use Regression Analysis

  • Prediction : Regression analysis is often employed to predict the value of the dependent variable based on the values of independent variables. For example, you might use regression to predict sales based on advertising expenditure, or to predict a student’s academic performance based on variables like study time, attendance, and previous grades.
  • Relationship analysis: Regression can help determine the strength and direction of the relationship between variables. It can be used to examine whether there is a linear association between variables, identify which independent variables have a significant impact on the dependent variable, and quantify the magnitude of those effects.
  • Causal inference: Regression analysis can be used to explore cause-and-effect relationships by controlling for other variables. For example, in a medical study, you might use regression to determine the impact of a specific treatment while accounting for other factors like age, gender, and lifestyle.
  • Forecasting : Regression models can be utilized to forecast future trends or outcomes. By fitting a regression model to historical data, you can make predictions about future values of the dependent variable based on changes in the independent variables.
  • Model evaluation: Regression analysis can be used to evaluate the performance of a model or test the significance of variables. You can assess how well the model fits the data, determine if additional variables improve the model’s predictive power, or test the statistical significance of coefficients.
  • Data exploration : Regression analysis can help uncover patterns and insights in the data. By examining the relationships between variables, you can gain a deeper understanding of the data set and identify potential patterns, outliers, or influential observations.

Applications of Regression Analysis

Here are some common applications of regression analysis:

  • Economic Forecasting: Regression analysis is frequently employed in economics to forecast variables such as GDP growth, inflation rates, or stock market performance. By analyzing historical data and identifying the underlying relationships, economists can make predictions about future economic conditions.
  • Financial Analysis: Regression analysis plays a crucial role in financial analysis, such as predicting stock prices or evaluating the impact of financial factors on company performance. It helps analysts understand how variables like interest rates, company earnings, or market indices influence financial outcomes.
  • Marketing Research: Regression analysis helps marketers understand consumer behavior and make data-driven decisions. It can be used to predict sales based on advertising expenditures, pricing strategies, or demographic variables. Regression models provide insights into which marketing efforts are most effective and help optimize marketing campaigns.
  • Health Sciences: Regression analysis is extensively used in medical research and public health studies. It helps examine the relationship between risk factors and health outcomes, such as the impact of smoking on lung cancer or the relationship between diet and heart disease. Regression analysis also helps in predicting health outcomes based on various factors like age, genetic markers, or lifestyle choices.
  • Social Sciences: Regression analysis is widely used in social sciences like sociology, psychology, and education research. Researchers can investigate the impact of variables like income, education level, or social factors on various outcomes such as crime rates, academic performance, or job satisfaction.
  • Operations Research: Regression analysis is applied in operations research to optimize processes and improve efficiency. For example, it can be used to predict demand based on historical sales data, determine the factors influencing production output, or optimize supply chain logistics.
  • Environmental Studies: Regression analysis helps in understanding and predicting environmental phenomena. It can be used to analyze the impact of factors like temperature, pollution levels, or land use patterns on phenomena such as species diversity, water quality, or climate change.
  • Sports Analytics: Regression analysis is increasingly used in sports analytics to gain insights into player performance, team strategies, and game outcomes. It helps analyze the relationship between various factors like player statistics, coaching strategies, or environmental conditions and their impact on game outcomes.

Advantages and Disadvantages of Regression Analysis

Advantages of Regression AnalysisDisadvantages of Regression Analysis
Provides a quantitative measure of the relationship between variablesAssumes a linear relationship between variables, which may not always hold true
Helps in predicting and forecasting outcomes based on historical dataRequires a large sample size to produce reliable results
Identifies and measures the significance of independent variables on the dependent variableAssumes no multicollinearity, meaning that independent variables should not be highly correlated with each other
Provides estimates of the coefficients that represent the strength and direction of the relationship between variablesAssumes the absence of outliers or influential data points
Allows for hypothesis testing to determine the statistical significance of the relationshipCan be sensitive to the inclusion or exclusion of certain variables, leading to different results
Can handle both continuous and categorical variablesAssumes the independence of observations, which may not hold true in some cases
Offers a visual representation of the relationship through the use of scatter plots and regression linesMay not capture complex non-linear relationships between variables without appropriate transformations
Provides insights into the marginal effects of independent variables on the dependent variableRequires the assumption of homoscedasticity, meaning that the variance of errors is constant across all levels of the independent variables

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Cluster Analysis

Cluster Analysis – Types, Methods and Examples

Narrative Analysis

Narrative Analysis – Types, Methods and Examples

Bimodal Histogram

Bimodal Histogram – Definition, Examples

Textual Analysis

Textual Analysis – Types, Examples and Guide

Correlation Analysis

Correlation Analysis – Types, Methods and...

Multidimensional Scaling

Multidimensional Scaling – Types, Formulas and...

A Study on Multiple Linear Regression Analysis

  • December 2013
  • Procedia - Social and Behavioral Sciences 106:234–240
  • 106:234–240
  • CC BY-NC-ND 3.0

Gülden Kaya Uyanık at Sakarya University

  • Sakarya University

Neşe Güler at Sakarya University

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

C. A. Obianefo

  • Akuchinyere C. Nwigwe
  • Chijindu E. Nwankwo
  • Eko Sarwono
  • Martin Choirul Fatah
  • I Made Indradjaja Marcus Brunner Brunner

Aleksandar Senić

  • Momčilo Dobrodolac

Zoran Stojadinovic

  • Neha Sharma
  • Miguel A. Sanz-Bobi

Divya Ghildyal

  • Kazi Masuk Elahi
  • Nabil Mohammad Chowdhury
  • Mohammad Rejaul Haque
  • Tahmid Sadi
  • David Wambua Makau
  • Dr. Susan Nzioki
  • Dr. Ann Kalei
  • Zhang Antong
  • Venilla Manikanta

Haseeb Hasainar

  • Nihal K. Badiger

Vinoth Srinivasan

  • Linda S. Fidell
  • Ş Büyüköztürk
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • Write for Us
  • BMJ Journals

You are here

  • Volume 24, Issue 4
  • Understanding and interpreting regression analysis
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0002-7839-8130 Parveen Ali 1 , 2 ,
  • http://orcid.org/0000-0003-0157-5319 Ahtisham Younas 3 , 4
  • 1 School of Nursing and Midwifery , University of Sheffield , Sheffield , South Yorkshire , UK
  • 2 Sheffiled University Interpersonal Violence Research Group , The University of Sheffiled SEAS , Sheffield , UK
  • 3 Faculty of Nursing , Memorial University of Newfoundland , St. John's , Newfoundland and Labrador , Canada
  • 4 Swat College of Nursing , Mingora, Swat , Pakistan
  • Correspondence to Ahtisham Younas, Memorial University of Newfoundland, St. John's, NL A1C 5S7, Canada; ay6133{at}mun.ca

https://doi.org/10.1136/ebnurs-2021-103425

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

  • statistics & research methods

Introduction

A nurse educator is interested in finding out the academic and non-academic predictors of success in nursing students. Given the complexity of educational and clinical learning environments, demographic, clinical and academic factors (age, gender, previous educational training, personal stressors, learning demands, motivation, assignment workload, etc) influencing nursing students’ success, she was able to list various potential factors contributing towards success relatively easily. Nevertheless, not all of the identified factors will be plausible predictors of increased success. Therefore, she could use a powerful statistical procedure called regression analysis to identify whether the likelihood of increased success is influenced by factors such as age, stressors, learning demands, motivation and education.

What is regression?

Purposes of regression analysis.

Regression analysis has four primary purposes: description, estimation, prediction and control. 1 , 2 By description, regression can explain the relationship between dependent and independent variables. Estimation means that by using the observed values of independent variables, the value of dependent variable can be estimated. 2 Regression analysis can be useful for predicting the outcomes and changes in dependent variables based on the relationships of dependent and independent variables. Finally, regression enables in controlling the effect of one or more independent variables while investigating the relationship of one independent variable with the dependent variable. 1

Types of regression analyses

There are commonly three types of regression analyses, namely, linear, logistic and multiple regression. The differences among these types are outlined in table 1 in terms of their purpose, nature of dependent and independent variables, underlying assumptions, and nature of curve. 1 , 3 However, more detailed discussion for linear regression is presented as follows.

  • View inline

Comparison of linear, logistic and multiple regression

Linear regression and interpretation

Linear regression analysis involves examining the relationship between one independent and dependent variable. Statistically, the relationship between one independent variable (x) and a dependent variable (y) is expressed as: y= β 0 + β 1 x+ε. In this equation, β 0 is the y intercept and refers to the estimated value of y when x is equal to 0. The coefficient β 1 is the regression coefficient and denotes that the estimated increase in the dependent variable for every unit increase in the independent variable. The symbol ε is a random error component and signifies imprecision of regression indicating that, in actual practice, the independent variables are cannot perfectly predict the change in any dependent variable. 1 Multiple linear regression follows the same logic as univariate linear regression except (a) multiple regression, there are more than one independent variable and (b) there should be non-collinearity among the independent variables.

Factors affecting regression

Linear and multiple regression analyses are affected by factors, namely, sample size, missing data and the nature of sample. 2

Small sample size may only demonstrate connections among variables with strong relationship. Therefore, sample size must be chosen based on the number of independent variables and expect strength of relationship.

Many missing values in the data set may affect the sample size. Therefore, all the missing values should be adequately dealt with before conducting regression analyses.

The subsamples within the larger sample may mask the actual effect of independent and dependent variables. Therefore, if subsamples are predefined, a regression within the sample could be used to detect true relationships. Otherwise, the analysis should be undertaken on the whole sample.

Building on her research interest mentioned in the beginning, let us consider a study by Ali and Naylor. 4 They were interested in identifying the academic and non-academic factors which predict the academic success of nursing diploma students. This purpose is consistent with one of the above-mentioned purposes of regression analysis (ie, prediction). Ali and Naylor’s chosen academic independent variables were preadmission qualification, previous academic performance and school type and the non-academic variables were age, gender, marital status and time gap. To achieve their purpose, they collected data from 628 nursing students between the age range of 15–34 years. They used both linear and multiple regression analyses to identify the predictors of student success. For analysis, they examined the relationship of academic and non-academic variables across different years of study and noted that academic factors accounted for 36.6%, 44.3% and 50.4% variability in academic success of students in year 1, year 2 and year 3, respectively. 4

Ali and Naylor presented the relationship among these variables using scatter plots, which are commonly used graphs for data display in regression analysis—see examples of various scatter plots in figure 1 . 4 In a scatter plot, the clustering of the dots denoted the strength of relationship, whereas the direction indicates the nature of relationships among variables as positive (ie, increase in one variable results in an increase in the other) and negative (ie, increase in one variable results in decrease in the other).

  • Download figure
  • Open in new tab
  • Download powerpoint

An Example of Scatter Plot for Regression.

Table 2 presents the results of regression analysis for academic and non-academic variables for year 4 students’ success. The significant predictors of student success are denoted with a significant p value. For every, significant predictor, the beta value indicates the percentage increase in students’ academic success with one unit increase in the variable.

Regression model for the final year students (N=343)

Conclusions

Regression analysis is a powerful and useful statistical procedure with many implications for nursing research. It enables researchers to describe, predict and estimate the relationships and draw plausible conclusions about the interrelated variables in relation to any studied phenomena. Regression also allows for controlling one or more variables when researchers are interested in examining the relationship among specific variables. Some of the key considerations are presented that may be useful for researchers undertaking regression analysis. While planning and conducting regression analysis, researchers should consider the type and number of dependent and independent variables as well as the nature and size of sample. Choosing a wrong type of regression analysis with small sample may result in erroneous conclusions about the studied phenomenon.

Ethics statements

Patient consent for publication.

Not required.

  • Montgomery DC ,
  • Schneider A ,

Twitter @parveenazamali, @@Ahtisham04

Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Competing interests None declared.

Provenance and peer review Commissioned; internally peer reviewed.

Read the full text or download the PDF:

Research-Methodology

Regression Analysis

Regression analysis is a quantitative research method which is used when the study involves modelling and analysing several variables, where the relationship includes a dependent variable and one or more independent variables. In simple terms, regression analysis is a quantitative method used to test the nature of relationships between a dependent variable and one or more independent variables.

The basic form of regression models includes unknown parameters (β), independent variables (X), and the dependent variable (Y).

Regression model, basically, specifies the relation of dependent variable (Y) to a function combination of independent variables (X) and unknown parameters (β)

                                    Y  ≈  f (X, β)   

Regression equation can be used to predict the values of ‘y’, if the value of ‘x’ is given, and both ‘y’ and ‘x’ are the two sets of measures of a sample size of ‘n’. The formulae for regression equation would be

Regression analysis

Do not be intimidated by visual complexity of correlation and regression formulae above. You don’t have to apply the formula manually, and correlation and regression analyses can be run with the application of popular analytical software such as Microsoft Excel, Microsoft Access, SPSS and others.

Linear regression analysis is based on the following set of assumptions:

1. Assumption of linearity . There is a linear relationship between dependent and independent variables.

2. Assumption of homoscedasticity . Data values for dependent and independent variables have equal variances.

3. Assumption of absence of collinearity or multicollinearity . There is no correlation between two or more independent variables.

4. Assumption of normal distribution . The data for the independent variables and dependent variable are normally distributed

My e-book,  The Ultimate Guide to Writing a Dissertation in Business Studies: a step by step assistance  offers practical assistance to complete a dissertation with minimum or no stress. The e-book covers all stages of writing a dissertation starting from the selection to the research area to submitting the completed version of the work within the deadline. John Dudovskiy

Regression analysis

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Simple Linear Regression | An Easy Introduction & Examples

Simple Linear Regression | An Easy Introduction & Examples

Published on February 19, 2020 by Rebecca Bevans . Revised on June 22, 2023.

Simple linear regression is used to estimate the relationship between two quantitative variables . You can use simple linear regression when you want to know:

  • How strong the relationship is between two variables (e.g., the relationship between rainfall and soil erosion).
  • The value of the dependent variable at a certain value of the independent variable (e.g., the amount of soil erosion at a certain level of rainfall).

Regression models describe the relationship between variables by fitting a line to the observed data. Linear regression models use a straight line, while logistic and nonlinear regression models use a curved line. Regression allows you to estimate how a dependent variable changes as the independent variable(s) change.

If you have more than one independent variable, use multiple linear regression instead.

Table of contents

Assumptions of simple linear regression, how to perform a simple linear regression, interpreting the results, presenting the results, can you predict values outside the range of your data, other interesting articles, frequently asked questions about simple linear regression.

Simple linear regression is a parametric test , meaning that it makes certain assumptions about the data. These assumptions are:

  • Homogeneity of variance (homoscedasticity) : the size of the error in our prediction doesn’t change significantly across the values of the independent variable.
  • Independence of observations : the observations in the dataset were collected using statistically valid sampling methods , and there are no hidden relationships among observations.
  • Normality : The data follows a normal distribution .

Linear regression makes one additional assumption:

  • The relationship between the independent and dependent variable is linear : the line of best fit through the data points is a straight line (rather than a curve or some sort of grouping factor).

If your data do not meet the assumptions of homoscedasticity or normality, you may be able to use a nonparametric test instead, such as the Spearman rank test.

If your data violate the assumption of independence of observations (e.g., if observations are repeated over time), you may be able to perform a linear mixed-effects model that accounts for the additional structure in the data.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

regression analysis research paper sample

Simple linear regression formula

The formula for a simple linear regression is:

y = {\beta_0} + {\beta_1{X}} + {\epsilon}

  • y is the predicted value of the dependent variable ( y ) for any given value of the independent variable ( x ).
  • B 0 is the intercept , the predicted value of y when the x is 0.
  • B 1 is the regression coefficient – how much we expect y to change as x increases.
  • x is the independent variable ( the variable we expect is influencing y ).
  • e is the error of the estimate, or how much variation there is in our estimate of the regression coefficient.

Linear regression finds the line of best fit line through your data by searching for the regression coefficient (B 1 ) that minimizes the total error (e) of the model.

While you can perform a linear regression by hand , this is a tedious process, so most people use statistical programs to help them quickly analyze the data.

Simple linear regression in R

R is a free, powerful, and widely-used statistical program. Download the dataset to try it yourself using our income and happiness example.

Dataset for simple linear regression (.csv)

Load the income.data dataset into your R environment, and then run the following command to generate a linear model describing the relationship between income and happiness:

This code takes the data you have collected data = income.data and calculates the effect that the independent variable income has on the dependent variable happiness using the equation for the linear model: lm() .

To learn more, follow our full step-by-step guide to linear regression in R .

To view the results of the model, you can use the summary() function in R:

This function takes the most important parameters from the linear model and puts them into a table, which looks like this:

Simple linear regression summary output in R

This output table first repeats the formula that was used to generate the results (‘Call’), then summarizes the model residuals (‘Residuals’), which give an idea of how well the model fits the real data.

Next is the ‘Coefficients’ table. The first row gives the estimates of the y-intercept, and the second row gives the regression coefficient of the model.

Row 1 of the table is labeled (Intercept) . This is the y-intercept of the regression equation, with a value of 0.20. You can plug this into your regression equation if you want to predict happiness values across the range of income that you have observed:

The next row in the ‘Coefficients’ table is income. This is the row that describes the estimated effect of income on reported happiness:

The Estimate column is the estimated effect , also called the regression coefficient or r 2 value. The number in the table (0.713) tells us that for every one unit increase in income (where one unit of income = 10,000) there is a corresponding 0.71-unit increase in reported happiness (where happiness is a scale of 1 to 10).

The Std. Error column displays the standard error of the estimate. This number shows how much variation there is in our estimate of the relationship between income and happiness.

The t value  column displays the test statistic . Unless you specify otherwise, the test statistic used in linear regression is the t value from a two-sided t test . The larger the test statistic, the less likely it is that our results occurred by chance.

The Pr(>| t |)  column shows the p value . This number tells us how likely we are to see the estimated effect of income on happiness if the null hypothesis of no effect were true.

Because the p value is so low ( p < 0.001),  we can reject the null hypothesis and conclude that income has a statistically significant effect on happiness.

The last three lines of the model summary are statistics about the model as a whole. The most important thing to notice here is the p value of the model. Here it is significant ( p < 0.001), which means that this model is a good fit for the observed data.

When reporting your results, include the estimated effect (i.e. the regression coefficient), standard error of the estimate, and the p value. You should also interpret your numbers to make it clear to your readers what your regression coefficient means:

It can also be helpful to include a graph with your results. For a simple linear regression, you can simply plot the observations on the x and y axis and then include the regression line and regression function:

Simple linear regression graph

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

No! We often say that regression models can be used to predict the value of the dependent variable at certain values of the independent variable. However, this is only true for the range of values where we have actually measured the response.

We can use our income and happiness regression analysis as an example. Between 15,000 and 75,000, we found an r 2 of 0.73 ± 0.0193. But what if we did a second survey of people making between 75,000 and 150,000?

Extrapolating data in R

The r 2 for the relationship between income and happiness is now 0.21, or a 0.21-unit increase in reported happiness for every 10,000 increase in income. While the relationship is still statistically significant (p<0.001), the slope is much smaller than before.

Extrapolating data in R graph

What if we hadn’t measured this group, and instead extrapolated the line from the 15–75k incomes to the 70–150k incomes?

You can see that if we simply extrapolated from the 15–75k income data, we would overestimate the happiness of people in the 75–150k income range.

Curved data line

If we instead fit a curve to the data, it seems to fit the actual pattern much better.

It looks as though happiness actually levels off at higher incomes, so we can’t use the same regression line we calculated from our lower-income data to predict happiness at higher levels of income.

Even when you see a strong pattern in your data, you can’t know for certain whether that pattern continues beyond the range of values you have actually measured. Therefore, it’s important to avoid extrapolating beyond what the data actually tell you.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Chi square test of independence
  • Statistical power
  • Descriptive statistics
  • Degrees of freedom
  • Pearson correlation
  • Null hypothesis

Methodology

  • Double-blind study
  • Case-control study
  • Research ethics
  • Data collection
  • Hypothesis testing
  • Structured interviews

Research bias

  • Hawthorne effect
  • Unconscious bias
  • Recall bias
  • Halo effect
  • Self-serving bias
  • Information bias

A regression model is a statistical model that estimates the relationship between one dependent variable and one or more independent variables using a line (or a plane in the case of two or more independent variables).

A regression model can be used when the dependent variable is quantitative, except in the case of logistic regression, where the dependent variable is binary.

Simple linear regression is a regression model that estimates the relationship between one independent variable and one dependent variable using a straight line. Both variables should be quantitative.

For example, the relationship between temperature and the expansion of mercury in a thermometer can be modeled using a straight line: as temperature increases, the mercury expands. This linear relationship is so certain that we can use mercury thermometers to measure temperature.

Linear regression most often uses mean-square error (MSE) to calculate the error of the model. MSE is calculated by:

  • measuring the distance of the observed y-values from the predicted y-values at each value of x;
  • squaring each of these distances;
  • calculating the mean of each of the squared distances.

Linear regression fits a line to the data by finding the regression coefficient that results in the smallest MSE.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 22). Simple Linear Regression | An Easy Introduction & Examples. Scribbr. Retrieved September 18, 2024, from https://www.scribbr.com/statistics/simple-linear-regression/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, an introduction to t tests | definitions, formula and examples, multiple linear regression | a quick guide (examples), linear regression in r | a step-by-step guide & examples, what is your plagiarism score.

Cart

  • SUGGESTED TOPICS
  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

A Refresher on Regression Analysis

regression analysis research paper sample

Understanding one of the most important types of data analysis.

You probably know by now that whenever possible you should be making data-driven decisions at work . But do you know how to parse through all the data available to you? The good news is that you probably don’t need to do the number crunching yourself (hallelujah!) but you do need to correctly understand and interpret the analysis created by your colleagues. One of the most important types of data analysis is called regression analysis.

  • Amy Gallo is a contributing editor at Harvard Business Review, cohost of the Women at Work podcast , and the author of two books: Getting Along: How to Work with Anyone (Even Difficult People) and the HBR Guide to Dealing with Conflict . She writes and speaks about workplace dynamics. Watch her TEDx talk on conflict and follow her on LinkedIn . amyegallo

regression analysis research paper sample

Partner Center

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 17 September 2024

Exploring the impact of perceived early marriage on women’s education and employment in Bangladesh through a mixed-methods study

  • Md. Nuruzzaman Khan 1 , 2 ,
  • Shimlin Jahan Khanam 1 ,
  • Md. Mostaured Ali Khan 3 ,
  • Md Arif Billah 4 &
  • Shahinoor Akter 5  

Scientific Reports volume  14 , Article number:  21683 ( 2024 ) Cite this article

Metrics details

  • Epidemiology
  • Medical research

Child marriage negatively affects women’s socio-economic empowerment, particularly in education and employment. This study aimed to explore women’ perspectives on the timing of their marriages, considering their educational and employment status at the time. It also sought to identify factors influencing early married women’s perception of their marriages as timely. We analyzed both quantitative and qualitative data. The quantitative data included a sample of 5,596 women aged 15–24 from the 2017/18 Bangladesh Demographic and Health Survey. Additionally, we collected qualitative data through six in-depth interviews, two focus group discussions, and 13 key informant interviews. We used a multilevel mixed-effects Poisson regression model to examine the relationship between women’s formal employment, education, and child marriage. Thematic analysis was employed for the qualitative data. Around 62% of the total women analysed reported their married occurred early with the mean age at marriage was 15.2 years. Approximately 55% of the total early married women believed their marriages occurred at the right time, especially among those who were employed at the time of their marriage. Among this subset, we also noticed a higher likelihood of discontinuing work and education following marriage. Qualitative findings revealed reasons behind this perception, such as escaping poverty, safety concerns, limited job prospects, and the impact of non-marital relationship and societal norms. While many early-married women perceived their marriage as timely, particularly those initially employed, this decision often coincides with a subsequent withdrawal from work and education. This underscores the pressing need for policies and programs aimed at educating women about the legal age for marriage and the negative consequences associated with early marriage while also equipping them with knowledge and resources for informed decision-making.

Similar content being viewed by others

regression analysis research paper sample

A study of factors affecting women’s lived experiences in STEM

regression analysis research paper sample

Single motherhood in Ghana: analysis of trends and predictors using demographic and health survey data

regression analysis research paper sample

Exploring how unemployment and grandparental support influence reproductive decisions in sub-Saharan African countries: Nigeria in focus

Introduction.

Child marriage is a pervasive issue in low- and middle-income countries (LMICs), especially in South Asia and sub-Saharan Africa 1 . The 2022 UNICEF report on child marriage revealed that approximately 12 million girls in LMICs are married before the age of 18 each year, which translates to one in every five girls in those settings 2 . In Bangladesh, the situation is even more alarming, with recent estimates indicating that about 59% of women aged 20–24 years were married before turning 18 3 , and 22% were married before the age of 15 4 . The rate is further higher in rural areas, particularly those with widespread poverty, low education enrolment and significant concerns about family’s reputation 4 , 5 , 6 .

Child marriage can have devastating consequences for girls and their families, as it often leads to a cycle of poverty and disempowerment 7 . Girls who are forced into early marriage are often forced to drop out of school, which reduces and restricts their opportunities for education and limits their economic prospects 8 . Additionally, child marriage has been associated with higher rates of domestic violence and divorce, which can have significant negative impacts on girls’ physical and mental health, hampering the development and wellbeing of girls 9 , 10 . Moreover, child marriage has serious implications for maternal health. Child brides are more likely to experience complications during pregnancy and childbirth, such as obstetric fistula and maternal mortality 11 , 12 . Children born to child brides are also at a higher risk of mortality and malnutrition 13 . This leads to an intergenerational effect because malnourished children are often more likely to drop out of school and subsequently become child brides themselves 14 . These adverse consequences associated with child marriage, coupled with the high number of girls affected, highlight a significant burden in LMICs 1 . Therefore, it poses challenges to achieving Sustainable Development Goals (SDGs) related to health and well-being (SDG 3), gender equality (SDG 5), education (SDG 4), and poverty reduction (SDG 1) 15 .

Socio-demographic factors associated with child marriage have been extensively studied in LMICs, including Bangladesh 16 , 17 , 18 , 19 . Although such observational studies played a crucial role in developing relevant policies and programs to reduce the occurrence of child marriage, they fall short comprehensively addressing the issue. Girls’ views over their marriage along with marriage age can carry a significant weight and override all socio-demographic factors by mediating their roles 20 . It is commonly assumed that every child marriage occurs on parental wishes and is not desired by the girls. However, this is not always the case in reality. The underlying reasons for girls choosing to marry at an early age instead of continuing education and work has remained understudied 21 . Therefore, it is crucial to explore the percentage of girls who perceive their marital age as appropriate or not, as this can inform the development of more effective policies and programs. However, undertaking this requires comprehensive research, integrating determinant factors of early marriage and perceptions of those who marry early. Unfortunately, such information is largely absent in existing literature due to the nature of available data in LMICs, including Bangladesh, where DHS surveys serve as primary data sources 16 , 17 , 18 , 19 , 22 . The survey provides important information on prevalence of early marriage and its socio-demographic predictors, however, lacking content related to girls’ views over early marriage as it typically requires qualitative study. This indicates a need for mixed-methods studies, which are mostly lacking in LMICs, with none conducted in Bangladesh 23 , 24 , 25 , 26 . To address these limitations, we aimed to investigate women’s perceptions of getting married at an earlier age, taking into account their education and employment status at the time of marriage. We also sought to identify the factors that influenced earlier-aged women to perceive their marriages as occurring at the right age.

Study design

The study applied a sequential explanatory mixed-methods design where analysis of secondary quantitative data were followed by the collection and analysis of qualitative data 27 . The qualitative findings, therefore, aimed to explain and interpret the findings of the quantitative study.

Quantitative study

Data source and sampling.

This study analysed data of most recent 2017/18 Bangladesh Demography and Health Survey (BDHS). The survey employed a two-stage stratified random sampling method to select the respondents. In the first stage, 672 Primary Sampling Units (PSUs) were selected from a list of 293,579 PSUs generated during the 2011 National Population Census of Bangladesh, excluding three PSUs due to extreme floods. In the second stage, 30 households were randomly selected from each of the PSUs, using probability proportional to PSU size. This generated a list of 20,160 households, and 19,457 of these households were interviewed. There were 20,376 respondents eligible in the selected households with the eligibility criteria: (i) being a married woman of reproductive age and (ii) spent the previous night of the survey day in the selected households. Of them data were collected from 20,127 women. Details of the BDHS survey procedure were published elsewhere 3 . A sub-sample, 5,596 women aged 15–24, of them was analysed in this study selected based on the following two inclusion criteria: (i) aged 15–24 years (to ensure inclusion of only recently married women following the recommendation of global literature 2 , 6 , 7 , 9 , 24 , 28 ) and (ii) married at the time of survey.

Outcome variable

The focus of our quantitative study was child marriage, which we defined as a binary outcome variable (yes or no). To collect this data, the BDHS asked women to report the age at which they first began living with their spouse or first spouse in case of more than one marriage. We categorized responses as either child marriage (1, if the marriage occurred before the woman turned 18) or normal-aged marriage (0, if the marriage occurred at age 18 or later) according to the universal recommendation which is also followed in Bangladesh 2 .

Exposure variables

Working status and educational status of respondents at the time of their marriage as well as following marriage were our primary exposure variables. The BDHS collected this data by asking whether the respondents were working or studying in school just before they got married. If the response was affirmative, the respondents were then asked two follow-up questions to determine their work or education continuity and the number of years of continuity. These follow-up questions were: (i) Did you continue working/studying after marriage? and (ii) If yes, for how long? Based on the responses, we created four variables: (i) study before marriage (yes, no), (ii) work before marriage (yes, no), (iii) after marriage study (no, continue less than a year, continue less than five years), and (iv) after marriage work (no, continue less than a year, less than five years). We also considered additional exposure variables by reviewing the available literature for Bangladesh and other LMICs 16 , 17 , 18 , 19 . These variables included respondents’ age, education level, partner’s education level, partner’s occupation, wealth quintile, place of residence, and region of residence.

Statistical analysis

Descriptive statistics, including frequency and percentage, were used to describe the characteristics of the respondents. Multilevel mixed-effect Poisson regression model was utilized to explore the association between early marriage and the working and studying status of the respondents, as well as the continuity of their working and studying status following child marriage. The reason for using a multilevel Poisson regression model was higher prevalence of child marriage (> 10%) and the clustering structure of the BDHS data. Previous studies have found that simple logistic regression analysis produces less precise findings when the prevalence of the outcome variable is high and the data come from a clustered structure 29 . Both adjusted and unadjusted models were run, where one particular exposure variable was considered with the child marriage variable in the unadjusted model, and other factors were adjusted in the adjusted model. Multicollinearity was checked before running each model. Results were reported as unadjusted or adjusted Prevalence Ratios (PRs) and corresponding 95% confidence intervals. Stata version 18.0 was used for data analysis.

Qualitative exploration

In our quantitative study, we identified a higher prevalence of child marriage among working women. To explore why working women get married at an early age, we conducted a qualitative study during January 2023 and January 2024 as this information was not available in the BDHS survey data. The Gazipur district of Dhaka division was purposively selected as the study area. This area has a higher concentration of ready-made garments and small-scale industries where the majority of workers are women and married 30 . Two focus group discussions (FGDs) involving 16 participants (8 in each FGD), six in-depth interviews (IDIs), and 13 key informant interviews (KIIs) were conducted using two pre-developed interview topic guides. The topic guides covered several areas, including marriage experience, perceptions at the time of marriage, education and employment after marriage, respondents’ perspectives on marriage over time, and community and religious norms related to early marriage. The length of these interviews ranged from 75 to 90 min.

The participants included in the FGD and IDI were selected purposively selected based on the following criteria: (i) currently aged between 15 and 24 years, (ii) married before their 18th birthday, (iii) involved in either work or education just before marriage, and (iv) currently either continuing education or work or have left them to become housewives. These criteria aimed to ensure that the qualitative study participants were similar to those in the quantitative study. To recruit the participants, data collectors first approached them by sharing the details of the study plan and collected data on their preferred date. To protect privacy of the participants and ensure limited interruption during interview process, the participant and interviewer choose a private location. Prior to the qualitative data collection, participants were again briefed about its objectives and assured of the confidentiality of their responses. Informed consent was obtained from participants above 18 years old, while for participants under 18, informed consent was obtained from their legal guardians (father or husband).

Additionally, 13 key informant interviews (KIIs, male = 9; female = 4) were conducted with managers of ready-made garment factories ( n  = 2), small-scale industries ( n  = 1), local leaders (members of the Pourosova, n  = 3 [male = 2, female = 1]), the Upazila Nirbahi officer ( n  = 1, female), and parents of girls who married at an early age ( n  = 6 [male = 4, female = 2]). Their perspectives on early marriage were sought due to their significant involvement in the issue, including shaping cultural norms and exerting social pressures that perpetuate early marriage practices.

Participation was completely voluntary, and no participants were provided with any gifts or incentives to participate in this study. Experienced social researchers were involved in the qualitative data collection. The FGDs and IDIs were conducted by a female interviewer (second author). The KIIs were conducted by two male interviewers (first and third author). All conversations during FDGs, IDIs and KIIs were audio recorded with consent. The recorded audio files were subsequently reviewed and translated into English by the respective research team members. Relevant sections on the reasons behind early marriage and its impact on work and educational pursuits were extracted and discussed among the team to ensure accurate interpretation and presentation of the data. Qualitative data were thematically analysed 31 , using Nvivo version 12.10 32 . Key themes identified in the analyses were synthesized and presented as study findings. Ethical approval for the qualitative study was obtained from the Institutional Review Board of the University of Rajshahi.

Background characteristics of the respondents

Table  1 presents the background characteristics of the respondents included in the quantitative exploration. The mean age at marriage was 15.2 years (SD, ± 1.41) and the mean years of education were 6.9 years (SD, ± 2.95). About 32.8% of the respondents reported being engaged in a formal job. Of the respondents analyzed, 61.7% reported their marriage occurred before reaching their age 18 years. Over half of the (55.1%) total 61.7% who reported being married before 18 years reported that their marriage occurred at the right time, while 44.9% of them felt that they should have delayed their marriage.

Respondents’ perception about their marriage timing

Table  2 illustrates the distribution of respondents’ perception about their marriage timing as per their socio-demographic characteristics. Among the respondents who believed that their marriage occurred earlier, and they should delay, 49.8% were aged between 15 and 19 years, and 41.2% were aged between 20 and 24 years. Earlier-aged married women who believe their marriage occurred earlier and should be delayed, 36.6% were illiterate. This number was half to 63.4% of illiterate women who thought their marriage occurred at right age.

How women’s education and employment status before marriage influences their perception of marriage age

The unadjusted associations suggest that earlier-married women who thought their marriage occurred at the right time were more likely to have worked before marriage (PR = 1.88, 95% CI = 1.34–2.64) and less likely to have studied before marriage (PR = 0.55, 95% CI = 0.47–0.66) compared to women who thought they should delay their marriage (Table  3 ). After adjusting for confounding variables, the associations remain significant, but the effect sizes are attenuated. Earlier aged women who thought their marriage occurred at the right time were still more likely to have worked before marriage (aPR = 1.47, 95% CI = 1.01–2.18), but the association was no longer significant. However, adjusted likelihood of early marriage was found lower among studying girls (aPR = 0.43, 95% CI = 0.35–0.53).

Impact of early marriage on the continuation of education and employment

Table  4 presents the results of the analysis of the association between the women’s perception of their marriage timing and their continuation of education and work after marriage. In the unadjusted analysis, earlier aged women who think that their marriage occurred at right time were more likely to report continuation of their education up to five years (PR = 1.73, 95% CI 1.31–2.28) as compared to the women who did not continue their education. However, this association was not significant for those who continued their education for less than one year. In the adjusted analysis, the association between continuing education and women’s perception of their marriage timing was attenuated, with women who continued their education for less than five years having a non-significant lower likelihood of perceiving their marriage as occurring at the right time (aPR = 0.92, 95% CI 0.65–1.28). The association between continuing work and women’s perception of their marriage timing remained non-significant in the adjusted analysis.

Through qualitative investigation, we conducted an extensive exploration into the reasons why working women who married at an earlier age believed that their marriage took place at the right time, and also examined why studying women who married at an earlier age perceived their marriage as occurring earlier, as revealed in our quantitative analysis. The characteristics of the participants participated in qualitative interviews are presented in supplementary Tables 1 , 2 and 3 . Our comprehensive findings have uncovered a multitude of factors that can be categorized into distinct thematic patterns (supplementary Table 4). These include: (i) getting married is a way to recover from poverty, (ii) marriage was perceived as a means to ensure the safety and security of young unmarried girls, (iii) less hope for job prospects, and (iv) intimate relationships and social norms.

Getting married way to recover from poverty

Participants reported that a common influencing factor behind early marriage among young girls was their perception that it would help them escape poverty.

“I had dreams of studying and becoming a government service holder, but poverty choked those dreams before they could bloom. Marriage was the only path open to me, even if it means leaving those dreams behind.” (IDI participant 3, age 20). “My parents couldn’t afford to keep me in school anymore, and marriage seemed like the only way to have a roof over my head and food on the table.” (IDI participant 2, age 19).

Most female participants reported that they entered the labour market before reaching their 18th birthday. Since their families were from lower socio-economic backgrounds, they started looking for a job to support their families. They often engaged in low-wage occupations, such as house cleaning or garment factories.

“I worked in house cleaning before my marriage and received a very low wage, which was not enough for my living. As a result, I could not send money to my parents, even though they expected me to do so.” (FGD 1 participant, age 19).

According to them, like many young girls, they also moved to the city from rural areas leaving their families behind in search of jobs and started living in rental accommodations. The income derived from these jobs proved insufficient to meet their daily expenses, including paying for food, rent and utilities while providing financial assistance to their families. This issue was also highlighted in the KIIs. One ready-made garment manager reported that young female ready-made garment factory workers’ wages are not enough to support themselves and their parents. Therefore, they usually decide or agree to get married at an early age to overcome their financial struggles.

“Girls who started working here usually work on a daily basis and earn only 150–200 BDT (1.5-2 USD) per day, which is not enough to maintain their daily expenses. What’s even worse is that many of these girls migrated from rural areas to work here, meaning they have to pay for their rent and other associated costs. It’s no surprise that many of them end up choosing to get married, as it seems like the only way out of this financial struggle”. (Ready-garment manager, male, age 45).

Marriage was perceived as a means to ensure the safety and security of young unmarried girls

In the context of working girls who opt for early marriage, safety and security emerge as crucial factors. Participants reported that young girls working in different industries (such as ready-made garments factories) usually work alongside male workers. Due to the demands of their jobs, they usually spend prolonged hours working together at workplace. The nature of their joint work and spending prolonged hours at workplace often create tensions and a sense of insecurity among themselves and within their families. They fear being exposed to physical and sexual abuse and/or violence at workplace, thereby reinforcing the inclination towards early marriage.

“My parents were concerned about the potential risks of sexual and physical abuse when working outside the home, especially during evening hours and interacting with male colleagues. I would hold the same belief and take similar precautions if I had a young daughter engaged in employment, as there are multiple reasons to support this perspective”. (FGD 1 participant, age 18). “Living alone as a young woman in this city felt dangerous. Marriage, even if it’s not ideal, meant having someone to protect me and a family to belong to.” (IDI participant 2, age 20).

This fear also reflected in the KIIs with fathers. Families of young girls working in ready-made garment factories tend to marry off their young girls to protect their daughters or girls from any future risks of abuse.

“Soon after starting working at age 15, I arranged her marriage. Although it was not my intention, we did not feel secure leaving my young daughter outside the house, as she could be at risk of rape or harassment from strangers. My neighbours also suggested that I do so”. (Father of a married and working young girl, age 55).

One key informant, a supervisor of a ready-made garments industry, also confirmed that young girls working in the industries often become victims of physical and sexual harassments. He also added that the prevalence of sexual violence and harassment had been underreported like in other sectors in Bangladesh and remained a concerning reality. However, the families of the young girls are aware of these risks, even if based on limited evidence, which significantly influenced marry off their girls at a young age against the law.

“A significant number of girls who start working at a young age face violence from their male counterparts, including sexual violence. We are aware of this, and our organization has very strict laws against it. However, these incidents often go unreported, similar to other sectors in Bangladesh”. (A supervisor of a garment factory, male, age 38).

Even parents of young school or college-going girls also voiced their concern regarding their daughters’ safety while traveling to school or college. Sometimes local boys or men harassed these young girls on the way to home or college, and parents of these young girls usually perceived that marrying the young girls off was the only solution.

“I wanted my daughter to continue her education. However, a mischievous boy started following her to school. I contacted his parents and asked them to discipline their son and prevent him from following my daughter, but it did not work. Eventually, I had to marry off my daughter when she was only 16 years old”. (Father of earlier aged girls who marry off while studying, age 38).

However, during the FGDs, female participants expressed their viewpoints that the claims regarding safety concerns and incidents of violence were not always accurate. They argued that this phenomenon might stem from parents’ genuine apprehensions regarding the safety and their desire to uphold societal prestige, their parents mostly forced them to get married at an early age against their will. This highlights the need to delve deeper into the complexities surrounding early marriage and the factors that shape this decision.

“I started working as a house cleaner when I was 15 and got married at age 16. I wasn’t intending to get married at that time, but my parents pressured me to do so. They had heard that young girls working as house cleaners often face sexual and physical violence from homeowners. Though my homeowners treated me like their daughter. Therefore, they were strict in their decision to marry me off”. (FGD 2 participant, age 20).

Less hope for job prospects

Several female participants highlighted the interconnectedness between school dropout, limited employment opportunities for people with lower educational attainment, early entry into workforce, and early marriage. According to them, education could not guarantee job prospects in Bangladesh. Securing a job could be more challenging for individuals with low academic achievements. Therefore, they perceived that entering the job market early would be more worthwhile for them and their families than continuing their education, as it required financial resources.

“Studying seemed pointless when I knew there were barely any jobs at the end. Marriage offered a chance at some stability, even if it wasn’t the kind I hoped for.” (IDI participant 5, age 21).

One FGD participant reported that she perceived that she did not have the necessary qualities to compete in the increasingly competitive job market. She felt early employment would give her practical work experience, opening opportunities for better job prospects. Hence, she decided to dropout from school. Other FGD participants agreed with her, indicating they shared similar perceptions.

“Obtaining a job after completing education is only possible for highly meritorious students. I did not fall into that category. Hence, I decided to start working with the hope that by the time I finished my education, I would have accumulated several years of work experience, which would undoubtedly enhance my chances of securing a comparatively better job”. (FGD 2 participant, age 19).

Some guardians of young school or college-going girls also perceived that education would not guarantee any employment for their daughters and married them off at an earlier age.

“Why would I continue my daughter’s education? What hope was there? Even many educated and meritorious students are now unemployed. Therefore, I married off my daughter when she was only 15, and she is now leading her own life. I have no concern now”. (Father of earlier aged girls who marry off while studying, age 40).

Intimate relationships and social norms

Several female participants revealed that working girls were found to be more susceptible to developing relationships with their colleagues. These close bonds often evolve into sexual relationships, contributing to early marriage among working girls.

“Shortly after I began working, I entered into a relationship with one of my colleagues, which later turned into a sexual relationship. We then decided to marry, although our parents were unhappy with our decision. However, we did it without their approval because we knew what we did was not right according to our religion”. (FGD 1 participant, age 18).

The issue of having a relationship at the workplace without parents’ permission was identified as a growing concern for early marriage, even in the absence of an intimate relationship.

“Parents of the earlier-aged working girls are mostly uneducated and strongly influenced by the social norms and misconceptions. They believe female and male could not be co-worker. Therefore, they prefer to marry off their daughters earlier”. (Upazila Nirbahi officer, age 38).

Importantly, this perception was found to be common among parents of both working and studying girls. However, it does not pose a significant concern for the parents of studying girls. This is because their awareness of the negative effects of child marriage motivates them to prioritize the education of their daughters and prioritize their daughters’ safety above all else.

“I had concerns about the possibility of my school-going daughters engaging in intimate relationships, which have become more common due to modernization. However, my intention was not to abruptly end her education and arrange her marriage solely based on this risk. I believed she was young and had a promising future ahead. Eventually, I did make the decision to marry her, but it was primarily motivated by the opportunity to find a comparatively better groom within my family lineage”. (Father of earlier aged girls who marry off while studying, age 38).

Social norms were highlighted as significant reasons for early marriage among the women who married at a younger age.

“Marriage is a way to uphold our traditions, to show respect to our families and ancestors. Even if I had doubts, I knew I had to follow the path laid out for me.” (IDI participant 4, age 22).

One participant mentioned that she felt she did not belong in her society and remained a minor in society’s eyes when she found many of peers were getting married at an earlier age than her.

“Everyone around me was getting married young, building families. It felt like I was the only one left behind, stuck in a childhood that was no longer fitting. Marriage was a way to belong, to be seen as a responsible adult.” (IDI participant 5, age 21).

Another participant explicitly mentioned why marriage was important for them to become important in the society.

“Marriage is a social currency here. It defines your status, your worth. Choosing a career over marriage felt like choosing shame over acceptance, a path less traveled and less understood.” (IDI participant 3, age 19).

The primary objective of this study was to explore the perspectives of girls regarding the timing of their own marriages, taking into consideration their educational and employment backgrounds at the time of marriage. Furthermore, we aimed to investigate the factors that influenced earlier-aged women to perceive their marriages as occurring at the appropriate time. Our findings indicate that among the total population of earlier-aged women, 55% believe that their marriages took place at the right time, with a higher percentage observed among women who were employed at the time of marriage. Among those who held this perception, there was a notable trend of discontinuing work and education after getting married. Through qualitative analysis, we gained insights into the underlying reasons why these women considered their marriages to be timely, including the desire to escape poverty, concerns regarding safety and security, and the influence of intimate relationships and societal norms.

The study findings convey three significant messages concerning early marriage in the country. Firstly, a substantial portion of early marriages are a result of girls’ choice. Secondly, the engagement of girls in formal employment contributes to an increase in early marriage rates, unless measures are implemented to ensure economic security and safety. Lastly, early-married girls who believe that their marriages occurred at the right time are more likely to discontinue their education and withdraw from the workforce.

The perception of working girls who marry early that their marriage occurred at the right time can be understood from two distinct directions. Firstly, these girls may lack awareness regarding the appropriate age for marriage and the potential negative consequences associated with marrying at a young age. Secondly, their working environment and the challenges they face may have influenced their decision to marry early, despite being aware of the adverse outcomes of early marriage. This may include parental pressure for marriage once girls start working, especially if they work alongside male colleagues or develop intimate relationships with them, which can conflict with societal norms 33 . Traditional patriarchal values in Bangladesh further reinforce these pressures 33 . Regardless of the direction, these perceptions indicate a failure of policies and programs, which can result in long-term burdens for the country.

If the first direction holds true, it suggests that a portion of women have not received the message regarding the correct age for marriage at early ages and consequences of early marriage. Factors such as early dropout from education to enter the workforce and limited exposure to mass media due to work obligations may contribute to this lower level of awareness 28 , 29 , 34 . These directions are influenced by various socio-demographic and socio-cultural factors. While leaving the parental home to enter the workforce can indicate a degree of freedom for girls, this is not always the case in many LMICs, including Nepal and India 8 , 18 . In these contexts, girls who start working early, often without continuing their education, typically come from low-income families where they are expected to support their families. Early marriage remains a long-standing norm in these communities, perpetuated by the fact that their mothers and grandmothers also married at a young age. Furthermore, these girls often move from their parental homes to their workplace, which is often seen as opposite of social norms in Bangladesh as well as other LMICs 35 .

On the other hand, if the second directions is true, it indicates a failure of long-standing governmental priorities to ensure continuing girls’ education and prevent child labour, as well as a failure to ensure the safety of working girls 18 . Though Bangladesh has made remarkable progress in reducing violence against women, incidents still occur frequently with one in three women faces it 36 . Importantly, any such incidents are usually spread widely with additional rumours, causing concern for the girls’ security and motivating their parents to marry them off early. However, the above explanations may not be true for the girls who continue education. They are usually better aware off about the right age of marriage and risk of early marriage as well as usually stay at home with less security issue. They are also from a comparatively better family status.

In this study, we have also found that there is a lower likelihood of continuing work after marriage. This change is mainly due to the presence of family pressure or intention to have a child, as well as the desire to give more time to the family 37 , 38 . Moreover, we found that while women are employed, their wages in the factories are very low. This financial constraint limits their opportunities, often pushing them towards early marriage or leading them to leave work altogether to assume traditional roles as wives, mothers, and homemakers 39 . However, this practice can have several adverse consequences on women’s economic flexibility, empowerment and decision making abilities in the family 40 , 41 . For instance, stopping work can lead to face earlier aged pregnancy, which is associated with various adverse maternal and child health outcomes, including lower utilization of maternal healthcare services, pregnancy complications, and maternal and child mortality 40 , 42 . Furthermore, this trend indicates a significant dropout rate of girls from continuing their education and working status, which negatively affects women’s empowerment and decision-making abilities 40 , 41 . These consequences can lead to higher household poverty, greater sensitivity to economic shocks, and less income diversification 41 . These factors, in turn, can have significant intergenerational impacts, resulting in poorer health among children and lower investment in education and other forms of human capital accumulation 43 , 44 . All of these factors increase the likelihood of early marriage in subsequent generations 43 .

Regardless of the explanation provided, these findings indicate challenges for the country to achieve relevant SDG’s targets related to the improvement of sexual and reproductive health rights as well as equity. This highlights the need for policies and programs to educate and raise awareness among studying and working girls about the correct age of marriage and the adverse effects of early marriage. Increasing the wages of working girls at the initial level is also important. Nevertheless, the existing initiatives remain ineffective unless appropriate engagement of multiple stakeholders including girl’s parents and local leaders for implementation of target-oriented policies and programs to reduce early marriage. Reducing gender-based violence and improving their safety in the workplace are also crucial to reducing early marriage.

This study exhibits several notable strengths as well as a few limitations. It stands as the first investigation in Bangladesh and other LMICs that delves into women’s perceptions of their marriage age, accounting for their educational and employment statuses, and utilizing nationally representative quantitative survey data. Additionally, the qualitative survey findings offer insight into the motivating factors behind the marriage decisions of working and educated girls. The qualitative interviews were conducted by the authors of this study, who have extensive experience in academia and public health research. They hold postgraduate degrees in population science, public health, and anthropology, and possess substantial expertise in conducting research in LMICs, including Bangladesh. All authors agreed on and approved the interpretation presented in the manuscript. The study employed appropriate statistical modelling techniques to analyze the data, incorporating a diverse range of confounding variables. As a result, the reported findings possess sufficient robustness to inform national-level policies and programs. However, one key limitation of this study is that the quantitative data analyzed in this study were derived from a cross-sectional survey, which restricts the ability to establish causal relationships. While the study explored cultural factors associated with early marriage through qualitative analysis, these factors were not adjusted for in the quantitative analysis due to their unavailability within the survey. We were also unable to account for other important factors, such as spousal age differences and the extent of early marriage within the women’s families, due to a lack of data, despite their relevance to the occurrence of early marriage. Furthermore, the women’s age of marriage was self-reported, introducing the potential for recall bias. Nevertheless, any such bias is expected to be random in nature and should not significantly skew the reported results in any particular direction. Conducting qualitative study was another strength of this study, where representation of women (in IDIs and FGDs) and men (in the KIIs) allowed us to capture diverse perspectives on this complex issue. We conducted the qualitative study in a purposively selected district and utilized data to explain and interpret findings from the nationally representative quantitative data. However, this comparison may introduce errors due to different social norms and cultural issues regarding early marriage in various parts of the country. Moreover, the perception of early marriage among this group of women might differ slightly from that of women in other regions and rural areas due to factors such as their relocation, economic stability, community engagement, and relatively higher decision-making autonomy. This indicates a need for qualitative interviews to be conducted in different regions of the country. However, we were unable to do so due to a lack of funding. It is worth noting that although participants included in our qualitative survey were from only one district, a significant portion of them reported their origins as being from different parts of the country, including rural areas, rather than their present location. They had moved to this district for employment, given that ready-made garment and small-scale industries are predominantly located in this area. Moreover, we conducted qualitative interviews in 2023–2024, while the quantitative data we analysed was collected in 2017–2018. Comparing data from different time points may introduce some distortion in the reported associations and conclusions. However, we could not address this issue further as the quantitative survey data we analysed is the most recent available in Bangladesh.

Our findings revealed that approximately 55% of women who married at an early age believed that their marriage took place at the right time. Among early-married women, those who were employed at the time of their marriage were more likely to perceive their marriage as timely, whereas those who were pursuing studies at the time of their marriage were more inclined to view their marriage as occurring too early and should have been delayed. Multiple factors emerged as influential in shaping the perception of earlier-aged married women regarding the timing of their marriage, including the desire to escape poverty, concerns related to safety and security, and the influence of intimate relationships and societal norms. These findings highlight that a significant proportion of early-married women believe their marriage occurred at the right time, indicating a gap in policies and programs designed to raise awareness about the risks of early marriage and early childbearing. It is crucial for policies and programs to prioritize comprehensive education for all girls and those around them, including parents, to ensure they are informed about the appropriate age for marriage and the potential consequences of early marriage. Additionally, those involved in decision-making and upholding social norms around early marriage should receive extensive counselling on its adverse effects. This focus should particularly target working girls, who may be more vulnerable to early marriage.

Data availability

“The data that support the findings of this study are available from The DHS Program, but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are, however, available from the corresponding author upon reasonable request and with permission of The DHS Program. To proceed, researchers are required to submit a research proposal via the website (https://dhsprogram.com/data/available-datasets.cfm). Subsequently, the designated individual will review the proposal and approve access for data download. We are unable to share the qualitative interview data due to restrictions imposed by the ethical review committee”.

Abbreviations

Low- and middle-income countries

Demographic Health Survey

Bangladesh Demographic Health Survey

adjusted odds ratio

Confidence interval

Sustainable development goal

National Institute of Population Research and Training

Primary sampling unit

Subramanee, S. D. et al. Child marriage in South Asia: a systematic review. Int. J. Environ. Res. Public Health 19 (22), 15138 (2022).

PubMed   PubMed Central   Google Scholar  

UNICEF. UN General Assembly Resolutions on Child Marriage . https://www.unicef.org/protection/child-marriage (2022).

National Institute of Population Research and Training (NIPORT). Bangladesh Demographic and Health Survey 2017–2018 (2020).

Bangladesh Bureau of Statistics (BBS) & UNICEF Bangladesh Progotir Pathey, Bangladesh Multiple Indicator Cluster Survey 2019, Survey Findings Report (2019).

Marphatia, A. A., Ambale, G. S. & Reid, A. M. Women’s marriage age matters for public health: a review of the broader health and social implications in South Asia. Front. Public Health 5 , 269 (2017).

UNICEF. Ending Child Marriage: A Profile of Progress in Ethiopia (UNICEF, 2018).

Scolaro, E. et al. Child marriage legislation in the Asia-Pacific region. Rev. Faith Int. Affairs 13 (3), 23–31 (2015).

Google Scholar  

Sekine, K. & Hodgkin, M. E. Effect of child marriage on girls’ school dropout in Nepal: analysis of data from the multiple Indicator Cluster Survey 2014. PLoS ONE 12 (7), e0180176 (2017).

Islam, M. M. Child marriage, marital disruption, and marriage thereafter: evidence from a national survey. BMC Women’s Health 485 , 22485 (2022).

Naved, R. T. et al. Impact of Tipping Point Initiative, a Social Norms Intervention, in Addressing Child Marriage and Other Adolescent Health and Behavioral Outcomes in a Northern District of Bangladesh (2022).

Dadras, O., Khampaya, T. & Nakayama, T. Child marriage, reproductive outcomes, and service utilization among young Afghan women: findings from a nationally Representative Survey in Afghanistan. Stud. Fam. Plann. 53 (3), 417–431 (2022).

PubMed   Google Scholar  

Fan, S. & Koski, A. The health consequences of child marriage: a systematic review of the evidence. BMC Public Health 22 (1), 1–17 (2022).

Hossain, M. M. et al. Child marriage and its association with morbidity and mortality of under-5 years old children in Bangladesh. PLoS ONE 17 (2), e0262927 (2022).

CAS   PubMed   PubMed Central   Google Scholar  

Khanam, S. J. & Khan, M. N. Effects of parental migration on early childhood development of left-behind children in Bangladesh: evidence from a nationally representative survey. PLoS ONE 18 (11), e0287828 (2023).

Julitta Onabanjo, M. M. M. F. Urgent Action Needed to Meet SDG to End Child Marriage by 2030 (UNICEF, 2022).

Kamal, S. M. et al. Child marriage in Bangladesh: trends and determinants. J. Biosoc. Sci. 47 (1), 120–139 (2015).

MathSciNet   PubMed   Google Scholar  

MacQuarrie, K., Juan, C. & Fish, T. D. Trends, Inequalities, and Contextual Determinants of Child Marriage in Asia (ICF, 2019).

Billah, M. A. et al. Spatial Pattern and Influential Factors for Early Marriage: Evidence from Bangladesh Demographic Health Survey Data 2017–2018 (2023).

Modak, P. Determinants of girl-child marriage in high prevalence states in India. J. Int. Women’s Stud. 20 (7), 374–394 (2019).

Boertien, D. & Härkönen, J. Why does women’s education stabilize marriages? The role of marital attraction and barriers to divorce. Demogr. Res. 38 , 1241–1276 (2018).

Mourtada, R., Schlecht, J. & DeJong, J. A qualitative study exploring child marriage practices among Syrian conflict-affected populations in Lebanon. Confl. Health 11 (1), 53–65 (2017).

Biswas, R. K., Khan, J. R. & Kabir, E. Trend of child marriage in Bangladesh: a reflection on significant socioeconomic factors. Child Youth Serv. Rev. 104 , 104382 (2019).

Saleh, A. M. et al. Exploring Iraqi people’s perception about early marriage: a qualitative study. BMC Women’s Health 22 (1), 393 (2022).

UNICEF. Child marriage knowledge, attitudes, and perceptions among affected communities in Albania. UNICEF 1 , 1–101 (2018).

Seta, R. Child marriage and its impact on health: a study of perceptions and attitudes in Nepal. J. Glob. Health Rep. 7 , e2023073 (2023).

Mardi, A. et al. Perceptions of teenage women about marriage in adolescence in an Iranian setting: a qualitative study. Electron. Phys. 10 (2), 6292 (2018).

Maxwell, J. A. Qualitative Research Design: An Interactive Approach (Sage, 2012).

UNICEF. Child Marriage, Aolescent Pregnancy and School Dropout in South Asia . https://www.unicef.org/rosa/reports/child-marriage-adolescent-pregnancy-and-school-dropout-south-asia (2022).

Barros, A. J. & Hirakata, V. N. Alternatives for logistic regression in cross-sectional studies: an empirical comparison of models that directly estimate the prevalence ratio. BMC Med. Res. Methodol. 3 (1), 1–13 (2003).

Swazan, I. S. & Das, D. Bangladesh’s emergence as a ready-made garment export leader: an examination of the competitive advantages of the garment industry. Int. J. Glob. Bus. Competitive. 17 (2), 162–174 (2022).

Clarke, V. & Braun, V. Thematic analysis. J. Posit. Psychol. 12 (3), 297–298 (2017).

Dhakal, K. NVivo J. Med. Libr. Assoc. 110 (2), 270 (2022).

Ferdous, D. S., Saha, P. & Yeasmin, F. Preventing Child, Early and Forced Marriage in Bangladesh: Understanding Socio-economic Drivers and Legislative Gaps (2019).

Bajracharya, A., Psaki, S. & Sadiq, M. Child Marriage, Adolescent Pregnancy and School Dropout in South Asia (Report by the Population Council for the United Nations Children’s Fund Regional Office for South Asia, 2019).

Jayachandran, S. Social norms as a barrier to women’s employment in developing countries. IMF Econ. Rev. 69 (3), 576–595 (2021).

MathSciNet   Google Scholar  

Shahen, M. A. Gender-based violence in Bangladesh: a critical analysis. Int. J. Qual. Res. 1 (2), 127–139 (2021).

Bhowmik, J., Biswas, R. K. & Hossain, S. Child marriage and adolescent motherhood: a nationwide vulnerability for women in Bangladesh. Int. J. Environ. Res. Public Health 18 (8), 4030 (2021).

Daily & Star From Early Marriage to Risky Pregnancy: We Must Break the Cycle (Dhaka, 2022).

Nagata, H. Female workers’ skills, wages, and householding in Bangladesh’s readymade garment industry: the case of a Japanese multinational company. J. Interdiscp. Econ. 32 (1), 47–74 (2020).

Ahinkorah, B. O. et al. Girl child marriage and its association with maternal healthcare services utilization in sub-saharan Africa. BMC Health Serv. Res. 22 (1), 777 (2022).

Abera, M. et al. Early marriage and women’s empowerment: the case of child-brides in Amhara National Regional State, Ethiopia. BMC Int. Health Hum. Rights 20 (1), 1–16 (2020).

Li, C., Cheng, W. & Shi, H. Early Marriage and Maternal Health care Utilisation: Evidence from sub-Saharan Africa 101054 (Economics & Human Biology, 2021).

Sekhri, S. & Debnath, S. Intergenerational consequences of early age marriages of girls: Effect on children’s human capital. J. Dev. Stud. 50 (12), 1670–1686 (2014).

Otoo-Oyortey, N. & Pobi, S. Early marriage and poverty: exploring links and key policy issues. Gend. Dev. 11 (2), 42–51 (2003).

Download references

Acknowledgements

We are thankful to MEASURE DHS for the data support and also grateful to icddr, b where the data for this study was analysed. We are also acknowledged the Governments of Bangladesh, Canada, Sweden and the UK for providing core/unrestricted support for to run icddr, b. The authors also acknowledge the support of Health System and Population Studies Division of icddr, b and Department of Population Science of Jatiya Kabi Kazi Nazrul Islam University, where this study was designed and conducted.

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and affiliations.

Department of Population Science, Jatiya Kabi Kazi Nazrul Islam University, Namapara, Mymensingh, 2220, Bangladesh

Md. Nuruzzaman Khan & Shimlin Jahan Khanam

Nossal Institute for Global Health, Melbourne School of Population and Global health, The University of Melbourne, Melbourne, 3010, Australia

Md. Nuruzzaman Khan

Maternal and Child Health Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), 68 Shaheed Tajuddin Ahmed Sarani, Mohakhali, Dhaka, 1212, Bangladesh

Md. Mostaured Ali Khan

Health System and Population Studies Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), 68 Shaheed Tajuddin Ahmed Sarani, Mohakhali, Dhaka, 1212, Bangladesh

Md Arif Billah

La Trobe Rural Health School, John Richards Centre for Rural Ageing Research, La Trobe University, Melbourne, VIC, 3689, Australia

Shahinoor Akter

You can also search for this author in PubMed   Google Scholar

Contributions

“Khan MN designed the study. Khan MN, Khanam SJ and Khan MMA collected qualitative data. Khan MN and Billah MA analysed quantitative data while all authors analysed qualitative data. Khan MN, Khan MMA, Billah MA and Khanam SJ wrote the first draft of this manuscript. Akter S critically reviewed and edited the previous versions of this manuscript. All authors approved this final version of the manuscript”.

Corresponding author

Correspondence to Md. Nuruzzaman Khan .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

The quantitative data analysed in this study was extracted from the survey which was approved by the institutional review board of ICF macro (Inner City Fund) and the National Research Ethics Committee of the Bangladesh Medical Research Council. Informed consent was obtained from all participants. All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived. No separate ethical approval was required to conduct this study. We obtained permission to access this survey and conduct this research. All methods were performed in accordance with the relevant guidelines and regulations. Ethical approval for conducting the qualitative survey was obtained from the Institutional Review Board of the University of Rajshahi (123/430/IAMEBBC/IBSc), ensuring compliance with ethical guidelines and protocols. Informed consent was obtained from all participants who were 18 years of age or older. For participants who were under 18 years of age, their legal guardian, such as their husband or father, provided informed consent on their behalf. This process ensured that all participants had a clear understanding of the survey’s purpose, procedures, and potential risks, and voluntarily agreed to participate. So that this applies to illiterate respondents as well.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Khan, M.N., Khanam, S.J., Khan, M.M.A. et al. Exploring the impact of perceived early marriage on women’s education and employment in Bangladesh through a mixed-methods study. Sci Rep 14 , 21683 (2024). https://doi.org/10.1038/s41598-024-73137-w

Download citation

Received : 15 February 2024

Accepted : 13 September 2024

Published : 17 September 2024

DOI : https://doi.org/10.1038/s41598-024-73137-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Child marriage
  • Working status
  • Educational status

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

regression analysis research paper sample

IMAGES

  1. FREE 10+ Regression Analysis Samples in PDF

    regression analysis research paper sample

  2. FREE 13+ Research Analysis Samples in Word, PDF, Google Docs, Apple Pages

    regression analysis research paper sample

  3. Multiple Linear Regression Analysis Research Paper

    regression analysis research paper sample

  4. Regression Analysis

    regression analysis research paper sample

  5. Regression Analysis Essay Example

    regression analysis research paper sample

  6. FREE 10+ Regression Analysis Samples in PDF

    regression analysis research paper sample

VIDEO

  1. Regression Analysis, Simple Regression (Intro) -Chapter 5

  2. 3. Regression Analysis

  3. SPSS Tutorial: Mastering Simple Linear Regression for Data Analysis

  4. Regression Analysis

  5. The Nature of Regression Analysis

  6. Statistics Corner: Overview of Regression Analysis

COMMENTS

  1. (PDF) Regression Analysis

    7.1 Introduction. Regression analysis is one of the most fr equently used tools in market resear ch. In its. simplest form, regression analys is allows market researchers to analyze rela tionships ...

  2. PDF Using regression analysis to establish the relationship between home

    Home environment and reading achievement research has been largely dominated by a focus on early reading acquisition, while research on the relationship between home environments and reading success with preadolescents (Grades 4-6) has been largely overlooked. There are other limitations as well. Clarke and Kurtz-Costes (1997) argued that prior ...

  3. (PDF) Linear regression analysis study

    Linear regression is a statistical procedure for calculating the value of a dependent variable from an independent variable. Linear regression measures the association between two variables. It is ...

  4. Linear Regression Analysis

    Linear regression is used to study the linear relationship between a dependent variable Y (blood pressure) and one or more independent variables X (age, weight, sex). The dependent variable Y must be continuous, while the independent variables may be either continuous (age), binary (sex), or categorical (social status).

  5. Multiple Regression: Methodology and Applications

    Abstract. Multiple regression is one of the most significant forms of regression and has a wide range. of applications. The study of the implementation of multiple regression analysis in different ...

  6. Multiple Linear Regression

    Learn how to use multiple linear regression to estimate the relationship between two or more independent variables and one dependent variable. See a step-by-step guide with R code, assumptions, interpretation, and visualization.

  7. The clinician's guide to interpreting a regression analysis

    Regression analysis is an important statistical method that is commonly used to determine the relationship between several factors ... Logistic regression in medical research. Anesth Analg. 2021 ...

  8. PDF Multiple Regression Analysis

    Learn how to use multiple regression analysis to predict or explain the value of a quantitative criterion variable using several quantitative or dichotomous variables. Explore the statistical methods, the types of variables, and the general linear model underlying multiple regression analysis.

  9. PDF Fundamentals of Multiple Regression

    Learn the basic ideas and methods of multiple regression analysis, including how to interpret the coefficients and control for other variables. This chapter also provides examples, homework exercises, and guidance on research design and paper writing.

  10. PDF Multiple Regression Analysis of Performance Indicators in the Ceramic

    The research methodology is based on statistical analysis, which in this paper includes the multiple regression analysis. This type of analysis is used for modeling and analyzing several variables. The multiple regression analysis extends regression analysis Titan et al., by describing the relationship between a dependent

  11. Regression Analysis for Prediction: Understanding the Process

    Regression analysis is a statistical technique for determining the relationship between a single dependent (criterion) variable and one or more independent (predictor) variables. The analysis yields a predicted value for the criterion resulting from a linear combination of the predictors. According to Pedhazur, 15 regression analysis has 2 uses ...

  12. Multiple Regression Analysis Example with Conceptual Framework

    Learn how to use multiple regression analysis to predict the number of hours spent online by high school students based on their age, gender, and parental relationship. See the research question, literature review, findings, and conclusion of this study.

  13. An Introduction to Regression Analysis

    Alan O. Sykes, "An Introduction to Regression Analysis" (Coase-Sandor Institute for Law & Economics Working Paper No. 20, 1993). This Working Paper is brought to you for free and open access by the Coase-Sandor Institute for Law and Economics at Chicago Unbound. It has been accepted for inclusion in Coase-Sandor Working Paper Series in Law and ...

  14. A Study on Multiple Linear Regression Analysis

    Regression models with one dependent variable and more than one independent variables are called multilinear regression. In this study, data for multilinear regression analysis is occur from Sakarya University Education Faculty student's lesson (measurement and evaluation, educational psychology, program development, counseling and ...

  15. PDF Presentation of Regression Results Regression Tables

    The "Raw output" below is for one of those regressions. The table below reports results from all 6 regressions. Raw output for 1968-2004 regression of a standard Phillips Curve: Dependent Variable: INFLATION100 Method: Least Squares Date: 04/26/05 Time: 13:16 Sample: 1968M12 2004M12 Included observations: 433. Variable.

  16. Regression Analysis

    Learn how to use linear regression analysis to examine the linear relationship between a dependent and one or more independent variables. See how to apply regression analysis to marketing problems, such as deriving an optimal marketing mix, and how to deal with common issues, such as endogeneity.

  17. Regression Analysis

    Learn how to use regression analysis to estimate the relationships among variables and make predictions. Explore different types of regression models, such as linear, multiple, logistic, and nonlinear, with formulas and examples.

  18. A Study on Multiple Linear Regression Analysis

    In this study, data for multilinear regression analysis is occur from Sakarya University Education Faculty student's lesson (measurement and evaluation, educational psychology, program development ...

  19. Handbook of Regression Modeling in People Analytics: With Examples in R

    Logistic regression belongs to the class of the generalized linear models (GLM), and examples in R with this function are given on various data with plotting the results. Models by one and many input variables are considered, the coefficients are interpreted via the log odds linear link function, and goodness-of-fit and model parsimony are ...

  20. Understanding and interpreting regression analysis

    Linear regression analysis involves examining the relationship between one independent and dependent variable. Statistically, the relationship between one independent variable (x) and a dependent variable (y) is expressed as: y= β 0 + β 1 x+ε. In this equation, β 0 is the y intercept and refers to the estimated value of y when x is equal to 0.

  21. Regression Analysis

    Regression analysis is a quantitative method to test the relationship between a dependent and one or more independent variables. It is not part of operation research, but it can be used for modelling and analysing data in various fields.

  22. Simple Linear Regression

    Learn how to use simple linear regression to estimate the relationship between two quantitative variables. Find out the assumptions, formula, steps, and interpretation of this statistical method with examples and R code.

  23. A Refresher on Regression Analysis

    Learn what regression analysis is and how to use it to make data-driven decisions at work. This article explains the basics of regression analysis, its applications, and its limitations.

  24. Exploring the impact of perceived early marriage on women's ...

    Thematic analysis was employed for the qualitative data. Around 62% of the total women analysed reported their married occurred early with the mean age at marriage was 15.2 years.