Corresponding author: Department of Psychology, Faculty of Arts and Social Sciences, National University of Singapore, Block AS4, Level 2, 9 Arts Link, Singapore 117570. Tel.: +(65) 6516-3702; Fax: +(65) 6773-1843; E-mail: mikewlcheung@nus.edu.sg
Search for other works by this author on:Alcohol and Alcoholism, Volume 57, Issue 1, January 2022, Pages 5–15, https://doi.org/10.1093/alcalc/agab044
30 June 2021 25 July 2020 Revision received: 01 June 2021 Revision requested: 02 June 2021 02 June 2021 30 June 2021Mike W-L Cheung, Synthesizing Indirect Effects in Mediation Models With Meta-Analytic Methods, Alcohol and Alcoholism, Volume 57, Issue 1, January 2022, Pages 5–15, https://doi.org/10.1093/alcalc/agab044
Navbar Search Filter Mobile Enter search term Search Navbar Search Filter Enter search term SearchA mediator is a variable that explains the underlying mechanism between an independent variable and a dependent variable. The indirect effect indicates the effect from the predictor to the outcome variable via the mediator. In contrast, the direct effect represents the predictor's effort on the outcome variable after controlling for the mediator.
A single study rarely provides enough evidence to answer research questions in a particular domain. Replications are generally recommended as the gold standard to conduct scientific research. When a sufficient number of studies have been conducted addressing similar research questions, a meta-analysis can be used to synthesize those studies' findings.
The main objective of this paper is to introduce two frameworks to integrating studies using mediation analysis. The first framework involves calculating standardized indirect effects and direct effects and conducting a multivariate meta-analysis on those effect sizes. The second one uses meta-analytic structural equation modeling to synthesize correlation matrices and fit mediation models on the average correlation matrix. We illustrate these procedures on a real dataset using the R statistical platform.
ConclusionThis paper closes with some further directions for future studies.
A mediator is a variable that explains the underlying mechanism between an independent variable and a dependent variable ( Baron and Kenny, 1986). The essential step in a mediation analysis is to decompose the total effect on the components of indirect and direct effects. The indirect effect indicates the effect from the predictor to the outcome variable via the mediator. In contrast, the direct effect represents the predictor’s effect on the outcome variable after controlling for the mediator. The seminal work on mediation and moderation by Baron and Kenny (1986) has been cited more than 104,000 times in Google Scholar. Theirs is the most cited article published before 2000 in the Journal of Personality & Social Psychology ( Quiñones-Vidal et al., 2004). Mediation models, for instance, the Theory of Planned Behaviors ( Ajzen, 2002) and the Reasoned Action Approach (RAA; Fishbein and Ajzen, 2010), have been frequently applied to study alcohol misuse in primary studies (e.g. Conner et al., 1999; Hasking and Schofield, 2015; DiBello et al., 2020).
A single study rarely provides enough evidence to answer research questions in a particular domain. Replications are generally regarded as the gold standard in conducting scientific research ( Open Science Collaboration, 2015). When a sufficient number of studies have been conducted on similar research questions, a meta-analysis can be used to synthesize those studies’ findings (e.g. Anderson and Maxwell, 2016; Hedges and Schauer, 2019). Several methodological advancements have been made on how to analyze mediation models in primary studies, including, for instance, defining effect sizes for a mediation analysis ( Preacher and Kelley, 2011), constructing confidence intervals (CIs) on the indirect effects ( Cheung, 2007, 2009; MacKinnon et al., 2007), combining mediation with moderation ( Edwards and Lambert, 2007), extending mediation to multilevel models ( Preacher et al., 2010), applying a Bayesian framework on a mediation analysis ( Yuan and MacKinnon, 2009) and drawing causal interpretations ( Imai et al., 2010). However, there have been few studies focusing on how to synthesize findings in a mediation analysis (e.g. Shadish, 1996; Eden et al., 2015; Cheung and Cheung, 2016; Huang et al., 2016; Cheung and Hong, 2017).
The main objective of this paper is to introduce two different frameworks to synthesize studies using mediation analysis. The intended audience for this paper is applied researchers with knowledge on meta-analysis but not meta-analytic structural equation modeling (MASEM). In the following sections, we first add the background of a simple mediation model. Specifically, we present common effect sizes in a mediation analysis and discuss their pros and cons as effect sizes in a meta-analysis. We also introduce methods to estimate the sampling variances and covariances of the effect sizes. Then, we briefly review how a meta-analysis can be applied to these effect sizes. In the section after that, we present MASEM to synthesize the mediation analysis. We contrast the pros and cons of these two frameworks to integrating research findings in mediation models. After introducing the theoretical background, a real example is used to illustrate the procedures using the R statistical platform ( R Development Core Team, 2021). This paper closes with some further directions for future studies.
There are two general frameworks to synthesizing indirect effects ( Cheung and Cheung, 2016)—multivariate meta-analysis on the indirect effects, and MASEM. In the first framework, the indirect effect (and the direct effect) is first computed in each study and then meta-analyzed (see Aloe and Thompson, 2013; Becker and Wu, 2007; Gasparrini et al., 2012; Hafdahl, 2009 for some applications using other effect sizes). In contrast, the correlation matrices are analyzed in the first stage of MASEM. In the second stage of the analysis, a mediation model is fitted on the average correlation matrix (see Hagger et al., 2018; Zhang et al., 2021 for some applications). These two frameworks are presented below.
Under this framework, researchers either extract the reported indirect (and direct) effects or compute the indirect (and direct) effects from the correlation matrices in the primary studies. These effect sizes are used as inputs in subsequent meta-analyses.
Let us consider a simple mediation model displayed in Fig. 1. We ignore the variable names in the figures; these variables will be used in the illustrations later. The model includes a dependent variable Beh (y), a mediator Int (m) and a predictor PB (x). The indirect effect is estimated by the product term ab, whereas the direct effect is represented by the path coefficient c’. When the variables are continuous, the total effect c is the sum of the indirect and direct effects, i.e. c = ab + c’. This relationship may not hold if the mediator or dependent variable is either binary or categorical ( Mackinnon and Dwyer, 1993). The first step is to compute the effect size and its sampling variance in each study.
A mediation model with one mediator.
Several measures of indirect effects have been proposed and discussed in the literature (e.g. Alwin and Hauser, 1975; Sobel, 1982). Two such measures are the ratio of the indirect effect to the direct effect (ab/c’) and the ratio of the indirect effect to the total effect (ab/(ab + c’)). However, these two measures suffer from several limitations when applied in a meta-analysis. The value of the ratio of the indirect effect to the direct effect may vary from very large (positive) to very small (negative) if c’ is close to 0. The second measure was used to call the proportion of the indirect effect to the total effect. However, the so-called proportion can be larger than 1 when ab and c’ are in the opposite sign.
Preacher and Kelley (2011) provided a comprehensive treatment of various potential effect sizes in mediation analysis. It should be noted that their discussion mainly centered on primary studies. There are additional concerns when applying these effect sizes in a meta-analysis. They recommended two new indices of effect sizes—the residual-based index (Γ) and the maximum possible indirect effect (κ 2 ). Γ indicates the extent to which the variance in m is explained by x, and the variance in y is explained jointly by x and m. As there is no analytic method to obtain the sampling variance of Γ, Preacher and Kelley (2011) used a non-parametric bootstrap method to estimate its CI. Since raw data are required, and its applicability to a meta-analysis is unclear, we will not consider it further. Regarding κ 2 , it can be interpreted as the estimated indirect effect (ab) relative to the maximum possible indirect effect that can be obtained in the data. However, Wen and Fan (2015) found several mathematical difficulties in using κ 2 as an effect size in mediation analysis. These problems rule out the use of κ 2 in primary studies and meta-analyses.
One intuitive measure of effect size in a mediation analysis is simply the estimated indirect effect (ab), which is called the unstandardized indirect effect. The unstandardized indirect effect is scale-dependent because it is based on raw scores. Its value varies as a function of the variances of x and y. It is useful when all studies are in the same scale. However, it is usually not suitable for a meta-analysis in which studies may be in different scales. In the context of a meta-analysis, we may standardize the data and estimate the standardized indirect effect ( Cheung, 2009). We assume that the variables are standardized in the remaining article to avoid introducing extra notations for the unstandardized versus standardized parameters. Cheung (2009) shows that there are differences in estimating the sampling variance of the standardized indirect effect when the variances are treated as constants or variables. Specifically, the diagonals of a covariance matrix are random variables, whereas the diagonals of a correlation matrix are always fixed at one. We ignore the effect of standardization, assuming that the sample sizes in the primary studies are reasonably large. When the sample sizes are too small, readers may refer to Cheung (2009) on how to correct them.
The standardized indirect effect is represented here by yab. Several formulas have been suggested to estimate the sampling variance (Vab) of the standardized indirect effect (see Cheung, 2009). The sampling variance of yab is
where Va and Vb are the sampling variances of a and b, respectively, and Cab is the sampling covariance of a and b. Because the last term, 2abCab, is relatively small compared with the other terms, some authors (e.g. Sobel, 1982) have dropped it from their calculations. As it is straightforward to keep this term in a computer program, we do not simplify this term in our analyses. The Appendix provides detailed derivations on how to compute the sampling variance–covariance matrices using the delta method.
In real applications, mediation models tend to be much more complicated than those with one mediator (e.g. Cheung, 2007; Taylor et al., 2008). Figure 2 shows a parallel mediation model with two specific mediators. The indirect effects via m1 Cap and m2 Aut are ab and cd, respectively. Their sampling variances are the same as that in Equation (1), whereas their sampling covariance is slightly more complicated (see Supplementary Materials 1 ). Researchers may conduct a multivariate meta-analysis with the two indirect effects (and the direct effect) as the effect sizes.
A mediation model with two parallel mediators.
Figure 3 shows a serial mediation model with two intermediate mediators. The indirect effect via the two intermediate mediators is yabc = abc, whereas its sampling variance can be estimated by
$$\beginA mediation model with two serial mediators.
As illustrated in the above examples, it is challenging to define effect sizes in complex mediation models. Let us consider the serial mediation model in Fig. 3 as an example. The product term abc is the indirect effect via m1 Cap and m2 Int. There are other potential definitions of indirect effects, such as ae via m1 and dc via m2. It is possible to include all of them as effect sizes: abc, ae and dc. One practical issue is that these effect sizes will be highly correlated because they share some parameters. When these effect sizes are meta-analyzed, researchers should apply a multivariate meta-analysis, which takes the dependence of the effect sizes into account.
Another issue is that it becomes more and more challenging to meet the conditional normal distribution when more and more product terms are involved. For example, it is more difficult for yabc in Fig. 3 to achieve a conditional normal distribution than it is for yab in Fig. 1, given the same condition. Taylor et al. (2008) found that the standard errors worked fine unless two or three of the paths in abc were zero, in which the bias could be overestimated by 20–30%. Readers should be cautious when two or more paths are close to zero. Since the use of the delta method is tedious and prone to human error, in Supplementary Materials 1 , we show how numerical and symbolic computations in R using the metaSEM ( Cheung, 2015a) and symSEM ( Cheung, 2020) packages can be used to do the calculations.
Once we have computed the effect sizes and their sampling variances and covariances, we may conduct a standard meta-analysis. Since the fixed-effect (or common-effect) model is a special case of the random-effects model ( Borenstein et al., 2010), we will only focus on the random- and mixed-effects meta-analyses here. The general random-effects meta-analytic model in the ith study is
$$\beginwhere yabi is the standardized indirect effect, β0 is the average population indirect effect, τ 2 = Var(ui) is the heterogeneity variance of the random effects and Vi = Var(ei) is the known sampling variance of the indirect effect. Several estimation methods, including the methods of moments, maximum likelihood (ML) and restricted ML, can be used to estimate the parameters and make inferences on the parameter estimates (e.g. Viechtbauer, 2005). Both the absolute heterogeneity index τ 2 and the relative heterogeneity index I 2 can be used to quantify the degree of heterogeneity ( Borenstein et al., 2017).
There are a few remarks worth noting here. As shown in Equation ( 3), there are two sources of ‘variances’ in a meta-analytic model—τ 2 and Vi. Vi is assumed approximately normally distributed with a known variance so that we may estimate the heterogeneity τ 2 . It is well known in the literature on mediation analysis that the estimated indirect effect is only approximately normally distributed when the sample sizes are huge ( MacKinnon et al., 2002). Bootstrap and likelihood-based (LB) CIs are generally recommended to construct CIs for hypothesis testing. Thus, the formulas for the sampling variances of indirect effects should only be applied when the sampling sizes are reasonably large.
Second, a small or zero indirect effect does not necessarily mean that there is no effect from the predictor x to the outcome variable y; there may still be a substantial direct effect c’. Meta-analyzing indirect effects alone may cause one to overlook the critical point of direct effects. Cheung and Cheung (2016) recommended conducting a multivariate meta-analysis by including both indirect and direct effects. Researchers may then get a comprehensive picture of both the direct and indirect effects, and the total effect. It will also be possible to estimate the heterogeneity variances and the covariance of the indirect and direct effects.
Suppose that the degree of heterogeneity is non-trivial; we may want to explain the heterogeneity by conducting a mixed-effects meta-analysis. The model with a study-level moderator (xi) in the ith study is
$$\beginwhere β0 is the population indirect effect when xi is 0, and τ 2 = Var(ui) is the residual heterogeneity variance of the random effects after controlling for xi. Besides testing the moderator’s statistical significance, we may also calculate an R 2 index that indicates the percentage of the heterogeneity variance that can be explained by the moderator ( Raudenbush, 2009). The above mixed-effects meta-analysis can easily be extended to a multivariate mixed-effects meta-analysis with several indirect effects and direct effects (e.g. Jackson et al., 2011; Cheung, 2013).
Interpreting β0 and β1 in Equation ( 4) would seem be a straightforward process. However, there is more than one interpretation of β1, which deserves some discussion. β1 can be interpreted as the expected change in the indirect effect when xi increases by 1 unit. That is,
$$\beginThe critical point in the above interpretation is to treat the indirect effect as a single entity in each study. We may rewrite the above equation in two different, but equivalent, ways
In the first equation in Equation (6), β1 indicates the expected change in ai when xi/bi increases by 1 unit. In other words, β1 is the expected change in ai when xi increases by 1 unit ‘given bi is 1’. The same interpretations apply to the second equation. When we interpret one parameter, other parameters have to be conditioned on some values. Researchers may choose the interpretations that best fit their research questions. The interpretations will become even more complicated when the indirect effect is calculated from a serial mediation model yabc = abc.
Another framework to synthesizing mediation models is MASEM. There are several approaches to conducting MASEM, including the univariate-r approach ( Viswesvaran and Ones, 1995), generalized least squares ( Becker, 1992, 1995), two-stage structural equation modeling (TSSEM; Cheung, 2014; Cheung and Chan, 2005a) and one-stage MASEM (OSMASEM; Jak and Cheung, 2020) (see Cheung, 2015b, 2019; Cheung and Hafdahl, 2016 for discussions on their similarities and differences). In this paper, we will focus on the TSSEM and OSMASEM approaches.
In the first stage of the analysis, the correlation matrices are meta-analyzed via a multivariate approach by taking the sampling variance–covariance matrices of correlations into account. To fix the notations, suppose that Ri is a sample correlation matrix in the ith study. For ease of presentation, we transform it into a vector of correlation coefficients ri via a vechs() operator, which takes the lower triangle elements without the diagonals by a column-order major, that is ri = vechs(Ri). For example, if Ri is a 4 × 4 correlation matrix, ri is a column vector with 6 = 4 × (4–1)/2 correlation coefficients. If some studies do not report the full correlation matrix, the elements in ri are reduced accordingly. The model for a multivariate meta-analysis is
$$\beginwhere ρ is the vector of average correlations adjusting for the number of correlations in the model, Tρ 2 = Var(ui) is the heterogeneity matrix of the random effects and Vi = Var(ei) is the known sampling covariance matrix of the sampling errors ( Olkin and Siotani, 1976). The variance–covariance matrices of the sampling errors Vi depend on the sample sizes in the studies. In contrast, the heterogeneity matrix Tρ 2 represents the true variability of the study-specific correlation coefficients in the population.
There are two possible sources of missing data in Ri. Some participants may not provide data in a few variables in the primary studies. Researchers may use methods, such as list-wise deletion or pairwise deletion, to handle the missing data at the participant level. Unfortunately, meta-analysts cannot address this type of missing value as raw data are unlikely available in a meta-analysis. MASEM (and SEM) expect the sample size is the same in the same study. The second type of missing value is missing variables. As different researchers conduct the primary studies, it is likely that not all variables are measured in the studies. The full information maximum likelihood (FIML) estimation, which is unbiased and efficient when missing data are either missing completely at random or missing at random ( Enders, 2010), is used in the TSSEM and OSMASEM to estimate the parameters.
After the stage-one analysis, we have an average correlation vector |$\hat$| , its asymptotic sampling covariance matrix |$<\hat>_$| , which indicates the precision of the estimates, and a heterogeneity matrix |$<\hat
The fit function tries to find the parameter estimates that minimize the discrepancy between our data r and the proposed model ρ(θ) by taking the precision matrix Vr into account. Likelihood-ratio test statistics and various goodness-of-fit indices may be used to test the exact and approximate fit of the proposed model. The parameter estimates divided by their standard errors (SEs) approximately follows a standard normal distribution.
The above procedure can be extended to handle categorical moderators. Essentially, studies are grouped according to the categorical moderators. Separate stage-one and stage-two analyses are applied to studies with a multiple-group SEM approach (e.g. Cheung and Chan, 2005b ; Jak and Cheung, 2018).
The TSSEM approach works well when there is no moderator or when there are categorical moderators. When the primary research questions are related to some continuous moderators, the TSSEM approach may not work well. One way of addressing this problem is to categorize the continuous moderator into several categories. However, categorizing continuous moderators may lead to problems, such as loss of information about individual differences, loss of power, the occurrence of spurious significant main effects or interactions, and the risk of overlooking nonlinear effects ( MacCallum et al., 2002). Jak and Cheung (2020) proposed a novel approach to test categorical and continuous moderators in MASEM. The model of OSMASEM in the ith study is
$$\beginThe model means that the sample correlation matrix ri is a function of the proposed correlation structure ρ(θi) after taking the heterogeneity variance Tρ 2 = Var(ui) and known sampling error Vi = Var(ei) into account. The FIML estimation method is used to conduct statistical inferences. Thus, it shares the same benefits as TSSEM in handling missing data.
There are some critical differences between the TSSEM and OSMASEM approaches. First, the proposed correlation structure ρ(θi) is included in Equation (9). Instead of using two stages of analysis in the TSSEM approach, OSMASEM only uses one analysis stage. The proposed structural equation models are fitted directly on the correlation matrices without estimating an average correlation matrix. Both the parameter θ in the structural equation models and the heterogeneity variances Tρ 2 are estimated simultaneously. Second, there is a subscript i in ρ(θi), meaning that the correlation structure may vary according to the moderators in the ith study. Both categorical and continuous moderators can be added to the proposed structural equation models ρ(θi). For a simple mediation model shown in Fig. 1, we may fit the model with a moderator xi in the ith study as
These equations state that the parameters, ai, bi and ci are modeled by the moderator xi. Researchers may decide which parameters are modeled. By comparing the models with and without the moderator, we may test the statistical significance of the moderator.
Cheung and Cheung (2016) provided detailed comparisons between these two frameworks to conducting MASEM. We will only summarize a few key points here. Since the indirect effects and possibly the direct effects are used as the effect sizes in the meta-analysis in the first framework, sufficient statistics are required to compute the direct and indirect effects in each study. Therefore, studies can be included in the meta-analysis when either the indirect or direct effects or the correlation matrices are reported in the primary studies. On the other hand, the TSSEM or OSMASEM approaches use correlation matrices as the inputs. All studies can be included in MASEM as long as the studies have reported some elements in the correlation matrix, e.g. x and m, x and y, or m and y. Missing correlations are handled with the FIML estimation method.
The meta-analysis on the indirect effects framework requires the indirect effects to be defined before the meta-analysis. As illustrated in Figs 2 and 3, there may be more than one way to define the indirect effects in complex mediation models. Different definitions may lead to different results and interpretations. Extra care has to be exercised in making judgment calls ( Aguinis et al., 2011; Aytug et al., 2012). Moreover, some studies may use models of a single mediator, parallel or serial mediators. This makes it challenging to properly meta-analyze the indirect effects. In contrast, under the MASEM framework, indirect effects are only defined when fitting the structural equation models in the stage-two analysis. It is relatively easy to explore different definitions of indirect effects in the second approach.
Another issue relates to the inclusion of moderators. As the indirect effect is a product term, there is more than one interpretation in the regression coefficient from the predictor to the indirect effect, as shown in Equations (5) and (6). On the other hand, there are no such issues in MASEM. Supposing that the model in Equation (10) in MASEM is correctly specified, the relationship between the moderator and the indirect effect would indeed be a quadratic rather than a linear one
In other words, the linear model in Equation (5) may not fully capture the quadratic relationship in Equation (11).
Another essential difference between these two frameworks is their empirical performance. In a recent simulation, van Zundert and Miočević (2020) compared six methods, including the meta-analysis on the indirect effects and MASEM, to synthesize indirect effects. They found that MASEM had the best performance among these methods in terms of bias, precision and coverage. On the other hand, a meta-analysis on the indirect effects tended to underestimate the parameters and coverage. This is expected because the sampling distributions of correlation coefficients in MASEM are closer to normal distributions than those in the indirect effects. In sum, MASEM (TSSEM and OSMASEM) are more flexible and have better statistical performance. The main limitation of MASEM, however, is that it requires correlation matrices as inputs. Researchers should use MASEM unless correlation matrices are not available.
We use a meta-analysis reported by Hagger et al. (2018), which was based on McEachan et al. (2016), to demonstrate how to synthesize indirect effects. Hagger et al. (2018) conducted a MASEM on the RAA model using various health behaviors, including alcohol drinking as the outcome variable. The data set has 81 correlation matrices of variables. The full model includes past behavior (PB) as a predictor with several multiple mediators, such as experiential attitude, instrumental attitude, injunctive norm, descriptive norm, autonomy (Aut), and capacity (Cap), one serial mediator (intention; Int) and one outcome variable (behavior; Beh). We selected PB, Aut, Cap and Beh to illustrate procedures introduced in the paper. We also selected behavior frequency (high versus low), how often the behaviors are measured, as a potential study-level moderator. It should be noted that the variables selected were only a small subset of a larger model in the original study. Readers should refer to Hagger et al. (2018) for the substantive interpretations of the RAA model. All of the analyses were conducted in R ( R Development Core Team, 2021) with the metaSEM package ( Cheung, 2015a). They are available in Supplementary Materials 2 , while the updated analyses can be accessed in Github at https://github.com/mikewlcheung/code-in-articles. The statistical inferences were based on 0.05 significance level.
Let us first illustrate how to synthesize the indirect effects with one mediator. Figure 1 displays a model with PB as the predictor, Int as the mediator and BEH as the outcome variable. We first apply the meta-analysis on the indirect effects and then the MASEM. The MASEM approach allows missing data in the correlation matrices, whereas the meta-analysis on the indirect effect approach does not. We only used 29 studies without any missing correlation for the ease of comparisons. Table 1 shows the results of both approaches.
Results of the one-mediator model
. | Estimate . | Lower bound . | Upper bound . | τ 2 . | I 2 . |
---|---|---|---|---|---|
Meta-analysis on the indirect effects | |||||
Indirect effect (ab) | 0.145 | 0.105 | 0.185 | 0.010 | 0.904 |
Direct effect (c’) | 0.428 | 0.347 | 0.508 | 0.043 | 0.935 |
Meta-analytic structural equation modeling | |||||
PB → Int (a) | 0.537 | 0.471 | 0.604 | ||
Int → Beh (b) | 0.273 | 0.156 | 0.384 | ||
PB → Beh (c’) | 0.380 | 0.260 | 0.499 | ||
Indirect effect (ab) | 0.146 | 0.086 | 0.210 | ||
Direct effect (c’) | 0.380 | 0.260 | 0.499 |
. | Estimate . | Lower bound . | Upper bound . | τ 2 . | I 2 . |
---|---|---|---|---|---|
Meta-analysis on the indirect effects | |||||
Indirect effect (ab) | 0.145 | 0.105 | 0.185 | 0.010 | 0.904 |
Direct effect (c’) | 0.428 | 0.347 | 0.508 | 0.043 | 0.935 |
Meta-analytic structural equation modeling | |||||
PB → Int (a) | 0.537 | 0.471 | 0.604 | ||
Int → Beh (b) | 0.273 | 0.156 | 0.384 | ||
PB → Beh (c’) | 0.380 | 0.260 | 0.499 | ||
Indirect effect (ab) | 0.146 | 0.086 | 0.210 | ||
Direct effect (c’) | 0.380 | 0.260 | 0.499 |
Note. X → Y represents the path coefficient from variable X to variable Y. Lower bound and upper bound represent the 95% CI.
Results of the one-mediator model
. | Estimate . | Lower bound . | Upper bound . | τ 2 . | I 2 . |
---|---|---|---|---|---|
Meta-analysis on the indirect effects | |||||
Indirect effect (ab) | 0.145 | 0.105 | 0.185 | 0.010 | 0.904 |
Direct effect (c’) | 0.428 | 0.347 | 0.508 | 0.043 | 0.935 |
Meta-analytic structural equation modeling | |||||
PB → Int (a) | 0.537 | 0.471 | 0.604 | ||
Int → Beh (b) | 0.273 | 0.156 | 0.384 | ||
PB → Beh (c’) | 0.380 | 0.260 | 0.499 | ||
Indirect effect (ab) | 0.146 | 0.086 | 0.210 | ||
Direct effect (c’) | 0.380 | 0.260 | 0.499 |
. | Estimate . | Lower bound . | Upper bound . | τ 2 . | I 2 . |
---|---|---|---|---|---|
Meta-analysis on the indirect effects | |||||
Indirect effect (ab) | 0.145 | 0.105 | 0.185 | 0.010 | 0.904 |
Direct effect (c’) | 0.428 | 0.347 | 0.508 | 0.043 | 0.935 |
Meta-analytic structural equation modeling | |||||
PB → Int (a) | 0.537 | 0.471 | 0.604 | ||
Int → Beh (b) | 0.273 | 0.156 | 0.384 | ||
PB → Beh (c’) | 0.380 | 0.260 | 0.499 | ||
Indirect effect (ab) | 0.146 | 0.086 | 0.210 | ||
Direct effect (c’) | 0.380 | 0.260 | 0.499 |
Note. X → Y represents the path coefficient from variable X to variable Y. Lower bound and upper bound represent the 95% CI.
We conducted a multivariate random-effects meta-analysis on the indirect and direct effects. The average indirect effect was moderate (0.145) and statistically significant, whereas the average direct effect was even much stronger (0.428) and statistically significant. The population effect sizes seem to be heterogeneous, with χ 2 (df = 56) = 1031.86, P < 0.001. The degree of heterogeneity of the indirect effect (τ 2 = 0.010, I 2 = 0.904) and direct effect (τ 2 = 0.043, I 2 = 0.935) was substantial. The considerable degree of heterogeneity suggests that the effect varies across studies. We may include study-level moderators to explain the heterogeneity across studies.
We further included the behavior frequency (high versus low) as a moderator to predict the indirect and direct effects. The estimated slopes and their 95% Wald CIs on the indirect and direct effects were 0.005 (−0.115, 0.126) and −0.038 (−0.266, 0.190), respectively. Both R 2 s were small (0.005 and 0.002); the likelihood-ratio test in testing both regression coefficients was statistically non-significant, χ 2 (df = 2) = 0.119, P = 0.942. Thus, there was not enough evidence to support the claim that the behavior frequency predicts the indirect and direct effects. Although the slope on the indirect effect is not statistically significant, we still illustrate how to interpret it for the sake of pedagogical purposes. The value of 0.005 can be interpreted as (a) the expected change on a * b when the behavior frequency is changed from low to high; (b) the expected change on a when the behavior frequency is changed from low to high, assuming b is one; and (c) the expected change on b when the behavior frequency is changed from low to high, assuming a is one.
Regarding the TSSEM, we estimated an average correlation matrix with a random-effects model in the first stage of analysis. The average correlations of the pairs Int_Beh, PB_Beh and PB_Int were 0.477, 0.526 and 0.537, respectively. The population correlation matrices were heterogeneous with χ 2 (df = 84) = 1199.94, P < 0.001. The heterogeneity variances τ 2 (and their I 2 ) of the pairs Int_Beh, PB_Beh and PB_Int were 0.033 (0.940), 0.041 (0.959) and 0.030 (0.948), respectively. The large degree of heterogeneity in the correlation matrices suggests that moderators may be used to explain why some studies have larger correlations than others. We will elaborate on the findings later.
In the stage-two analysis, we fitted the mediation model on the average correlation matrix. Since the mediation model was just-identified, there was no test statistic on its model fit. The estimated path coefficients (and their 95% LBCIs) on a, b and c’ were 0.537 (0.471, 0.604), 0.273 (0.156, 0.384) and 0.380 (0.260, 0.499), respectively. Figure 4 displays the model with the parameter estimates. The estimated indirect and direct effects (and their 95% LBCIs) were 0.146 (0.086, 0.210) and 0.380 (0.260, 0.499), respectively. All the estimated coefficients were statistically significant. The average indirect and direct effects are comparable to those in the meta-analysis of the indirect and direct effects.
A mediation model with one mediator in the illustration.
The next step was to test the moderating effect of using the behavior frequency in predicting the regression coefficients a, b and c’ using the OSMASEM. The overall comparison between the models with and without the moderator was statistically non-significant, at χ 2 (df = 3) = 0.615, P = 0.893. The estimated regression coefficients on predicting a, b and c’ were −0.056, −0.038 and 0.012, respectively. All of them were statistically non-significant. Therefore, there was not enough evidence to support the claim that the behavior frequency predicts the mediation model’s path coefficients.
Next, we illustrate how to synthesize the findings for a more complex mediation model shown in Fig. 2. In this model, Cap and Aut are two parallel mediators explaining the effect from PB to Beh. There were 13 studies with a total of 3958 participants. The findings of the meta-analysis on the indirect effects and MASEM are shown in Table 2.
Results of the two multiple-mediator model
. | Estimate . | Lower bound . | Upper bound . | τ 2 . | I 2 . |
---|---|---|---|---|---|
Meta-analysis on the indirect effects | |||||
Ind_Cap (ab) | 0.069 | 0.034 | 0.104 | 0.003 | 0.827 |
Ind_Aut (cd) | 0.005 | -0.001 | 0.011 | 0.000 | 0.000 |
Dir_PB (e) | 0.244 | 0.130 | 0.360 | 0.038 | 0.930 |
Meta-analytic structural equation modeling | |||||
PB → Cap (a) | 0.435 | 0.362 | 0.508 | ||
Cap→Beh (b) | 0.176 | 0.033 | 0.319 | ||
PB → Aut (c) | 0.208 | 0.084 | 0.331 | ||
Aut → Beh (d) | -0.065 | -0.233 | 0.093 | ||
PB → Beh (e) | 0.574 | 0.466 | 0.685 | ||
Cap↔Aut (f) | 0.264 | 0.117 | 0.410 | ||
Ind_Cap (ab) | 0.077 | 0.015 | 0.139 | ||
Ind_Aut (cd) | -0.013 | -0.067 | 0.016 | ||
Dir_PB (e) | 0.264 | 0.117 | 0.410 |
. | Estimate . | Lower bound . | Upper bound . | τ 2 . | I 2 . |
---|---|---|---|---|---|
Meta-analysis on the indirect effects | |||||
Ind_Cap (ab) | 0.069 | 0.034 | 0.104 | 0.003 | 0.827 |
Ind_Aut (cd) | 0.005 | -0.001 | 0.011 | 0.000 | 0.000 |
Dir_PB (e) | 0.244 | 0.130 | 0.360 | 0.038 | 0.930 |
Meta-analytic structural equation modeling | |||||
PB → Cap (a) | 0.435 | 0.362 | 0.508 | ||
Cap→Beh (b) | 0.176 | 0.033 | 0.319 | ||
PB → Aut (c) | 0.208 | 0.084 | 0.331 | ||
Aut → Beh (d) | -0.065 | -0.233 | 0.093 | ||
PB → Beh (e) | 0.574 | 0.466 | 0.685 | ||
Cap↔Aut (f) | 0.264 | 0.117 | 0.410 | ||
Ind_Cap (ab) | 0.077 | 0.015 | 0.139 | ||
Ind_Aut (cd) | -0.013 | -0.067 | 0.016 | ||
Dir_PB (e) | 0.264 | 0.117 | 0.410 |
Note. Ind_X is the indirect effect via variable X. X → Y represents the path coefficient from variable X to variable Y. X↔Y represents the correlation between variables X and Y. Lower bound and upper bound represent the 95% CI.
Results of the two multiple-mediator model
. | Estimate . | Lower bound . | Upper bound . | τ 2 . | I 2 . |
---|---|---|---|---|---|
Meta-analysis on the indirect effects | |||||
Ind_Cap (ab) | 0.069 | 0.034 | 0.104 | 0.003 | 0.827 |
Ind_Aut (cd) | 0.005 | -0.001 | 0.011 | 0.000 | 0.000 |
Dir_PB (e) | 0.244 | 0.130 | 0.360 | 0.038 | 0.930 |
Meta-analytic structural equation modeling | |||||
PB → Cap (a) | 0.435 | 0.362 | 0.508 | ||
Cap→Beh (b) | 0.176 | 0.033 | 0.319 | ||
PB → Aut (c) | 0.208 | 0.084 | 0.331 | ||
Aut → Beh (d) | -0.065 | -0.233 | 0.093 | ||
PB → Beh (e) | 0.574 | 0.466 | 0.685 | ||
Cap↔Aut (f) | 0.264 | 0.117 | 0.410 | ||
Ind_Cap (ab) | 0.077 | 0.015 | 0.139 | ||
Ind_Aut (cd) | -0.013 | -0.067 | 0.016 | ||
Dir_PB (e) | 0.264 | 0.117 | 0.410 |
. | Estimate . | Lower bound . | Upper bound . | τ 2 . | I 2 . |
---|---|---|---|---|---|
Meta-analysis on the indirect effects | |||||
Ind_Cap (ab) | 0.069 | 0.034 | 0.104 | 0.003 | 0.827 |
Ind_Aut (cd) | 0.005 | -0.001 | 0.011 | 0.000 | 0.000 |
Dir_PB (e) | 0.244 | 0.130 | 0.360 | 0.038 | 0.930 |
Meta-analytic structural equation modeling | |||||
PB → Cap (a) | 0.435 | 0.362 | 0.508 | ||
Cap→Beh (b) | 0.176 | 0.033 | 0.319 | ||
PB → Aut (c) | 0.208 | 0.084 | 0.331 | ||
Aut → Beh (d) | -0.065 | -0.233 | 0.093 | ||
PB → Beh (e) | 0.574 | 0.466 | 0.685 | ||
Cap↔Aut (f) | 0.264 | 0.117 | 0.410 | ||
Ind_Cap (ab) | 0.077 | 0.015 | 0.139 | ||
Ind_Aut (cd) | -0.013 | -0.067 | 0.016 | ||
Dir_PB (e) | 0.264 | 0.117 | 0.410 |
Note. Ind_X is the indirect effect via variable X. X → Y represents the path coefficient from variable X to variable Y. X↔Y represents the correlation between variables X and Y. Lower bound and upper bound represent the 95% CI.
Let us first consider the meta-analysis on the indirect effects. The average parallel indirect effects via Cap (ab) and Aut (cd) were 0.069 and 0.005, respectively, whereas the average direct effect from PB (e) was 0.244. All the estimated effects were statistically significant except for the indirect effect via Aut. As a whole, these results suggest that the indirect effects are weak, whereas the remaining direct effect from PB is very strong. Regarding their heterogeneity, the heterogeneity variances τ 2 (and I 2 ) of the indirect effects via Cap, Aut and the direct effect from PB were 0.003 (0.827), 0.000 (0.000) and 0.038 (0.930), respectively. The findings suggest that the indirect effect via Aut is very stable across studies, whereas the indirect effect via Cap and the direct effect vary across studies. Since there are only 13 studies in this meta-analysis, future studies may need to verify these results.
Next, we report the results of the TSSEM approach. Since the mediation model was just-identified, there were no fit statistics on the model fit. Figure 5 displays the model with the parameter estimates. The average parallel indirect effects via Cap (ab) and Aut (cd) and the direct effect from PB (e) were 0.077, −0.013 and 0.264, respectively. The results are similar to that of the meta-analysis on the indirect and direct effects except that the estimated direct effect from PB is weaker.
A mediation model with two parallel mediators in the illustration.
Finally, we illustrate how to synthesize a mediation model with two serial mediators shown in Fig. 3. There are three indirect effects in this model—the serial indirect effect via Cap and Aut (abc), the indirect effects via Cap (ae), and via Int (dc). We may also include the direct effect from PB (f) in the meta-analysis. There were 20 studies with a total of 6207 participants. The findings of the meta-analysis on the indirect effects and MASEM are shown in Table 3.
Results of the two serial-mediator model
. | Estimate . | Lower bound . | Upper bound . | τ 2 . | I 2 . |
---|---|---|---|---|---|
Meta-analysis on the indirect effects | |||||
Ind_CapInt (abc) | 0.024 | 0.013 | 0.035 | 0.000 | 0.865 |
Ind_Cap (ae) | 0.028 | 0.007 | 0.049 | 0.002 | 0.755 |
Ind_Int (dc) | 0.086 | 0.050 | 0.122 | 0.005 | 0.931 |
Dir_PB (f) | 0.411 | 0.336 | 0.487 | 0.024 | 0.883 |
Meta-analytic structural equation modeling | |||||
PB → Cap (a) | 0.383 | 0.311 | 0.456 | ||
Cap→Int (b) | 0.392 | 0.309 | 0.472 | ||
Int → Beh (c) | 0.216 | 0.042 | 0.381 | ||
PB → Int (d) | 0.411 | 0.327 | 0.493 | ||
Cap→Beh (e) | 0.093 | -0.038 | 0.220 | ||
PB → Beh (f) | 0.399 | 0.263 | 0.535 | ||
Ind_CapInt (abc) | 0.032 | 0.006 | 0.063 | ||
Ind_Cap (ae) | 0.036 | -0.015 | 0.083 | ||
Ind_Int (dc) | 0.089 | 0.018 | 0.160 | ||
Dir_PB (f) | 0.399 | 0.263 | 0.535 |
. | Estimate . | Lower bound . | Upper bound . | τ 2 . | I 2 . |
---|---|---|---|---|---|
Meta-analysis on the indirect effects | |||||
Ind_CapInt (abc) | 0.024 | 0.013 | 0.035 | 0.000 | 0.865 |
Ind_Cap (ae) | 0.028 | 0.007 | 0.049 | 0.002 | 0.755 |
Ind_Int (dc) | 0.086 | 0.050 | 0.122 | 0.005 | 0.931 |
Dir_PB (f) | 0.411 | 0.336 | 0.487 | 0.024 | 0.883 |
Meta-analytic structural equation modeling | |||||
PB → Cap (a) | 0.383 | 0.311 | 0.456 | ||
Cap→Int (b) | 0.392 | 0.309 | 0.472 | ||
Int → Beh (c) | 0.216 | 0.042 | 0.381 | ||
PB → Int (d) | 0.411 | 0.327 | 0.493 | ||
Cap→Beh (e) | 0.093 | -0.038 | 0.220 | ||
PB → Beh (f) | 0.399 | 0.263 | 0.535 | ||
Ind_CapInt (abc) | 0.032 | 0.006 | 0.063 | ||
Ind_Cap (ae) | 0.036 | -0.015 | 0.083 | ||
Ind_Int (dc) | 0.089 | 0.018 | 0.160 | ||
Dir_PB (f) | 0.399 | 0.263 | 0.535 |
Note. Ind_X is the indirect effect via variable X. Ind_XY is the indirect effect via variables X and Y. X → Y represents the path coefficient from variable X to variable Y. Lower bound and upper bound represent the 95% CI.
Results of the two serial-mediator model
. | Estimate . | Lower bound . | Upper bound . | τ 2 . | I 2 . |
---|---|---|---|---|---|
Meta-analysis on the indirect effects | |||||
Ind_CapInt (abc) | 0.024 | 0.013 | 0.035 | 0.000 | 0.865 |
Ind_Cap (ae) | 0.028 | 0.007 | 0.049 | 0.002 | 0.755 |
Ind_Int (dc) | 0.086 | 0.050 | 0.122 | 0.005 | 0.931 |
Dir_PB (f) | 0.411 | 0.336 | 0.487 | 0.024 | 0.883 |
Meta-analytic structural equation modeling | |||||
PB → Cap (a) | 0.383 | 0.311 | 0.456 | ||
Cap→Int (b) | 0.392 | 0.309 | 0.472 | ||
Int → Beh (c) | 0.216 | 0.042 | 0.381 | ||
PB → Int (d) | 0.411 | 0.327 | 0.493 | ||
Cap→Beh (e) | 0.093 | -0.038 | 0.220 | ||
PB → Beh (f) | 0.399 | 0.263 | 0.535 | ||
Ind_CapInt (abc) | 0.032 | 0.006 | 0.063 | ||
Ind_Cap (ae) | 0.036 | -0.015 | 0.083 | ||
Ind_Int (dc) | 0.089 | 0.018 | 0.160 | ||
Dir_PB (f) | 0.399 | 0.263 | 0.535 |
. | Estimate . | Lower bound . | Upper bound . | τ 2 . | I 2 . |
---|---|---|---|---|---|
Meta-analysis on the indirect effects | |||||
Ind_CapInt (abc) | 0.024 | 0.013 | 0.035 | 0.000 | 0.865 |
Ind_Cap (ae) | 0.028 | 0.007 | 0.049 | 0.002 | 0.755 |
Ind_Int (dc) | 0.086 | 0.050 | 0.122 | 0.005 | 0.931 |
Dir_PB (f) | 0.411 | 0.336 | 0.487 | 0.024 | 0.883 |
Meta-analytic structural equation modeling | |||||
PB → Cap (a) | 0.383 | 0.311 | 0.456 | ||
Cap→Int (b) | 0.392 | 0.309 | 0.472 | ||
Int → Beh (c) | 0.216 | 0.042 | 0.381 | ||
PB → Int (d) | 0.411 | 0.327 | 0.493 | ||
Cap→Beh (e) | 0.093 | -0.038 | 0.220 | ||
PB → Beh (f) | 0.399 | 0.263 | 0.535 | ||
Ind_CapInt (abc) | 0.032 | 0.006 | 0.063 | ||
Ind_Cap (ae) | 0.036 | -0.015 | 0.083 | ||
Ind_Int (dc) | 0.089 | 0.018 | 0.160 | ||
Dir_PB (f) | 0.399 | 0.263 | 0.535 |
Note. Ind_X is the indirect effect via variable X. Ind_XY is the indirect effect via variables X and Y. X → Y represents the path coefficient from variable X to variable Y. Lower bound and upper bound represent the 95% CI.
Let us first consider the meta-analysis on the indirect effects. The average indirect effects via Cap and Int (abc), Cap (ae), Int (dc) and direct effect from PB (f) were 0.024, 0.028, 0.086 and 0.411, respectively. All the estimated effects were statistically significant. Regarding their heterogeneity, the heterogeneity variances τ 2 (and I 2 ) varies from 0.000 (0.865) to 0.024 (0.931). These results suggest that the indirect effects are weak, whereas the remaining direct effect from PB is very strong.
Next, we report the results of the TSSEM approach. Since the mediation model was just-identified, there were no fit statistics on the model fit. Figure 6 displays the model with the parameter estimates. The average indirect effects via Cap and Int (abc), Cap (ae), Int (dc) and direct effect from PB (f) were 0.032, 0.036, 0.089 and 0.399, respectively. The results are similar to that of the meta-analysis on the indirect and direct effects.
A mediation model with two serial mediators in the illustration.
This article demonstrated how primary studies in mediation models could be synthesized via the meta-analysis on the indirect effects and MASEM. In the first framework, we showed how the standardized indirect effects and direct effects could be used as effect sizes in a multivariate meta-analysis. The delta method was used to compute the sampling variance–covariance matrix of indirect and direct effects. In the latter framework, MASEM (TSSEM and OSMASEM) were introduced to combine correlation matrices and to fit mediation models on the data. A real dataset from Hagger et al. (2018) was used to illustrate these procedures.
This paper also highlighted some theoretical and practical differences between these two frameworks. For example, the meta-analysis on the indirect effects requires the indirect (and direct) effects as the effect sizes, whereas MASEM uses the correlation matrices as the inputs. The heterogeneity of population effects is also defined differently under these two approaches. Another neglected difference is how the moderator is used under these two approaches. Since the indirect effect is defined as a product term of at least two parameters, there is more than one way to interpret the regression coefficient. More importantly, the linear relationship between the moderator and the regression coefficients in MASEM may become a quadratic relationship in the meta-analysis on the indirect effects. We hope that these approaches’ pros and cons will be explored in future studies to synthesize indirect effects.
These two frameworks are not without their own limitations. We have mainly focused on correlation matrices because studies are likely different in their measurements and response scales. If all the measurements and response scales are the same, the above procedures can be extended to covariance matrices (e.g. Cheung and Chan, 2009). A second issue is that this study only focuses on the association (correlation) among the variables. If the mediation models involve experimental manipulations, a meta-analysis on the mean differences may be a better option.
We have introduced two general frameworks to synthesize complex effects. There are other meta-analytic approaches (e.g. Liu et al., 2015; Jiao et al., 2020) that could also be used. It is of interest to explore them as well. All of these methods are based on summary statistics. When raw data are accessible to researchers, the individual patient data meta-analysis, also known as the integrative data analysis (e.g. Huh et al., 2015; Mun et al., 2015), is a favorable alternative. Before closing this paper, it is crucial to mention that all these methods assume that the published studies reasonably represent the pool of empirical findings. That is, there is no publication bias. The presence of publication bias may lead to a biased estimate of the effect. Researchers may refer to techniques to address publication bias in meta-analysis (e.g. Ferguson and Brannick, 2012; Rothstein and Bushman, 2012; Coburn and Vevea, 2015).
This work was supported by the Academic Research Fund Tier 1 (FY2017-FRC1-008) from the Ministry of Education, Singapore.