Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • The use of -KHB- command in mediation analysis of logistic regressions

    Hi everyone,

    I am currently using the -khb- command in stata to do a mediation analysis of logistic regressions. I have a binary DV, a binary IV, and groups of mediators that consist of categorical, continuous, and dichotomous variables. Stata was able to produce output using this command. However, there are several things that I hope to know more:

    1, how to interpret the coefficients in the "components of difference" table? I know that they add up to the indirect effect, but why are some of them positive while others negative?

    2, I found that for those variables whose coefficients are negative, their "P_Reduced" is also negative, but isn't "P_Reduced" representing the percentage of total effect explained? How can they be negative? Does it imply suppression effect?

    3, as I searched on this forum, there are some people trying to do bootstrap to find out whether the indirect effects are significant, but the answers are not very clear. Can anyone show an example of using bootstrap in -khb- command and interpret the results?

    Thank you all in advance!
    Last edited by Vivian Luo; 10 Jan 2017, 19:17. Reason: khb

  • #2
    You did not get a quick response. You should read the FAQ on asking questions - providing code, Stata output, and maybe sample data (using dataex) increases your chances of a response.

    However, I suspect the problem in your case is that khb is a user written command and folks who might respond simply don't use it. No one is likely to read a paper to learn a technique and then try to puzzle through a new software routine to help you. You probably should contact the authors of the routine directly.

    Comment


    • #3
      Phil pointed out the most important issues. Let me stress once more that khb is a user-written command, so please state where it comes from, so we can be sure to talk about the same thing. Another, perhaps minor issue, is the preference for full real names here on Statalist.

      Concerning the questions, you have already answered 1 and 2 yourself. Negative values occur if the coefficient in the "full" model is larger than the one in the reduced model. You can call this a suppression effect.

      As for the bootstrap, I do not believe that khb directly supports this. Otherwise it would be available as a vce-type. Of course nobody stops you from using the bootstrap prefix with khb. Here is an example (using khb *! version 2.11 September 4, 2015 @ 15:57:12 from SSC)

      Code:
      sysuse nlsw88
      
      bootstrap _b[collgrad:Diff] : khb logit union collgrad || hours wage
      Best
      Daniel
      Last edited by daniel klein; 12 Jan 2017, 23:47.

      Comment


      • #4
        Hi Phil,

        Thank you so much for your response! I have read the FAQs and I would like to apologize for being unclear in my original post. Here is more information:

        1, Stata code. I referred to the codes provided in this article:

        Kohler, U., Karlson, K.B., Holm, A. (2011). Comparing Coefficients of Nested Nonlinear Probability Models. The Stata Journal 11:420-438.

        Specifically, an example of my codes is:
        Code:
        khb logit Y X || M1 M2 M3, concomitant (C1 C2 C3) summary disentangle
        bootstrap _b[Diff]: khb logit Y X || M1 M2 M3
        2, Stata output. Here is partial output :
        Click image for larger version

Name:	output.png
Views:	1
Size:	32.3 KB
ID:	1370327



        Thanks again for responding!

        Best,
        Vivian

        Comment


        • #5
          Originally posted by daniel klein View Post
          Phil pointed out the most important issues. Let me stress once more that khb is a user-written command, so please state where it comes from, so we can be sure to talk about the same thing. Another, perhaps minor issue, is the preference for full real names here on Statalist.

          Concerning the questions, you have already answered 1 and 2 yourself. Negative values occur if the coefficient in the "full" model is larger than the one in the reduced model. You can call this a suppression effect.

          As for the bootstrap, I do not believe that khb directly supports this. Otherwise it would be available as a vce-type. Of course nobody stops you from using the bootstrap prefix with khb. Here is an example (using khb *! version 2.11 September 4, 2015 @ 15:57:12 from SSC)

          Code:
          sysuse nlsw88
          
          bootstrap _b[collgrad:Diff] : khb logit union collgrad || hours wage
          Best
          Daniel
          Hi Daniel,

          I have emailed the site administrator about changing my name, and I hope it will be updated soon.

          Your answers to my questions are super helpful! And I am very grateful for your insights! Now I am wondering if you can help me with a few further questions.

          1, I tried to calculate bootstrapped confidence interval using both the prefix and the vce option. However, the results are different. Please see the following codes:

          Code:
          khb logit Y X || incomeCat2 incomeCat3 incomeCat4, summary disentangle vce(bootstrap, reps(50) seed(1234))
          Code:
          bootstrap _b[Diff], reps(50) seed(1234): khb logit Y X || incomeCat2 incomeCat3 incomeCat4
          The result for the first command is:
          Click image for larger version

Name:	1.png
Views:	1
Size:	16.5 KB
ID:	1370821


          The result for the second command is:
          Click image for larger version

Name:	2.png
Views:	1
Size:	13.3 KB
ID:	1370822


          Why are the CIs of the indirect effect different? Moreover, I did not find a way to specify control variables when using bootstrap prefix. So now I am using the vce option which follows the main command where control variables are specified by "concomitant" option, and I hope the CI produced by this option is as accurate as that produced by the bootstrap prefix.

          2, I am running the KHB command on different samples, and I wonder whether the coefficients for the full and indirect effects are comparable across samples. If not, how can I compare them?

          3, It seems that for each model only one confidence interval for the indirect effect is calculated. However, what if I want to know the confidence interval for each group of mediators? Do I have to include each group separately in the model? I hope to know whether the effect of a group of mediators is significant in the full model where all mediators and IVs are included.

          4, Finally, a small question: as I see only one "Number of Observations" is reported in the result, does KHB make sure that the samples used before and after including mediators are the same?

          Thank you very much! I will try to contact the authors as well since it is a user written command.

          Best,
          Vivian

          Comment


          • #6
            1, I tried to calculate bootstrapped confidence interval using both the prefix and the vce option. However, the results are different. Please see the following codes:

            Code:
            khb logit Y X || incomeCat2 incomeCat3 incomeCat4, summary disentangle vce(bootstrap, reps(50) seed(1234))
            Code:
            bootstrap _b[Diff], reps(50) seed(1234): khb logit Y X || incomeCat2 incomeCat3 incomeCat4
            [...]

            Why are the CIs of the indirect effect different?
            Be careful here. Your first code instructs logit to calculate bootstrap standard errors, not khb. As mentioned, khb does not support bootstrap standard errors. The output you get is arguably misleading since the full and reduced model implement bootstrap standard errors, as indicated in the table, but I do not think that the estimation of the difference (the indirect effect) does. You should ask Uli Kohler (author of khb) for clarification.

            Moreover, I did not find a way to specify control variables when using bootstrap prefix.
            This is a bug in khb. The code that is supposed to deal with the bootstrap prefix messes up the options. I could give you an ad-hoc fix but it is better to report the problem to Uli Kohler.

            I cannot give much useful advice on 2. and 3.

            4, Finally, a small question: as I see only one "Number of Observations" is reported in the result, does KHB make sure that the samples used before and after including mediators are the same?
            Yes.

            Best
            Daniel

            Comment


            • #7
              Hi Daniel,

              Thank you so much for your prompt reply! I appreciate your kind help! I will contact Uli Kohler for further clarification. Thank you!

              Best,
              Vivian
              Last edited by Vivian Luo; 19 Jan 2017, 13:03.

              Comment


              • #8
                Originally posted by Vivian Luo View Post

                ...

                1, I tried to calculate bootstrapped confidence interval using both the prefix and the vce option. However, the results are different. Please see the following codes:

                Code:
                khb logit Y X || incomeCat2 incomeCat3 incomeCat4, summary disentangle vce(bootstrap, reps(50) seed(1234))
                Code:
                bootstrap _b[Diff], reps(50) seed(1234): khb logit Y X || incomeCat2 incomeCat3 incomeCat4
                Why are the CIs of the indirect effect different?
                The standard error of the difference is calculated using the Variance Covariance Matrix of the full model and the scale parameter. If the Variance-Covariance-Matrix is estimated with a bootstrap, the standard error of the difference shown by -khb- "sort of" inherits that feature but the standard error of the difference is not bootstraped itself. Whether or not it is correct to call the SE of the difference that -khb- shows "bootstraped" depends on whether or not the technique to derive the SE from the bootstraped VCE invokes additional assumptions. Addmitedly, I must leave this debate to the statisticians. The SE is derived using Sobel's Delta Method; see our paper on khb in SJ11-3.

                I'm willing to put any other column title here if someone makes a good proposal. It should be noted though, that the SE of the reduced and full model are bootstraped in a strict sense and only the SE of the difference is bootstraped in a loose sense.

                Originally posted by Vivian Luo View Post
                Moreover, I did not find a way to specify control variables when using bootstrap prefix. So now I am using the vce option which follows the main command where control variables are specified by "concomitant" option, and I hope the CI produced by this option is as accurate as that produced by the bootstrap prefix.
                This is a side-effect of the simplified syntax for separating the reduced and full model with ||. Please use parentheses to separte the models:

                Code:
                bootstrap _b[Diff], reps(50) seed(1234): khb logit Y (X) (incomeCat2 incomeCat3 incomeCat4), c(<controls>)
                Not sure if one should call this a bug or a side-effect of the simplified syntax. The last time I looked at this I thought there is not an easy way out of this -- for me, at least. I may look at this one more time but it is not on top of my "to do" list.


                Originally posted by Vivian Luo View Post
                2, I am running the KHB command on different samples, and I wonder whether the coefficients for the full and indirect effects are comparable across samples. If not, how can I compare them?
                Naturally, there is much to say about "comparability" of coeficients between samples, and all this applies here as well. If you are willing to compare logit coefs between samples, than you can also compare the khb-corrected coefs between samples. Note however, that the scaling problem that leads to the necessity of -khb- also applies for the comparison between samples.

                The problem here is that the scale parameter is (likely to be) not constant accross samples. Proposed solutions for this include "heterogeneous choice models" as impemented in R. Williams -oglm- command, Breen/Karlson/Holm's "correlation metric" as implemented in the user written command -nlcorr-. ("Average marginal effects" are also sometimes proposed as a solution for the scaling problem, but I think they make things worse in many occasions") . In any case, -khb- does not correct the differences in the scaling of the coefs between samples/groups.
                [/QUOTE]


                Originally posted by Vivian Luo View Post
                3, It seems that for each model only one confidence interval for the indirect effect is calculated. However, what if I want to know the confidence interval for each group of mediators? Do I have to include each group separately in the model? I hope to know whether the effect of a group of mediators is significant in the full model where all mediators and IVs are included.
                If I understand this correctly, you may want to try something like this

                Code:
                . khb logit y x || z1, c(z2 z3)
                . khb logit y x || z2, c(z1 z3)
                . khb logit y x || z3, c(z1 z2)
                where x are the covariate(s) for which the indirect effect(s) are of interest, and z1, z2, z3 are varlists for the (groups of) mediators you want to use.

                Comment


                • #9
                  Originally posted by Ulrich Kohler View Post
                  I'm willing to put any other column title here if someone makes a good proposal. It should be noted though, that the SE of the reduced and full model are bootstraped in a strict sense and only the SE of the difference is bootstraped in a loose sense.
                  Maybe some note below the table would be an idea? Perhaps the note could be issued when the user specifies an vce() type that is not supported by khb.

                  Intuitively, I would be reluctant to call the SE bootstrapped, because even though the coefficients in the full and reduced models are estimated repeatedly the difference is not, right? From my understanding a bootstrapped SE is derived from repeated estimates of the point estimate in question, which is the difference in this case. I could well be wrong and it is indeed up to statisticians to look at the math behind the intuition.

                  Originally posted by Ulrich Kohler View Post
                  This is a side-effect of the simplified syntax for separating the reduced and full model with ||. Please use parentheses to separte the models:

                  Code:
                  bootstrap _b[Diff], reps(50) seed(1234): khb logit Y (X) (incomeCat2 incomeCat3 incomeCat4), c(<controls>)
                  Not sure if one should call this a bug or a side-effect of the simplified syntax. The last time I looked at this I thought there is not an easy way out of this -- for me, at least. I may look at this one more time but it is not on top of my "to do" list.
                  Maybe the alternative syntax could be documented in the help file then? If you really find the time to revisit this, and I agree that is does not seem urgent, you could think about parsing local 0 using gettoken before the syntax call, instead of the subinstr functions that cannot detect nested parentheses and are sensitive to the number of blank spaces used.

                  Best
                  Daniel

                  Comment


                  • #10
                    Hi Daniel,

                    Yes, maybe a footnote might be a solution.

                    However, we do not directly support any kind of vce for the difference. It is always calculated with the delta method. What is your intuition: Is the Delta-Method SE based on robust standard errors robust in the same sense as the standard errors of the building blocks? In that sense the footnote should always be "S.E. for difference is estimated using the Delta-Method". Wondering if this makes things much clearer.

                    Ok, that subinstr() thing looks odd. I'll change that.

                    Uli

                    Comment


                    • #11
                      Uli,

                      that is a good point. I guess the standard error for the difference should probably indeed be labels as delta-method s.e. regardless of the vce-type used in the full and reduced model. Technically, this would also correctly reflect the fact that it is estimated in exactly the same way, regardless of what is specified in the vce() option.

                      The current situation is unclear because of the label for that column, but maybe even more because the documentation explicitly mentions robust and cluster as types to be specified in the vce() option. This gives the impression that these types are somehow supported and leaves the user a bit clueless of how other types are handled if they do not produce an error. If there is nothing special about robust and cluster s.e. then I would not discuss this option at all in the help and subsume it under the other model-options.

                      As for the question whether the delta-method s.e. inherits any of the properties of the s.e. that it is based on, I have no clue. I have not looked into the literature either, but this would be interesting to know.

                      Best
                      Daniel

                      Comment


                      • #12
                        Hi Uli and Daniel,

                        Thank you so much for all your responses! They are super informative and helpful! I have learned a lot and will continue with my exploration of this amazing command!

                        Best,
                        Vivian







                        Comment


                        • #13
                          Originally posted by Ulrich Kohler View Post

                          If I understand this correctly, you may want to try something like this

                          Code:
                          . khb logit y x || z1, c(z2 z3)
                          . khb logit y x || z2, c(z1 z3)
                          . khb logit y x || z3, c(z1 z2)
                          where x are the covariate(s) for which the indirect effect(s) are of interest, and z1, z2, z3 are varlists for the (groups of) mediators you want to use.
                          Hi Dr. Kohler,

                          Thank you so much for your prior help! As shown above, the suggestion for finding confidence interval for each mediator while keeping all other mediators in the model is to treat the others as control variables. I have followed this suggestion in my work. However, now I am wondering how I can achieve the same purpose without treating any mediators as control variables. i.e. I hope to find each Z's confidence interval when all Zs are included as mediators as in:

                          Code:
                          khb logit Y X || z1 z2 z3
                          It seems that the above code produces different results (e.g., confounding % of z1) from:

                          Code:
                          khb logit y x || z1, c(z2 z3)
                          The interpretation is also different: while the first command recognizes multiple pathways, the second command only recognizes one pathway each time. I am hoping to have a model that talks about the multiple pathways that link X to Y and about whether each one of the pathways is significant (while the other pathways are functioning at the same time).

                          Thank you and any advice is appreciated!

                          Best,
                          Vivian

                          Comment


                          • #14
                            I´m trying to use this command in Stata 13 but i get this error
                            "
                            (khb() in lkhb, compiled by Stata 13.1, is too new to be run by this version of Stata and
                            so was ignored)
                            <istmt>: 3499 khb() not found"

                            Is there a version of this command that works on Stata 13? What should i do?

                            Thanks in advanced.

                            Sory for posting in such an old post.

                            Comment


                            • #15
                              Just update to 13.1 if you have 13. Type

                              update all

                              and follow the instructions.
                              -------------------------------------------
                              Richard Williams, Notre Dame Dept of Sociology
                              Stata Version: 17.0 MP (2 processor)

                              EMAIL: [email protected]
                              WWW: https://www3.nd.edu/~rwilliam

                              Comment

                              Working...
                              X