Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • lincom related question

    Hello,

    Hope you all are well.

    I have a question regarding the use of lincom in finding the mean difference in salary before and after a training intervention.

    I have read up on lincom and from what i understood, it is a postestimation command mainly used after the mean command.

    What I would like to ask is for example, without running a regression, if we want to calculate means directly from a variables, would that still mean that we use lincom like

    lincom mean2- mean1

    and if we wanted to do this gender wise then how will the gender variable be used with the above command (if correct)?

    Thank you.

  • #2
    lincom is a post-estimation command. This means it uses the results left behind by an estimation command. So it cannot be used on its own.

    To get what you want I would use regress in combination with margins (In the example below my price is your salary, my good is your training intervention, and my foreign is your gender

    Code:
    sysuse auto, clear
    gen good = rep78 > 3 if rep78 < .
    reg price i.foreign##i.good, vce(robust)
    margins, dydx(good) over(foreign)
    You can actually do this all in one go without margins, see this Stata tip: https://www.stata-journal.com/articl...article=st0250

    Code:
    reg price ibn.foreign i.foreign#i.good, hascons vce(robust)
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thank you for the reply Maartein. I was however wondering that if I had to do this without running a regression then I would use:
      mean salary1
      then store its estimate followed by:

      mean salary2
      then store its estimate followed by:

      lincom mean2 - mean1
      from where I can then use the by() command to get results over gender.

      do you think this approach will also work and whether it is a problem to calculate mean difference in salary using this method?

      Comment


      • #4
        The syntax seems wrong, and the fact that you asked a question about it suggest you tried it and did not get to work. But the bigger question is why would you want to do this this way? Remember, regression is just about conditional means and differences in conditional means, and differences in conditional means is exactly what you want. So why try to jump through these hoops if a simpler more direct approach is available?

        You may have a legitimate reason, but you need to tell us. Otherwise we start guessing, and in all likelihood guess wrong, and all kinds of miscommunications start to happen.
        Last edited by Maarten Buis; 17 Sep 2019, 06:44.
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          Sorry for bothering you again with questions but I guess I am still a bit confused. Actually, this is a question that was thinking of approaching as I dont have data to check whether the result will be accurate or not and so I figured if maybe I can use the mean command, I can use postestimation after that as it allows for it. Then again if you dont mind me asking, what if I wanted to calculate just the difference in salary without looking at it gender-wise, would that still require running a regression? If yes then would it be something like this:

          Code:
          reg salary i.training, vce(robust)
          margins, dydx(training)

          Comment


          • #6
            To make things more clear. There was this vocational training program that took place last year and recently a survey was conducted to determine how effective the program was and one of the measures being used is to see whether there was any significant impact on trainees salary (monthly) from taking up this training. My initial thought was that maybe by applying lincom we can generate mean salary for each sector wise trainee and take a difference between before and after intervention to see whether training helped and have p-values stated to determine whether there are significant results. The confusion in my head then is that because there is no proper estimation taking place, would taking lincom be the right approach? Hence the confusion.

            Comment


            • #7
              If you want the difference between means, then just the regression is enough: the regression coefficients are the differences in means. If you don't believe me (and you should not believe a random person on the internet) then try it out yourself:

              Code:
              . sysuse nlsw88, clear
              (NLSW, 1988 extract)
              
              . // see the mean wages for union and non-union members
              . table union, c(mean wage)
              
              ----------------------
              union     |
              worker    | mean(wage)
              ----------+-----------
               nonunion |   7.204669
                  union |   8.674294
              ----------------------
              
              .
              . // union members earn on average 1.47 dollars per hour more
              . di 8.674294 - 7.204669
              1.469625
              
              .
              . // the regression tells you that non-union members earn on average 7.20 dollars
              . // per hour and union members earn 1.47 dollars per hour more
              . // which is exactly the same as what we got by comparing means
              . reg wage i.union
              
                    Source |       SS           df       MS      Number of obs   =     1,878
              -------------+----------------------------------   F(1, 1876)      =     44.23
                     Model |  751.256803         1  751.256803   Prob > F        =    0.0000
                  Residual |  31862.1856     1,876  16.9841074   R-squared       =    0.0230
              -------------+----------------------------------   Adj R-squared   =    0.0225
                     Total |  32613.4424     1,877  17.3753023   Root MSE        =    4.1212
              
              ------------------------------------------------------------------------------
                      wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                     union |
                    union  |   1.469625   .2209702     6.65   0.000     1.036252    1.902999
                     _cons |   7.204669   .1094804    65.81   0.000     6.989953    7.419385
              ------------------------------------------------------------------------------
              In that case you don't need the margins. I used margins in the previous example, because you want this separate for males and females, i.e. you wanted an interaction effect. margins is convenient, but even than not necessary, as I also showed.

              Your description sounds to me like you are trying to reinvent linear regression. I have said it before, but I'll repeat it again: linear regression is not magic, it just compares means, which is exactly what you are trying to do. So I would just use regression.
              ---------------------------------
              Maarten L. Buis
              University of Konstanz
              Department of history and sociology
              box 40
              78457 Konstanz
              Germany
              http://www.maartenbuis.nl
              ---------------------------------

              Comment


              • #8
                Sorry for the late response but i had a good laugh at your answer! but frankly thank you for helping me understand this, hope to see you around.

                Comment


                • #9
                  I am back again! though this time around with a slightly different question related to equality of medians. i read that you need to use quantile regressions for this but will it follow the same method as stated above because i feel that there wont be a need for using margins here in quantile regression as it will just take a difference of 2 medians of before training and after training income for a data of 4700 participants in which everyone is getting training.

                  Comment


                  • #10
                    Code:
                    . sysuse nlsw88, clear
                    (NLSW, 1988 extract)
                    
                    .
                    . table union, c(p50 wage)
                    
                    ----------------------
                    union     |
                    worker    |  med(wage)
                    ----------+-----------
                     nonunion |   6.191625
                        union |    8.05153
                    ----------------------
                    
                    . di  8.05153 - 6.191625
                    1.859905
                    
                    .
                    . qreg wage i.union
                    Iteration  1:  WLS sum of weighted deviations =  2847.5866
                    
                    Iteration  1: sum of abs. weighted deviations =  2847.0028
                    Iteration  2: sum of abs. weighted deviations =  2784.9206
                    Iteration  3: sum of abs. weighted deviations =  2777.7789
                    
                    Median regression                                   Number of obs =      1,878
                      Raw sum of deviations 2850.964 (about 6.6022515)
                      Min sum of deviations 2777.779                    Pseudo R2     =     0.0257
                    
                    ------------------------------------------------------------------------------
                            wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                           union |
                          union  |   1.859905   .2379648     7.82   0.000     1.393201    2.326608
                           _cons |   6.191625   .1179004    52.52   0.000     5.960395    6.422855
                    ------------------------------------------------------------------------------
                    ---------------------------------
                    Maarten L. Buis
                    University of Konstanz
                    Department of history and sociology
                    box 40
                    78457 Konstanz
                    Germany
                    http://www.maartenbuis.nl
                    ---------------------------------

                    Comment


                    • #11
                      Thank you Maarten you have helped me understand this concept really well and i really appreciate your help

                      Comment


                      • #12
                        Hello,

                        I have a small query which I would like to discuss. When I made a median table for before and after intervention income, I get the following results

                        Before: 11600
                        After: 12000

                        However, when i run qreg using the commands listed above, I get _cons = 11700 while the After = 300. I was wondering, should this not be showing _cons as 11600 and After = 400 rather than the result output? I took out a random sample from the dataset and exported it here but obviously it may not replicate the exact result as desired. What could possibly be the reason behind this.

                        The commands I used are:

                        Code:
                        table time, c(p50 income)
                        qreg income i.time

                        Thank you.

                        Dataset:

                        Code:
                        * Example generated by -dataex-. To install: ssc install dataex
                        clear
                        input str14 Q2_TraineeID float(female time income)
                        "J7-273-002-19"  0 0 10800
                        "J8-145-007-23"  0 0     .
                        "J7-264-009-13"  0 0  1500
                        "C2-453-009-2"   0 0     .
                        "J7-135-007-12"  0 0     .
                        "J7-165-024-22"  0 0     .
                        "W3-254-007-16"  1 0     .
                        "W3-012-018-7"   1 0     .
                        "WE1-325-048-14" 1 0     .
                        "WE1-101-001-8"  1 0     .
                        "WE1-069-022-14" 1 0     .
                        "WE1-205-004-1"  1 0     .
                        "SS-109-009-26"  0 0  5000
                        "J9-349-001-24"  1 0     .
                        "J9-034-007-1"   0 0 12000
                        "J8-254-066-6"   0 0     .
                        "J7-165-003-11"  0 0     .
                        "J7-300-003-11"  0 0     .
                        "J7-278-007-23"  0 0     .
                        "J7-195-014-12"  0 0     .
                        "J7-015-012-2"   1 0     .
                        "J7-163-002-11"  1 0     .
                        "J7-087-009-21"  0 0     .
                        "J7-099-016-15"  0 0     .
                        "J7-020-053-25"  0 0     .
                        "J7-012-011-7"   1 0     .
                        "GT-320-006-4"   1 0     .
                        "I3-369-010-1"   0 0     .
                        "I3-008-019-13"  0 0     .
                        "I3-326-007-3"   0 0     .
                        "I3-390-003-18"  0 0     .
                        "I3-369-014-1"   0 0     .
                        "I2-210-004-24"  0 0     .
                        "I2-209-024-2"   1 0     .
                        "I2-156-023-10"  0 0 12000
                        "I2-122-021-3"   0 0 13000
                        "I2-202-046-11"  0 0 10000
                        "I1-062-005-18"  0 0     .
                        "WE2-256-004-9"  1 0  6000
                        "C2-426-004-5"   0 0 12000
                        "C2-241-029-18"  0 0     .
                        "C2-238-009-25"  0 0     .
                        "WE1-122-044-2"  1 1 10000
                        "WE1-069-008-10" 1 1     .
                        "TP1-805-011-22" 1 1     .
                        "TP1-805-018-26" 1 1     .
                        "SS-029-025-6"   1 1     .
                        "J9-558-001-13"  0 1     .
                        "J9-552-001-9"   0 1     .
                        "J9-367-001-9"   0 1 15000
                        "J9-349-001-24"  1 1     .
                        "J8-145-023-25"  0 1     .
                        "J8-254-054-21"  1 1     .
                        "J8-332-018-16"  1 1     .
                        "J8-271-004-5"   1 1     .
                        "J8-805-011-20"  1 1     .
                        "J8-254-045-12"  1 1     .
                        "J7-195-008-16"  0 1     .
                        "J7-291-007-3"   0 1     .
                        "J7-146-045-3"   0 1 15000
                        "J7-059-018-3"   0 1     .
                        "J7-053-008-16"  0 1     .
                        "J7-145-157-19"  0 1     .
                        "J7-292-029-8"   0 1     .
                        "J7-017-002-2"   0 1     .
                        "J7-012-031-24"  1 1 14000
                        "J7-124-012-20"  0 1     .
                        "J7-146-050-7"   0 1     .
                        "J7-192-029-2"   0 1     .
                        "J7-145-078-8"   0 1     .
                        "J7-287-002-11"  0 1     .
                        "J7-189-045-22"  0 1     .
                        "J7-101-006-17"  1 1     .
                        "J7-017-008-2"   0 1 12000
                        "J7-004-035-9"   1 1     .
                        "J7-195-010-26"  0 1     .
                        "GT1-531-007-13" 0 1     .
                        "GT1-522-001-13" 1 1 15000
                        "GT-301-002-10"  0 1 17000
                        "I3-221-009-12"  0 1  7000
                        "I3-156-001-5"   0 1 13000
                        "I3-223-053-5"   1 1     .
                        "I3-212-007-13"  1 1     .
                        "I2-122-046-13"  0 1     .
                        "I2-223-004-11"  1 1     .
                        "I2-203-022-1"   0 1 15000
                        "I2-122-039-5"   0 1     .
                        "I2-122-021-3"   0 1 18000
                        "I2-202-009-21"  0 1 15000
                        "I1-122-025-17"  0 1 16690
                        "I1-122-081-18"  1 1 15500
                        "I1-122-034-24"  0 1     .
                        "I1-062-008-8"   0 1 18000
                        "WE2-361-016-17" 1 1     .
                        "WE2-261-001-1"  1 1 10000
                        "WE2-363-012-10" 1 1 30000
                        "C3-458-032-24"  0 1 18000
                        "C2-427-005-4"   0 1 15000
                        "C2-429-002-12"  0 1  7000
                        "C2-249-001-23"  1 1     .
                        end
                        label values female female
                        label def female 0 "Male", modify
                        label def female 1 "Female", modify
                        label values time time
                        label def time 0 "Before", modify
                        label def time 1 "After", modify

                        Comment

                        Working...
                        X