Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • DIfferent individuals in one merged data set

    Hi all,
    I have a problem, because I have already merged data with ID numbers from different waves of survey into one merged data-set. I asked the statistical center and they replied me, that individuals were not the same across years but nevertheless they got the same number in a merged data set.. Therefore which panel data regression I can use? I excluded fixed effects, because individuals are different over time. Can i Use random effects and which explanation I can write that i used it?
    All the best,

  • #2
    do you have an indicator from which data set the observations came from? if so, you can create an id based on that.

    Code:
    egen id = group(obsid, dataset)

    Comment


    • #3
      oops. no comma in group

      Comment


      • #4
        No, just all IDs are merged into one and is it a huge mistake if I treat them for the purpose of model as same individuals? ( they were choosen randomly in one, same area but just different individuals)

        Comment


        • #5
          My reading of #1 is that you are asking about modeling options and not how to create identifiers. You do not have panel data but pooled cross-sections, so rule out panel data models (FE, RE). Thus, you are confined to linear regression assuming a continuous outcome, but you may include time effects. If you want the coefficient on a variable to vary across time, interact the variable with the time dummies. You will still have the problem of unobserved heterogeneity that is endemic in cross-sectional analyses.
          Last edited by Andrew Musau; 25 Jan 2022, 06:22.

          Comment


          • #6
            yes exactly, Thank you very much, therefore instead of (xtreg) should I just use (reg) command? should I also set data as panel data (xtset) or how would teh commands look like if i perform pooled OLS?

            Comment


            • #7
              You can use either regress or xtreg, fe in case you include time effects. With regress, you include time dummies and there is no need for xtset whereas with xtreg, fe, you xtset using the time variable. If we suppose that the Grunfeld dataset consists of pooled cross-sections, the following are equivalent:

              Code:
              webuse grunfeld, clear
              gen id=_n
              xtset year
              xtreg invest mvalue kstock, fe cluster(id) nonest dfadj
              regress invest mvalue kstock i.year, robust
              Clustering at the observation level yields White standard errors in xtreg.

              Res.:

              Code:
              . xtreg invest mvalue kstock, fe cluster(id) nonest dfadj
              
              Fixed-effects (within) regression               Number of obs      =       200
              Group variable: year                            Number of groups   =        20
              
              R-sq:  within  = 0.8038                         Obs per group: min =        10
                     between = 0.9325                                        avg =      10.0
                     overall = 0.8122                                        max =        10
              
                                                              F(2,178)           =    192.01
              corr(u_i, Xb)  = 0.0565                         Prob > F           =    0.0000
              
                                                 (Std. Err. adjusted for 200 clusters in id)
              ------------------------------------------------------------------------------
                           |               Robust
                    invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                    mvalue |   .1167978   .0066825    17.48   0.000     .1036107    .1299849
                    kstock |   .2197066   .0575203     3.82   0.000     .1061972     .333216
                     _cons |   -41.0225   13.59568    -3.02   0.003    -67.85195   -14.19304
              -------------+----------------------------------------------------------------
                   sigma_u |  15.309436
                   sigma_e |  98.099115
                       rho |  .02377594   (fraction of variance due to u_i)
              ------------------------------------------------------------------------------
              
              . 
              . regress invest mvalue kstock i.year, robust
              
              Linear regression                               Number of obs     =        200
                                                              F(21, 178)        =      23.74
                                                              Prob > F          =     0.0000
                                                              R-squared         =     0.8170
                                                              Root MSE          =     98.099
              
              ------------------------------------------------------------------------------
                           |               Robust
                    invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                    mvalue |   .1167978   .0066825    17.48   0.000     .1036107    .1299849
                    kstock |   .2197066   .0575203     3.82   0.000     .1061972     .333216
                           |
                      year |
                     1936  |  -17.21234   32.30392    -0.53   0.595    -80.96027    46.53559
                     1937  |  -34.49127   39.41671    -0.88   0.383    -112.2755    43.29291
                     1938  |  -28.44276   26.44632    -1.08   0.284    -80.63142    23.74589
                     1939  |  -56.24304   30.29841    -1.86   0.065    -116.0334     3.54727
                     1940  |  -30.50473   26.88752    -1.13   0.258    -83.56404    22.55459
                     1941  |  -2.627113   27.07946    -0.10   0.923    -56.06521    50.81098
                     1942  |  -1.422156   27.25112    -0.05   0.958      -55.199    52.35469
                     1943  |  -21.80127   26.90696    -0.81   0.419    -74.89895    31.29641
                     1944  |  -22.11735   25.50119    -0.87   0.387    -72.44091    28.20622
                     1945  |  -33.59647   24.79854    -1.35   0.177    -82.53343     15.3405
                     1946  |  -7.028064   31.03571    -0.23   0.821    -68.27333     54.2172
                     1947  |  -5.246123   31.18714    -0.17   0.867    -66.79023    56.29799
                     1948  |  -3.919472   39.34635    -0.10   0.921    -81.56481    73.72587
                     1949  |  -28.79332   35.41855    -0.81   0.417    -98.68761    41.10098
                     1950  |  -28.35409   38.03333    -0.75   0.457    -103.4083    46.70015
                     1951  |  -11.67194   45.29242    -0.26   0.797    -101.0511    77.70727
                     1952  |  -5.613218   50.98644    -0.11   0.912    -106.2289    95.00245
                     1953  |   2.448996   52.49263     0.05   0.963     -101.139     106.037
                     1954  |  -12.31488   41.68043    -0.30   0.768    -94.56625     69.9365
                           |
                     _cons |  -23.57497   14.25381    -1.65   0.100    -51.70316    4.553223
              ------------------------------------------------------------------------------
              
              .

              Comment

              Working...
              X