Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • trend and time

    I want regress difference model in panel data.
    using two year data, for example, I want to estimate the effect of independent variable with trend.
    but the problem is that when I use the gen time=_n, the time order is mixed up every time, so the result became different whenever I analyze.
    The other problem is that if I made the time variable , for example 1, 2 for every id, because of colinearity, it came out zero
    How can I handle that?
    Attached Files

  • #2
    The problem is with your command -by occ, sort: gen time = _n-. The variable occ does not determine a unique sort order of the data, because there can be multiple observations with the same value of occ. In Stata, when you -sort- the data on a variable or list of variables that do not uniquely identify the observations, the observations that are duplicates on the sort key variables are sorted into random order, irreproducibly. That's why you're getting different results every time. So you either need another variable (or several other variables) that, together with occ uniquely identify the data , or you need to tell Stata not to randomize the sorting but to retain the existing order. It seems that you do not have any other variables that, with occ, uniquely identify the observations, such as a date. It seems that, instead, you want to take the order in which the data originally appear as the correct time order. If that is correct, you can do that with:
    Code:
    sort occ, stable
    by occ: gen time = _n
    Note that you must specify the -stable- option in the -sort- command and you must not specify a -sort- option in the -by occ:...- command.

    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    In the future when showing Stata output, copy the output directly from your Results window or log file to the clipboard and then paste them into the forum editor in between code delimiters. If you are unfamiliar with code delimiters, please read forum FAQ #12 for instructions.

    Comment


    • #3
      Thank you very much, sir!

      I'm not sure the way I paste the result here is the same way you mentioned, but I left the result here.
      I faced another issue of collinearity. As you see, my model is simple difference model of panel data with trend, but in this time, the result of trend is omitted in every time period. I analyzed difference model for four time period based 2014; so the time gap is 1,2,3,4 to 2015, 2016, 2017, 2018 respectively, I used the time gap as trend variable.
      Is it fail to building model appropriately or coding?
      Thank you for taking your time and answer in advance.


      drop in 39/40
      (2 observations deleted)

      . encode occupation, gen(occ)


      sort occ, stable

      . by occ: gen time=_n

      . tsset occ time
      panel variable: occ (strongly balanced)
      time variable: time, 1 to 2
      delta: 1 unit

      . gen lemp=log(근로자수)

      . reg D.lemp trend D.lnMWRIIWMW

      note: trend omitted because of collinearity

      Source | SS df MS Number of obs = 19
      -------------+---------------------------------- F(1, 17) = 0.41
      Model | .016417834 1 .016417834 Prob > F = 0.5315
      Residual | .684092334 17 .040240726 R-squared = 0.0234
      -------------+---------------------------------- Adj R-squared = -0.0340
      Total | .700510168 18 .038917232 Root MSE = .2006

      ------------------------------------------------------------------------------
      D.lemp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      trend | 0 (omitted)
      |
      lnMWRIIWMW |
      D1. | -.0059717 .0093492 -0.64 0.532 -.0256969 .0137534
      |
      _cons | .083711 .0534071 1.57 0.135 -.0289683 .1963902
      ------------------------------------------------------------------------------
      Last edited by Yunjeong Kwon; 27 Dec 2019, 00:46.

      Comment


      • #4
        As you have only two time periods per occ, the differences are defined only for one observation per occ. When time = 1, D.temp and D.lnMWRIIWMW will both be missing, so the observation will be omitted from the analysis. Only time = 2 observations are included in your analysis. Because your example data is shown as a screen shot, which is not helpful, I cannot explore your data in Stata, but a quick visual inspection seems to suggest that whenever time = 2, trend = 1, which would make trend a constant--hence its omission due to collinearity (with the _cons) term of the model. You can check this yourself easily by running your regression and then following it with:

        Code:
        tab trend if e(sample)

        Comment


        • #5
          Thank you very much sir

          Your advice is always helpful.

          I'm not good at using the statalist tool, but next time, I will try to show my data in stata.

          Have a good day!

          Comment

          Working...
          X