Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • dependent variable does not vary within a panel

    Dear all,

    I am using a panel dataset at the individual i - product k - year t level and my dependent variable is at the product - year level.

    That is, I estimate Yk,t = ai + b* Xi,k,t + epsiloni,k,t and epsilon is clustered at the individual level. (using xtreg, fe vce(cluster) )

    I am wondering if my regression makes sense, since my dependent variable does not change within individuals for a given product - year. Is this a problem?
    I considered collapsing my data at the product-year level, but this aggregation would dramatically reduce my sample size (I have many individuals) and I prefer to avoid it.

    Many thanks for your feedback!
    Best

  • #2
    Francesco:
    it is difficult to advise without seeing your results. You can post them using the code delimiters function (# icon) of the advanced editor (A icon at the top right of the reply section of the screen).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thank you Carlo,

      Actually the output looks OK. What I want to understand if what are the advantages/inconvenients of running Yk,t = ai + b* Xi,k,t + epsiloni,k,t vs Yk,t = a + b* Xk,t + epsilonk,t where Xk,t can be seen as the average over i for a given k, t combination.

      The code would look like
      xtset individual
      xtreg Y X, fe vce(cluster individual)

      and for the second specification I would first
      collapse Y X, by(product time) *** (note that Y would be the same after collapse as it not varies over individuals for a given k-t)
      reg Y X

      In my view, the main advantage of the main specification is that it allows the inclusion of individual fixed effects. However, the dependent variable will be the same for all the investors in a given product-day combination. I wonder whether this is an issue or not.

      Thanks!

      Comment


      • #4
        Francesco:
        provided that a dependent variable that do not vary for a given product-day combination is plausible in your research field, switching to -collapse- will throw away the panel data strucure of your data set.
        Probably you have already checked it yourself, but I would also investigate if the random-effect specification outperforms the fixed-effect one via -help hausman-.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Although I do not know what you are trying to estimate (economically), I have a hard time to believe that your model is not misspecified. Since a variation in \( X_{i,k,t} \) across individuals does not affect your dependent variable, it must be reflected in an equivalent variation of your error term. And because \( X_{i,k,t} \) is allowed to vary over time as well, the fixed effects will not fully capture this cross-individual variation. Consequently, your regressor is clearly correlated with the error term which invalidates the whole approach.
          https://www.kripfganz.de/stata/

          Comment


          • #6
            thank you sebastian for your input!

            However since I add individual fixed effects I am essentially interested in how the within- individual variation of X comoves with the within-individual variation of Y.
            So actually I dont think the model is misspecified. Maybe what you suggest is to add time fixed effects as well as individual fixed effects?

            Comment


            • #7
              Francesco:
              does the -hausman- test shed some light on this issue?
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Maybe I'm reading this incorrectly, but I think we are talking at cross purposes.

                When you say your dv is at the product year level, you're saying you have no data on individual differences with respect to the dv, but you do have differences at the individual level in the x's? So what you're doing is having a pile of difference person-year-product level variables explaining a variable that is clearly at the total product sales per year level (so some such)?

                So the data for one product year might look like:

                dv person product year
                1.2 1 1 1
                1.2 2 1 1
                1.2 3 1 1

                I think you do want to aggregate to the product year. It doesn't make sense to treat the person-year-product as the observation unit when you don't have person-year-product data on the dv.



                Comment


                • #9
                  Hi Phil,
                  thank you for your comment.

                  The idea is that you can consider the dependent variable as some financial return over the, say, from (t+1) to (t+2), t being the time of the current observation. I do not want to lose the individual component of the data (by aggregating at the product year level) because that would drastically reduce my sample (I have many individuals in my sample) and I want to understand the predictive power of X over Y. When you think about it, the fact that the dependent variable does not vary over individuals for a given product year pair does not violate any of the OLS assumptions. The only threat I see is that, of course, the disturbances for all the observations in a product-year pair will be highly correlated (not perfectly because of the fixed effects). So I should be fine as long as I cluster at the year level and/or add year product/year fixed effects. What do you think?

                  Comment


                  • #10
                    Ask yourself: Why should the return of some asset depend on characteristics of any particular individual? Your individual component does not explain anything. You can only hope to learn something from average individual characteristics. Having a large sample size on the individual level will not help you to get better estimates here because individual-level variation has zero predictive power.
                    https://www.kripfganz.de/stata/

                    Comment


                    • #11
                      Aggregating to the product year does lose sample size, but probably gives you more meaningful estimates. With the individual level data, suppose you doubled the number of individuals. You would have no additional data on your dv. But, you'd double your apparent sample size.

                      This isn't a simple statistical issue. It is more of an issue of framing. With product-year data on the dv, it makes sense to use product year data on the iv's.

                      Comment

                      Working...
                      X