Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is xtreg inappropriate for a non-negative dependent variable?

    Hello,

    I'm working on a model with R&D intensity (R&D divided by sale) as the dependent variable. Values in my panel for this variables range from 0 to about 1.5. I have the following questions:

    Would xtreg give me unreliable estimation for this dependent variable?
    Would glm with a link (such as log) and another distribution family (such as poisson) be preferable to xtreg even if using xtreg is not technically incorrect?

    Also, expanding the above questions, what if I have a non-negative dependent variable that ranges, say, from 0 to 2500 (e.g. R&D expenditure)?
    Would the appropriateness of using either xtreg or a glm model change from the above scenario at all?

    Many thanks in advance for your help.

  • #2
    Steve:
    welcome to this forum.
    If you have panel data and your regressand is continuous, -xtreg- seems feasible, especially for the second scenario.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Thank you kindly, Carlo.

      From a theoretical perspective, how would you compare the two approach?
      Any reason one might to discount either of the approaches?

      Comment


      • #4
        To better illustrate my point of concern, kindly see the charts below:

        This is the actual distribution of R&D Expenditure in my panel
        Click image for larger version

Name:	Actual.jpg
Views:	1
Size:	24.3 KB
ID:	1583539




        This is the distribution of predicted values using fixed-effects regression (xtreg)
        Click image for larger version

Name:	xtreg.jpg
Views:	1
Size:	31.9 KB
ID:	1583540



        And this is the the distribution of predicted values using glm, a log link with a poisson family:
        Click image for larger version

Name:	Poisson.jpg
Views:	1
Size:	25.9 KB
ID:	1583541



        I see so many papers using xtreg which leads me to believe it's ok to have this inaccuracy in distribution, but on the other hand, having so many clearly wrong prediction values (negative ones) makes me concern.

        Thanks so much for any help in advance.

        Comment


        • #5
          xtgee allows a log link and so guarantees positive predictions.

          Comment


          • #6
            Steve:
            if the predicted values cannot be negative (and your model is correctly specified), it's wise to switch to -xtgee-, as Nick helpfully suggested.
            Kind regards,
            Carlo
            (Stata 18.0 SE)

            Comment


            • #7
              Thanks you so much for your time and responses.

              Comment


              • #8
                You should really use xtpoisson with the fe option and vce(robust). I’d use RD as the dependent variable and log(sales) as an explanatory variable.

                Comment


                • #9
                  Hi Jeff,
                  Thanks so much for the answer.

                  Comparing my models, the results are almost identical using either glm with a poisson family, ot xtpoisson (provided I include both time and firm dummies). Is there any difference between the two, theoretically or practically?

                  Also, I would love to hear your perspective on how "technically" wrong would it be for someone to use xtreg, or regression in general, for a continuous DV that is strictly positive and goes high enough (such as R&D expenditure).
                  I'm curious on this mostly because in my experience, some reviewers of top journals in my field are not comfortable with using xtpoisson (or a poisson family in general) for a continuous dependent variable (any reference on this topic is appreciated).

                  Thank you very much again.

                  Comment

                  Working...
                  X