Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question about the results in piecewise poisson regression

    Hi all,

    Code:
    poisson income c.iv1##c.iv2 c.iv1#c.iv1 c.iv2#c.iv2 cv1 cv2 cv3 cv4 i.indcode i.areacode, vce(robust)
    Click image for larger version

Name:	0911_1.png
Views:	1
Size:	10.8 KB
ID:	1340170


    Then, I drop one of c.iv1#c.iv1, and it becomes significant.
    Code:
    poisson income c.iv1##c.iv2 c.iv2#c.iv2 cv1 cv2 cv3 cv4 i.indcode i.areacode, vce(robust)
    Click image for larger version

Name:	0913_1.png
Views:	1
Size:	13.9 KB
ID:	1340171



    But if I keep dropping another square term, c.iv2#c.iv2, then it becomes insignificant,
    Click image for larger version

Name:	0916_1.png
Views:	1
Size:	12.5 KB
ID:	1340172

    I have checked the correlationship between the variables, but found nothing serious collinearity

    . corr income iv1 iv2
    (obs=267)

    | income iv1 iv2
    -------------+---------------------------
    income | 1.0000
    iv1 | 0.0318 1.0000
    iv2 | -0.0178 0.2790 1.0000


    Is there something wrong here?

    Thanks,
    David
    Last edited by David Lu; 11 May 2016, 01:25.

  • #2
    David:
    yiou may have a singleton dummy among your predictors (please, see http://www.stata.com/statalist/archi.../msg00851.html).
    You also report very high pseudo R-sq with most non-signifucant pedictors: I would take a look at -estat vce- after -poisson-.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Originally posted by Carlo Lazzaro View Post
      David:
      yiou may have a singleton dummy among your predictors (please, see http://www.stata.com/statalist/archi.../msg00851.html).
      You also report very high pseudo R-sq with most non-signifucant pedictors: I would take a look at -estat vce- after -poisson-.
      Hi Carlo,

      You're absoultely right. It has a tons of dummy because of the control for industrial effect and regional effect. But I am still confused about what the problem is after looking at -estat vce- after -poisson-, can you explain a bit ?



      Thanks,
      David

      Comment


      • #4
        David:
        I meant that you may have two problems:
        - a singleton dummy, that makes the calculation of Wald test unfasible;
        - multicollinearity, which inflates your pseudo-Rsq but leaves non-significant coefficients: you may want to eyeball the -estat vce, corr- output to sniff out the potential culprit(s) if that is the issue.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Originally posted by Carlo Lazzaro View Post
          David:
          I meant that you may have two problems:
          - a singleton dummy, that makes the calculation of Wald test unfasible;
          - multicollinearity, which inflates your pseudo-Rsq but leaves non-significant coefficients: you may want to eyeball the -estat vce, corr- output to sniff out the potential culprit(s) if that is the issue.
          Hi Carlo,

          I check there is no singleton dummy in my model. Also, I've check the problem of multiconllinearity, none of the variables are highly correlated (<0.4). So, that what I am confused why the significance of the coefficients becomes so tricky.

          Thanks,
          David

          Comment


          • #6
            David:
            - missing Wald test: there was similar thread some time ago on this forum: http://www.statalist.org/forums/foru...andard-errors;
            - if multicollinearity is not the issue, it may be that you have included too many interactions or, in general terms, too many predictors (by the way: are all of them useful for your research purposes? What does the literature in your research field suggest for dealing with the same research topic?).
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Originally posted by Carlo Lazzaro View Post
              David:
              - missing Wald test: there was similar thread some time ago on this forum: http://www.statalist.org/forums/foru...andard-errors;
              - if multicollinearity is not the issue, it may be that you have included too many interactions or, in general terms, too many predictors (by the way: are all of them useful for your research purposes? What does the literature in your research field suggest for dealing with the same research topic?).
              Hi Carlo,

              Missing wale test is less severe for me. What I am really worried about is the tricky significance of the coeffients. In my models, I have two variables in their square term. What I want to know is why they are insignificant if staying together in the same model and turn to be significant if one is out. I think it's because multiconlllinearity but I find no evidence about this. And if I do piecewise regression, I really doubt if my logic makes sense to the reviewer why I partially drop out one squared variable with another included in the model. Becasue I assume that all of the insignificant squared variables should be partially out, how can I convince others I left one and drop the other and then the model improves?

              Btw, it seems that I misunderstood the concept of singleton dummy in my context. If those industries count once only should be taken as the case of singleton dummy, then, the sample has 7 singleton dummies.

              Best,
              David
              Last edited by David Lu; 11 May 2016, 07:21.

              Comment


              • #8
                David:
                I understand your concerns: that's why I previously suggested you to skim through the literature in yiur research field and see if any example of what you're after already exist (even better if published in your target journal).
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Originally posted by Carlo Lazzaro View Post
                  David:
                  I understand your concerns: that's why I previously suggested you to skim through the literature in yiur research field and see if any example of what you're after already exist (even better if published in your target journal).
                  Hi Carlo,

                  Yes, I have seen some similar cases. But then, they don't do piecewise regression and only report the results of the final model. So, I lost the cue and find no concrete explanation about why this happens. Have you met some similar example before?

                  Best,
                  David

                  Comment


                  • #10
                    David:
                    just out of curiosity: as income is a continuous variable, why using -poisson- instead of -regression-?
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      Originally posted by Carlo Lazzaro View Post
                      David:
                      just out of curiosity: as income is a continuous variable, why using -poisson- instead of -regression-?
                      Hi Carlo,

                      Yes, it is a continuous variable but it's highly right skewed. And -regression- is not suitable for this type of data. I have to either transform my data or use another regression. I consulted the literature and asked questions here, most of them suggest me not take the log form of the data and tried to use -poisson- or -glm- with a log link function, it also fits for nonnegative continuous variable not only count variable.

                      Best,
                      David

                      Comment


                      • #12
                        David:
                        I suppose you refer to http://blog.stata.com/2011/08/22/use...tell-a-friend/.
                        However, if this suggestion does not ease your research procedure, going -regression- vs -poisson- may depend on some features of your data, for instances how many zeros you have in the depvar before deciding to log income.
                        As an aside, normality of the depvar is not required for -regression-.
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment


                        • #13
                          Originally posted by Carlo Lazzaro View Post
                          David:
                          I suppose you refer to http://blog.stata.com/2011/08/22/use...tell-a-friend/.
                          However, if this suggestion does not ease your research procedure, going -regression- vs -poisson- may depend on some features of your data, for instances how many zeros you have in the depvar before deciding to log income.
                          As an aside, normality of the depvar is not required for -regression-.
                          Hi Carlo,

                          That's where I started from, it's a very inspiring and helpful post. My depvar is positive no zero inside. I previously decided to use the log, but some scholars disagree with that. Personally, both -regression and -poisson- fit for my data, they all yield significant results. But since -poisson- has much stronger argument for skewed data, and also it fits well with nonnegative variables, so I go for -poisson- or -glm-. But what I am worried about is if it makes sense to researchers by dropping one square term while retaining the other without specific explanation or arguement. Does it make sense to you?

                          David

                          Comment


                          • #14
                            David:
                            I'm afraid it does not, unless some theorethical argument can support your choice.
                            Kind regards,
                            Carlo
                            (Stata 19.0)

                            Comment


                            • #15
                              Originally posted by Carlo Lazzaro View Post
                              David:
                              I'm afraid it does not, unless some theorethical argument can support your choice.
                              Hi Carlo,

                              That's what I am confused about. Since those without piecewise regression, they escape the theoretical argument why dropping one and maintain the other. But I prefer to use piecewise and try to explain why it reasonable to do so, do u know if there is similar literature that can help me in explaining why just keeping only one squared term in the model and dropping the other?

                              Thanks,
                              David

                              Comment

                              Working...
                              X