Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Descriptives per regression model

    Good afternoon,

    I need to create a summary statistic of my variables for the number of observations of my main analysis.
    When I create the normal descriptives, I get the descriptives for each variable for a different number of observations.
    I found that I'm supposed to use a command like e(sample), however, I get an error that this option is not allowed.

    Thanks,
    Pia

  • #2
    as per the FAQ, please show us exactly what you typed (in CODE blocks) and exactly what Stata returned to you

    Comment


    • #3
      Pia:
      as an aside to Richìs helpful receommendation, you can actually obtain the descriptive statistics of the complete case analysis as follows:
      Code:
      . use "C:\Program Files\Stata17\ado\base\a\auto.dta"
      (1978 automobile data)
      
      . egen wanted=rowmiss( price mpg rep78 headroom trunk weight length turn displacement gear_ratio )
      
      . tabstat price mpg rep78 weight length turn foreign if wanted==0, stat(count mean sd p50 min max)
      
         Stats |     price       mpg     rep78    weight    length      turn   foreign
      ---------+----------------------------------------------------------------------
             N |        69        69        69        69        69        69        69
          Mean |  6146.043  21.28986  3.405797  3032.029  188.2899   39.7971  .3043478
            SD |   2912.44  5.866408  .9899323  792.8515   22.7474  4.441051  .4635016
           p50 |      5079        20         3      3200       193        40         0
           Min |      3291        12         1      1760       142        31         0
           Max |     15906        41         5      4840       233        51         1
      --------------------------------------------------------------------------------
      
      .
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Thank you for your answers and sorry for not posting as per FAQ in the first place!
        One more question: I standardised all values for the regression analysis, but would like to have the descriptions for the unstandardised values. However, Stata returns me more observations when using unstandardised. Why could that be?

        Comment


        • #5
          Pia:
          theprevious advice to post as per FAQ still holds.
          Please note that seeing what you typed and what Stata gave you back worths more than any detailed description. Thanks.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Good afternoon,
            this is what I typed in. My supervisor wants me to report the summary statistics for these 1,054 observations. However, as seen below, I standardized all variables, but want the descriptives for the original values (i.e. without "z"). When run xtreg with the non-stadardized variables I receive 1,066 observations.

            xtreg zESG zlnRnD_lag1 zlnTotAssets_lag1 zROA_lag1 zFemBM_lag1 zDivIndex_lag1 zCSRcomm2_lag1 zlnGDP_lag1 i.
            > Year1 i.Year2 i.Year3 i.Year4 i.Year5 i.IndSens, re vce(cluster CountryCode)
            note: 0.Year1 omitted because of collinearity.

            Random-effects GLS regression Number of obs = 1,054
            Group variable: id Number of groups = 292

            R-squared: Obs per group:
            Within = 0.2647 min = 1
            Between = 0.5473 avg = 3.6
            Overall = 0.4945 max = 5

            Wald chi2(12) = 641.12
            corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

            Comment


            • #7
              This is the same attempt with unstandardized values:

              xtreg ESG lnRnD_lag1 lnTotAssets_lag1 ROA_lag1 FemaleBM_lag1 DivIndex_lag1 CSRcomm2_lag1 lnGDP_lag1 i.Year1
              > i.Year2 i.Year3 i.Year4 i.Year5 i.IndSens, re vce(cluster CountryCode)
              note: 0.Year1 omitted because of collinearity.

              Random-effects GLS regression Number of obs = 1,066
              Group variable: id Number of groups = 295

              R-squared: Obs per group:
              Within = 0.2666 min = 1
              Between = 0.5456 avg = 3.6
              Overall = 0.4968 max = 5

              Wald chi2(12) = 929.20
              corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

              Comment


              • #8
                Basically, my question is what the reason can be that I have a different number of observations?

                Comment


                • #9

                  If the within-group variance is zero, you will get a missing value for the standardized score. You standardize as follows: $$Z= \frac{x-\mu}{\sigma} $$ where \(x\) is the observed value, \(\mu\) is the mean of the group and \(\sigma\) is the standard deviation of the group. You can check this:

                  Code:
                  bys id (ESG): gen zerovariance= ESG[1]==ESG[_N]
                  list if zerovariance, sepby(id)

                  Comment


                  • #10
                    Thank you for your answer - that definitely helps for my understanding!

                    Comment

                    Working...
                    X