Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to store p-value and CI in a new variable in prais-winsten regression?

    Hi!

    I'm new to Stata! I have a database model as shown below (hypothetical data). When running Prais-winsten regression, I would like to store the p-values and confidence intervals in a new variable.

    The command “ statsby, by(id): prais log10_prevame year ”, stores only the beta values in a new variable. How do I store the p-value and CI as well?

    I also tested the regsave command (below) but it only stores the values of the last prais-wisten regression and not the set of regressions by id (my database is approximately 2000 id)

    by(id): prais log10_prevame year
    regsave, tstat pval ci

    Click image for larger version

Name:	stata.png
Views:	2
Size:	100.4 KB
ID:	1679169


    I thank the help of all you.

  • #2
    Something like this:
    Code:
    levelsof id, local(ids)
    gen b = .
    gen ll = .
    gen ul = .
    gen pvalue = .
    
    foreach i of local ids {
        capture prais log10_prevname year if id == `i'
        if c(rc) == 0 {
            matrix M = r(table)
            foreach x in b ll ul pvalue {
                quietly replace `x' = M["`x'", "year"] if id == `i'
            }
        }
        else if inlist(c(rc), 2000, 2001) {
            continue
        }
        else {
            display as error "Unexpected error: id == `i'"
            exit(c(rc))
        }
    }
    Warning: Because it is not possible to import data from a screenshot into Stata, this code is untested and may contain typos or other errors. In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Added: The -capture- and -if ... else if ... else- structure are there in anticipation of the possibility that for some id's there will be insufficiently many observations with non-missing values of log10_prevname and year to carry out the -prais- command. Because large data sets often have gaps of this nature, such errors are anticipated and to avoid Stata's default response of terminating execution, this code allows the completion of the task for all id's having sufficient associated data. If, however, -prais- encounters other problems and cannot successfully execute, this is an unanticipated and abnormal condition, and the code will terminate execution and display a corresponding error message.
    Last edited by Clyde Schechter; 24 Aug 2022, 19:46.

    Comment


    • #3
      Another option to what Clyde proposes is to first retrieve the betas and standard errors with -statsby-, e.g.,

      Code:
      statsby _b _se, by(rep): reg price mpg headroom
      and then to construct manually the confidence intervals from the information. Confidence interval for the coefficient on mpg is just [ _b[mpg] - 1.96*_se[mpg] , _b[mpg] + 1.96*_se[mpg]].

      The pvalue for the coefficient on mpg is 2*normal(_b[mpg]/_se[mpg])

      Comment


      • #4
        Hi, thank you very much Clyde e Joro!

        Joro, unfortunately the values don't match the regression output.

        Clyde, here's an example using the dataex command. I tried the code you provided and at the end I get the message "matrix operators that return matrices not allowed in this context" I think I'm missing some simple detail (Stata is new to me, but I'm increasingly surprised by the possibilities)

        I thank the help of all you.


        copy starting from the next line ------ ----------------
        Code:
        * Example generated by -dataex-. To install: ssc install    dataex
        clear
        input long codigo_ibge int ano float logprev_ame_
        110002 2015   1.69897
        110002 2016         2
        110002 2017 1.7781513
        110002 2018 1.8037053
        110002 2019         2
        110012 2015         2
        110012 2016 1.7781513
        110012 2017         2
        110012 2018         2
        110012 2019   1.30103
        110029 2015 1.5126595
        110029 2016 1.6651118
        110029 2017 1.5948814
        110029 2018  1.686381
        110029 2019  1.533344
        110032 2015  1.363178
        110032 2016  .7695511
        110032 2017 1.4259688
        110032 2018  1.498405
        110032 2019   1.39794
        end
        copy up to and including the previous line - ----------------

        Comment


        • #5
          The values do not exactly match the regression output, because whoever programmed -prais- used finite sample t-statistics, which are not justified in this context because -prais- is a two step GLS, and therefore the procedure has only asymptotic justification.

          Here is how the values can be made to match the regression output:

          Code:
          . webuse idle
          
          . tsset t
          
          Time variable: t, 1 to 30
                  Delta: 1 unit
          
          . prais usr idle, nolog
          
          
          Prais–Winsten AR(1) regression with iterated estimates
          
                Source |       SS           df       MS      Number of obs   =        30
          -------------+----------------------------------   F(1, 28)        =      7.12
                 Model |  43.0076941         1  43.0076941   Prob > F        =    0.0125
              Residual |  169.165739        28  6.04163354   R-squared       =    0.2027
          -------------+----------------------------------   Adj R-squared   =    0.1742
                 Total |  212.173433        29  7.31632528   Root MSE        =     2.458
          
          ------------------------------------------------------------------------------
                   usr | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                  idle |  -.1356522   .0472195    -2.87   0.008    -.2323769   -.0389275
                 _cons |   15.20415   4.160391     3.65   0.001     6.681978    23.72633
          -------------+----------------------------------------------------------------
                   rho |   .5535476
          ------------------------------------------------------------------------------
          Durbin–Watson statistic (original)    = 1.295766
          Durbin–Watson statistic (transformed) = 1.476004
          
          . dis "lower bound CI = " _b[idle] - invttail(28, .025)*_se[idle]
          lower bound CI = -.23237687
          
          . dis "higher bound CI = " _b[idle] + invttail(28, .025)*_se[idle]
          higher bound CI = -.03892751
          
          . dis "p-value = " 2*ttail(28, abs(_b[idle]/_se[idle]))
          p-value = .00767544

          Comment


          • #6
            Thanks for the explanations, Joro! The values match

            Now, in a large database like the one I have, how can I generate the p-value variable and CI for a large number of regressions?

            With the command "statsby _b _se, by( codigo_ibge ): prais logprev_ame_ ano", I can calculate p-value and CI for all "codigo_ibge", but not correctly because I don't have "df".
            Last edited by Andressa Freire; 25 Aug 2022, 11:54.

            Comment


            • #7
              Re #4: I cannot reproduce your problem with your example data. The code, edited to use the actual variable names in your data set, runs with no error messages and produces plausible looking output:

              Code:
              . * Example generated by -dataex-. To install: ssc install    dataex
              . clear
              
              . input long codigo_ibge int ano float logprev_ame_
              
                    codigo_ibge       ano  logprev~_
                1. 110002 2015   1.69897
                2. 110002 2016         2
                3. 110002 2017 1.7781513
                4. 110002 2018 1.8037053
                5. 110002 2019         2
                6. 110012 2015         2
                7. 110012 2016 1.7781513
                8. 110012 2017         2
                9. 110012 2018         2
               10. 110012 2019   1.30103
               11. 110029 2015 1.5126595
               12. 110029 2016 1.6651118
               13. 110029 2017 1.5948814
               14. 110029 2018  1.686381
               15. 110029 2019  1.533344
               16. 110032 2015  1.363178
               17. 110032 2016  .7695511
               18. 110032 2017 1.4259688
               19. 110032 2018  1.498405
               20. 110032 2019   1.39794
               21. end
              
              .
              . xtset codigo_ibge ano
              
              Panel variable: codigo_ibge (strongly balanced)
               Time variable: ano, 2015 to 2019
                       Delta: 1 unit
              
              .
              . levelsof codigo_ibge, local(ids)
              110002 110012 110029 110032
              
              . gen b = .
              (20 missing values generated)
              
              . gen ll = .
              (20 missing values generated)
              
              . gen ul = .
              (20 missing values generated)
              
              . gen pvalue = .
              (20 missing values generated)
              
              .
              . foreach i of local ids {
                2.     capture prais logprev_ame_ ano if codigo_ibge == `i'
                3.     if c(rc) == 0 {
                4.         matrix M = r(table)
                5.         foreach x in b ll ul pvalue {
                6.             quietly replace `x' = M["`x'", "ano"] if codigo_ibge == `i'
                7.         }
                8.     }
                9.     else if inlist(c(rc), 2000, 2001) {
               10.         continue
               11.     }
               12.     else {
               13.         display as error "Unexpected error: codigo_ibge == `i'"
               14.         exit(c(rc))
               15.     }
               16. }
              
              .
              . list, noobs clean
              
                  codigo~e    ano   logpre~_           b          ll         ul     pvalue  
                    110002   2015    1.69897    .0158911   -.0685422   .1003244   .5914078  
                    110002   2016          2    .0158911   -.0685422   .1003244   .5914078  
                    110002   2017   1.778151    .0158911   -.0685422   .1003244   .5914078  
                    110002   2018   1.803705    .0158911   -.0685422   .1003244   .5914078  
                    110002   2019          2    .0158911   -.0685422   .1003244   .5914078  
                    110012   2015          2   -.0595957   -.3509707   .2317792   .4716642  
                    110012   2016   1.778151   -.0595957   -.3509707   .2317792   .4716642  
                    110012   2017          2   -.0595957   -.3509707   .2317792   .4716642  
                    110012   2018          2   -.0595957   -.3509707   .2317792   .4716642  
                    110012   2019    1.30103   -.0595957   -.3509707   .2317792   .4716642  
                    110029   2015    1.51266    .0072499   -.0277862    .042286   .5572233  
                    110029   2016   1.665112    .0072499   -.0277862    .042286   .5572233  
                    110029   2017   1.594881    .0072499   -.0277862    .042286   .5572233  
                    110029   2018   1.686381    .0072499   -.0277862    .042286   .5572233  
                    110029   2019   1.533344    .0072499   -.0277862    .042286   .5572233  
                    110032   2015   1.363178    .1282266   -.0481095   .3045626   .1036335  
                    110032   2016   .7695511    .1282266   -.0481095   .3045626   .1036335  
                    110032   2017   1.425969    .1282266   -.0481095   .3045626   .1036335  
                    110032   2018   1.498405    .1282266   -.0481095   .3045626   .1036335  
                    110032   2019    1.39794    .1282266   -.0481095   .3045626   .1036335
              It is easy for me to imagine that you neglected to change some mention of the variable names shown in the original code, but that should not produce the error message you got. I really don't know what to say. Can you post back with example data that actually produces the error you are encountering?

              Comment


              • #8
                Thank you so much, Clyde!

                I applied the code exactly and at the end this message appears in red, so that the generated variables (b, ll, ul and pvalue) are still missing. The only difference is that I'm doing it in the full database.
                My version of stata is 13.1. I don't know if it could be related. (p.s: sorry, I couldn't send using dataex). I send the print below

                Click image for larger version

Name:	stata_1.png
Views:	1
Size:	303.0 KB
ID:	1679293
                Click image for larger version

Name:	stata_2.png
Views:	1
Size:	117.8 KB
ID:	1679294


                Comment


                • #9
                  Version 13.1 is pretty old by now. Much has changed since then, and it is possible that some of the syntax I am using here was introduced after that. I no longer have version 13.1 on any of my computers, so, not knowing what has changed since then, I can't really fix the code to run on the older version.

                  Here's what I suggest you do. Immediately after the -capture prais...- command, put in two new lines of code:
                  Code:
                  set tracedepth 1
                  set trace on
                  That way, the commands in the loop will themselves print out and we can see which of them is triggering that error message. Seeing that, I'll at least know which command is the source of the problem, and I'll see if I can dredge up from memory what the syntax for that used to be.

                  By the way, the Forum FAQ does specifically say that if you are not using the current version of Stata when you post, you should state what version you are using. In truth, I don't think that would have made a difference here: I probably would not have realized that I was using syntax beyond what was available in version 13. But still.

                  Comment


                  • #10
                    Thanks for the explanations, Clyde!
                    I followed your lead and the error appears in the matrix. As soon as possible I will update Stata, but I appreciate the help to solve my problem at this time.

                    Click image for larger version

Name:	stata_3.png
Views:	1
Size:	103.1 KB
ID:	1679315

                    Comment


                    • #11
                      Try this (changes from earlier code in bold face):
                      Code:
                      xtset codigo_ibge ano
                      
                      levelsof codigo_ibge, local(ids)
                      
                      gen b = .
                      gen ll = .
                      gen ul = .
                      gen pvalue = .
                      foreach i of local ids {
                          capture prais logprev_ame_ ano if codigo_ibge == `i'
                          if c(rc) == 0 {
                              matrix M = r(table)
                              local c: colnumb M "ano"
                              foreach x in b ll ul pvalue {
                                  local r: rownumb M "`x'"
                                  quietly replace `x' = M[`r', `c'] if codigo_ibge == `i'
                              }
                          }
                          else if inlist(c(rc), 2000, 2001) {
                              continue
                          }
                          else {
                              display as error "Unexpected error: codigo_ibge == `i'"
                              exit(c(rc))
                          }
                      }
                      If that doesn't eliminate that error message, I'm kind of stuck and hope somebody else who still uses 13.1 or remembers it better than I do will jump in. If it does eliminate that message but leads to a different one, do the -set tracedepth 1- -set trace on- thing again and post back with what happens.

                      Comment


                      • #12
                        The first thing that OP should check is whether -r(table)- exists at all in Stata 13. The object -r(table)- was relatively recently and quietly introduced, and Stata Corp documented it in Stata 17 manual only after I and William Lisowski asked for it to be documented.

                        Comment


                        • #13
                          This is the thread where I am raging about the quiet introduction of -r(table)-. https://www.statalist.org/forums/for...-it-documented

                          Somebody there says that -r(table)- was introduced in Stata 12. So then this should not be the problem.

                          Comment


                          • #14
                            Thank you so much, Clyde and Joro!

                            Clyde, I was able to run my analyzes on Stata 17 successfully. The code provided last still didn't run on Stata 13, but I appreciate you helping me come up with a solution to follow up on my analysis.

                            Joro, I'm going to read about r(table) in Stata 13 through the post you sent. Thank you for presenting more possibilities to do the analyses.

                            Comment

                            Working...
                            X