Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • re: #209 - to follow up on Leonardo Guizzetti 's comment, please see
    Code:
    help datetime##s4
    and, if this is not what you mean, please clarify

    Comment


    • Might there be a way to add an option to the –centile– command that allows it to compute percentiles in the precisely same way as do –summarize, d– and –estat bootstrap, percentile– (and probably other commands)?
      Code:
      . sysuse auto
      (1978 automobile data)
      
      . tempfile tboot
      
      . qui bootstrap, reps(500) saving(`tboot'): reg price mpg, level(50)
      
      . estat bootstrap, percentile
      
      Linear regression                               Number of obs     =         74
                                                      Replications      =        500
      
      ------------------------------------------------------------------------------
                   |    Observed               Bootstrap
             price | coefficient       Bias    std. err.  [50% conf. interval]
      -------------+----------------------------------------------------------------
               mpg |  -238.89435  -6.540619   56.911373   -283.3891  -206.8067   (P)
             _cons |   11253.061   133.0593   1356.8499    10512.26   12328.61   (P)
      ------------------------------------------------------------------------------
      Key: P: Percentile
      
      . use `tboot'
      (bootstrap: regress)
      
      . sum, d
      
                                 _b[mpg]
      -------------------------------------------------------------
            Percentiles      Smallest
       1%    -380.3653      -436.6451
       5%    -344.9039      -410.6723
      10%    -324.1469      -385.3353       Obs                 500
      25%    -283.3891      -385.0602       Sum of wgt.         500
      
      50%    -237.3672                      Mean           -245.435
                              Largest       Std. dev.      56.91137
      75%    -206.8067      -121.5362
      90%    -175.8918      -111.9515       Variance       3238.904
      95%    -159.8261      -107.5473       Skewness      -.2622398
      99%     -130.299      -100.4849       Kurtosis       2.846957
      
                                _b[_cons]
      -------------------------------------------------------------
            Percentiles      Smallest
       1%     8331.229       7704.635
       5%     9282.614        7916.47
      10%     9701.941        8226.62       Obs                 500
      25%     10512.26       8312.646       Sum of wgt.         500
      
      50%     11250.26                      Mean           11386.12
                              Largest       Std. dev.       1356.85
      75%     12328.61       14624.41
      90%     13168.51       14663.63       Variance        1841042
      95%     13638.37          14861       Skewness       .1261091
      99%     14560.25       15460.15       Kurtosis       2.709669
      
      . centile *, c(25 75)
      
                                                                Binom. interp.   
          Variable |       Obs  Percentile    Centile        [95% conf. interval]
      -------------+-------------------------------------------------------------
            _b_mpg |       500         25   -283.3945       -288.5923   -277.2365
                   |                   75   -206.7666       -212.8225   -200.2001
           _b_cons |       500         25    10512.16        10293.56    10656.54
                   |                   75    12332.11        12182.53    12567.85

      Comment


      • 1. Just a brief response to the request for a joyplots command by Igor Paploski. I agree with what Maarten Buis and Nick Cox said. Joyplots are very context-specific and work under certain circumstances. The number of parameters required to provide full functionality is quite a lot. It is definitely quicker to create them manually. As Nick said, they are basically overlapping area graphs and the guide provides the code. If you have the code block in place, you can easily copy and modify it to fit different data sets. There is not so much programming involved here. If the creation of extra junk variables is the issue, then just replace them with tempvars. I intentionally don't go so far in the guides to accommodate more basic users, but there is definitely a lot of scope for code optimization.

        2. My wish list for Stata:

        a) dofile editor:
        i) It will be great if syntax highlighting is different colors for different commenting-out options (* vs // vs /* */). This will help discern actual comments inside the code versus marked-out code.

        b) Graph options
        i) ability to scale marker weights based on a starting and ending value
        ii) ability to scale colors, and line widths based on a starting and ending value
        iii) Marker labels can be read from variables. Please also allow this for angles, colors, marker sizes. This will make grouping so much easier on graphs

        c) A long-term wishlist
        i) Ability to read pixels (to process images, geotiffs, edge detection, native machine learning routines, etc)
        ii) color scaling within graph elements defined by starting and ending color. For example, higher values on the y-axis are darker than lower values for the same line.
        Last edited by Asjad Naqvi; 17 Dec 2021, 15:00.

        Comment


        • I wish in Stata 18 horizontal scroll bar for result window rather than wraping results in result window. Or is there any option to turn on horizontal scroll bar for result window like in do editor preferences. Thanks and best regards

          Comment


          • An option to suppress repeated categorical axes in a by-graph would be very helpful. The repetition of axis labels can be switched off using the by-option suboption noiylabel / noixlabel, but this only works for numerical axes, not for categorical ones. This issue has been discussed before, for example:
            Suppressing Repeated Categorical Axis Labels in by-graphs
            In by cat_var bar graph, how to show var names only on the first graph y-axis
            Multiple graphs combined with the over option (and only one legend)

            Comment


            • An option for log to close when a do-file exits with error.

              Comment


              • Following execution of a tabi command, ensure that an r(p_exact) stored value (i.e., 1.00) is saved even in the absence of table output. See discussion below.

                https://www.statalist.org/forums/for...o-output-shown

                Comment


                • I am opposed to #217 - I don't think that the result makes any sense

                  Comment


                  • Originally posted by Rich Goldstein View Post
                    I am opposed to #217 - I don't think that the result makes any sense
                    My sentiments exactly. Just because other software produces a result doesn't mean that it is correct in doing so.

                    Comment


                    • In 2019, Goncalo Cotovio asked if there is an immediate form of the paired t-test. The answer then, as now, is that there is not an immediate form of the paired t-test. But why not? The paired t-test can easily be computed from the following summary data:
                      • n = the number of paired scores
                      • r = the correlation between the paired scores
                      • #mean1 = mean of 1st sample
                      • #sd1 = SD of 1st sample
                      • #mean2 = mean of 2nd sample
                      • #sd2 = SD of 2nd sample
                      This is exactly the same number of arguments needed for the immediate form of the unpaired t-test:

                      Code:
                          Immediate form of two-sample t test
                      
                              ttesti #obs1 #mean1 #sd1 #obs2 #mean2 #sd2 [, options2]
                      For an immediate form of the paired t-test, the syntax might be something like this:

                      Code:
                          Immediate form of two-sample paired t test
                      
                              ttesti #obs #r #mean1 #sd1 #mean2 #sd2, paired [options]
                      The following code shows the needed computations and compares the results to those from -ttest- using the raw data.

                      Cheers,
                      Bruce

                      Code:
                      clear
                      webuse fuel
                      summarize  
                      * Write the needed summary measures to the dataset
                      quietly summarize mpg1
                      generate byte n = r(N) in 1
                      generate m1 = r(mean) in 1
                      generate sd1 = r(sd) in 1
                      quietly summarize mpg2
                      generate m2 = r(mean) in 1
                      generate sd2 = r(sd) in 1
                      quietly pwcorr mpg1 mpg2
                      generate r = r(rho) in 1
                      * Now compute the paired t-test from the summary data
                      generate mdiff = m1-m2
                      generate sddiff = sqrt(sd1^2+sd2^2-2*r*sd1*sd2)
                      generate sediff = sddiff/sqrt(n)
                      generate tobs = mdiff/sediff
                      generate byte df = n-1
                      generate pval = ttail(df,abs(tobs))*2
                      list mdiff-pval in 1
                      * Compare results to those from -ttest-
                      ttest mpg1==mpg2
                      --
                      Bruce Weaver
                      Email: [email protected]
                      Web: http://sites.google.com/a/lakeheadu.ca/bweaver/
                      Version: Stata/MP 18.0 (Windows)

                      Comment


                      • Currently, the shell command is ignored when running Stata in batch mode in Windows (leaving the notice "request ignored because of batch mode" in the log-file). It would be nice if this were rectified in a future update.

                        Comment


                        • As discussed here, margins should not be available after nonlinear models with fixed effects are estimated. The explanation for that is simple: any interesting quantity that we may want to compute will depend on the value of the fixed effects, which are not estimated by these commands. Therefore, margins computes something that most of the times is meaningless. This could be done in a future update, but at least it would be good to have this looked into in the next version.

                          Comment


                          • This should be simple but I can't get it to work. I have a data set attached in which I want to create a variable for US states. The state variables in the World Values Study are X048ISO and X048WVS.I created a variable called region from X048WVS. I then wanted to create a state level variable when X048WVS is not equal to a region value. So I typed and received:

                            gen str24 state = "."

                            . des state

                            storage display value
                            variable name type format label variable label
                            -------------------------------------------------------------------------------------------------------------------------------
                            state str24 %24s

                            . replace state = X048WVS if region1 == "."
                            type mismatch
                            r(109);

                            where region is gen str24 state = "."

                            . des state

                            storage display value
                            variable name type format label variable label
                            -------------------------------------------------------------------------------------------------------------------------------
                            state str24 %24s

                            . replace state = X048WVS if region1 == "."
                            type mismatch
                            r(109);

                            where region ==

                            . fre X048WVS if S003 == 840

                            X048WVS -- Region where the interview was conducted (WVS)
                            ----------------------------------------------------------------------------------------
                            | Freq. Percent Valid Cum.
                            -------------------------------------------+--------------------------------------------
                            Valid -2 No answer | 55 0.62 0.62 0.62
                            840001 US: New England | 364 4.13 4.13 4.75
                            840002 US: Middle Atlantic States | 962 10.91 10.91 15.66
                            840003 US: South Atlantic | 971 11.01 11.01 26.67
                            840004 US: East South Central | 462 5.24 5.24 31.91
                            840005 US: West South Central | 670 7.60 7.60 39.51
                            840006 US: East North Central | 996 11.29 11.29 50.80
                            840007 US: West North Central | 418 4.74 4.74 55.54
                            840008 US: Rocky Mountain state | 369 4.18 4.18 59.72
                            840009 US: Northwest | 156 1.77 1.77 61.49
                            840010 US: California | 445 5.05 5.05 66.54
                            840011 US: Alaska | 3 0.03 0.03 66.57
                            840012 US: Hawai | 4 0.05 0.05 66.62
                            840013 US: Pacific | 348 3.95 3.95 70.56
                            840201 US: AL Alabama | 28 0.32 0.32 70.88
                            840202 US: AR Arkansas | 11 0.12 0.12 71.01
                            840203 US: AZ Arizona | 73 0.83 0.83 71.83
                            840204 US: CA California | 282 3.20 3.20 75.03
                            840205 US: CO Colorado | 73 0.83 0.83 75.86
                            840206 US: CT Connecticut | 26 0.29 0.29 76.15
                            : | : : : :
                            840231 US: NJ New Jersey | 56 0.63 0.63 89.36
                            840232 US: NM New Mexico | 27 0.31 0.31 89.67
                            840233 US: NV Nevada | 22 0.25 0.25 89.92
                            840234 US: NY New York | 118 1.34 1.34 91.26
                            840235 US: OH Ohio | 101 1.15 1.15 92.40
                            840236 US: OK Oklahoma | 28 0.32 0.32 92.72
                            840237 US: OR Oregon | 33 0.37 0.37 93.09
                            840238 US: PA Pennsylvania | 83 0.94 0.94 94.04
                            840239 US: RI Rhode Island | 8 0.09 0.09 94.13
                            840240 US: SC South Carolina | 12 0.14 0.14 94.26
                            840241 US: SD South Dakota | 24 0.27 0.27 94.53
                            840242 US: TN Tennessee | 55 0.62 0.62 95.16
                            840243 US: TX Texas | 177 2.01 2.01 97.17
                            840244 US: UT Utah | 22 0.25 0.25 97.41
                            840245 US: VA Virginia | 68 0.77 0.77 98.19
                            840246 US: VT Vermont | 6 0.07 0.07 98.25
                            840247 US: WA Washington | 65 0.74 0.74 98.99
                            840248 US: WI Wisconsin | 68 0.77 0.77 99.76
                            840249 US: WV West Virginia | 19 0.22 0.22 99.98
                            840250 US: WY Wyoming | 2 0.02 0.02 100.00
                            Total | 8819 100.00 100.00
                            --------------------------------------------------------------------------------------

                            and I wanted to assign state values of . fre X048WVS if S003 == 840

                            X048WVS -- Region where the interview was conducted (WVS)
                            ----------------------------------------------------------------------------------------
                            | Freq. Percent Valid Cum.
                            -------------------------------------------+--------------------------------------------
                            Valid -2 No answer | 55 0.62 0.62 0.62
                            840001 US: New England | 364 4.13 4.13 4.75
                            840002 US: Middle Atlantic States | 962 10.91 10.91 15.66
                            840003 US: South Atlantic | 971 11.01 11.01 26.67
                            840004 US: East South Central | 462 5.24 5.24 31.91
                            840005 US: West South Central | 670 7.60 7.60 39.51
                            840006 US: East North Central | 996 11.29 11.29 50.80
                            840007 US: West North Central | 418 4.74 4.74 55.54
                            840008 US: Rocky Mountain state | 369 4.18 4.18 59.72
                            840009 US: Northwest | 156 1.77 1.77 61.49
                            840010 US: California | 445 5.05 5.05 66.54
                            840011 US: Alaska | 3 0.03 0.03 66.57
                            840012 US: Hawai | 4 0.05 0.05 66.62
                            840013 US: Pacific | 348 3.95 3.95 70.56
                            840201 US: AL Alabama | 28 0.32 0.32 70.88
                            840202 US: AR Arkansas | 11 0.12 0.12 71.01
                            840203 US: AZ Arizona | 73 0.83 0.83 71.83
                            840204 US: CA California | 282 3.20 3.20 75.03
                            840205 US: CO Colorado | 73 0.83 0.83 75.86
                            840206 US: CT Connecticut | 26 0.29 0.29 76.15
                            : | : : : :
                            840231 US: NJ New Jersey | 56 0.63 0.63 89.36
                            840232 US: NM New Mexico | 27 0.31 0.31 89.67
                            840233 US: NV Nevada | 22 0.25 0.25 89.92
                            840234 US: NY New York | 118 1.34 1.34 91.26
                            840235 US: OH Ohio | 101 1.15 1.15 92.40
                            840236 US: OK Oklahoma | 28 0.32 0.32 92.72
                            840237 US: OR Oregon | 33 0.37 0.37 93.09
                            840238 US: PA Pennsylvania | 83 0.94 0.94 94.04
                            840239 US: RI Rhode Island | 8 0.09 0.09 94.13
                            840240 US: SC South Carolina | 12 0.14 0.14 94.26
                            840241 US: SD South Dakota | 24 0.27 0.27 94.53
                            840242 US: TN Tennessee | 55 0.62 0.62 95.16
                            840243 US: TX Texas | 177 2.01 2.01 97.17
                            840244 US: UT Utah | 22 0.25 0.25 97.41
                            840245 US: VA Virginia | 68 0.77 0.77 98.19
                            840246 US: VT Vermont | 6 0.07 0.07 98.25
                            840247 US: WA Washington | 65 0.74 0.74 98.99
                            840248 US: WI Wisconsin | 68 0.77 0.77 99.76
                            840249 US: WV West Virginia | 19 0.22 0.22 99.98
                            840250 US: WY Wyoming | 2 0.02 0.02 100.00
                            Total | 8819 100.00 100.00
                            --------------------------------------------------------------------------------------

                            and I wanted to attach values to state from W048WVS when W048WVS is a state, not a region. But my command didn;t work (see above). I tried to upload the data set (very small) but it wouldn;'t load. But this should be obvious to people who can do this better than I can.

                            Thanks,

                            Ric Uslaner

                            Comment


                            • Euslaner (post #223) -

                              This is off-topic here, which is the Wishlist for Stata 18.

                              Please post your question to a new topic, preferably using code delimiters [CODE] and [/CODE] to render the example output more readable.

                              Comment


                              • I know I usually say stuff like this, but I think it has to be said for those interested in treatment effects: another interesting algorithm we might find useful would be the Bayesian Structural Time Series approach. It's a method that's related to synthetic controls, but of course differs radically in the underlying theoretical framework. It's already been applied in a few interesting contexts, and I'd hate to see R beat Stata further (as much as it currently does, anyhow) in the treatment effects department.

                                I know StataCorp did quite a lot with teffects in 16 and 17, but I essentially think we should, and should want to see formalized, StataCorp implementations of synthetic controls, RD, and other treatment effect estimators, just as we have with difference-in-differences in (i think?) Stata 17.

                                Comment

                                Working...
                                X