Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New -dstat- package available from SSC

    Thanks to Kit Baum, a new package called dstat is available from the SSC Archive. dstat unites a variety of methods to describe (univariate) statistical distributions. Covered are
    • density estimation
    • histograms
    • cumulative distribution functions
    • probability distributions
    • quantile functions
    • Lorenz curves
    • percentile shares
    and a large collection of summary statistics such as
    • classical and robust measures of location, scale, skewness, and kurtosis
    • inequality and poverty measures
    Particular features of the command are
    • consistent standard errors supporting complex sample designs for all covered statistics
    • simultaneous analysis of multiple variables across multiple subpopulations
    • covariate balancing based on reweighting techniques (inverse probability weighting and entropy balancing), including appropriate correction of standard errors
    • generating influence functions (or recentered influence functions) for further analysis, e.g. in a RIF regression
    To install the package, type

    Code:
    . ssc install dstat
    Since the latest update of moremata is required, you may also want to type

    Code:
    . ssc install moremata, replace
    Furthermore, dstat calls coefplot when drawing graphs. To install or update coefplot, type

    Code:
    . ssc install coefplot, replace
    Writing dstat was low hanging fruit after having developed reldist (see https://ideas.repec.org/p/bss/wpaper/37.html). However, consider this as a beta release. The code is complex and long (a bit more than 6000 lines) and I certainly did not catch all bugs in my tests. If you encounter any issues, please let me know (either via email or by posting an issue at https://github.com/benjann/dstat).

    ben


  • #2
    I tried the program but I failed with the first example:

    Code:
    . update query
    (contacting http://www.stata.com)
    
    Update status
        Last check for updates:  25 Nov 2020
        New update available:    none         (as of 25 Nov 2020)
        Current update level:    19 Nov 2020  (what's new)
    
    Possible actions
    
        Do nothing; all files are up to date.
    
    . h dstat
    
    . sysuse nlsw88, clear
    (NLSW, 1988 extract)
    
    . dstat (mean gmean median trim5 winsor5 huber95 hl) wage, over(union)
    invalid 'user'
                     stata():  3598  Stata returned error
            _dstat_cstripe():     -  function returned error
             dstat_cstripe():     -  function returned error
                     dstat():     -  function returned error
                     <istmt>:     -  function returned error
    r(3598);
    I've installed coefplot (version 1.8.3) earlier and updated moremata by adoupdate. What is invalid "user" and how can I avoid that?

    Comment


    • #3
      Many thanks for this grand unified package. As a fan of inluence functions myself, I am looking forward to parmesting it.
      All the best
      Roger

      Comment


      • #4
        Marc, can you tell me the exact Stata version you are using (type about to find out)? This error should not be possible in any of the Stata versions supported by dstat.
        ben

        Comment


        • #5
          Hi Ben, same error here:


          . sysuse nlsw88, clear
          (NLSW, 1988 extract)

          . ssc install moremata, replace

          . ssc install coefplot, replace

          . dstat (gini mld) wage, rif(gini mld)
          invalid 'user'
          stata(): 3598 Stata returned error
          _dstat_cstripe(): - function returned error
          dstat_cstripe(): - function returned error
          dstat(): - function returned error
          <istmt>: - function returned error
          r(3598);

          . dstat (mean gmean median trim5 winsor5 huber95 hl) wage, over(union)
          invalid 'user'
          stata(): 3598 Stata returned error
          _dstat_cstripe(): - function returned error
          dstat_cstripe(): - function returned error
          dstat(): - function returned error
          <istmt>: - function returned error
          r(3598);

          . about

          Stata/MP 16.1 for Windows (64-bit x86-64)
          Revision 05 Nov 2020
          Copyright 1985-2019 StataCorp LLC

          Luis

          Comment


          • #6
            Very strange. I can still not reproduce the error. Tried Stata 14, 15, 16 on Mac and Stata 16 on Windows.
            There is a piece of code in dstat that is executed under version 14.2, user to handle an issue with how Stata sets column stripes of matrices. Option user is necessary in this context, but it appears to cause the error that Marc and Luis encountered. The option exists since Stata 14 and it is puzzling to me how it can cause error.

            Comment


            • #7
              Ben, tried on my "old" v15.1

              . dstat (gini mld) wage, rif(gini mld)
              invalid 'user'
              stata(): 3598 Stata returned error
              _dstat_cstripe(): - function returned error
              dstat_cstripe(): - function returned error
              dstat(): - function returned error
              <istmt>: - function returned error
              r(3598);

              . about

              Stata/MP 15.1 for Windows (64-bit x86-64)
              Revision 26 Aug 2019
              Copyright 1985-2017 StataCorp LLC

              hope , it gives you some clue.

              Comment


              • #8
                Ben, voila:

                Stata/MP 16.1 for Windows (64-bit x86-64)
                Revision 19 Nov 2020
                Copyright 1985-2019 StataCorp LLC

                Originally posted by Ben Jann View Post
                Marc, can you tell me the exact Stata version you are using (type about to find out)? This error should not be possible in any of the Stata versions supported by dstat.
                ben
                Code:
                set trace on
                dstat (gini mld) wage, rif(gini mld)
                I get the following failure message (reduced to the last parts - if you need full trace please notify me):

                Code:
                      -------------------------------------------------------------------------------------- begin dstat.Parse_slist ---
                      - gettoken touse slist : 0
                      - mata: dstat_slist_expand()
                      - c_local slist `"`slist'"'
                      = c_local slist `"(gini mld) wage"'
                      - c_local varlist: list uniq vlist
                      ---------------------------------------------------------------------------------------- end dstat.Parse_slist ---
                    - }
                    - else {
                      fvexpand `varlist' if `touse'
                      local varlist `r(varlist)'
                      }
                    - foreach plv of local plvars {
                      capt assert(`plv'>0) if `touse'
                      if _rc==1 exit _rc
                      else if _rc {
                      di as err `"`plv': poverty line must be positive"'
                      exit _rc
                      }
                      }
                    - if "`over'"!="" {
                    = if ""!="" {
                      capt assert ((`over'==floor(`over')) & (`over'>=0)) if `touse'
                      if _rc==1 exit _rc
                      if _rc {
                      di as err "variable in over() must be integer and nonnegative"
                      exit 452
                      }
                      qui levelsof `over' if `touse', local(overlevels)
                      local N_over: list sizeof overlevels
                      local overlevels "`overlevels'"
                      local over_labels
                      foreach o of local overlevels {
                      local olab: label (`over') `o'
                      local over_labels `"`over_labels' `"`olab'"'"'
                      }
                      local over_labels: list clean over_labels
                      if `"`bal_method'"'!="" {
                      if "`bal_ref'"!="" {
                      if `:list bal_ref in overlevels'==0 {
                      di as err "{bf:balance()}: no observations in reference distribution"
                      exit 2000
                      }
                      }
                      if "`bal_wvar'"!="" {
                      tempname BAL_WVAR
                      qui gen double `BAL_WVAR' = `wvar' if `touse'
                      }
                      fvexpand `bal_varlist' if `touse'
                      local bal_varlist2 `r(varlist)'
                      }
                      }
                    - tempname b id _N _W
                    - if "`subcmd'"!="summarize" tempname AT
                    = if "summarize"!="summarize" tempname AT
                    - if inlist("`subcmd'", "density", "quantile", "summarize") tempname BW
                    = if inlist("summarize", "density", "quantile", "summarize") tempname BW
                    - mata: dstat()
                invalid 'user'
                                 stata():  3598  Stata returned error
                        _dstat_cstripe():     -  function returned error
                         dstat_cstripe():     -  function returned error
                                 dstat():     -  function returned error
                                 <istmt>:     -  function returned error
                    -------------------------------------------------------------------------------------------- end dstat._Estimate ---
                    c_local generate_quietly `generate_quietly'
                    exit
                    }
                  ----------------------------------------------------------------------------------------------- end dstat.Estimate ---
                  if c(stata_version)<15 {
                  _estimates unhold `ecurrent', not
                  }
                  }
                ---------------------------------------------------------------------------------------------------------- end dstat ---
                r(3598);

                Comment


                • #9
                  I am still puzzled why this error occurs and cannot reproduce it on any of the Stata installations I have access to. However, I now posted an update of dstat at GitHub that contains code to circumvent the error should it occur. The consequence is that in some situations the displayed output table might look slightly awkward (results are not affected). To install the update from GitHub type

                  Code:
                  . net install dstat, replace from(https://raw.githubusercontent.com/benjann/dstat/main/)
                  The update should also become available from SSC in some time.

                  ben

                  Comment


                  • #10
                    Sorry to say - with a freshly opened Stata:
                    Code:
                    . net install dstat, replace from(https://raw.githubusercontent.com/benjann/dstat/main/)
                    checking dstat consistency and verifying not already installed...
                    installing into c:\ado\plus\...
                    installation complete.
                    
                    . h dstat
                    
                    . sysuse nlsw88, clear
                    (NLSW, 1988 extract)
                    
                    . dstat (mean gmean median trim5 winsor5 huber95 hl) wage, over(union)
                    invalid 'user'
                                     stata():  3598  Stata returned error
                            _dstat_cstripe():     -  function returned error
                             dstat_cstripe():     -  function returned error
                                     dstat():     -  function returned error
                                     <istmt>:     -  function returned error
                    r(3598);
                    
                    . which dstat
                    c:\ado\plus\d\dstat.ado
                    *! version 1.0.2  27nov2020  Ben Jann

                    Comment


                    • #11
                      FWIW, you could use option user in calls to the version command even back in Stata 11; the option would probably not have an effect but it would not cause an error either. So this is indeed very strange. I notice that both Marc and Luis run Stata MP; Ben, do you have an MP version, too?

                      Comment


                      • #12
                        Still puzzling. I posted yet another update to GitHub. Can you try once more?

                        Comment


                        • #13
                          yes, Stata MP
                          Code:
                          . about
                          
                          Stata/MP 16.1 for Mac (64-bit Intel)
                          Revision 19 Nov 2020
                          Copyright 1985-2019 StataCorp LLC

                          Comment


                          • #14
                            OK, I think I got it:

                            Code:
                            . set dp comma
                            . mata : stata(sprintf("version %g, user", 16.1))
                            invalid 'user' 
                                             stata():  3598  Stata returned error
                                             <istmt>:     -  function returned error
                            r(3598);
                            With dp set to comma

                            Code:
                            . mata : st_numscalar("c(userversion)")
                              16,1

                            Comment


                            • #15
                              Thanks, awesome, you are a hero!

                              Comment

                              Working...
                              X