Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Output of 'ci proportions' depends on variable format

    Why does the result of 'ci proportions x' depend on the format of variable x? Very strange. Using version 15. See below:

    . describe x

    storage display value
    variable name type format label variable label
    -------------------------------------------------------------------------------------
    x byte %8.0f Example variable

    . tab x, missing

    Example |
    variable | Freq. Percent Cum.
    ------------+-----------------------------------
    0 | 86,913 98.73 98.73
    1 | 1,119 1.27 100.00
    ------------+-----------------------------------
    Total | 88,032 100.00

    . ci proportions x

    -- Binomial Exact --
    Variable | Obs Proportion Std. Err. [95% Conf. Interval]
    -------------+---------------------------------------------------------------
    x | 88,032 0 0 0 0

    . format x %10.0g

    . ci proportions x

    -- Binomial Exact --
    Variable | Obs Proportion Std. Err. [95% Conf. Interval]
    -------------+---------------------------------------------------------------
    x | 88,032 .0127113 .0003776 .0119817 .0134733

    Last edited by Gabor Mihala; 27 Mar 2019, 19:37.

  • #2
    I can't reproduce your results. Here's what I get:
    Code:
    . clear
    
    . // create data
    . set obs 88032
    number of observations (_N) was 0, now 88,032
    
    . gen byte x = _n <= 1119
    
    . //
    . //
    . desc x
    
                  storage   display    value
    variable name   type    format     label      variable label
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    x               byte    %8.0g                 
    
    . tab x
    
              x |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |     86,913       98.73       98.73
              1 |      1,119        1.27      100.00
    ------------+-----------------------------------
          Total |     88,032      100.00
    
    . ci proportions x   // works fine for me
    
                                                             -- Binomial Exact --
        Variable |        Obs  Proportion    Std. Err.       [95% Conf. Interval]
    -------------+---------------------------------------------------------------
               x |     88,032    .0127113    .0003776        .0119817    .0134733

    Comment


    • #3
      Thanks Mike. I had %8.0f format to start with.
      Sorry about the formatting of my first post - otherwise I've been a Stata user since 2011 so this relatively simple issue is very surprising!

      Comment


      • #4
        Welcome to Statalist, and thank you for the provocative question.

        My example below shows that the format assigned to x is applied by ci to all the values reported for x, other than Obs.

        This seems like a questionable decision in some cases, especially for 0/1 variables with a %n.0f format assigned, and a decision that should be more easily overridden with an option to the ci command. Nevertheless, ci appears to have operated this way for some time now, so I suppose it is not an oversight.

        Added in edit: the summarize command does not suffer from this issue; it apparently ignores the format assigned to x. I don't see why the ci command does what it does.
        Code:
        . clear
        
        . // create data
        . set obs 88032
        number of observations (_N) was 0, now 88,032
        
        . gen byte x = _n <= 1119
        
        . tab x
        
                  x |      Freq.     Percent        Cum.
        ------------+-----------------------------------
                  0 |     86,913       98.73       98.73
                  1 |      1,119        1.27      100.00
        ------------+-----------------------------------
              Total |     88,032      100.00
        
        . // try different formats
        . format x %8.0g
        
        . ci proportions x
        
                                                                 -- Binomial Exact --
            Variable |        Obs  Proportion    Std. Err.       [95% Conf. Interval]
        -------------+---------------------------------------------------------------
                   x |     88,032    .0127113    .0003776        .0119817    .0134733
        
        . format x %8.4f
        
        . ci proportions x
        
                                                                 -- Binomial Exact --
            Variable |        Obs  Proportion    Std. Err.       [95% Conf. Interval]
        -------------+---------------------------------------------------------------
                   x |     88,032      0.0127      0.0004          0.0120      0.0135
        
        . format x %8.0f
        
        . ci proportions x
        
                                                                 -- Binomial Exact --
            Variable |        Obs  Proportion    Std. Err.       [95% Conf. Interval]
        -------------+---------------------------------------------------------------
                   x |     88,032           0           0               0           0
        
        . // try older version
        . version 13: ci x, binomial
        
                                                                 -- Binomial Exact --
            Variable |        Obs        Mean    Std. Err.       [95% Conf. Interval]
        -------------+---------------------------------------------------------------
                   x |     88,032           0           0               0           0
        
        .
        Last edited by William Lisowski; 28 Mar 2019, 06:41.

        Comment


        • #5
          Thank you, William.

          I tend to use the %8.0f variable format, maybe I should switch to the %8.0g format.

          Comment


          • #6
            In post #4 I gave an incomplete description of how the summarize command formats its output. It chooses a general format as a default, but includes an option to override the default with the variable's assigned format.

            This struck me as an elegant solution to providing a useful default format while allowing customization, so I asked Stata Technical Services about the difference in the approaches between the summarize and ci commands.

            In response I was told

            Thank you for bringing this to our attention. I have passed this along to the relevant developers, and we will release a fix in a future update. We apologize for any inconvenience.
            I appreciate that response.

            Comment


            • #7
              Thank you, William

              Comment

              Working...
              X