Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Getting more decimals from matcell

    Dear statalisters,

    I am trying to obtain the percentages from a tabm table that uses weights and I get some precision error. You can see that for variable r12 and Categ5, the proportion is 53.85%. However, if I try to calculate this same proportion using the results stored by matcell(), the resulting proportion is 53.83. Is there any way to make matcells() store more decimals?

    Below you have my code and results:

    . tabm r1-r12 [aw=wgt], row nofreq matcell(freqs)

    | values
    variable | Categ1 Categ2 Categ3 Categ4 Categ5 | Total
    ------------------+-------------------------------------------------------+----------
    r1 | 0.00 0.00 0.71 45.33 53.96 | 100.00
    r2 | 0.00 0.00 1.93 44.11 53.96 | 100.00
    r3 | 0.00 0.00 4.21 41.83 53.96 | 100.00
    r4 | 4.83 1.48 0.69 39.15 53.85 | 100.00
    .. | .... .... .... ..... ..... | ......
    r12 | 0.00 0.00 0.00 46.15 53.85 | 100.00
    ------------------+-------------------------------------------------------+----------
    Total | 1.76 0.80 1.47 42.09 53.88 | 100.00

    . mat li freqs

    freqs[12,5]
    c1 c2 c3 c4 c5
    r1 0 0 18.687787 1194.2136 1421.6247
    r2 0 0 50.843143 1162.0582 1421.6247
    r3 0 0 110.89391 1102.0074 1421.6247
    r4 127.37912 39.013458 18.216818 1033.5905 1421.6247
    .................................................. .......
    r12 58.60397 45.169448 74.980403 1039.4461 1421.6247


    mat percent = 100*freqs/2641

    . mat li percent

    percent[12,5]
    c1 c2 c3 c4 c5
    r1 0 0 .7076027 45.218234 53.829032
    r2 0 0 1.9251474 44.000689 53.829032
    r3 0 0 4.1989364 41.7269 53.829032
    r4 4.8231396 1.4772229 .68976971 39.136331 53.829032
    .................................................. .......
    r12 2.2190068 1.7103161 2.8390914 39.358049 53.829032





    Thank you in advance,

    Aina

  • #2
    Welcome to Statalist.

    Your problem is that the output of tabm reports 53.85% for r12 c5, but the corresponding freqs cell is 1421.6247, and when you multiply that by 100 and divide it by 2641, the result is less.

    I note however that if you total the values of the freqs row for r12, the total is 2639.824621, and if you divide by that, rather than 2641, the result is 53.852998.

    Comment


    • #3
      Thank you William,

      However, I would prefer working with the true values as this code is going to be used with different sets of data.

      Comment


      • #4
        We should note, as the the Statalist FAQ asks us to do, that tabm is a community contributed command that is part of the tab_chi package available from SSC, see
        Code:
        ssc describe tab_chi
        for details. It took a while to figure that out, which is why the FAQ asks that questions include that information for commands that are not part of base Stata.

        If you were to run the commands
        Code:
        tabm r1-r12 [aw=wgt], row matcell(freqs)
        matrix list freqs
        you would find that the weighted frequencies reported in the tabm display are exactly those found in the freqs matrix.

        Each row of percentages in the tabm display the corresponding row of frequencies, with each frequence divided by the total of that row, and then multiplied by 100.

        Done that way, the total of the percentages will be 100. If you divide by something other than the total of the row - by the "true value" for example - the percentages will no longer total to 100%.

        Your problem is not a lack of precision in the freqs matrix returned by the matcells() option.

        You do not explain why you think 2641 is the "true value", nor do you explain what the "true value" is. Perhaps that is the number of observations in your dataset.

        If so, I expect that (at least) one observation was dropped because of missing values. This would explain why the total of the row r12 is 2639.824621 rather than 2641 as I would expect for weighting with analytical weights. You could confirm this if you were to run the commands commands without weighting the tabm command and adding the missing option.
        Code:
        tabm r1-r12, row matcell(freqs) missing
        matrix list freqs
        Last edited by William Lisowski; 10 Jun 2020, 09:29.

        Comment


        • #5
          Dear William,

          Thank you for your prompt and comprehensive reply. I apologize for not being as clear as I should have been.

          You are right, there are some missing values that do change the total number of observations (the true value I referred to).

          I will follow your advice and calculate the percentages using each row total.

          Thank you,

          Aina

          Comment

          Working...
          X