Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • issue with collect recode

    I am trying to get a table that has percent distribution of some categorical variables, 95% CIs for the percent, and p-value from a chi2 test comparing two levels of the 'by' variable. My code goes like this:

    dtable i.sex i.hdage i.white, by(ld10, nototals tests) svy
    collect style autolevels result _dtable_test, clear
    collect: proportion sex hdage white [iw=gross4_ind], percent over(ld10)
    collect composite define _r_ci = _r_lb _r_ub, delimiter(",") trim replace
    collect style cell result[_r_ci], sformat("(%s)") nformat(%6.0f)
    collect style cell result[fvpercent], sformat("%s%%") nformat(%6.0f)
    collect style autolevels result fvpercent _r_ci _dtable_test
    collect layout (colname var) (ld10#result)

    The table below is the closest to what I am trying to achieve, but obviously I want the p-values to be in the upper half of the table and I don't need the bottom half of the table. I understand that I need to use collect recode (or collect remap). I have spent a lot of time trying this but to no good result. How do I identify what I need to collect recode, and to what? Thanks in advance!


    Click image for larger version

Name:	Screenshot 2025-08-15 at 10.23.42.png
Views:	1
Size:	123.7 KB
ID:	1781028





    Attached Files
    Last edited by Filip Sosenko; 15 Aug 2025, 03:35.

  • #2
    Please provide a reproducible example.

    Comment


    • #3
      Thank you Andrew. I was hoping I can avoid this.

      Code:
      clear
      input byte sex long gross4_ind byte ld2
      1 1 1
      2 1 1
      2 1 1
      2 1 1
      2 1 1
      1 1 2
      2 1 1
      1 1 1
      1 1 1
      1 1 1
      2 1 2
      2 1 1
      2 1 1
      2 1 1
      2 1 1
      1 1 2
      1 1 1
      1 1 2
      1 1 1
      2 1 2
      end
      label values sex SEX
      label def SEX 1 "Male", modify
      label def SEX 2 "Female", modify
      label values ld2 ld2
      label def ld2 1 "Not disabled" 2 "Disabled", modify
      Code:
      svyset [iw=gross4_ind]
      dtable i.sex, by(ld2, nototals tests) svy
      collect style autolevels result _dtable_test, clear
      collect: proportion sex [iw=gross4_ind], percent over(ld2)
      collect composite define _r_ci = _r_lb _r_ub, delimiter(",") trim replace
      collect style cell result[_r_ci], sformat("(%s)") nformat(%6.0f) 
      collect style cell result[fvpercent], sformat("%s%%") nformat(%6.0f) 
      collect style autolevels result fvpercent _r_ci _dtable_test
      
      collect recode colname [email protected] = 1.sex  [email protected] = 1.sex [email protected] = 2.sex [email protected] = 2.sex
      collect layout (colname var) (ld2#result)

      Comment


      • #4
        Thanks for the reproducible example. See

        Code:
        help collect addtags

        Code:
        clear
        input byte sex long gross4_ind byte ld2
        1 1 1
        2 1 1
        2 1 1
        2 1 1
        2 1 1
        1 1 2
        2 1 1
        1 1 1
        1 1 1
        1 1 1
        2 1 2
        2 1 1
        2 1 1
        2 1 1
        2 1 1
        1 1 2
        1 1 1
        1 1 2
        1 1 1
        2 1 2
        end
        label values sex SEX
        label def SEX 1 "Male", modify
        label def SEX 2 "Female", modify
        label values ld2 ld2
        label def ld2 1 "Not disabled" 2 "Disabled", modify
        
        collect clear
        svyset [iw=gross4_ind]
        dtable i.sex, by(ld2, nototals tests) svy
        collect style autolevels result _dtable_test, clear
        collect: proportion sex [iw=gross4_ind], percent over(ld2)
        collect composite define _r_ci = _r_lb _r_ub, delimiter(",") trim replace
        collect style cell result[_r_ci], sformat("(%s)") nformat(%6.0f) 
        collect style cell result[fvpercent], sformat("%s%%") nformat(%6.0f) 
        collect style autolevels result fvpercent _r_ci _dtable_test
        collect recode colname [email protected] = 1.sex  [email protected] = 1.sex [email protected] = 2.sex [email protected] = 2.sex
        collect addtags colname[], fortags(colname[2.sex] result[_dtable_test]) replace
        collect layout (colname) (ld2#result)
        Res.:

        Code:
        . 
        . collect layout (colname) (ld2#result)
        
        Collection: DTable
              Rows: colname
           Columns: ld2#result
           Table 1: 3 x 5
        
        -------------------------------------
                             ld2             
               Not disabled   Disabled   Test
        -------------------------------------
        sex                                  
          Male  40% (18,67) 60% (18,91)      
                60% (33,82) 40%  (9,82) 0.457
        -------------------------------------

        Comment


        • #5
          Thanks Andrew. That works in this example, but not if I have more than one categorical variable. How do I amend the collect addtag line when I have 2+ categorical variables?

          Comment


          • #6
            Again, you'd have to create a reproducible example of the problem (with more than 1 categorical variable).

            Comment


            • #7
              Thanks Andrew.

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input byte(sex hdage) long gross4_ind byte ld2
              2 2 1 1
              1 3 1 2
              1 2 1 1
              2 3 1 1
              1 3 1 1
              2 2 1 1
              1 1 1 1
              1 1 1 1
              2 1 1 2
              1 1 1 1
              2 3 1  2
              1 2 1 2
              2 2 1 1
              2 1 1 1
              1 3 1 2
              2 3 1 1
              2 2 1 1
              1 2 1 2
              2 1 1 1
              2 2 1 1
              end
              label values sex SEX
              label def SEX 1 "Male", modify
              label def SEX 2 "Female", modify
              label values hdage HDAGE
              label def HDAGE 1 "Age 16 to 24", modify
              label def HDAGE 2 "Age 25 to 34", modify
              label def HDAGE 3 "Age 35 to 44", modify
              label values ld2 ld2
              label def ld2 1 "Not disabled" 2 "Disabled", modify
              Code:
              svyset [iw=gross4_ind]
              dtable i.sex i.hdage, by(ld2, nototals tests) svy
              collect style autolevels result _dtable_test, clear
              collect: proportion sex hdage [iw=gross4_ind], percent over(ld2)
              collect composite define _r_ci = _r_lb _r_ub, delimiter(",") trim replace
              collect style cell result[_r_ci], sformat("(%s)") nformat(%6.0f)
              collect style cell result[fvpercent], sformat("%s%%") nformat(%6.0f)
              collect style autolevels result fvpercent _r_ci _dtable_test
              
              collect recode colname [email protected] = 1.sex  [email protected] = 1.sex [email protected] = 2.sex [email protected] = 2.sex
              collect recode colname [email protected] = 1.hdage  [email protected] = 1.hdage [email protected] = 2.hdage [email protected] = 2.hdage [email protected] = 3.hdage [email protected] = 3.hdage
              collect layout (colname var) (ld2#result)

              Comment


              • #8
                Thanks for the example. Consider:

                Code:
                clear
                input byte(sex hdage) long gross4_ind byte ld2
                2 2 1 1
                1 3 1 2
                1 2 1 1
                2 3 1 1
                1 3 1 1
                2 2 1 1
                1 1 1 1
                1 1 1 1
                2 1 1 2
                1 1 1 1
                2 3 1  2
                1 2 1 2
                2 2 1 1
                2 1 1 1
                1 3 1 2
                2 3 1 1
                2 2 1 1
                1 2 1 2
                2 1 1 1
                2 2 1 1
                end
                label values sex SEX
                label def SEX 1 "Male", modify
                label def SEX 2 "Female", modify
                label values hdage HDAGE
                label def HDAGE 1 "Age 16 to 24", modify
                label def HDAGE 2 "Age 25 to 34", modify
                label def HDAGE 3 "Age 35 to 44", modify
                label values ld2 ld2
                label def ld2 1 "Not disabled" 2 "Disabled", modify
                
                svyset [iw=gross4_ind]
                dtable i.sex i.hdage, by(ld2, nototals tests) svy
                collect style autolevels result _dtable_test, clear
                collect: proportion sex hdage [iw=gross4_ind], percent over(ld2)
                collect composite define _r_ci = _r_lb _r_ub, delimiter(",") trim replace
                collect style cell result[_r_ci], sformat("(%s)") nformat(%6.0f)
                collect style cell result[fvpercent], sformat("%s%%") nformat(%6.0f)
                collect style autolevels result fvpercent _r_ci _dtable_test
                collect recode colname [email protected] = 1.sex  [email protected] = 1.sex [email protected] = 2.sex [email protected] = 2.sex
                collect recode colname [email protected] = 1.hdage  [email protected] = 1.hdage [email protected] = 2.hdage [email protected] = 2.hdage [email protected] = 3.hdage [email protected] = 3.hdage
                collect layout (colname var) (ld2#result)
                
                qui collect levelsof var
                collect remap var[`s(levels)']= colname[`s(levels)'], fortags(result[_dtable_test])
                collect layout (colname) (ld2#result)
                Res.:

                Code:
                . collect layout (colname var) (ld2#result)
                
                Collection: DTable
                      Rows: colname var
                   Columns: ld2#result
                   Table 1: 14 x 5
                
                ---------------------------------------------
                                             ld2             
                               Not disabled   Disabled   Test
                ---------------------------------------------
                sex                                          
                  Male          36% (15,64) 67% (25,92)      
                  Female        64% (36,85) 33%  (8,75)      
                hdage                                        
                  Age 16 to 24  36% (15,64) 17%  (2,66)      
                  Age 25 to 34  43% (19,70) 33%  (8,75)      
                  Age 35 to 44  21%  (7,52) 50% (15,85)      
                sex                                          
                  Male          36%         67%         0.229
                  Female        64%         33%              
                hdage                                        
                  Age 16 to 24  36%         17%         0.444
                  Age 25 to 34  43%         33%              
                  Age 35 to 44  21%         50%              
                ---------------------------------------------
                
                . 
                . 
                . 
                . qui collect levelsof var
                
                . 
                . collect remap var[`s(levels)']= colname[`s(levels)'], fortags(result[_dtable_test])
                (2 items remapped in collection DTable)
                
                . 
                . collect layout (colname) (ld2#result)
                
                Collection: DTable
                      Rows: colname
                   Columns: ld2#result
                   Table 1: 7 x 5
                
                ---------------------------------------------
                                             ld2             
                               Not disabled   Disabled   Test
                ---------------------------------------------
                sex                                          
                  Male          36% (15,64) 67% (25,92) 0.229
                  Female        64% (36,85) 33%  (8,75)      
                hdage                                        
                  Age 16 to 24  36% (15,64) 17%  (2,66) 0.444
                  Age 25 to 34  43% (19,70) 33%  (8,75)      
                  Age 35 to 44  21%  (7,52) 50% (15,85)      
                ---------------------------------------------

                Comment


                • #9
                  That worked perfectly, thanks Andrew!

                  Comment

                  Working...
                  X