Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Table with double sorting and two cases plus t-stat

    Hi everybody,
    I have the following dataset:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(date PFnumber Return MeR SMB HML RF) byte(dunemp dinflation dfomc duninfl dunfomc dinflfomc dall) float Exret double firstbeta float(firstbetastatic size B2M exretA exretN)
     2758 1 -.46  .04  -.3   .18 .016 0 0 0 0 0 0 0      -.476 1.4879545137011279 1.0128056 1 1     .      -.476
    11611 5 1.19  .64  .75   -.1 .018 0 0 0 0 0 0 0      1.172   .529238700062847  .7444988 1 5     .      1.172
     6312 2  .08  .03  .12   .27 .019 0 0 0 0 0 0 0       .061  .5535918342666777  .8976381 1 2     .       .061
     9611 5  .04    0  .19   -.4 .024 0 0 0 0 0 0 0 .015999999  .4749225004099712  .7444988 1 5     . .015999999
    13907 4  .41 1.01  .29  -.53 .021 0 0 0 0 0 0 0       .389   .469481372849392  .7557166 1 4     .       .389
    14266 3 -.11 -.66  .41   .15 .019 0 0 0 0 0 0 0      -.129  .6212392009081008  .7998899 1 3     .      -.129
     5205 3 -.18    0 -.06     0 .036 0 0 0 0 0 0 0      -.216   .831576466663534  .7998899 1 3     .      -.216
    17447 3 -.59 -.27 -.18  -.29 .014 0 0 0 0 0 0 0      -.604 1.0311082397263975  .7998899 1 3     .      -.604
     4140 5 -.76 -.57  .23  -.17 .015 0 0 0 0 0 0 0      -.775 1.0797220465688908  .7444988 1 5     .      -.775
    15504 5 -.44 -1.1 -.28   .43 .006 0 1 0 1 0 1 1      -.446  .4334751155209491  .7444988 1 5 -.446          .
    12263 3 -.12  .54 -.51   .08 .011 0 0 0 0 0 0 0      -.131  .5380154342389192  .7998899 1 3     .      -.131
     3728 3  .06  .39 -.15  -.46 .027 0 0 0 0 0 0 0       .033 1.1760068623254416  .7998899 1 3     .       .033
     7565 5   .7  1.5 -.31 -1.09 .036 0 0 1 0 1 1 1       .664    .69895095080145  .7444988 1 5  .664          .
    15075 5  .94 2.87 -.35 -1.36  .02 0 0 0 0 0 0 0        .92  .2934694266349103  .7444988 1 5     .        .92
     5919 4  .21  .98 -.49   .39 .017 0 0 0 0 0 0 0       .193  .7090840191200938  .7557166 1 4     .       .193
    10233 5 1.03  .74 -.08   .29 .015 0 0 0 0 0 0 0      1.015  .5712486792705823  .7444988 1 5     .      1.015
    12822 1  .84  .13  .17   -.2 .021 0 0 0 0 0 0 0   .8189999  .7636763451047806 1.0128056 1 1     .   .8189999
    13207 2  .28 -.19  .37   .27  .02 0 0 0 0 0 0 0        .26  .7328015471319895  .8976381 1 2     .        .26
     2484 2 -.24 -.31  .04   .05 .022 0 1 0 1 0 1 1      -.262 1.2391570819266435  .8976381 1 2 -.262          .
     3065 2   .5 -.13  .46  -.23  .02 0 0 0 0 0 0 0        .48 1.4494875065537562  .8976381 1 2     .        .48
    end
    format %td date
    My aim is to contain a table of means of "Exret" sorted by size (categorical variable that goes from 1 to 5) and B2M (categorical variable that goes from 1 to 5). I would like to sort the Exret means on the rows by size and on the columns by B2M. To have an idea of what I mean:

    table size B2M, c(mean Exret)

    However, I need to add to modifications to this table. First, I have a dummy variable ("dall") according to which I would sort the variable Exret, i.e. I need a table as the one produced with the code of above but where every entry should be divided into the categories of "dall" (which means zero and one). Furthermore, every entry should report the t-statistic for the null hypothesis of a zreo mean. I attach below a visual example of what I would like to do.
    I tried to create two distinct variable of Exret, one for each value of "dall" but I can get only two distinct table and I want a unique one. Furthermore, I can't find how to get the t-statistic below the mean.
    Dall B2M
    Size 1 2 3 4 5
    0 1 5.3
    (1.95)
    6.6
    (1.24)
    5.3
    (1.95)
    6.6
    (1.24)
    5.3
    (1.95)
    1 4.1
    (7)
    1.89
    (2.02)
    4.1
    (7)
    1.89
    (2.02)
    4.1
    (7)
    0 2 1.3
    (1.95)
    4.6
    (1.24)
    5.3
    (1.95)
    6.99
    (1.24)
    5.3
    (6.95)
    1 4.1
    (7)
    1.89
    (2.02)
    4.1
    (7)
    1.89
    (2.02)
    4.1
    (7)
    0 3 5.3
    (1.95)
    6.6
    (1.24)
    5.3
    (1.95)
    6.6
    (1.24)
    5.3
    (1.95)
    1 1.3
    (1.95)
    4.6
    (1.24)
    5.3
    (1.95)
    6.99
    (1.24)
    5.3
    (6.95)
    0 4 5.3
    (1.95)
    6.6
    (1.24)
    5.3
    (1.95)
    6.99
    (1.24)
    5.3
    (6.95)
    1 4.1
    (7)
    1.89
    (2.02)
    4.1
    (7)
    4.6
    (1.24)
    5.3
    (1.95)
    0 5 5.3
    (1.95)
    6.6
    (1.24)
    4.1
    (7)
    4.6
    (1.24)
    5.3
    (1.95)
    1 4.6
    (1.24)
    5.3
    (1.95)
    6.99
    (1.24)
    5.3
    (6.95)
    5.3
    (1.95)

  • #2
    This will be close to what you want:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(date PFnumber Return MeR SMB HML RF) byte(dunemp dinflation dfomc duninfl dunfomc dinflfomc dall) float Exret double firstbeta float(firstbetastatic size B2M exretA exretN)
     2758 1 -.46  .04  -.3   .18 .016 0 0 0 0 0 0 0      -.476 1.4879545137011279 1.0128056 1 1     .      -.476
    11611 5 1.19  .64  .75   -.1 .018 0 0 0 0 0 0 0      1.172   .529238700062847  .7444988 1 5     .      1.172
     6312 2  .08  .03  .12   .27 .019 0 0 0 0 0 0 0       .061  .5535918342666777  .8976381 1 2     .       .061
     9611 5  .04    0  .19   -.4 .024 0 0 0 0 0 0 0 .015999999  .4749225004099712  .7444988 1 5     . .015999999
    13907 4  .41 1.01  .29  -.53 .021 0 0 0 0 0 0 0       .389   .469481372849392  .7557166 1 4     .       .389
    14266 3 -.11 -.66  .41   .15 .019 0 0 0 0 0 0 0      -.129  .6212392009081008  .7998899 1 3     .      -.129
     5205 3 -.18    0 -.06     0 .036 0 0 0 0 0 0 0      -.216   .831576466663534  .7998899 1 3     .      -.216
    17447 3 -.59 -.27 -.18  -.29 .014 0 0 0 0 0 0 0      -.604 1.0311082397263975  .7998899 1 3     .      -.604
     4140 5 -.76 -.57  .23  -.17 .015 0 0 0 0 0 0 0      -.775 1.0797220465688908  .7444988 1 5     .      -.775
    15504 5 -.44 -1.1 -.28   .43 .006 0 1 0 1 0 1 1      -.446  .4334751155209491  .7444988 1 5 -.446          .
    12263 3 -.12  .54 -.51   .08 .011 0 0 0 0 0 0 0      -.131  .5380154342389192  .7998899 1 3     .      -.131
     3728 3  .06  .39 -.15  -.46 .027 0 0 0 0 0 0 0       .033 1.1760068623254416  .7998899 1 3     .       .033
     7565 5   .7  1.5 -.31 -1.09 .036 0 0 1 0 1 1 1       .664    .69895095080145  .7444988 1 5  .664          .
    15075 5  .94 2.87 -.35 -1.36  .02 0 0 0 0 0 0 0        .92  .2934694266349103  .7444988 1 5     .        .92
     5919 4  .21  .98 -.49   .39 .017 0 0 0 0 0 0 0       .193  .7090840191200938  .7557166 1 4     .       .193
    10233 5 1.03  .74 -.08   .29 .015 0 0 0 0 0 0 0      1.015  .5712486792705823  .7444988 1 5     .      1.015
    12822 1  .84  .13  .17   -.2 .021 0 0 0 0 0 0 0   .8189999  .7636763451047806 1.0128056 1 1     .   .8189999
    13207 2  .28 -.19  .37   .27  .02 0 0 0 0 0 0 0        .26  .7328015471319895  .8976381 1 2     .        .26
     2484 2 -.24 -.31  .04   .05 .022 0 1 0 1 0 1 1      -.262 1.2391570819266435  .8976381 1 2 -.262          .
     3065 2   .5 -.13  .46  -.23  .02 0 0 0 0 0 0 0        .48 1.4494875065537562  .8976381 1 2     .        .48
    end
    format %td date
    
    keep size B2M dall Exret
    
    capture program drop once
    program define once
        ttest Exret = 0
        gen mean_Exret = r(mu_1)
        gen tstat_Exret = r(t)
        keep if _n == 1
        exit
    end
    
    runby once, by(size B2M dall)
    
    drop Exret
    rename *_Exret =_B2M_
    reshape wide mean_Exret_B2M_ tstat_Exret_B2M_, i(size dall) j(B2M)
    rename mean_* *_mean
    rename tstat_* *_tstat
    ds Exret_B2M_*_mean
    local stubs `r(varlist)'
    local stubs: subinstr local stubs "mean" "", all
    
    reshape long `stubs', i(dall size) j(statistic) string
    sort size dall
    rename Exret_B2M_#_ B2M_#
    The exact layout you specify is not possible as a Stata data set because 1, 2, 3, 4, and 5 are not legal variable names, and numeric variables cannot contain parentheses.

    Some simpler code that will show a table in the Results window that is also close to, but not exactly like, your layout is:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(date PFnumber Return MeR SMB HML RF) byte(dunemp dinflation dfomc duninfl dunfomc dinflfomc dall) float Exret double firstbeta float(firstbetastatic size B2M exretA exretN)
     2758 1 -.46  .04  -.3   .18 .016 0 0 0 0 0 0 0      -.476 1.4879545137011279 1.0128056 1 1     .      -.476
    11611 5 1.19  .64  .75   -.1 .018 0 0 0 0 0 0 0      1.172   .529238700062847  .7444988 1 5     .      1.172
     6312 2  .08  .03  .12   .27 .019 0 0 0 0 0 0 0       .061  .5535918342666777  .8976381 1 2     .       .061
     9611 5  .04    0  .19   -.4 .024 0 0 0 0 0 0 0 .015999999  .4749225004099712  .7444988 1 5     . .015999999
    13907 4  .41 1.01  .29  -.53 .021 0 0 0 0 0 0 0       .389   .469481372849392  .7557166 1 4     .       .389
    14266 3 -.11 -.66  .41   .15 .019 0 0 0 0 0 0 0      -.129  .6212392009081008  .7998899 1 3     .      -.129
     5205 3 -.18    0 -.06     0 .036 0 0 0 0 0 0 0      -.216   .831576466663534  .7998899 1 3     .      -.216
    17447 3 -.59 -.27 -.18  -.29 .014 0 0 0 0 0 0 0      -.604 1.0311082397263975  .7998899 1 3     .      -.604
     4140 5 -.76 -.57  .23  -.17 .015 0 0 0 0 0 0 0      -.775 1.0797220465688908  .7444988 1 5     .      -.775
    15504 5 -.44 -1.1 -.28   .43 .006 0 1 0 1 0 1 1      -.446  .4334751155209491  .7444988 1 5 -.446          .
    12263 3 -.12  .54 -.51   .08 .011 0 0 0 0 0 0 0      -.131  .5380154342389192  .7998899 1 3     .      -.131
     3728 3  .06  .39 -.15  -.46 .027 0 0 0 0 0 0 0       .033 1.1760068623254416  .7998899 1 3     .       .033
     7565 5   .7  1.5 -.31 -1.09 .036 0 0 1 0 1 1 1       .664    .69895095080145  .7444988 1 5  .664          .
    15075 5  .94 2.87 -.35 -1.36  .02 0 0 0 0 0 0 0        .92  .2934694266349103  .7444988 1 5     .        .92
     5919 4  .21  .98 -.49   .39 .017 0 0 0 0 0 0 0       .193  .7090840191200938  .7557166 1 4     .       .193
    10233 5 1.03  .74 -.08   .29 .015 0 0 0 0 0 0 0      1.015  .5712486792705823  .7444988 1 5     .      1.015
    12822 1  .84  .13  .17   -.2 .021 0 0 0 0 0 0 0   .8189999  .7636763451047806 1.0128056 1 1     .   .8189999
    13207 2  .28 -.19  .37   .27  .02 0 0 0 0 0 0 0        .26  .7328015471319895  .8976381 1 2     .        .26
     2484 2 -.24 -.31  .04   .05 .022 0 1 0 1 0 1 1      -.262 1.2391570819266435  .8976381 1 2 -.262          .
     3065 2   .5 -.13  .46  -.23  .02 0 0 0 0 0 0 0        .48 1.4494875065537562  .8976381 1 2     .        .48
    end
    format %td date
    
    keep size B2M dall Exret
    
    capture program drop once
    program define once
        ttest Exret = 0
        gen mean_Exret = r(mu_1)
        gen tstat_Exret = r(t)
        keep if _n == 1
        exit
    end
    
    runby once, by(size B2M dall)
    
    drop Exret
    tabdisp dall B2M, by(size) c(mean_Exret tstat_Exret) format(%3.2f)
    Again, this will not put the t-statistics in parentheses because -tabdisp- doesn't have that capability.

    Note: Your data example contains only one value of size, so the results from these examples do not show the full layout.

    It is, in principle, possible to create the exact layout you are looking for by using a lot of display statements in a lengthy and complicated block of code. It is unlikely that it is worth the effort to do that just to get a minor change in the layout.


    Comment


    • #3
      Dear Mr. Schechter,
      thank you a lot for your answer. As you said, the result is very very close to what I exactly want. Only one remark: you say that "The exact layout you specify is not possible as a Stata data set because 1, 2, 3, 4, and 5 are not legal variable names, and numeric variables cannot contain parentheses."
      Actually I could use strings for both Size and B2M. In particular, I would like to define the 5 categories of size as "Small, 2, 3, 4, Large" and those for B2M as "Growth, 2, 3, 4, Value".
      Could the result be then improved? Is that what you mean? If not, I am already well satisfied with this.

      Comment


      • #4
        In referring to strings, I was thinking about the values in the cells of the tables, not the headers/stubs. It's the parentheses. Now, of course, you could change the variables B2M_* to strings and then add the parentheses following the code in the first block:

        Code:
        tostring B2M_*, replace format(%3.2f) force
        foreach v of varlist B2M_* {
            replace `v'= "(" + `v' + ")" if statistic == "tstat"
        }
        Similarly you could
        Code:
        rename B2M_1 Growth
        rename B2m_5 Value
        label define size 1 "Small" 5 "Large"
        label values size size
        and that would get you even closer.

        Of course, you are now in the realm of a data set that is meant for human eyes but is useless for further analysis.

        Comment


        • #5
          I understand what you mean. I guess then the second solution you provided me with it the closest possible. Thank you

          Comment

          Working...
          X