Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Confidence interval for weighted average

    Dear all,

    I would love seek for help here. What I am doing is that, I want to get the confidence interval for a weighted average.
    This is how the weighting method ( variance-weighted). First, get the proportion of the variance for each group, and sum the product of the share part and the weights.
    Code:
    egen total_var = total(var_RET)
    gen var_weight = var_RET/ total_var
    gen mktinfo_share_weight = mktinfo_share * var_weight
    egen mktinfo_share_weighted = sum(mktinfo_share_weight)
    For common command -ci-, since I only have one final weighted average across the rows, if I type
    [CODE]
    ci means mktinfo_share_weighted , level(99)
    [CODE]
    it cannot gives me the correct estimation for the interval.
    Therefore, I would love to ask if there is another way to show the confident interval for such weighted average.
    The idea may derive from the original definition of CI:
    mean +- t-value * standard deviation/ root of observation
    As long as I can figure out the part after +-, I can re-create the confident interval.
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(id var_RET mktinfo)
      1 .00025453698  .00009306284
      2  .0010932052 1.6395407e-12
      3 .00013873918 .000015883144
      4 .00016952788  .00005152942
      5 .00015983787 .000065545595
      6 .00024326568  .00002890711
      7 .00016386366 .000017440707
      8 .00028597284  .00011248127
      9 .00023394095  .00008871407
     10 .00034012095  3.477177e-06
     11  .0007160784  .00013231797
     12  .0004039086  .00002951982
     13   .000220585 .000011787646
     14 .00010360737 .000011535892
     15 .00012653777  .00002533065
     16  .0003179252  .00002351869
     17 .00017056165 .000016126445
     18  .0005434161  .00006128963
     19 .00010800704  1.460614e-06
     20 .00020800895  .00007542817
     21 .00023145023  .00003084261
     22   .000695263  .00008889187
     23 .00019090103 .000018818724
     24 .00022233406 .000034568246
     25 .00009257844 .000011349546
     26  .0004938965  2.323115e-06
     27 .00050838693  .00001416657
     28  .0003364799  .00011926446
     29 .00027392787 3.8549214e-07
     30  .0003095213 .000072238654
     31  .0002522189  .00005137531
     32 .00015967217 .000017764502
     33   .000359111  .00003512886
     34  .0004625673  .00002947951
     35  .0001509002  6.536637e-06
     36  .0002188882 .000035863814
     37 .00023711416 .000018214803
     38 .00015291903  7.146552e-06
     39 .00015346557  .00002819447
     40 .00053468865  .00005954452
     41  .0003625301 .000026964013
     42 .00023529546  .00013698229
     43 .00023128066 .000012145727
     44  .0002992186  .00003733501
     45 .00045417435  .00002429819
     46 .00023781646   .0000285013
     47 .00016155763  .00003025555
     48  .0009921843 1.6805366e-06
     49  .0001302785 .000011270977
     50 .00004920922  8.644625e-06
     51  .0005262289  .00002889953
     52  .0003162579  .00005325446
     53  .0004438201  .00005493474
     54  .0006693757  .00004698702
     56 .00011338643   4.96667e-06
     57 .00015950597  .00008380983
     58  .0011968252  .00013898936
     59 .00040132646  .00004648703
     60  .0005500457   .0001222533
     61 .00035834225 .000013325222
     62  .0006459293  .00005520241
     63  .0004001613  .00008322814
     64  .0004163074   .0000336237
     65  .0011161747  .00004507854
     66  .0004083046   .0000481796
     67 .00011901466  .00005155581
     68  .0004205853  .00010002564
     69  .0006257504 .000027802735
     70  .0003540163 .000031119293
     71 .00028855423 .000062650455
     72    .00020876  .00001857438
     73 .00011886151 .000012743667
     74   .001184069   .0000846461
     75 .00008185631   .0000181655
     76 .00017324876  .00005542144
     77 .00010932072  3.303772e-06
     78  .0004687891  .00003822233
     79 .00026965962 .000020909243
     80 .00005452983  3.757928e-06
     81  .0002793335  .00006898495
     82 .00017664494  .00005756211
     83  .0001835153  .00008233645
     84  .0003393123 .000027126114
     85 .00022325464 .000019327326
     86   .000078872 .000017647546
     87  .0004393946  .00001253071
     88  .0005859645  .00006841045
     89  .0004759088  .00007075154
     90 .00031620075  .00005308945
     91 .00025006154  .00004220495
     92  .0007614839   .0001366144
     93 .00005937216  .00001791885
     94 .00044717905   9.66545e-07
     95 .00009891028   .0000674664
     96  .0004589452  .00013130865
     97 .00019592948   .0001016715
     98 .00027104112  .00008730995
     99   .000129755  7.287784e-06
    100  .0003375435   .0000227227
    101  .0010927363  .00020377597
    end
    Forget to mention, the id is the group of stock-year observation. The variance is firstly calculated within each id, and then the id is duplicated drop to leave only 1 observation for each id.



    Last edited by Wen-Hung Hsu; 18 Dec 2022, 13:28.

  • #2
    There is second question I would love to post:
    Also for the weighted average, how can I perform t-test of difference in means between different period within the same variable?
    For example,
    Code:
    gen sub = 1 
    replace sub = 2 if year >1996
    bysort sub: egen total_var_B = total(var_RET)
    gen var_weight_B = var_RET/ total_var_B
    gen mktinfo_share_weight_B = mktinfo_share_w * var_weight_B
    bysort sub: egen mktinfo_share_B = sum(mktinfo_share_weight_B)
    Since now the weighted average is still, only one value within sub-period. How can I perform the t-test for the difference in means between two sub-period?

    Comment


    • #3
      Re #1:
      Code:
      mean mktinfo [aweight = var_RET]
      Re #2:
      Code:
      regress mktinfo i.sub [aweight =var_RET]
      and the difference and its associated standard error, CIs and test statistics are shown in the 2.sub row in the output table.
      Note: The code shown for #2 is untested because the example data does not include a year variable.

      Added: General principle: in Stata, calculations involving weighted means usually are done with the unweighted data and a reference in the command to the desired weighting. While it is fine to calculated weighted means as new variables, they usually don't play a role in weighted analyses of the data.
      Last edited by Clyde Schechter; 18 Dec 2022, 15:12.

      Comment

      Working...
      X