Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • skewness from moments2 into data

    Hi,

    I am trying to use "moments2" and use the skewness, kurtosis, and SD as my data.
    The following is the example of my data (it is just a small part of my complete data). My full data has 24(IDs)*20(years)*12(months)*30(Days)

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float edate byte mnth int yr str5 id double rate
    15706 1 2003 "A"        .
    15707 1 2003 "A" 1.776182
    15708 1 2003 "A" 1.774442
    15709 1 2003 "A"        .
    15710 1 2003 "A"        .
    15711 1 2003 "A"  1.74304
    15712 1 2003 "A" 1.741966
    15713 1 2003 "A" 1.747326
    15714 1 2003 "A" 1.729514
    15715 1 2003 "A" 1.723984
    15716 1 2003 "A"        .
    15717 1 2003 "A"        .
    15718 1 2003 "A" 1.716656
    15719 1 2003 "A" 1.712678
    15720 1 2003 "A" 1.712331
    15721 1 2003 "A" 1.706551
    15722 1 2003 "A"  1.69095
    15723 1 2003 "A"        .
    15724 1 2003 "A"        .
    15725 1 2003 "A" 1.694922
    15726 1 2003 "A" 1.702244
    15727 1 2003 "A" 1.704889
    15728 1 2003 "A" 1.690992
    15729 1 2003 "A" 1.690931
    15730 1 2003 "A"        .
    15731 1 2003 "A"        .
    15732 1 2003 "A" 1.687029
    15733 1 2003 "A" 1.699389
    15734 1 2003 "A" 1.692584
    15735 1 2003 "A" 1.701247
    15736 1 2003 "A" 1.700906
    15737 2 2003 "A"        .
    15738 2 2003 "A"        .
    15739 2 2003 "A" 1.715444
    15740 2 2003 "A" 1.696636
    15741 2 2003 "A" 1.688818
    15742 2 2003 "A" 1.693517
    15743 2 2003 "A" 1.694411
    15744 2 2003 "A"        .
    15745 2 2003 "A"        .
    15746 2 2003 "A" 1.686714
    15747 2 2003 "A" 1.698113
    15748 2 2003 "A" 1.689498
    15749 2 2003 "A" 1.693252
    15750 2 2003 "A" 1.684981
    15751 2 2003 "A"        .
    15752 2 2003 "A"        .
    15753 2 2003 "A" 1.696342
    15754 2 2003 "A" 1.693112
    15755 2 2003 "A" 1.687039
    15756 2 2003 "A" 1.677878
    15757 2 2003 "A" 1.668297
    15758 2 2003 "A"        .
    15759 2 2003 "A"        .
    15760 2 2003 "A" 1.661226
    15761 2 2003 "A" 1.651105
    15762 2 2003 "A" 1.651353
    15763 2 2003 "A" 1.644982
    15764 2 2003 "A" 1.652384
    15706 1 2003 "B"        .
    15707 1 2003 "B"  7.11009
    15708 1 2003 "B" 7.148095
    15709 1 2003 "B"        .
    15710 1 2003 "B"        .
    15711 1 2003 "B" 7.082285
    15712 1 2003 "B" 7.125659
    15713 1 2003 "B" 7.159198
    15714 1 2003 "B" 7.071191
    15715 1 2003 "B" 7.073217
    15716 1 2003 "B"        .
    15717 1 2003 "B"        .
    15718 1 2003 "B" 7.042848
    15719 1 2003 "B" 7.024203
    15720 1 2003 "B" 7.059662
    15721 1 2003 "B" 7.035498
    15722 1 2003 "B" 6.979253
    15723 1 2003 "B"        .
    15724 1 2003 "B"        .
    15725 1 2003 "B" 6.977377
    15726 1 2003 "B" 6.977659
    15727 1 2003 "B" 6.937955
    15728 1 2003 "B" 6.913173
    15729 1 2003 "B" 6.897533
    15730 1 2003 "B"        .
    15731 1 2003 "B"        .
    15732 1 2003 "B" 6.841766
    15733 1 2003 "B" 6.881199
    15734 1 2003 "B" 6.840817
    15735 1 2003 "B" 6.917008
    15736 1 2003 "B"  6.87574
    15737 2 2003 "B"        .
    15738 2 2003 "B"        .
    15739 2 2003 "B" 6.932333
    15740 2 2003 "B" 6.870911
    15741 2 2003 "B"  6.81549
    15742 2 2003 "B"  6.87441
    15743 2 2003 "B" 6.887571
    15744 2 2003 "B"        .
    15745 2 2003 "B"        .
    15746 2 2003 "B" 6.876665
    end
    format %d edate
    ------------------ copy up to and including the previous line ------------------


    To find skewness by id, mnth, and yr, I used the following command: "bys id mnth yr: moments2 rate"

    -> id = A, mnth = 1, yr = 2003
    ----------------------------------------------------------
    n = 22 | mean SD skewness kurtosis
    ----------+-----------------------------------------------
    exchrate | 1.715 0.026 1.155 0.519
    ----------------------------------------------------------
    -> id = A, mnth = 2, yr = 2003
    ----------------------------------------------------------
    n = 20 | mean SD skewness kurtosis
    ----------+-----------------------------------------------
    exchrate | 1.681 0.020 -0.591 -0.621
    ----------------------------------------------------------
    -> id = B, mnth = 1, yr = 2003
    ---------------------------------------------------------
    n = 22 | mean SD skewness kurtosis
    ----------+-----------------------------------------------
    exchrate | 6.999 0.100 -0.057 -1.214
    ----------------------------------------------------------
    -> id = B, mnth = 2, yr = 2003
    ----------------------------------------------------------
    n = 20 | mean SD skewness kurtosis
    ----------+-----------------------------------------------
    exchrate | 6.899 0.033 -0.652 0.260
    ----------------------------------------------------------

    What I want to do is to use the skewness as data. I would like to create a data set like the following:


    edate mnth yr id rate skewness SD kurtosis
    .
    .
    15712 1 2003 "A" 1.741966 1.155 0.026 0.519
    15713 1 2003 "A" 1.747326 1.155 0.026 0.519
    .
    .
    15740 2 2003 "A" 1.696636 -0.591 0.020 -0.621
    15741 2 2003 "A" 1.688818 -0.591 0.020 -0.621
    .
    .
    15707 1 2003 "B" 7.110090 -0.057 0.100 -1.214
    15708 1 2003 "B" 7.148095 -0.057 0.100 -1.214
    .
    .
    15739 2 2003 "B" 6.932333 -0.652 0.033 0.260
    15740 2 2003 "B" 6.870911 -0.652 0.033 0.260

    Is there any way I can do it in more efficient way?
    I always appreciate your help and time.

  • #2
    What is moments2? https://www.statalist.org/forums/help#stata applies. You're asked to explain the provenance of community-contributed commands you refer to.

    Comment


    • #3
      I apologize for the lack of explanation and hope the following information is enough.

      I am using moments2 from SSC in STATA 14.2.

      package moments2 from http://fmwww.bc.edu/RePEc/bocode/m
      TITLE
      'MOMENTS2': module to compute skewness and kurtosis measures

      DESCRIPTION/AUTHOR(S)

      moments2 calculates various measures of skewness and kurtosis.
      Based on Nicholas Cox's moments, it also calculates mean and
      standard deviation for a list of variables. moments2 differs
      from moments only in allowing different measures of skewness and
      kurtosis and making the measures used in SAS and SPSS the
      default.

      KW: moments
      KW: kurtosis
      KW: skewness
      KW: summary statistics

      Requires: Stata version 8.2

      Distribution-Date: 20130119

      Author: Dirk Enzmann, University of Hamburg
      Support: email [email protected]
      Thanks a lot.

      Comment


      • #4
        Thanks for the detail. moments2 has documented capability to save results as a matrix, whence svmat in principle could be used.

        But I have to say that calling it up repeatedly as you do seems very roundabout and isn't compatible with saving as matrices.

        I'd use rangestat (SSC) as you aren't using any of the extra definitions supported by moments2.

        Code:
        . sysuse auto, clear
        (1978 Automobile Data)
        
        . rangestat (mean) mpg (sd) mpg (skewness) mpg (kurtosis) mpg, int(rep78 0 0)
        
        . tabdisp rep78, c(mpg_*)
        
        ------------------------------------------------------------------------------
        Repair    |
        Record    |
        1978      |     mean of mpg        sd of mpg  skewness of mpg  kurtosis of mpg
        ----------+-------------------------------------------------------------------
                1 |              21        4.2426407                0                1
                2 |          19.125        3.7583241        .22412358        1.6421611
                3 |       19.433333        4.1413252        .35552274         3.081041
                4 |       21.666667        4.9348699       -.13262994        1.9664636
                5 |       27.363636        8.7323849       -.01518308        1.5515859
                . |                                                                   
        ------------------------------------------------------------------------------
        At worst you need to define an interval variable:

        Code:
        egen interval = group(id  mnth yr), label

        Comment


        • #5
          Thanks for the advice. And, it works perfectly.
          But, I have one more question. The skewness and kurtosis of mpg from rangestat and those from moments2 are different. And, the moments2 provides the same skewness and kurtosis generated by Excel program.

          Code:
          . sysuse auto, clear
          (1978 Automobile Data)
          
          . rangestat (mean) mpg (sd) mpg (skewness) mpg (kurtosis) mpg, int(rep78 0 0)
          
          . tabdisp rep78, c(mpg_*)
          
          ------------------------------------------------------------------------------
          Repair    |
          Record    |
          1978      |     mean of mpg        sd of mpg  skewness of mpg  kurtosis of mpg
          ----------+-------------------------------------------------------------------
                  1 |              21        4.2426407                0                1
                  2 |          19.125        3.7583241        .22412358        1.6421611
                  3 |       19.433333        4.1413252        .35552274         3.081041
                  4 |       21.666667        4.9348699       -.13262994        1.9664636
                  5 |       27.363636        8.7323849       -.01518308        1.5515859
                  . |                                                                  
          ------------------------------------------------------------------------------

          Code:
           bys rep78: moments2 mpg
          
          -------------------------------------------------------------------------------------------------------------------------------
          -> rep78 = 1
          
          --------------------------------------------------------------
                  n = 2 |       mean          SD    skewness    kurtosis
          --------------+-----------------------------------------------
          Mileage (mpg) |     21.000       4.243                        
          --------------------------------------------------------------
          
          -------------------------------------------------------------------------------------------------------------------------------
          -> rep78 = 2
          
          --------------------------------------------------------------
                  n = 8 |       mean          SD    skewness    kurtosis
          --------------+-----------------------------------------------
          Mileage (mpg) |     19.125       3.758       0.280      -1.451
          --------------------------------------------------------------
          
          -------------------------------------------------------------------------------------------------------------------------------
          -> rep78 = 3
          
          --------------------------------------------------------------
                 n = 30 |       mean          SD    skewness    kurtosis
          --------------+-----------------------------------------------
          Mileage (mpg) |     19.433       4.141       0.375       0.327
          --------------------------------------------------------------
          
          -------------------------------------------------------------------------------------------------------------------------------
          -> rep78 = 4
          
          --------------------------------------------------------------
                 n = 18 |       mean          SD    skewness    kurtosis
          --------------+-----------------------------------------------
          Mileage (mpg) |     21.667       4.935      -0.145      -0.966
          --------------------------------------------------------------
          
          -------------------------------------------------------------------------------------------------------------------------------
          -> rep78 = 5
          
          --------------------------------------------------------------
                 n = 11 |       mean          SD    skewness    kurtosis
          --------------+-----------------------------------------------
          Mileage (mpg) |     27.364       8.732      -0.018      -1.581
          --------------------------------------------------------------
          
          -------------------------------------------------------------------------------------------------------------------------------
          From Excel:
          rep78 mean SD skewness kurtosis
          1 21 4.2426
          2 19.125 3.758324095 0.279531215 -1.451461687
          3 19.43333333 4.141325236 0.374514797 0.326528863
          4 21.66666667 4.934869925 -0.145004776 -0.965967701
          5 27.3636363 8.732384866 -0.017693498 -1.580690186
          Does this comparison imply that moments2 is better?
          If so, I must use moments2 and use the skewness and kurtosis as data. Any advice will be greatly appreciated.

          Thanks a lot.



          Comment


          • #6
            If you want to take MS Excel as a reference standard, then so be it. Really, not my choice and not my recommendation.

            There are different formulas for skewness and kurtosis, hinging on

            1. correction factors to make estimates more nearly unbiased that bite most in small samples

            2. whether 3 is subtracted in reporting kurtosis (which summarize doesn't do and rangestat doesn't do but moments2 and Excel apparently do).

            The help for rangestat documents its choices with a reference to

            https://www.stata-journal.com/sjpdf....iclenum=st0204

            which I am partial to as a survey of the main issues.

            As far as #1 is concerned expecting good estimates of skewness and kurtosis from small samples is extreme statistical optimism, given their dependence on third and fourth powers of deviations. I chose the auto data as a convenient mutual sandbox but the reservation remains.

            Comment

            Working...
            X