Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Descriptive statistics from previous year

    Hello everyone!

    My name is Fábio and I am completely new at Stata. Since I am a beginner, I will post my question to you:
    So, imagine I have four columns: id, year, cash and neighbourhood (dummy). Each ID is a different company and each ID has seven lines (same ID for seven years, so line 1 = id 1; line 2 = id 1; ... ; line 8 = id 2 and so on). Is it possible to calculate an average for cash of all companies (different IDs), but cash from the previous year?
    In other words, if a company has dummy = 1 in 2010 and 2012 and other company has dummy = 1 for 2009, is it possible to calculate the average of cash 2009 and cash 2011 (first company) and cash 2008 (second company)?

    I hope you can understand what I mean. Thank you a lot!

    EDIT: I mean "Average" in the title instead of "Descriptive statistics".
    Last edited by Fabio Brandao; 05 May 2018, 16:39.

  • #2
    Fabio:
    are you looking for something along the following lines?
    Code:
    . set obs 3
    number of observations (_N) was 0, now 3
    
    . g id=_n
    
    . expand 7
    (18 observations created)
    
    . sort id
    
    . bysort id: g year=_n
    
    . g cash_flow=runiform()
    
    . xtset id year
           panel variable:  id (strongly balanced)
            time variable:  year, 1 to 7
                    delta:  1 unit
    
    . xtsum cash_flow if counter==1
    
    Variable         |      Mean   Std. Dev.       Min        Max |    Observations
    -----------------+--------------------------------------------+----------------
    cash_f~w overall |  .5844709   .3270967   .0454151   .9805113 |     N =      18
             between |             .0879652   .4830158   .6394438 |     n =       3
             within  |             .3186381   .0791027    1.07634 |     T =       6
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo,
      Thank you for your reply! Well, more or less. I want what you did there, but with previous years. For example, for all companies with dummy variable == 1 in 2010, make average cash flows for 2009 for those companies.
      Kind regards.
      Last edited by Fabio Brandao; 06 May 2018, 14:42.

      Comment


      • #4
        Fabio:
        Code:
        set obs 3
        g id=_n
        expand 7
        sort id
        bysort id: g year=_n
        g cash_flow=runiform()
        ysort id: g counter=1 if _n>=2
        xtset id year
        bysort year: xtsum cash_flow if counter[_n-1]!=.
        
        -----------------------------------------------------------------------------------------------------------------------
        -> year = 1
        
        Variable         |      Mean   Std. Dev.       Min        Max |    Observations
        -----------------+--------------------------------------------+----------------
        cash_f~w overall |         .          .          .          . |     N =       0
                 between |                    .          .          . |     n =       0
                 within  |                    .          .          . |     T =       .
        
        -----------------------------------------------------------------------------------------------------------------------
        -> year = 2
        
        Variable         |      Mean   Std. Dev.       Min        Max |    Observations
        -----------------+--------------------------------------------+----------------
        cash_f~w overall |  .3291338   .0880321   .2668857   .3913819 |     N =       2
                 between |             .0880321   .2668857   .3913819 |     n =       2
                 within  |                    0   .3291338   .3291338 |     T =       1
        
        -----------------------------------------------------------------------------------------------------------------------
        -> year = 3
        
        Variable         |      Mean   Std. Dev.       Min        Max |    Observations
        -----------------+--------------------------------------------+----------------
        cash_f~w overall |  .3774329   .4318475   .1196613    .875991 |     N =       3
                 between |             .4318475   .1196613    .875991 |     n =       3
                 within  |                    0   .3774329   .3774329 |     T =       1
        
        -----------------------------------------------------------------------------------------------------------------------
        -> year = 4
        
        Variable         |      Mean   Std. Dev.       Min        Max |    Observations
        -----------------+--------------------------------------------+----------------
        cash_f~w overall |  .3291699   .3785143   .0285569   .7542434 |     N =       3
                 between |             .3785143   .0285569   .7542434 |     n =       3
                 within  |                    0   .3291699   .3291699 |     T =       1
        
        -----------------------------------------------------------------------------------------------------------------------
        -> year = 5
        
        Variable         |      Mean   Std. Dev.       Min        Max |    Observations
        -----------------+--------------------------------------------+----------------
        cash_f~w overall |  .8189051   .1079441   .6950234   .8927587 |     N =       3
                 between |             .1079441   .6950234   .8927587 |     n =       3
                 within  |                    0   .8189051   .8189051 |     T =       1
        
        -----------------------------------------------------------------------------------------------------------------------
        -> year = 6
        
        Variable         |      Mean   Std. Dev.       Min        Max |    Observations
        -----------------+--------------------------------------------+----------------
        cash_f~w overall |  .5406453    .172116   .3508549   .6866152 |     N =       3
                 between |              .172116   .3508549   .6866152 |     n =       3
                 within  |                    0   .5406453   .5406453 |     T =       1
        
        -----------------------------------------------------------------------------------------------------------------------
        -> year = 7
        
        Variable         |      Mean   Std. Dev.       Min        Max |    Observations
        -----------------+--------------------------------------------+----------------
        cash_f~w overall |  .4576063   .4370836   .0711051   .9319346 |     N =       3
                 between |             .4370836   .0711051   .9319346 |     n =       3
                 within  |                    0   .4576063   .4576063 |     T =       1
        
        .
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Dear Carlo,

          on the same idea, is it possible to do the same when dummy = 1, but I need calculate the average for the 3 preceding years? Should I just change the following line of the code:
          Code:
           
           bysort year: xtsum cash_flow if counter[_n-3]!=.

          And another difficulty is when ID is not a long variable but a wide one, i.e., instead of having one columnn ID, I have many columns, ID_1, ID_2, ..., ID_k: could I still use xtset?
          Or should I first transform it?
          Many thanks

          Comment


          • #6
            Marie: That's several questions at once. Some of them deserve a new thread.

            The average for the 3 previous years can be got easily with various programs. See https://www.statalist.org/forums/for...s-observations for examples. Check the help files for how to get means, not medians.

            Once you've calculated means, you can ignore observations you don't care about.

            If your panels are held in separate variables in a wide layout, you must reshape long before you can xtset (or easily use any of the programs in this territory).

            Comment


            • #7
              Thank you. I decided to use rangestat.

              Comment

              Working...
              X