Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to calculate the average for "two consecutive observations" by group in panel data with conditions relating to missing observations?

    I am looking for a way to calculate the average for each "two consecutive observations" the by group in panel data. Let's say the group or firms is id, the time the is year, the original variable is asset, the wanted variable is asset_ave

    asset_ave at year (t)= (asset at year( t) + asset at year (t-1))/2


    With three conditions here:
    1. The starting observation of asset_av should be missing due to there is no previous-year data for calculating

    2. If there is a missing observation in two consecutive observations of a firm => set the value of asset_ave=".". For example, due to the missing observation of asset in year 2002, the value of asset_av in years 2002 and 2003 are missing.
    firm asset asset_av year
    A 100 . 2000
    A 200 150 2001
    A . . 2002
    A 300 . 2003
    A 50 175 2004

    3. If the two consecutive observations for a firm are not for two consecutive years, the asset_av has also been set as missing variables, for example, from the table below, due to years 2001 and 2005 are not two consecutive years, the asset_av receive the missing results for the year 2005
    firm asset asset_av year
    B 100 . 2000
    B 200 150 2001
    B 100 . 2005
    B 500 300 2006
    B 400 450 2007
    Could you please provide me the code for calculating the asset_ave and satisfy these requirements?

    A sample example is
    firm asset asset_av year
    2554Q9 100 . 2000
    2554Q9 300 200 2001
    2554Q9 200 . 2003
    2554Q9 100 150 2004
    2564UU 200 . 2000
    2564UU 100 150 2001
    2564UU 800 450 2002
    2564UU 1200 1000 2003
    2564UU 300 750 2004
    2563CS 200 . 2000
    2563CS 400 300 2001
    2563CS . . 2002
    2563CS 900 . 2003
    2563CS 100 500 2004
    2557AZ 200 . 2000
    2557AZ 600 . 2002
    2557AZ 100 . 2004
    Thanks in advance! Please let me know if there is anything I did not explain clearly.

  • #2
    Dear all!

    I sucessfully generated the code satisfying the condition above (ay least from the result I got at hand similar to the results above. Could you please kindly check whether I did not miss anything and suggest me if there is any way to shorten the code?

    Code:
    encode FIRM, generate(FIRM2)
    sort FIRM2 year
    by FIRM2: gen lag1ASSET = ASSET[_n-1] if YEAR==YEAR[_n-1]+1
    
    gen ASSET_AVE=(ASSET+lag1ASSET)/2
    The result I got is
    Click image for larger version

Name:	Capture.PNG
Views:	1
Size:	18.3 KB
ID:	1681875

    Comment


    • #3
      Please use -dataex- to create the data example so it is easy for others to copy it into their Stata do-file, and help you with this.

      Comment


      • #4
        Thank you Hemanshu Kumar for notice, I post my data example using -dataex- here:

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str6 firm int(asset year)
        "2554Q9"  100 2000
        "2554Q9"  300 2001
        "2554Q9"  200 2003
        "2554Q9"  100 2004
        "2564UU"  200 2000
        "2564UU"  100 2001
        "2564UU"  800 2002
        "2564UU" 1200 2003
        "2564UU"  300 2004
        "2563CS"  200 2000
        "2563CS"  400 2001
        "2563CS"    . 2002
        "2563CS"  900 2003
        "2563CS"  100 2004
        "2557AZ"  200 2000
        "2557AZ"  600 2002
        "2557AZ"  100 2004
        end
        Best regards and Thanks in advance.

        Comment


        • #5
          Your code seems to be fine, but it can be condensed into one line:

          Code:
          bysort firm (year): gen asset_ave = (asset + asset[_n-1])/2 if year == year[_n-1] + 1

          Comment


          • #6
            Originally posted by Hemanshu Kumar View Post
            Your code seems to be fine, but it can be condensed into one line:

            Code:
            bysort firm (year): gen asset_ave = (asset + asset[_n-1])/2 if year == year[_n-1] + 1
            It fits great in my case, thanks a heap Hemanshu Kumar , it saves me from creating too many variables that mess the dataset

            Comment

            Working...
            X