Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to generate delta variables to run convergence regressions

    Dear all,

    This is about creating dependent variables for a convergence study with panel data in long form.
    My panel dataset looks like the example below:

    Code:
    reg_cod  reg_code_n           year      var1      delta_var1        
    -------------------------------------------------------------------------
    AT         11                 1983      .              
    AT         11                 1984      .              
    AT         11                 1985      .              
    AT         11                 1986      .              
    AT         11                 1987      .              
    AT         11                 1988      .               
    AT         11                 1989      .               
    AT         11                 1990      .              
    AT         11                 1991      .2600     
    AT         11                 1992      .2611     
    AT         11                 1993      .             
    AT         11                 1994      .2789     
    AT         11                 1995      .2766     
    AT         11                 1996      .             
    AT         11                 1997      .2656     
    AT         11                 1998      .             
    AT         11                 1999      .             
    AT         11                 2000      .2550     
    AT         11                 2001      .            
    AT         11                 2002      .            
    AT         11                 2003      .            
    AT         11                 2004      .2682     
    AT         11                 2005      .            
    AT         11                 2006      .            
    AT         11                 2007      .2834    
    AT         11                 2008      .            
    AT         11                 2009      .            
    AT         11                 2010      .2792   
    AT         11                 2011      .           
    AT         11                 2012      .          
    AT         11                 2013      .2776  
    AT         11                 2014      .          
    AT         11                 2015      .          
    ---------------------------------------------------------------------------
    AT1       111                 1983      .           
    AT1       111                 1984      .3011   
    AT1       111                 1985      .3324  
    ...    
    ...
    ...        
    ---------------------------------------------------------------------------
    I am trying to create, for some variables, its change over time using the last and the first value for each observation (var1 at time 1 - var1 at time 0), such that column delta_var1 will record each calculated change in corresponding initial t (e.g., var1 at time2013 - var1 at time1991 in a cell for year 1983, etcetera). The goal is to perform convergence regressions.

    My problem is:
    - How to generate these delta variables?
    - How to take into account for the analysis that the first and last value are not same for all the observations I have?

    Thanks in advance for your help.

    Kind Regards,
    Francesco




  • #2
    First, if it is not essential to have them, I would delete observations that are all missing before an observed value or all missing after. That is, AT 2014 and 2015, and 1983 to 1990. This could be done with a bunch of drop if... statements.

    Second, you can then get the first value by bysort reg_id : g first=var1[1]
    and last value bysort reg_id: g last=var2[_N]
    you can move them around by
    bysort reg_id: egen first1=mean[first]
    Now you can do your differences.

    Comment


    • #3
      Dear Phil,

      thanks for your help, but I can't delete missing observations before and after observed values.
      Based on your commands, I managed to generate two variables with first and last values for each observation (using the variable wave not mentioned in the dataset example above), with the following:
      Code:
      gsort region_cod wave_n year
      bysort region_cod: gen first = var1[1]
      
      gsort region_cod -wave_n year
      bysort region_cod: gen last = var1[1]
      
      gen delta_var1 = last - first

      Although I have this delta variable, now I was wondering how to solve the second part of my problem:
      How to take into account for the analysis that the first and last value are not the same for all the observations I have?
      Is there a way to do this?

      The idea is to run, based on data availability, separate regressions for subsamples, for example: countries within 1985-2013, countries within 1990/2013, countries within 1998/2013.

      This is the panel dataset example (with the new variable wave):

      Code:
      reg_cod  reg_code_n    wave     year      var1      delta_var1        
      -------------------------------------------------------------------------
      AT         11          .       1983      .              
      AT         11          .       1984      .              
      AT         11          .       1985      .              
      AT         11          .       1986      .              
      AT         11          .       1987      .              
      AT         11          .       1988      .               
      AT         11          .       1989      .               
      AT         11          .       1990      .              
      AT         11          .       1991      .2600     
      AT         11          .       1992      .2611     
      AT         11          .       1993      .             
      AT         11          4       1994      .2789     
      AT         11          4       1995      .2766     
      AT         11          .       1996      .             
      AT         11          4       1997      .2656     
      AT         11          .       1998      .             
      AT         11          .       1999      .             
      AT         11          5       2000      .2550     
      AT         11          .       2001      .            
      AT         11          .       2002      .            
      AT         11          .       2003      .            
      AT         11          6       2004      .2682     
      AT         11          .       2005      .            
      AT         11          .       2006      .            
      AT         11          7       2007      .2834    
      AT         11          .       2008      .            
      AT         11          .       2009      .            
      AT         11          8       2010      .2792   
      AT         11          .       2011      .           
      AT         11          .       2012      .          
      AT         11          9       2013      .2776  
      AT         11          .       2014      .          
      AT         11          .       2015      .          
      ---------------------------------------------------------------------------
      AT1       111          .       1983      .           
      AT1       111          .       1984      .3011   
      AT1       111          .       1985      .3324  
      ...    
      ...
      ...        
      ---------------------------------------------------------------------------
      Kind regards,
      Francesco
      Last edited by Francesco Savoia; 12 Jul 2017, 10:46. Reason: dataset example modified

      Comment

      Working...
      X