Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Expand down an observation a specific number of times

    Hi,

    I have observations that I want to duplicate for some of the following years only.

    For instance, the obs for 2009 is .6491228 and I want to duplicate this value for 2010 and 2011 only. I tried carryforward but this command expands unitl the next observation, which is not what I want. I also tried with expand but I found nothing satisfying.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str52 CountryName float(year boycott)
    
    "Brazil" 2007 .
    "Brazil" 2008 .
    "Brazil" 2009 .6491228
    "Brazil" 2010 .
    "Brazil" 2011 .
    "Brazil" 2012 .
    "Brazil" 2013 .
    "Brazil" 2014 .6721311
    "Brazil" 2015 .
    "Brazil" 2016 .
    "Brazil" 2017 .
    "Brazil" 2018 .
    If possible, I need a code that would allow me to define the precise number of years I need to expand my observation.

    I would appreciate any help. Thank you.
    Last edited by Zsolt Marai; 08 Dec 2022, 07:16.

  • #2
    You can create a variable with such information.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str52 CountryName float(year boycott)
    
    "Brazil" 2007 .
    "Brazil" 2008 .
    "Brazil" 2009 .6491228
    "Brazil" 2010 .
    "Brazil" 2011 .
    "Brazil" 2012 .
    "Brazil" 2013 .
    "Brazil" 2014 .6721311
    "Brazil" 2015 .
    "Brazil" 2016 .
    "Brazil" 2017 .
    "Brazil" 2018 .
    end
    
    drop if missing(boycott)
    *FOR EXAMPLE
    g toexpand=cond(year==2009,3, 2)
    expand toexpand, g(new)
    bys CountryName year: replace year= year[_n-1]+1 if new
    Use

    Code:
    encode CountryN, g(country)
    xtset country year
    tsfill, full
    to restore the missing values if needed.

    Res.:

    Code:
    . l, sepby(C)
    
         +---------------------------------------------+
         | Countr~e   year    boycott   toexpand   new |
         |---------------------------------------------|
      1. |   Brazil   2009   .6491228          3     0 |
      2. |   Brazil   2010   .6491228          3     1 |
      3. |   Brazil   2011   .6491228          3     1 |
      4. |   Brazil   2014   .6721311          2     0 |
      5. |   Brazil   2015   .6721311          2     1 |
         +---------------------------------------------+
    Last edited by Andrew Musau; 08 Dec 2022, 07:55.

    Comment


    • #3
      Thank you for the answer.

      However, I am wondering whether the cond command would still be relevant when there is more than two kinds of observations.

      More precisely, I have the following periods from another sample, and each of them corresponds to one observation in the sample:

      1995-1998
      1999-2004
      2005-2009
      2010-2014
      2017-2022

      For each of this period, I want to "expand" the years and to attribute the observation of the first year of the period (which I actually have, see below) to the following years of the same period. So, same observations from 1995 to 1998; from 1999 to 2004; and so on.

      Here is a better subset of my observations (obs for 1999 are missing but it does not matter):

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str52 CountryName float(year gentrust)
      "Brazil" 1995 2.8193834
      "Brazil" 1996 .
      "Brazil" 1997 .
      "Brazil" 1998 .
      "Brazil" 1999 .
      "Brazil" 2000 .
      "Brazil" 2001 .
      "Brazil" 2002 .
      "Brazil" 2003 .
      "Brazil" 2004 .
      "Brazil" 2005 9.201624
      "Brazil" 2006 .
      "Brazil" 2007 .
      "Brazil" 2008 .
      "Brazil" 2009 .
      "Brazil" 2010 6.576272
      "Brazil" 2011 .
      "Brazil" 2012 .
      "Brazil" 2013 .
      "Brazil" 2014 .
      "Brazil" 2015 .
      "Brazil" 2016 .
      "Brazil" 2017 6.647399
      "Brazil" 2018 .
      "Brazil" 2019 .
      "Brazil" 2020 .
      I think that using the cond command would imply to create a new variable for each of the period. Am I right?

      English is not my first language so I hope that my message is clear enough.

      Thanks.
      Last edited by Zsolt Marai; 08 Dec 2022, 14:59.

      Comment


      • #4
        Code:
        gen byte era = 1 if inrange(year, 1995, 1998)
        replace era = 2 if inrange(year, 1999, 2004)
        replace era = 3 if inrange(year, 2005, 2009)
        replace era = 4 if inrange(year, 2010, 2014)
        replace era = 5 if inrange(year, 2017, 2022)
        
        by CountryName era (year), sort: replace gentrust = gentrust[1]
        sort CountryName year
        Note: It seems that your year ranges, which I have called era in this code, have no place for years 2015 and 2016. Is that correct?

        The above code also assumes that the year for which a non-missing value of gentrust is available is the first year of that era. (This is true in your example data.) The following code relaxes this assumption and allows that to happen anywhere and spreads the value further down until the end of the era.
        Code:
        gen byte era = 1 if inrange(year, 1995, 1998)
        replace era = 2 if inrange(year, 1999, 2004)
        replace era = 3 if inrange(year, 2005, 2009)
        replace era = 4 if inrange(year, 2010, 2014)
        replace era = 5 if inrange(year, 2017, 2022)
        
        by CountryName era (year), sort: replace gentrust = gentrust[_n-1] if _n > 1 & missing(gentrust)

        Comment


        • #5
          It appears that you want to create 5-year periods, in which case you can use the -floor()- or -ceil()- functions. For more on this, see https://journals.sagepub.com/doi/pdf...867X1801800311

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str52 CountryName float(year gentrust)
          "Brazil" 1995 2.8193834
          "Brazil" 1996         .
          "Brazil" 1997         .
          "Brazil" 1998         .
          "Brazil" 1999         .
          "Brazil" 2000         .
          "Brazil" 2001         .
          "Brazil" 2002         .
          "Brazil" 2003         .
          "Brazil" 2004         .
          "Brazil" 2005  9.201624
          "Brazil" 2006         .
          "Brazil" 2007         .
          "Brazil" 2008         .
          "Brazil" 2009         .
          "Brazil" 2010  6.576272
          "Brazil" 2011         .
          "Brazil" 2012         .
          "Brazil" 2013         .
          "Brazil" 2014         .
          "Brazil" 2015         .
          "Brazil" 2016         .
          "Brazil" 2017  6.647399
          "Brazil" 2018         .
          "Brazil" 2019         .
          "Brazil" 2020         .
          end
          
          g period= floor(year/5)*5
          bys CountryN period (gentrust): replace gentrust= gentrust[1]
          Res.:

          Code:
          . sort C y
          
          . l, sepby(C p)
          
               +-------------------------------------+
               | Countr~e   year   gentrust   period |
               |-------------------------------------|
            1. |   Brazil   1995   2.819383     1995 |
            2. |   Brazil   1996   2.819383     1995 |
            3. |   Brazil   1997   2.819383     1995 |
            4. |   Brazil   1998   2.819383     1995 |
            5. |   Brazil   1999   2.819383     1995 |
               |-------------------------------------|
            6. |   Brazil   2000          .     2000 |
            7. |   Brazil   2001          .     2000 |
            8. |   Brazil   2002          .     2000 |
            9. |   Brazil   2003          .     2000 |
           10. |   Brazil   2004          .     2000 |
               |-------------------------------------|
           11. |   Brazil   2005   9.201624     2005 |
           12. |   Brazil   2006   9.201624     2005 |
           13. |   Brazil   2007   9.201624     2005 |
           14. |   Brazil   2008   9.201624     2005 |
           15. |   Brazil   2009   9.201624     2005 |
               |-------------------------------------|
           16. |   Brazil   2010   6.576272     2010 |
           17. |   Brazil   2011   6.576272     2010 |
           18. |   Brazil   2012   6.576272     2010 |
           19. |   Brazil   2013   6.576272     2010 |
           20. |   Brazil   2014   6.576272     2010 |
               |-------------------------------------|
           21. |   Brazil   2015   6.647399     2015 |
           22. |   Brazil   2016   6.647399     2015 |
           23. |   Brazil   2017   6.647399     2015 |
           24. |   Brazil   2018   6.647399     2015 |
           25. |   Brazil   2019   6.647399     2015 |
               |-------------------------------------|
           26. |   Brazil   2020          .     2020 |
               +-------------------------------------+
          
          .

          Comment


          • #6
            It appears that you want to create 5-year periods
            It doesn't look that way to me. It seems in #3 that some of the periods are 4 years, some 5, and some 6.

            Comment


            • #7
              Ah, then there is no easy way around it.

              Comment


              • #8
                Thanks to both of you for your help.

                The code that Clyde has suggested seems to work well (I have an observation for the first year of each period).

                Comment

                Working...
                X