Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replace values within the same column

    Could I get help on how to replace values within the same column? Here is an example:

    country year countryname fx
    250 1997 France 5.83669
    250 1998 France 5.89952
    250 1999 France .
    250 2000 France .
    250 2001 France .
    250 2002 France .
    250 2003 France .
    1000 1997 Euro Area .
    1000 1998 Euro Area .
    1000 1999 Euro Area .938283
    1000 2000 Euro Area 1.08271
    1000 2001 Euro Area 1.11653
    1000 2002 Euro Area 1.05756
    1000 2003 Euro Area .884048

    Starting in 1999, the "fx" values for France are missing, and I intend to replace these missing values with the corresponding ones from the Euro Area for each respective year. Considering that I aim to fill in missing "fx" values for all EU countries, would splitting the data by country and merging them to fill in the missing values be the most efficient approach?

    Many thanks in advance!

  • #2
    That depends on what you mean by "efficient." If you mean "minimizes the number of CPU cycles that the program needs to execute" then no, probably not. If you mean relatively simple, easy to read, and easy to program, then maybe. At the risk of being pedantic, I would call the latter an "elegant" rather than "efficient" solution. I can picture what you are describing, but not the implementation. Alternatively, I might use the reshape command like so:

    Code:
    clear
    input int(country year) str80(countryname) double(fx)
    250 1997 "France" 5.83669
    250 1998 "France" 5.89952
    250 1999 "France" .
    250 2000 "France" .
    250 2001 "France" .
    250 2002 "France" .
    250 2003 "France" .
    1000 1997 "Euro Area" .
    1000 1998 "Euro Area" .
    1000 1999 "Euro Area" .938283
    1000 2000 "Euro Area" 1.08271
    1000 2001 "Euro Area" 1.11653
    1000 2002 "Euro Area" 1.05756
    1000 2003 "Euro Area" .884048
    end
    
    
    reshape wide countryname fx, i(year) j(country)
    replace fx250 = fx1000 if missing(fx250)
    reshape long countryname fx, i(year) j(country)
    sort country year
    list, clean noobs
    Code:
    . list, clean noobs
    
        year   country   country~e        fx  
        1997       250      France   5.83669  
        1998       250      France   5.89952  
        1999       250      France   .938283  
        2000       250      France   1.08271  
        2001       250      France   1.11653  
        2002       250      France   1.05756  
        2003       250      France   .884048  
        1997      1000   Euro Area         .  
        1998      1000   Euro Area         .  
        1999      1000   Euro Area   .938283  
        2000      1000   Euro Area   1.08271  
        2001      1000   Euro Area   1.11653  
        2002      1000   Euro Area   1.05756  
        2003      1000   Euro Area   .884048

    Comment


    • #3
      Jaehee:
      do you mean something along the following lines?
      Code:
      . bysort year: replace fx=fx[2] if fx[1]==. & country==250
      
      . list
      
           +--------------------------------------+
           | country   year   country~e        fx |
           |--------------------------------------|
        1. |    1000   1997   Euro_Area         . |
        2. |     250   1997      France   5.83669 |
        3. |     250   1998      France   5.89952 |
        4. |    1000   1998   Euro_Area         . |
        5. |    1000   1999   Euro_Area   .938283 |
           |--------------------------------------|
        6. |     250   1999      France         . |
        7. |     250   2000      France   1.08271 |
        8. |    1000   2000   Euro_Area   1.08271 |
        9. |    1000   2001   Euro_Area   1.11653 |
       10. |     250   2001      France         . |
           |--------------------------------------|
       11. |    1000   2002   Euro_Area   1.05756 |
       12. |     250   2002      France         . |
       13. |     250   2003      France   .884048 |
       14. |    1000   2003   Euro_Area   .884048 |
           +--------------------------------------+
      
      . bysort year: replace fx=fx[1] if fx[2]==. & country==250
      
      . list
      
           +--------------------------------------+
           | country   year   country~e        fx |
           |--------------------------------------|
        1. |    1000   1997   Euro_Area         . |
        2. |     250   1997      France   5.83669 |
        3. |     250   1998      France   5.89952 |
        4. |    1000   1998   Euro_Area         . |
        5. |    1000   1999   Euro_Area   .938283 |
           |--------------------------------------|
        6. |     250   1999      France   .938283 |
        7. |     250   2000      France   1.08271 |
        8. |    1000   2000   Euro_Area   1.08271 |
        9. |    1000   2001   Euro_Area   1.11653 |
       10. |     250   2001      France   1.11653 |
           |--------------------------------------|
       11. |    1000   2002   Euro_Area   1.05756 |
       12. |     250   2002      France   1.05756 |
       13. |     250   2003      France   .884048 |
       14. |    1000   2003   Euro_Area   .884048 |
           +--------------------------------------+
      
      .
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        A straightforward way might be to create a variable to store the Euro value, like so:

        Code:
        egen euro_fx = max(cond(country == 1000, fx, .)), by(year)
        Then you can plug in its values as needed. For instance,

        Code:
        replace fx = euro_fx if missing(fx) & country != 1000
        drop euro_fx
        which gives:

        Code:
        . list if country == 250, noobs sep(0)
        
          +---------------------------------------+
          | country   year   countr~e          fx |
          |---------------------------------------|
          |     250   1997     France     5.83669 |
          |     250   1998     France     5.89952 |
          |     250   1999     France   .93828303 |
          |     250   2000     France     1.08271 |
          |     250   2001     France   1.1165299 |
          |     250   2002     France     1.05756 |
          |     250   2003     France   .88404799 |
          +---------------------------------------+
        Last edited by Hemanshu Kumar; 10 Jul 2023, 02:36.

        Comment


        • #5
          Many thanks to Daniel, Carlo and Hemanshu! I think I actually overcomplicated things. Thanks for suggesting much simpler approaches.

          Comment

          Working...
          X