Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create same variable across groups in panel data

    Hello Statalists,

    Hope you can advise on this.

    Lets say I have data N=20 and T=100, and I want to import/paste one variable (let's say from excel) to be common across the 20 cross sections.
    I can past it for the first group (id=1) which creates missing values for the other 19, but what is the efficient way to do this for the rest of the groups? Ideally pasting in 20 times is time consuming.

    Thank you.

  • #2
    Stata has extensive capabilities to import and merge data from different sources. The exact code will depend on what your data looks like exactly, but in any case pasting won't be the best option, especially if you want to be able to reproduce your results.

    My best guess is that you will need some combination of import excel and merge.

    Here's an example which may be similar to what you are looking for.
    Code:
    clear
    input str3 country long year byte value1
    "FRA" 2000 1
    "FRA" 2001 2
    "US"  2000 3
    "US"  2001 4
    end
    
    tempfile original
    save `original'
    
    clear
    input str3 country byte value2
    "FRA" 10
    "US"  20
    end
    
    merge 1:m country using `original'
    order country year value1 value2
    sort country year
    list
    Code:
    . list
    
         +------------------------------------------------+
         | country   year   value1   value2        _merge |
         |------------------------------------------------|
      1. |     FRA   2000        1       10   matched (3) |
      2. |     FRA   2001        2       10   matched (3) |
      3. |      US   2000        3       20   matched (3) |
      4. |      US   2001        4       20   matched (3) |
         +------------------------------------------------+
    If you still run into problems, please post back with more details about your data (make sure to use dataex) and what you are trying to do exactly.

    Comment


    • #3
      Thank you Wouter for your reply.

      My question comes in case you already did merging and you are working on a dataset. Let's take it from your code for simplicity, assume we have this data (country year value1 value2) that you created in stata and saved at .dta file, assume I want to create another variable in this .dta file called value3, and has the values of 0.1 for year 2000 and 0.9 for year 2001 repeated for FRA and US. How shall we code this? One can do it with data editor in stata, to simply create value3 and input 0.1 and 0.9 for 2000 and 2001, however, what if at that point, we have 20 country that we want to repeat value3 variable for them?

      Comment


      • #4
        So if I understand you correctly you want to add data that has values for each year, which are the same for all panels.

        If you want to do it manually (assuming your time variable is called year):
        Code:
        gen value3 = .
        replace value3 = 0.1 if year == 2000
        replace value3 = 0.9 if year == 2001
        etc.
        If you have a separate dataset with variables year and value3 you can also use merge. In this case you would merge on year instead of on country.

        Comment


        • #5
          Thank you Wouter, you understood my point.
          Basically, if I have long N X T the way to do it is to create a .dta file with value3 variable for all years say 20 years using the merge function.
          Can you please provide the syntax for merging on year repeated across countries? assuming we have value3.dta with values 0.1 for 2000 and 0.9 for 2001, and I want to merge it with data.dta file with value1 and value2 variables.

          Comment


          • #6
            Code:
            clear
            input str3 country long year byte (value1 value2)
            "FRA" 2000 1 10
            "FRA" 2001 2 10
            "US"  2000 3 20
            "US"  2001 4 20
            end
            
            tempfile original
            save `original'
            
            clear
            input year value3
            2000 0.1
            2001 0.9
            end
            
            merge 1:m year using `original'
            order country year value1 value2 value3
            sort country year
            list
            
                 +---------------------------------------------------------+
                 | country   year   value1   value2   value3        _merge |
                 |---------------------------------------------------------|
              1. |     FRA   2000        1       10       .1   matched (3) |
              2. |     FRA   2001        2       10       .9   matched (3) |
              3. |      US   2000        3       20       .1   matched (3) |
              4. |      US   2001        4       20       .9   matched (3) |
                 +---------------------------------------------------------+

            Comment


            • #7
              It works well, thank you Wouter.

              Comment

              Working...
              X