Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Joining datasets when a variable is in a different format

    Hi all,

    I currently have a data set that looks like this where each quarter is represented by a seperate variable and contains values of trade balances (simple values have been put in for example)
    naic year q1 q2 q3 q4
    321111 1997 5 9 4 2
    393333 1998 4 10 4 1
    I want to join this with a data set that contains the same naic numbers and the same years but quarter is its own variable (this set contains other variables of industry characteristics- M Y and K)
    naic year M Y K qtr
    321111 1997 46 45 55 1
    321111 1997 29 45 44 2
    321111 1997 34 54 75 3
    321111 1997 23 34 53 4
    321111 1998 54 44 43 1
    I want to join these data sets, preferably so the first takes the form of the second with qtr as one variable taking on the sequence of values 1 2 3 4 (essentially want the last 4 columns of the first table to be in one column and hence to expand the observations for each naic code by 4 whilst keeping the values for each quarter)



  • #2
    reshape the first dataset before you merge.

    Code:
    clear 
    input naic    year    q1    q2    q3    q4
    321111    1997    5    9    4    2
    393333    1998    4    10    4    1
    end 
    
    reshape long q, i(naic year) j(qtr) 
    
    save first , replace 
    
    clear 
    input naic    year    M    Y    K    qtr
    321111    1997    46    45    55    1
    321111    1997    29    45    44    2
    321111    1997    34    54    75    3
    321111    1997    23    34    53    4
    321111    1998    54    44    43    1
    
    end
    
    merge 1:1 naic year qtr using first 
    
    list , sepby(naic) 
    
         +-----------------------------------------------------------+
         |   naic   year    M    Y    K   qtr    q            _merge |
         |-----------------------------------------------------------|
      1. | 321111   1997   46   45   55     1    5       matched (3) |
      2. | 321111   1997   29   45   44     2    9       matched (3) |
      3. | 321111   1997   34   54   75     3    4       matched (3) |
      4. | 321111   1997   23   34   53     4    2       matched (3) |
      5. | 321111   1998   54   44   43     1    .   master only (1) |
         |-----------------------------------------------------------|
      6. | 393333   1998    .    .    .     1    4    using only (2) |
      7. | 393333   1998    .    .    .     2   10    using only (2) |
      8. | 393333   1998    .    .    .     3    4    using only (2) |
      9. | 393333   1998    .    .    .     4    1    using only (2) |
         +-----------------------------------------------------------+
    At some point soon you'll benefit from a Stata quarterly date:

    Code:
    gen qdate = yq(year, qtr) 
    format qdate %tq

    Comment


    • #3
      thanks Nick.

      Comment

      Working...
      X