Merge Issue

Syed Raza

Join Date: Feb 2018
Posts: 50

18 Sep 2019, 00:24

Hello Everyone,

Hope you are doing great. I am running my code, but right in the middle of my code I need to merge one file which contains calculated weights from interview frequencies. But when I run my code Stata says that I am using the old syntax of a merge. Please have a look at the data and suggest what should I do to avoid this problem. I have read the manual for Merge command but still confused. I have attached complete data from weights file and example of data from main file.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(wt06 wt12 wt24 wt36 cum36 wt06x wt12x wt24x wt36x cum36x)
      2.620274       5.240549    .14448807     -4.951572    .14448807     1.0909091     2.1818182 .6857143 -.8103896 .6857143
  5.498567e-17  1.0997134e-16     .9724288     1.9448576     .9724288 -4.516161e-17 -9.032323e-17 .6857143 1.3714286 .6857143
-1.3746417e-17 -2.7492835e-17 -6.65337e-18 1.4186095e-17 -6.65337e-18             0             0        0         0        0
end

Now here is the type of data from the main file

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double hhidpn byte hacohort float(age married phealth rgovt sgovt sitearnsemp ritearnsemp working)
1100032701 1 50 1   0         0     0 16832.605 15360.938 100
1100032701 1 53 1 100     12000     0 13345.703         0   0
1100032702 1 47 1   0         0     0 18127.424  14268.16 100
1100032702 1 50 1   0         0 12000         0 13345.703 100
1100067401 3 58 0   0         0     0         0 10145.454 100
1100067401 3 62 0   0     16672     0         0         0   0
1100067401 3 60 0   0     15400     0         0         0   0
1100121501 3 71 0   0     14400     0         0         0   0
1100121501 3 75 0   0     16800     0         0         0   0
1100121501 3 73 0   0     15400     0         0         0   0
1100149201 3 62 1   0         0 28000  5890.279         0   0
1100149201 3 58 1   0         0     0  45090.91         0   0
1100149201 3 60 1   0         0     0  45090.91         0   0
1100149202 3 58 1   0         0     0         0  47345.45 100
1100149202 3 60 1   0         0     0         0  48096.28 100
1100149202 3 62 1   0     28000     0         0  5890.279   0
1100181601 1 60 1   0      6000     0 2330.6687  2530.856   0
1100181602 1 55 1   0         0  6000  2848.595  1295.541 100
1100181602 1 57 1   0         0     .         .         0 100
1100188101 3 84 0   0     11680     0         0         0   0
1100188101 3 82 1 100     10640 11410         0         0   0
1100188101 3 80 1   0      8760 11780         0         0   0
1100188102 3 87 1   0     11410 10640         0         0   0
1100188102 3 86 1   0     11780  8760         0         0   0
1100188102 3 89 1   0     19100     0         0         0   0
1100193701 3 50 0 100      4800     0         0         0   .
1100213201 1 72 0 100      6600     0         0         0   0
1100213201 1 69 0 100      9600     0         0 1147.6398   0
1100213601 3 62 1 100     19600 22400         0         0   0
1100213601 3 60 1   0     18000 20100         0  16061.21   0
1100213601 3 58 1   0         0 17520         0  33818.18   0
1100213603 3 57 1   0     17520     0  33818.18         0   0
1100213603 3 61 1   0     22400 19600         0         0   0
1100213603 3 59 1   0     20100 18000  16061.21         0   0
1100218002 3 85 1 100     14000 14000         0         0   0
1100218002 3 83 1   0     12600 12600         0         0   0
1100218003 3 71 1 100     14000 14000         0         0   0
1100218003 3 69 1   0     12600 12600         0         0   0
1100235501 1 70 1   0     14400  7080         0         0   0
1100235502 1 70 1   0      7080 14400         0         0   0
1100252501 3 67 1   0     14700 16730         0         0   0
1100252501 3 71 1   0     15400 16800         0         0   0
1100252501 3 69 1   0     15220 17960         0         0   0
1100252502 3 69 1   0     17960 15220         0         0   0
1100252502 3 67 1   0     16730 14700         0         0   0
1100252502 3 71 1   0     16800 15400         0         0   .
1100257301 3 74 1   0      5600 22980         0         0   0
1100257301 3 76 1   0      6240 23010         0         0   0
1100257302 3 70 1   0     23010  6240         0         0   0
1100257302 3 68 1   0     22980  5600         0         0   0
1100280001 3 56 0   0         0     0         0         0 100
1100280001 3 59 0   0     27120     0         0  37866.08   0
1100280001 3 57 0   0         0     0         0   22482.1   0
1100296502 3 88 0   0         0     0         0         0   0
1100296502 3 90 0   0      9000     0         0         0   0
1100296502 3 86 0   0      8400     0         0         0   0
1100319401 3 74 1   0     21140     .         .         0   0
1100319401 3 73 1   0     21580     0         0         0   0
1100319401 3 76 1   0     21440     .         .         0   0
1100319402 3 72 1   0         0 21580         0         0   0
1100321301 3 69 0   0     14000     0         0         0   0
1100321301 3 68 0   0     13200     0         0         0   0
1100321301 3 71 0   0     14560     0         0         0   0
1100368301 3 49 1   0         0     0 25983.635  22861.09 100
1100368302 3 50 1   0         0     0  22861.09 25983.635 100
1100423401 1 52 1   0     12072     .         .         0   0
1100423401 1 59 1   0     14840     0 15781.817         0   0
1100423401 1 55 1   0     12600     0 13345.703         0   0
1100423402 1 52 1   0         0 12600         0 13345.703 100
1100423402 1 56 1   0         0 14840         0 15781.817 100
1100437901 1 70 0   0      6576     0         0         0   0
end

Here is my code which illustrates how i am merging these two datasets

Code:

clear all

local output "D:\Share\Output"
local programs "D:\Share\Programs"

local samples "under60_hosp_insured" // age60to64_hosp_insured over65_hosp_insured"
local fes "cohortXwaveXcountry"
local outcomes "oop_spend working ritearnsemp sitearnsemp hgovt hittot rhlthlm phealth fphealth"
local isocountry "Austria Belgium Czech Denmark France Germany Italy Spain Sweden Switzerland "

cap log close
log using "D:\Share\output\implied_effects_log.log", replace

foreach samp in `samples' {
foreach fe in `fes' {
foreach cont in `isocountry' {

mat drop _all
local colnames ""
local col_list_1 ""
local col_list_2 ""
local col_list_IE ""

* Keep only those hospitalized 
use "D:\Share\Data\SHARE_long.dta" if `samp'==1 & `cont'==1, clear

*Get implied effects weighting matrix to be used at the end of this do-file
preserve
do "D:\Share\Programs\SHARE Implied Effects Weights.do"
restore
merge using "D:\Share\output\implied_effects_matrix.dta"
drop _merge

mkmat wt06  wt12  wt24  wt36  cum36  if _n<=3
mkmat wt06x wt12x wt24x wt36x cum36x if _n<=3


* Generate event time dummies
drop if evt_time<-$window
drop if evt_time>$window

forv i = 0/$window {
    if (`i' < 0) {
        local j = abs(`i')
        gen evt_f`j' = (evt_time == `i')
    }
    else {
        gen evt_l`i' = (evt_time == `i')
    }
}

egen cohortXwave = group(hacohort wave)
egen cohortXwaveXcountry = group(hacohort wave isocountry)

* Define number of variables for "implied effects" matrix
local J = 0
foreach outcome of varlist `outcomes' {
    local J = `J' + 1
}
matrix define results_IEs = J(33,`J',.)

local j = 1
foreach v of varlist `outcomes' { 

    local controls ""
    if "`fe'"=="hhidpn" {
        xi i.wave
        drop _Iwave_11
        local controls "_I*"
        if regexm("`v'","_c")==1 {
            drop _Iwave_2 _Iwave_3 
        }
        
        drop _Iwave_2 _Iwave_3 
    }

    di "FE = `fe', Sample = `samp', rregi = `cont' , Var = `v'"
    *areg `v' evt_time evt_l* `controls' [pweight=rwtresp], absorb(`fes')  cluster(hhidpn)
    areg `v' evt_time evt_l* `controls' [pweight=rwtresp], absorb(`fes')  cluster(hhidpn)
    *areg `v' evt_time evt_l* `controls' i.country_fe  [pweight=rwtresp], absorb(`fes')  cluster(hhidpn)
    *areg `v' evt_time evt_l* `controls' dum*  [pweight=rwtresp], absorb(`fes')  cluster(hhidpn)
    *reg `v' evt_time evt_l* `controls' i.country_fe i.cohortXwave [pweight=rwtresp], cluster(hhidpn)
    *reghdfe `v' evt_time evt_l* `controls' [pweight=rwtresp], absorb(country_fe cohortXwave) cluster(hhidpn)
        
    *log close
    
    *Saves N, number of individuals, and effective sample size to matrix
    local N = e(N)
    local C = e(N_clust)
    local R= e(r2)
    
    * Save first four rows as N, unique individuals, weighted individuals, and R-squared
    di "`N' \ `C' \ `R' " 
    mat input N=(`N' \ `C' \ `R' )
    mat rown N="N" "Indiv" "R2"
    
    * Save coefficients and add to column
    matrix eb = e(b)
    matrix eb = (N\ eb')
    

    * Save variance-covariance matrix
    matrix var= (e(V))
    local colnames: coln var
    matrix list var            // YU ADDED THIS
    
    local n=0
    * Drop SE matrix from prior run
    cap mat drop se

    * Clean up matrices for output
    foreach col in `colnames'  {
        local n=`n'+1
        mat c`n'=var[`n'..., `n']
        local rownames: rown c`n'

        foreach w of local rownames  {
            local rw_c`n' `rw_c`n'' `w'_`col'
        }
        
        matrix rown c`n'= `rw_c`n''
        matrix coln c`n'= `v'
        matrix se=(nullmat(se)\ c`n')
        cap mat drop c`n' 
        local rw_c`n' ""
    }
    
    if regexm("`v'","_c")==1 {
        mat se=(N\se)
        matrix results_ses_2=(nullmat(results_ses_2), se)
        matrix results_coefs_2 = (nullmat(results_coefs_2), eb)
        local col_list_2 `col_list_2' `v'
    }    
        
     {
        mat se=(N\se)
        matrix results_ses_1=(nullmat(results_ses_1), se)
        matrix results_coefs_1 = (nullmat(results_coefs_1), eb)
        local col_list_1 `col_list_1' `v'
    }
    
    * Calculating implied effects:
    * (lincom takes the last estimates stored)
        
    *Using Earnings weights
    lincom wt06[1,1]*evt_l0 + wt06[2,1]*evt_l1 + wt06[3,1]*evt_l2
        matrix results_IEs[1,`j'] = r(estimate)
        matrix results_IEs[2,`j'] = r(se)
    lincom wt12[1,1]*evt_l0 + wt12[2,1]*evt_l1 + wt12[3,1]*evt_l2
        matrix results_IEs[3,`j'] = r(estimate)
        matrix results_IEs[4,`j'] = r(se)
    lincom wt24[1,1]*evt_l0 + wt24[2,1]*evt_l1 + wt24[3,1]*evt_l2
        matrix results_IEs[5,`j'] = r(estimate)
        matrix results_IEs[6,`j'] = r(se)    
    lincom wt36[1,1]*evt_l0 + wt36[2,1]*evt_l1 + wt36[3,1]*evt_l2
        matrix results_IEs[7,`j'] = r(estimate)
        matrix results_IEs[8,`j'] = r(se)
    lincom cum36[1,1]*evt_l0 + cum36[2,1]*evt_l1 + cum36[3,1]*evt_l2
        matrix results_IEs[9,`j'] = r(estimate)
        matrix results_IEs[10,`j'] = r(se)    
    test wt12[1,1]*evt_l0 + wt12[2,1]*evt_l1 + wt12[3,1]*evt_l2 = wt36[1,1]*evt_l0 + wt36[2,1]*evt_l1 + wt36[3,1]*evt_l2
        matrix results_IEs[11,`j'] = r(p)
    
    *Using OOP weights
    lincom wt06x[1,1]*evt_l0 + wt06x[2,1]*evt_l1 + wt06x[3,1]*evt_l2
        matrix results_IEs[12,`j'] = r(estimate)
        matrix results_IEs[13,`j'] = r(se)
    lincom wt12x[1,1]*evt_l0 + wt12x[2,1]*evt_l1 + wt12x[3,1]*evt_l2
        matrix results_IEs[14,`j'] = r(estimate)
        matrix results_IEs[15,`j'] = r(se)
    lincom wt24x[1,1]*evt_l0 + wt24x[2,1]*evt_l1 + wt24x[3,1]*evt_l2
        matrix results_IEs[16,`j'] = r(estimate)
        matrix results_IEs[17,`j'] = r(se)    
    lincom wt36x[1,1]*evt_l0 + wt36x[2,1]*evt_l1 + wt36x[3,1]*evt_l2
        matrix results_IEs[18,`j'] = r(estimate)
        matrix results_IEs[19,`j'] = r(se)
    lincom cum36x[1,1]*evt_l0 + cum36x[2,1]*evt_l1 + cum36x[3,1]*evt_l2
        matrix results_IEs[20,`j'] = r(estimate)
        matrix results_IEs[21,`j'] = r(se)
    test wt12x[1,1]*evt_l0 + wt12x[2,1]*evt_l1 + wt12x[3,1]*evt_l2 = wt36x[1,1]*evt_l0 + wt36x[2,1]*evt_l1 + wt36x[3,1]*evt_l2
        matrix results_IEs[22,`j'] = r(p)    
    
    *Using LFP weights
    lincom wt06x[1,1]*evt_l0 + wt06x[2,1]*evt_l1 + wt06x[3,1]*evt_l2
        matrix results_IEs[23,`j'] = r(estimate)
        matrix results_IEs[24,`j'] = r(se)
    lincom wt12x[1,1]*evt_l0 + wt12x[2,1]*evt_l1 + wt12x[3,1]*evt_l2
        matrix results_IEs[25,`j'] = r(estimate)
        matrix results_IEs[26,`j'] = r(se)
    lincom wt24x[1,1]*evt_l0 + wt24x[2,1]*evt_l1 + wt24x[3,1]*evt_l2
        matrix results_IEs[27,`j'] = r(estimate)
        matrix results_IEs[28,`j'] = r(se)    
    lincom wt36x[1,1]*evt_l0 + wt36x[2,1]*evt_l1 + wt36x[3,1]*evt_l2
        matrix results_IEs[29,`j'] = r(estimate)
        matrix results_IEs[30,`j'] = r(se)
    lincom cum36x[1,1]*evt_l0 + cum36x[2,1]*evt_l1 + cum36x[3,1]*evt_l2
        matrix results_IEs[31,`j'] = r(estimate)
        matrix results_IEs[32,`j'] = r(se)
    test wt12x[1,1]*evt_l0 + wt12x[2,1]*evt_l1 + wt12x[3,1]*evt_l2 = wt36x[1,1]*evt_l0 + wt36x[2,1]*evt_l1 + wt36x[3,1]*evt_l2
        matrix results_IEs[33,`j'] = r(p)    
    
    local col_list_IE `col_list_IE' `v'    
    local j = `j' + 1    
        
} // outcomes

* Labeling rows of implied effects table
* NOTE: 36a indicate the annual effect at 36 months
    * the 36m are the average annual effects that are presented in the paper
local r="b_6mEarn se_6mEarn b_12mEarn se_12mEarn b_24mEarn se_24mEarn b_36mEarn se_36mEarn b_cum36Earn se_cum36Earn p_Earn b_6mOOP se_6mOOP b_12mOOP se_12mOOP b_24mOOP se_24mOOP b_36mOOP se_36mOOP b_cum36OOP se_cum36OOP p_OOP b_6mLFP se_6mLFP b_12mLFP se_12mLFP b_24mLFP se_24mLFP b_36mLFP se_36mLFP b_cum36LFP se_cum36LFP p_LFP"
mat rown results_IEs=`r'
        
* Outputting and saving results
local types = "coefs ses"
foreach type of local types {
      drop _all         
     mat coln results_`type'_1=`col_list_1'
     svmat2 results_`type'_1, names(col) rnames(var) 

     order var
    
     outsheet using "`output'\SHARE_ES_`type'_`samp'_`cont'_$window.txt", replace
 } // end foreach type of local types

drop _all
mat coln results_IEs=`col_list_IE'
svmat2 results_IEs, names(col) rnames(var) full
order var
mat list results_IEs
outsheet using "`output'\SHARE_IEs_`samp'_`cont'_$window.txt", replace  
  
} // samples
} // fes
} //country

log close

Tags: None

Marie Hanemann

Join Date: Sep 2019

Posts: 4
#2

18 Sep 2019, 01:12

merge using "D:\Share\output\implied_effects_matrix.dta"
drop _merge

Hello!
Have you tried typing "help merge"?
When merging, you need to clarify what Variable to merge and if you want to merge 1:1 or many:1.
Your code for merging should look something like this:

Code:

merge 1:1 Var1 Var2 using matrix.dta

Maybe this link can help you clarify this a little more:
https://www.princeton.edu/~otorres/Merge101.pdf
Comment
Jorrit Gosens

Join Date: Jan 2015

Posts: 1019
#3

18 Sep 2019, 01:16

You need to specify the type of merge and the variables that you seek to merge on.
Example:

Code:

merge 1:1 group_var id_var using "D:\Share\output\implied_effects_matrix.dta" drop _merge

See explanation of 1:1 m:1 or 1:m in the help for merge.
Comment

Announcement

Merge Issue

Comment

Comment