Hello stata users,
I have very complicated issue with my code and output data and I might explain very unclearly about these issues.
I have used multiple layers of loop since I have several versions of input data, and since I have to produce multiple combination of output data.
Due to confidentiality issue, I cannot fully provide data sample and commands but I am sharing the most important parts.
The main issue here is that after executing the commands, sometimes the values in the output datasets are identical when it shouldn't be, or the values in the output datasets are very different when it should be identical. And I am guessing that it is a problem caused by using multiple layers of loop.
I am sincerely sorry that the code could be too complicated to understand the problem. I am now also fixing the code (that my collaborator initially wrote, but who is not present) to address the issue but I am entirely lost as well. If you find any suspicious part that might seems weird or problematic, any clues could help me (or save my life...) please.
So the input files are like:
i) F_data
ii) F_data_agg
iii) F_data_incl_01
iv) F_data_incl_01_agg
And from the code that is below, the output data are like:
i) F_data_agg_all
ii) F_data_agg_all_incl_01
iii) F_data_agg_all_restricted
iv) F_data_agg_all_restricted_incl_01
v) F_data_agg_E
vi) F_data_agg_E_G
vii) F_data_agg_E_G_incl_01
viii) F_data_agg_E_G_restricted
ix) F_data_agg_E_G_restricted_incl_01
x) F_data_agg_E_incl_01
xi) F_data_agg_E_restricted
xii) F_data_agg_E_restricted_incl_01
And the main issues are:
and the code is:
I really apologize that you are lost (as me..) with the code and you are not a magician to solve the issue and even guess what the issue is..
I truely understand what you will think/feel but I am really on the same page.
Just in the broad point of view, of looking very quickly or etc.. if you find any clues of causes regarding the issue would really help me a lot.
Thank you so much in advance.
I have very complicated issue with my code and output data and I might explain very unclearly about these issues.
I have used multiple layers of loop since I have several versions of input data, and since I have to produce multiple combination of output data.
Due to confidentiality issue, I cannot fully provide data sample and commands but I am sharing the most important parts.
The main issue here is that after executing the commands, sometimes the values in the output datasets are identical when it shouldn't be, or the values in the output datasets are very different when it should be identical. And I am guessing that it is a problem caused by using multiple layers of loop.
I am sincerely sorry that the code could be too complicated to understand the problem. I am now also fixing the code (that my collaborator initially wrote, but who is not present) to address the issue but I am entirely lost as well. If you find any suspicious part that might seems weird or problematic, any clues could help me (or save my life...) please.
So the input files are like:
i) F_data
ii) F_data_agg
iii) F_data_incl_01
iv) F_data_incl_01_agg
And from the code that is below, the output data are like:
i) F_data_agg_all
ii) F_data_agg_all_incl_01
iii) F_data_agg_all_restricted
iv) F_data_agg_all_restricted_incl_01
v) F_data_agg_E
vi) F_data_agg_E_G
vii) F_data_agg_E_G_incl_01
viii) F_data_agg_E_G_restricted
ix) F_data_agg_E_G_restricted_incl_01
x) F_data_agg_E_incl_01
xi) F_data_agg_E_restricted
xii) F_data_agg_E_restricted_incl_01
And the main issues are:
- Mismatch in Aggregated Data (AGG) for F
â Fâs values are different across datasets that should include the same countries (e.g., AGG_E vs AGG_All), even though Fis present in both. - Mismatch in Sector-Level Data (BYA7) for F
â Sector-level values for F also differ across datasets (BYA7_E vs BYA7_All), and even between BYA7_E and BYA7_E_G for specific years and sectors. - INCL01 Files Are Not Working
â Datasets marked as INCL01 show no difference compared to those that exclude them, which is unexpected.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str3 country str52 years_available double(year L_surv_TM N_surv_TM FL_surv_TM L_micro_ent L_incl01_all) "A" "2004, 2007, 2009, 2010, 2012, 2015" 2004 22708 10649 71196 29498 1932196 "A" "2004, 2007, 2009, 2010, 2012, 2015" 2004 18035 8317 63398 29498 1932196 "A" "2004, 2007, 2009, 2010, 2012, 2015" 2007 23331 10938 45133 30090 2027543 "A" "2004, 2007, 2009, 2010, 2012, 2015" 2007 18853 8715 43077 30090 2027543 "A" "2004, 2007, 2009, 2010, 2012, 2015" 2009 22262 10464 41325 27563 2008853 "A" "2004, 2007, 2009, 2010, 2012, 2015" 2009 18429 8476 37432 27563 2008853 "A" "2004, 2007, 2009, 2010, 2012, 2015" 2010 21075 10064 39043 26168 2015871 "A" "2004, 2007, 2009, 2010, 2012, 2015" 2010 17286 8137 36031 26168 2015871 "A" "2004, 2007, 2009, 2010, 2012, 2015" 2012 21097 9990 38736 26560 2099436 "A" "2004, 2007, 2009, 2010, 2012, 2015" 2012 17245 8041 36594 26560 2099436 "A" "2004, 2007, 2009, 2010, 2012, 2015" 2015 19403 9287 41796 23930 2147934 "C" "2004, 2007, 2009, 2010, 2012, 2015" 2004 86321 37751 160920 108242 8715591 "C" "2004, 2007, 2009, 2010, 2012, 2015" 2004 70442 30365 144751 108242 8715591 "C" "2004, 2007, 2009, 2010, 2012, 2015" 2007 91696 42599 162922 107482 9294105 "C" "2004, 2007, 2009, 2010, 2012, 2015" 2007 76680 35325 151956 107482 9294105 "C" "2004, 2007, 2009, 2010, 2012, 2015" 2009 77386 36249 143959 85937 9001447 "C" "2004, 2007, 2009, 2010, 2012, 2015" 2009 64224 29618 135831 85937 9001447 "C" "2004, 2007, 2009, 2010, 2012, 2015" 2010 78611 36655 149241 87384 9003113 "C" "2004, 2007, 2009, 2010, 2012, 2015" 2010 64864 29938 143103 87384 9003113 "C" "2004, 2007, 2009, 2010, 2012, 2015" 2012 82908 39480 154221 93419 9297148 "C" "2004, 2007, 2009, 2010, 2012, 2015" 2012 68299 32008 148197 93419 9297148 "C" "2004, 2007, 2009, 2010, 2012, 2015" 2015 78671 36948 155331 84040 9680785 "D" "2009, 2010, 2013, 2016, 2017, 2019" 2009 101465 31940 153074 194675 0 "D" "2009, 2010, 2013, 2016, 2017, 2019" 2009 81546 25125 143940 194675 0 "D" "2009, 2010, 2013, 2016, 2017, 2019" 2010 111299 35443 166755 208871 0 "D" "2009, 2010, 2013, 2016, 2017, 2019" 2010 91108 28578 164804 208871 0 "D" "2009, 2010, 2013, 2016, 2017, 2019" 2013 91431 28861 144195 160895 0 "D" "2009, 2010, 2013, 2016, 2017, 2019" 2013 75333 23569 141765 160895 0 "D" "2009, 2010, 2013, 2016, 2017, 2019" 2016 109862 34344 170014 185413 21736817 "D" "2009, 2010, 2013, 2016, 2017, 2019" 2016 91086 28148 166430 185413 21736817 "D" "2009, 2010, 2013, 2016, 2017, 2019" 2017 108391 33585 167936 180646 22249238 "D" "2009, 2010, 2013, 2016, 2017, 2019" 2017 90474 27627 171700 180646 22249238 "D" "2009, 2010, 2013, 2016, 2017, 2019" 2019 108606 33264 171597 174953 22856094 "N" "2004, 2007, 2009, 2010, 2012, 2015" 2004 19232 9381 31241 18419 1530635 "N" "2004, 2007, 2009, 2010, 2012, 2015" 2004 15078 7211 23134 18419 1530635 "N" "2004, 2007, 2009, 2010, 2012, 2015" 2007 22013 10591 32775 22924 1684170 "N" "2004, 2007, 2009, 2010, 2012, 2015" 2007 17728 8646 30426 22924 1684170 "N" "2004, 2007, 2009, 2010, 2012, 2015" 2009 16190 8659 25980 14386 1414464 "N" "2004, 2007, 2009, 2010, 2012, 2015" 2009 13258 7093 24728 14386 1414464 "N" "2004, 2007, 2009, 2010, 2012, 2015" 2010 17506 9212 28001 16233 1408086 "N" "2004, 2007, 2009, 2010, 2012, 2015" 2010 14227 7427 26638 16233 1408086 "N" "2004, 2007, 2009, 2010, 2012, 2015" 2012 16005 8536 25676 14767 1424842 "N" "2004, 2007, 2009, 2010, 2012, 2015" 2012 13130 7105 24265 14767 1424842 "N" "2004, 2007, 2009, 2010, 2012, 2015" 2015 18294 9305 29963 17648 1537269 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2004 243433 143292 384249 188943 10484806 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2004 187718 109003 286269 188943 10484806 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2007 211583 129336 293260 188821 11606121 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2007 170431 102388 247903 188821 11606121 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2009 183046 112335 260754 153222 10322587 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2009 145978 89110 224918 153222 10322587 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2010 169655 106144 262393 141682 10166067 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2010 136064 84812 231917 141682 10166067 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2013 187847 114920 290850 162287 9593800 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2013 155703 94880 267868 162287 9593800 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2016 233260 139551 351678 200450 10426943 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2016 199114 117355 311578 200450 10426943 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2017 200320 123088 297658 166122 10833494 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2017 166429 101416 274242 166122 10833494 "E" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2019 198243 126692 292526 147985 11489587 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2004 8923 5193 12364 6856 1028952 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2004 7472 4274 11647 6856 1028952 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2007 11956 6981 16427 9340 1111747 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2007 9953 5717 15131 9340 1111747 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2009 10395 6246 16119 8061 1073390 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2009 8430 5035 14925 8061 1073390 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2010 11219 6626 18648 8372 1059324 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2010 9330 5440 17657 8372 1059324 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2013 10401 5954 16357 8459 1074190 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2013 8618 4829 16527 8459 1074190 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2016 8047 4226 14316 6937 1031664 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2016 6347 3209 13224 6937 1031664 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2017 8158 4355 15487 7001 1047200 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2017 6336 3218 15151 7001 1047200 "F" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2019 7677 3982 13119 7512 1102451 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2004 202012 94859 258461 192454 13114840 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2004 146715 62096 200416 192454 13114840 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2007 164437 53851 190119 206134 13454244 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2007 138865 45544 174856 206134 13454244 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2009 268699 157870 315314 201103 12839429 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2009 222278 129035 286674 201103 12839429 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2010 173618 84696 205281 163209 13177550 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2010 153542 73255 206513 163209 13177550 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2013 174734 86831 232596 143463 13139786 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2013 144641 70544 220300 143463 13139786 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2016 186651 99124 272274 151073 13690344 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2016 155838 81484 286550 151073 13690344 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2017 182051 101110 294877 127955 13732391 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2017 156111 85198 298824 127955 13732391 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2018 205354 109370 311174 150327 14048514 "R" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2018, 2019" 2019 216933 115998 323086 153341 14082833 "G" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2004 0 0 0 294589 16087778 "G" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2004 143846 74797 254545 294589 16087778 "G" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2007 228362 130852 322061 287181 16345134 "G" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2007 162408 93637 291745 287181 16345134 "G" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2009 153028 84517 261951 183707 16026887 "G" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2009 0 0 0 183707 16026887 "G" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2010 0 0 0 157341 15583405 "G" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2010 0 0 0 157341 15583405 "G" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2013 0 0 0 286470 16758307 "G" "2004, 2007, 2009, 2010, 2013, 2016, 2017, 2019" 2013 190122 102710 317644 286470 16758307 end
and the code is:
Code:
clear all local all A C D N E F R G I N P S local E A D N E F R I P S local E_G `E' G local varlist contrib_survmicroent_NJCR startup_ratio av_sz_survmicroent av_sz_microent survival_sh peg peg_ratio nrp_sh_gr foreach benchmark in all E E_G { foreach restr in 0 1 { foreach incl01 in 0 1 { foreach agg in 0 1 { * ***************************************** local countrylist ``benchmark'' local lbl_bench `benchmark' if `restr'==1 local lbl_bench `benchmark'_restricted local macrosect if `incl01'==1 { local lbl _incl_01 local lblimport _incl_01 } else local lbl if `agg'==1 { local agg _agg local lblagg agg } else { local agg local lblagg byA7 local macrosect macrosect } use "${input}/F_data`lblimport'`agg'.dta", clear if `restr'==1 drop if inlist(country, "A", "C", "P", "S") * cleaning drop if country =="F" & year==2004 drop if country =="F" & year==2007 cap rename *_incl01* ** rename FL_surv_TM FL_surv rename L_surv_TM L_surv local varlistnumbers L_all emp_surv FL_surv drop years_available * relevant countries _keepcountry, countryvar(country) countrylist("`countrylist'") * variables keep country `macrosect' j year `varlist' `varlistnumbers' tempfile indicators save `indicators', replace * ***************************************** use `indicators', clear * do not include France drop if country=="FRA" * select cohorts keep if inlist(year,2004,2007,2009, 2010, 2012, 2013, 2016, 2017, 2019) * cohort variable gen cohort = string(year) replace cohort="2015 or 2016" if inlist(year,2015,2016) replace cohort="2012 or 2013" if inlist(year,2012,2013) * coverage by variable gen cty_year = country + " (" + string(year) + ")" egen group=group(`macrosect' j cohort) levelsof group , clean local(gplist) foreach var in `varlist' { gen `var'_cov="" foreach gp in `gplist' { levelsof cty_year if group==`gp' & !missing(`var'), clean local(coverage) sep(", ") replace `var'_cov="`coverage'" if group==`gp' } } * data checks summ `varlist', d * cross-country average: collapse local varcount local varcoverage foreach var in `varlist' { local varcount `varcount' c_`var'=`var' local varcoverage `varcoverage' `var'_cov } collapse (mean) `varlist' (firstnm) `varcoverage' (count) `varcount', by(j `macrosect' cohort ) for varlist `varlist': replace X=. if c_X==0 foreach var in `varlist' { replace `var'_cov=string(c_`var')+" countries: " + `var'_cov } drop c_* gen country="Average" tempfile xcavg save `xcavg', replace * ***************************************** use `indicators', clear keep if country=="F" * cohort variable gen cohort = string(year) replace cohort="2015 or 2016" if inlist(year,2015,2016) replace cohort="2012 or 2013" if inlist(year,2012,2013) * coverage by variable gen cty_year = country + " (" + string(year) + ")" egen group=group(`macrosect' j cohort) levelsof group , clean local(gplist) foreach var in `varlist' { gen `var'_cov="" foreach gp in `gplist' { levelsof cty_year if group==`gp' & !missing(`var'), clean local(coverage) sep(", ") replace `var'_cov="`coverage'" if group==`gp' } } * cross-country average: collapse local varcount local varcoverage foreach var in `varlist' { local varcount `varcount' c_`var'=`var' local varcoverage `varcoverage' `var'_cov } collapse (mean) `varlist' `varlistnumbers' (firstnm) `varcoverage' (count) `varcount' , by(j `macrosect' cohort ) for varlist `varlist': replace X=. if c_X==0 foreach var in `varlist' { replace `var'_cov=`var'_cov } drop c_* gen country="F" * ***************************************** append using `xcavg' sort country `macrosect' j cohort order country `macrosect' j cohort rename j horizon cap mkdir "${output}/`lblagg'" export excel "${output}/`lblagg'/F_data`lblagg'_`lbl_bench'`lbl'.xlsx", replace firstrow(variables) } } } }
I really apologize that you are lost (as me..) with the code and you are not a magician to solve the issue and even guess what the issue is..
I truely understand what you will think/feel but I am really on the same page.
Just in the broad point of view, of looking very quickly or etc.. if you find any clues of causes regarding the issue would really help me a lot.
Thank you so much in advance.
Comment