Hi, I am doing a series of data cleaning on data of multiple countries. To make this job easier, I have defined a program for the repeated parts of the code, and call it for each country.
My issue is that this program works for majority of the countries (like Bangladesh), but does not work for some of the countries (like Yemen). Would you mind helping me with this? Thank you.
I have created a sample of datasets for Bangladesh and Yemen below using the dataex command. This is the program I have defined:
And I call it for two countries as below:
And
Yemen's sample data:
And Bangladesh's sample data:
My issue is that this program works for majority of the countries (like Bangladesh), but does not work for some of the countries (like Yemen). Would you mind helping me with this? Thank you.
I have created a sample of datasets for Bangladesh and Yemen below using the dataex command. This is the program I have defined:
Code:
program replace_m // Some observations are labeled as "Don't Know" and take the value of -9. This function recodes them as missing.
capture describe d2 n2e n2a n7a l1 l6 a14y n2f n2b n6a a15y n2ra a4a
if _rc == 0 {
keep d2 n2e n2a n7a l1 l6 a14y n2f n2b n6a a15y n2ra a4a
rename (d2 n2e n2a n7a l1 l6 a14y n2f n2b n6a a15y n2ra a4a) (total_sales total_input_costs total_cost_labor machinery_replacement_value num_perm_worker num_temp_worker interview_year total_cost_fuel total_cost_electricity net_book_value_machinery interview_year_end total_rent industry)
replace total_sales = . if total_sales < 0
replace total_input_costs = . if total_input_costs < 0
replace total_cost_labor = . if total_cost_labor < 0
replace machinery_replacement_value = . if machinery_replacement_value < 0
replace total_cost_fuel = . if total_cost_fuel < 0
replace num_perm_worker = . if num_perm_worker < 0
replace num_temp_worker = . if num_temp_worker < 0
replace total_cost_electricity = . if total_cost_electricity < 0
replace net_book_value_machinery = . if net_book_value_machinery < 0
replace total_rent = . if total_rent < 0
}
end
And I call it for two countries as below:
Code:
program clean_yemen
use "./data/WBES_FIRM/Yemen/Yemen-2010-full-data-.dta", clear
gen ave_exchange_rate = 202.84666667 // Official average exchange rate of USD per YER for 2009; source: FAO STAT
gen country_name = "Yemen"
gen country_iso = "YEM"
gen year = 2010
replace_m
save "./output/country_Yemen.dta", replace
end
Code:
program clean_bangladesh
use "./data/WBES_FIRM/Bangladesh/Bangladesh-2013-full-data.dta", clear
gen ave_exchange_rate = 74.1524 // Official average exchange rate of USD per BDT for 2011; source: FAO STAT
gen country_name = "Bangladesh"
gen country_iso = "BGD"
gen year = 2013
replace_m
save "./output/country_Bangladesh.dta", replace
end
Yemen's sample data:
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input double(d2 n2e n2a n7a) int(l1 l6 a14y) double n2f long n2b double n6a int a15y byte a4a
-9 . -9 . 250 30 2010 . -9 . 2010 52
5000000 4000000 1600000 6000000 5 0 2010 120000 200000 -9 2010 28
8000000 6000000 1556000 -9 7 0 2010 -8 1202000 -9 2010 18
10000000 . 1000000 . 10 0 2010 . 1500000 . 2010 52
-9 -9 -9 -9 10 20 2010 -9 200000 -9 2010 2
30000000 20000000 6000000 40000000 15 0 2010 30000000 2700000 26000000 2010 26
-9 -9 -8 100000 5 0 2010 -8 20000 -9 2010 18
500000 200000 -9 10000 5 0 2010 -8 14400 0 2010 18
-9 . -9 . 18 4 2010 . -9 . 2010 55
-9 -9 -9 -9 10 9 2010 -9 200000 -9 2010 28
-9 . -8 . 6 12 2010 . -8 . 2010 52
1.200e+08 4000000 6000000 35000000 31 7 2010 500000 100000 20000000 2010 26
-9 . 2296000 . 15 11 2010 . -9 . 2010 45
200000 600000 400000 600000 5 0 2010 36000 -9 50000 2010 26
2493925376 . 117896720 . 300 100 2010 . 49738168 . 2010 51
12000000 1400000 2000000 0 5 3 2010 -8 90000 0 2010 2
5000000 . -9 . 5 0 2010 . 660000 . 2010 52
6000000 . 2000000 . 10 0 2010 . -9 . 2010 55
5000000 . 1500000 . 7 2 2010 . 600000 . 2010 52
-9 -9 -9 -9 10 0 2010 -9 -9 -9 2010 2
9846871 . 9617782 . 25 10 2010 . 200000 . 2010 51
-9 -9 -9 -9 160 0 2010 -9 -9 -9 2010 28
1000000 . 200000 . 5 0 2010 . 120000 . 2010 28
-9 7000000 3000000 2.000e+08 60 10 2010 200000 1500000 2.000e+08 2010 28
-9 . -9 . 10 0 2010 . 360000 . 2010 45
5.000e+08 4.980e+08 71000000 1.200e+08 200 0 2010 7000000 5800000 1.200e+08 2010 52
60000000 36000000 15000000 -9 95 20 2010 1000000 0 1.520e+08 2010 26
-9 4.000e+08 1.500e+08 -9 365 45 2010 10000000 -9 -9 2010 24
-9 -8 -8 -9 220 0 2010 1000000 6000000 -8 2010 24
5.840e+08 . 15000000 . 25 4 2010 . -9 . 2010 60
-9 -9 -9 -9 240 0 2010 -9 -9 -9 2010 24
1000000 500000 -9 100000 13 6 2010 -8 60000 -9 2010 18
-9 . -9 . 38 0 2010 . 2000000 . 2010 52
3857281024 . 4000000 . 40 10 2010 . 6000000 . 2010 52
60000000 . 11000000 . 60 30 2010 . 936000 . 2010 52
-9 . -8 . 9 1 2010 . 300000 . 2010 51
4.000e+08 . 20000000 . 5000 200 2010 . 100000 . 2010 45
1.000e+08 . 15000000 . 50 0 2010 . 800000 . 2010 52
-9 . -8 . 10 0 2010 . -8 . 2010 50
1200000 . 2000000 . 7 4 2010 . 300000 . 2010 55
1.800e+09 500000 2.000e+08 5.500e+08 500 250 2010 65000 2200000 4.500e+08 2010 25
1.000e+08 6.000e+08 2880000 1000000 8 10 2010 1000000 200000 2000000 2010 28
50000000 . 12000000 . 40 0 2010 . 180000 . 2010 51
-9 -8 -8 2000000 14 3 2010 -9 600000 4000000 2010 28
2000000 1000000 400000 18000000 15 10 2010 360000 240000 18000000 2010 26
15000000 3000000 600000 6000000 6 0 2010 1440000 840000 5000000 2010 26
1000000 . 1000000 . 5 8 2010 . 500000 . 2010 50
2.400e+10 -8 -8 -9 1357 44 2010 -8 -8 -9 2010 15
11000000512 5.200e+09 2381659904 -9 1696 216 2010 2.800e+08 32000000 2381659904 2010 15
-9 -8 -8 -9 200 10 2010 -8 -8 -8 2010 2
1.4336e+09 . 50000000 . 40 0 2010 . 840000 . 2010 50
-9 -9 -9 -9 -9 0 2010 -9 -9 -9 2010 2
30000000 . 42000000 . 12 4 2010 . 480000 . 2010 55
10000000 -9 3000000 50000000 7 4 2010 -9 840000 50000000 2010 26
20000000 20000000 3000000 50000000 25 8 2010 4000000 72000 50000000 2010 26
9000000 . 1500000 . 5 0 2010 . -9 . 2010 52
1000000 -8 1080000 70000 5 2 2010 -9 72000 70000 2010 26
4800000 . 1640000 . 5 0 2010 . 96000 . 2010 55
15000000 9000000 2200000 4000000 10 0 2010 50000 240000 4000000 2010 2
10000000 6840000 1000000 800000 5 0 2010 580000 60000 1000000 2010 15
2880000 . 216000 . 8 10 2010 . 360000 . 2010 55
3000000 . 300000 . 5 0 2010 . 72000 . 2010 52
2000000 500000 1200000 1000000 5 2 2010 -8 192000 500000 2010 18
6499999744 . 35000000 . 69 12 2010 . 7800000 . 2010 51
5000000 -9 1800000 56000 8 3 2010 2500 36000 56000 2010 18
2.000e+08 . 13000000 . 39 12 2010 . 15000000 . 2010 55
4.320e+09 . 23881630 . 30 4 2010 . 1200000 . 2010 51
5760000 1000000 3012000 300000 24 15 2010 -8 150000 700000 2010 18
-9 . 5000000 . 9 4 2010 . 600000 . 2010 52
3000000 2000000 1200000 3000000 9 10 2010 500000 -9 2000000 2010 26
35000000 10000000 15000000 10000000 25 10 2010 500000 5000000 10000000 2010 26
-9 . 1000000 . 18 1 2010 . 150000 . 2010 51
72000000 . 1500000 . 35 0 2010 . 840000 . 2010 55
4000000 900000 1200000 1000000 7 0 2010 -9 200000 500000 2010 2
12000000 . 2000000 . 5 0 2010 . 3000000 . 2010 52
-9 . -9 . 10 0 2010 . 840000 . 2010 55
7000000 . 3600000 . 40 10 2010 . 400000 . 2010 55
1.360e+08 . 36000000 . 75 10 2010 . 23000000 . 2010 55
-9 . 60000000 . 170 0 2010 . 12000000 . 2010 55
-9 40600000 26200000 80000000 40 0 2010 3050000 960000 86000000 2010 28
-9 . -9 . 20 0 2010 . -9 . 2010 52
-9 . 7200000 . 30 0 2010 . 1200000 . 2010 51
3.000e+09 1.528e+10 255630800 8814000128 500 0 2010 1.170e+08 -9 2344999936 2010 17
3.000e+08 . 2000000 . 6 10 2010 . 240000 . 2010 52
8.800e+09 . 24000000 . 48 0 2010 . 12000000 . 2010 51
4.000e+09 2.000e+09 65460000 3.000e+09 150 10 2010 1.500e+08 10000000 2.000e+09 2010 15
44999999488 4.000e+09 4000000 7.000e+09 500 200 2010 1.200e+09 -9 4.000e+09 2010 15
4.710e+08 1.870e+08 18000000 1.600e+08 121 27 2010 3600000 14400000 14200000 2010 18
75000000 39000000 1.850e+08 -9 38 0 2010 400000 1600000 -9 2010 28
50000000 20000000 18000000 -9 75 0 2010 6000000 4800000 5.000e+08 2010 28
10000000 3600000 4320000 -9 9 0 2010 -8 72000 -8 2010 28
1800000 . 1800000 . 5 1 2010 . 60000 . 2010 52
-9 -9 -9 1800000 5 0 2010 360000 500000 1000000 2010 18
4015000 250000 2281250 500000 6 0 2010 50000 144000 100000 2010 18
3273659904 . 33000000 . 29 0 2010 . 480000 . 2010 51
48000000 . 1500000 . 5 0 2010 . 48000 . 2010 52
3600000 . 1200000 . 5 0 2010 . 360000 . 2010 55
1.560e+08 1.400e+08 16000000 -9 33 30 2010 21000000 10000000 -9 2010 26
4000000 1800000 2160000 -9 5 1 2010 96000 120000 -9 2010 2
18250000 . 24000000 . 10 3 2010 . 1560000 . 2010 55
end
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input double(d2 n2e n2a n7a) int(l1 l6 a14y) long(n2f n2b) double n6a int a15y byte a4a
1.934e+08 1.200e+08 35000000 6.000e+09 600 80 2013 400000 5000000 2.000e+09 2013 15
90000000 70000000 4000000 12000000 60 0 2013 30000 60000 8000000 2013 31
5.400e+08 4.000e+08 14000000 60000000 110 0 2013 300000 1000000 40000000 2013 19
1.000e+09 6.000e+08 2.500e+08 2.500e+08 1800 0 2013 10000000 6000000 1.500e+08 2013 18
1.400e+09 9.000e+08 3.200e+08 2.300e+08 3000 0 2013 5000000 6000000 18000000 2013 18
7.200e+08 5.700e+08 90000000 45000000 1100 0 2013 1200000 9000000 30000000 2013 18
3.000e+08 2.700e+08 10000000 60000000 110 0 2013 2000000 600000 50000000 2013 29
1.600e+08 80000000 40000000 12000000 860 0 2013 1300000 1300000 10000000 2013 17
1.200e+08 67000000 24000000 20000000 350 0 2013 1700000 800000 15000000 2013 17
2.000e+08 30000000 50000000 15000000 350 0 2013 10000000 1000000 10000000 2013 18
6.000e+08 1.000e+08 3.000e+08 -9 250 150 2013 1000000 2000000 -9 2013 18
1.800e+08 1.037e+08 6634000 -9 1700 0 2013 2900000 6000000 -9 2013 18
3.054e+08 2.300e+08 40000000 60000000 448 0 2013 1200000 8000000 10900000 2013 18
9600000 6000000 600000 -9 12 0 2013 12000 140000 -9 2013 15
3.415e+08 1.600e+08 1.000e+08 30000000 1000 0 2013 900000 500000 20000000 2013 18
1400000 1200000 200000 70000 5 0 2013 0 20000 50000 2013 27
2.600e+08 1.800e+08 12000000 30000000 500 0 2013 1000000 300000 14000000 2013 18
6.600e+08 5.500e+08 24000000 50000000 500 0 2013 0 884000 30000000 2013 18
6.357e+08 3.552e+08 36200000 2.600e+08 350 0 2013 10400000 300000 1.500e+08 2013 18
1.450e+09 9.700e+08 75000000 1.200e+09 1450 0 2013 30000000 0 8.700e+08 2013 17
1.490e+09 6.400e+08 123423564 1.200e+09 1100 200 2013 5000000 1200000 1.000e+09 2013 17
8.500e+08 3.000e+08 1.500e+08 2.000e+09 900 30 2013 140000000 0 1.500e+09 2013 17
5.400e+08 4.700e+08 36000000 25000000 500 0 2013 10800000 0 20000000 2013 18
1500000 0 50000 60000 4 0 2013 0 12000 40000 2013 27
18500000 2200000 3000000 3000000 91 10 2013 1500000 1000000 2000000 2013 15
7.200e+09 2.000e+09 1.200e+09 3.000e+09 2600 500 2013 500000000 200000000 2.000e+09 2013 24
8.000e+08 4.000e+08 20000000 2.000e+09 1200 0 2013 50000000 50000000 2.000e+09 2013 24
2.400e+08 1.500e+08 30000000 1.500e+08 250 0 2013 400000 4500000 5.000e+08 2013 24
3.400e+08 1.300e+08 70000000 1.500e+08 72 0 2013 2000000 2000000 90000000 2013 24
3.000e+08 20000000 96000000 2.000e+08 400 0 2013 2400000 2600000 1.000e+09 2013 18
2.000e+08 1.500e+08 6500000 -9 100 30 2013 100000 84000 52400000 2013 19
40000000 15000000 2700000 500000 30 30 2013 150000 3500000 150000 2013 19
3000000 1000000 960000 700000 20 0 2013 240000 120000 500000 2013 29
3.600e+08 2.900e+08 45000000 40000000 800 0 2013 7200000 3000000 30000000 2013 18
6.000e+08 4.900e+08 48000000 55000000 830 0 2013 3000000 2500000 60000000 2013 18
6.000e+08 3.000e+08 2.000e+08 6500000 700 200 2013 5000000 10000000 6000000 2013 17
37000000 5000000 12000000 40000000 200 0 2013 1600000 2000000 20000000 2013 15
6.500e+08 5.500e+08 22000000 5.000e+08 150 10 2013 700000 2500000 2.000e+08 2013 19
3.000e+08 1.300e+08 1.200e+08 7.000e+08 550 50 2013 10000000 10000000 5.000e+08 2013 24
8.900e+08 6.500e+08 30000000 45000000 2050 0 2013 5000000 15000000 30000000 2013 18
1.300e+08 60000000 40000000 32000000 470 0 2013 2000000 6000000 28100000 2013 18
3.080e+09 2.050e+09 3.200e+08 1.400e+08 5000 0 2013 30000000 20000000 1.200e+08 2013 18
8.000e+08 4.600e+08 50000000 60000000 550 0 2013 6000000 20000000 50000000 2013 18
12000000 4500000 3600000 30000000 24 0 2013 0 360000 25000000 2013 19
577500 . 80000 . 20 0 2013 . 120000 . 2013 52
1.160e+09 9.200e+08 1.800e+08 1.500e+08 1150 0 2013 8600000 3800000 1.300e+08 2013 17
32500000 10000000 9000000 40000000 110 0 2013 1200000 1800000 20000000 2013 17
3.000e+08 2.630e+08 21000000 15000000 250 0 2013 3744000 1080000 1.200e+08 2013 18
3.600e+08 3.070e+08 26000000 20000000 305 0 2013 3000000 1800000 15000000 2013 18
95000000 . 7100000 . 70 0 2013 . 1300000 . 2013 45
-9 -9 -9 -9 270 0 2013 -9 -9 -9 2013 15
50000000 . 4000000 . 35 5 2013 . 700000 . 2013 52
1.859e+10 -9 -9 -9 1202 0 2013 -9 -9 -9 2013 24
1.900e+08 60000000 1.000e+08 70000000 1350 0 2013 1000000 6600000 50000000 2013 18
1.500e+08 40000000 40000000 20000000 200 0 2013 5000000 2000000 15000000 2013 18
4.800e+08 4.230e+08 38000000 2.500e+08 270 0 2013 3000000 2500000 2.000e+08 2013 18
1.800e+09 1.620e+09 84000000 90000000 1400 0 2013 42000000 600000 70000000 2013 18
35454000 15160000 3700000 1.407e+08 350 0 2013 51000 4054000 10000000 2013 18
1.100e+08 50000000 30000000 50000000 25 15 2013 2000000 4000000 30000000 2013 15
30000000 9000000 8400000 3.000e+08 123 0 2013 2500000 1300000 15000000 2013 31
5.000e+09 2.000e+09 3.000e+08 2.000e+09 1500 0 2013 30000000 30000000 4.500e+08 2013 23
1.000e+08 500000 4000000 15000000 330 0 2013 300000 1000000 10000000 2013 18
1.600e+09 1.000e+09 3.000e+08 15000000 2500 0 2013 10000000 15000000 10000000 2013 17
1.300e+08 69000000 31200000 8.000e+08 300 0 2013 21600000 480000 5.200e+08 2013 18
60000000 27000000 20000000 10000000 350 0 2013 1600000 1200000 8000000 2013 18
1.200e+08 90000000 14400000 50000000 120 0 2013 1800000 4800000 45000000 2013 19
90000000 76000000 6000000 9000000 120 0 2013 600000 264000 6000000 2013 18
3.225e+08 6450000 23400000 -9 1500 30 2013 11500000 200000 -9 2013 18
2.000e+08 1.300e+08 52800000 50000000 550 0 2013 3000000 2160000 40000000 2013 18
6.000e+08 4.780e+08 1.000e+08 50000000 1500 0 2013 6000000 1800000 40000000 2013 17
4.800e+08 2.915e+08 1.500e+08 95000000 1950 0 2013 20000000 3600000 80000000 2013 24
2.400e+08 1.200e+08 20000000 46000000 250 0 2013 84000000 120000 40000000 2013 18
1.200e+08 83000000 25200000 9000000 320 0 2013 108000000 1000000 7000000 2013 18
851772215 732479813 40627181 -9 550 50 2013 4729666 1194353 21920845 2013 17
3.500e+08 -9 -9 -9 275 0 2013 -9 -9 -9 2013 19
3700000 1500000 800000 400000 10 0 2013 -9 120000 200000 2013 27
-9 -9 -9 -9 2200 500 2013 -9 -9 -9 2013 18
6.000e+08 4.870e+08 80000000 86000000 1100 0 2013 13500000 6000000 80000000 2013 18
9.000e+08 70000000 55000000 6.500e+08 600 60 2013 12000000 10000000 6.000e+08 2013 17
7.000e+08 5.000e+08 12500000 2.500e+08 100 30 2013 7000000 6000000 2.000e+08 2013 31
90000000 63000000 13000000 14000000 250 0 2013 3000000 720000 12000000 2013 31
3.600e+08 2.300e+08 82000000 40000000 750 0 2013 8000000 3600000 35000000 2013 24
6000000 4500000 240000 2000000 7 0 2013 60000 18000 1500000 2013 15
4905783521 2.1618e+09 1152050000 7.000e+08 1907 0 2013 354200000 482000000 5.000e+08 2013 17
3.600e+08 3.000e+08 15800000 19000000 110 0 2013 3600000 1200000 17000000 2013 31
2.400e+09 2.100e+09 1.170e+08 1.000e+08 650 0 2013 18000000 6000000 80000000 2013 24
2.000e+08 1.600e+08 12000000 50000000 140 30 2013 2000000 2600000 40000000 2013 17
4.800e+08 4.300e+08 8600000 8000000 60 0 2013 1500000 1200000 6000000 2013 31
5.200e+08 3.640e+08 1.200e+08 15000000 1400 0 2013 4800000 2400000 12000000 2013 18
3.000e+08 2.330e+08 36000000 1.200e+08 2500 0 2013 3600000 1800000 1.000e+08 2013 24
4000000 2400000 480000 1000000 5 0 2013 24000 78000 800000 2013 15
2500000 . 85700 . 4 1 2013 . 9600 . 2013 52
186020000 1.400e+08 28000000 20000000 300 0 2013 800000 1500000 12000000 2013 26
-9 . 13500000 . 120 0 2013 . 200000 . 2013 60
3.000e+08 2.000e+08 30000000 1.800e+08 225 0 2013 1600000 3700000 1.300e+08 2013 31
1.500e+09 1.000e+09 1.200e+08 3.000e+08 2000 0 2013 1500000 4000000 1.500e+08 2013 18
34186960 20000000 3500000 2000000 35 80 2013 200000 300000 1000000 2013 24
15000000 10000000 400000 1000000 4 25 2013 0 150000 600000 2013 15
1.350e+08 62000000 16000000 30000000 200 0 2013 1500000 500000 20000000 2013 18
2.300e+08 1.200e+08 46500000 50000000 500 0 2013 200000 1800000 30000000 2013 18
end
label values d2 LABF
label values l1 LABF
label values n2a LABF
label values n2e LABF
label values n2f LABF
label values n2b LABF
label values n6a LABF
label values n7a LABF
label def LABF -9 "DON'T KNOW", modify
label values l6 L6
label def L6 0 "NO FULL-TIME SEASONAL OR TEMPORTARY WORKERS", modify
label values a4a LABC
label def LABC 15 "Food", modify
label def LABC 17 "Textiles", modify
label def LABC 18 "Garments", modify
label def LABC 19 "Leather", modify
label def LABC 23 "Refined petroleum product", modify
label def LABC 24 "Chemicals", modify
label def LABC 26 "Non metallic mineral products", modify
label def LABC 27 "Basic metals", modify
label def LABC 29 "Machinery and equipment (29 & 30)", modify
label def LABC 31 "Electronics (31 & 32)", modify
label def LABC 45 "Construction Section F: F", modify
label def LABC 52 "Retail", modify
label def LABC 60 "Transport Section I: (60-64) I", modify

Comment