Dear all,
I have two datasets (A &B) from two sections (A &B) of the same survey (Ghana Living Standard Survey, GLSS 6). Dataset A is on FINANCIAL SERVICES of households while dataset B is on AS SETS AND DURABLE CONSUMER GOODS of these households
But each of this datasets has no unique identifier. I want to know the proportion of households owing various assets and consumer durables by locality.
1. Please How do I create ID that uniquely identify observations for the these two datasets so as merge them?
2. Please help me understand what could possibly went wrong if you do crosstabs and some of your observations dropped, is it from dataset or stata command issues?
for example
tab loc5 (where var loc5 means various levels of locality)
loc5 Freq. Percent
---------------+-----------------------------------
Accra (GAMA) | 1,697 10.12
Other Urban | 5,748 34.27
Rural Coastal | 1,156 6.89
Rural Forest | 3,863 23.03
Rural Savannah | 4,308 25.69
---------------+-----------------------------------
Total | 16,772 100.00
tab s12aq16 (where var s12aq16 means do you have an insurance policy?
s12aq16 Freq. Percent
------------+-----------------------------------
yes | 13,822 22.54
no | 47,500 77.46
------------+-----------------------------------
Total | 61,322 100.00
And then I do crosstabs of loc5 by s12aq16 or the other way round but some of my observations dropped. Such as below.
tab loc5 s12aq16
| s12aq16
Loc5 yes no | Total
---------------+----------------------+----------
Accra (GAMA) | 553 1,141 | 1,694
Other Urban | 2,129 3,608 | 5,737
Rural Coastal | 237 918 | 1,155
Rural Forest | 873 2,989 | 3,862
Rural Savannah | 589 3,708 | 4,297
---------------+----------------------+----------
Total | 4,381 12,364 16,745
Ideally the minimum total should have been 16,772 and not 16,745.
So what went wrong and 27 observations were dropped?
Please sort me out.
I have two datasets (A &B) from two sections (A &B) of the same survey (Ghana Living Standard Survey, GLSS 6). Dataset A is on FINANCIAL SERVICES of households while dataset B is on AS SETS AND DURABLE CONSUMER GOODS of these households
But each of this datasets has no unique identifier. I want to know the proportion of households owing various assets and consumer durables by locality.
1. Please How do I create ID that uniquely identify observations for the these two datasets so as merge them?
2. Please help me understand what could possibly went wrong if you do crosstabs and some of your observations dropped, is it from dataset or stata command issues?
for example
tab loc5 (where var loc5 means various levels of locality)
loc5 Freq. Percent
---------------+-----------------------------------
Accra (GAMA) | 1,697 10.12
Other Urban | 5,748 34.27
Rural Coastal | 1,156 6.89
Rural Forest | 3,863 23.03
Rural Savannah | 4,308 25.69
---------------+-----------------------------------
Total | 16,772 100.00
tab s12aq16 (where var s12aq16 means do you have an insurance policy?
s12aq16 Freq. Percent
------------+-----------------------------------
yes | 13,822 22.54
no | 47,500 77.46
------------+-----------------------------------
Total | 61,322 100.00
And then I do crosstabs of loc5 by s12aq16 or the other way round but some of my observations dropped. Such as below.
tab loc5 s12aq16
| s12aq16
Loc5 yes no | Total
---------------+----------------------+----------
Accra (GAMA) | 553 1,141 | 1,694
Other Urban | 2,129 3,608 | 5,737
Rural Coastal | 237 918 | 1,155
Rural Forest | 873 2,989 | 3,862
Rural Savannah | 589 3,708 | 4,297
---------------+----------------------+----------
Total | 4,381 12,364 16,745
Ideally the minimum total should have been 16,772 and not 16,745.
So what went wrong and 27 observations were dropped?
Please sort me out.
Comment