I have several doubts regarding a project I have to do with data from IPUMS General Household Survey in Nigeria. I have two samples for years 2006 and 2007.
I need to analyse the socio-economic factors which influence whether a household owns, has access to a computer or none. Therefore, I will use multinomial regression.
Following is the sample description for 2006 sample (2007 slightly differs in the total number of included households and EAs): the sample followed a two-stage, replicated and rotable design in which enumeration areas (EAs) demarcated for the 1991 Population Census served as the primary sampling units and housing units (HUs) as the secondary sampling units. Sixty EAs per state and 30 EAs in the Federal Capital Territory, Abuja were randomly selected. In each EA, 10 households were selected randomly from a list of all households in the EA. In total, 21,900 housing units from 2,190 enumeration areas were included in the sample. The selected EAs were distributed across urban and rural areas.
The sample is weighted, meaning each record in the sample represents certain number of households from the population. If I am right, these are post-stratification weights.
However, my doubts are with setting up the design of the survey with svyset:
1. Should/can I use the weights in the sample as pweights?
2. Should I use the urban/rural variable as strata identifier?
3. Should I pool the data or perform the analysis on each year separatly?
Thank you in advance. Any help will be appreciated.
I need to analyse the socio-economic factors which influence whether a household owns, has access to a computer or none. Therefore, I will use multinomial regression.
Following is the sample description for 2006 sample (2007 slightly differs in the total number of included households and EAs): the sample followed a two-stage, replicated and rotable design in which enumeration areas (EAs) demarcated for the 1991 Population Census served as the primary sampling units and housing units (HUs) as the secondary sampling units. Sixty EAs per state and 30 EAs in the Federal Capital Territory, Abuja were randomly selected. In each EA, 10 households were selected randomly from a list of all households in the EA. In total, 21,900 housing units from 2,190 enumeration areas were included in the sample. The selected EAs were distributed across urban and rural areas.
The sample is weighted, meaning each record in the sample represents certain number of households from the population. If I am right, these are post-stratification weights.
However, my doubts are with setting up the design of the survey with svyset:
1. Should/can I use the weights in the sample as pweights?
2. Should I use the urban/rural variable as strata identifier?
3. Should I pool the data or perform the analysis on each year separatly?
Thank you in advance. Any help will be appreciated.
Comment