Dear All,
I am working with a dataset containing multiple observations for each case (representing one respondent), where each observation represents an alternative that may be chosen. More specifically I have individuals nested in 14 countries, and for each individual (idno) I have different sets of alternatives (party_choice, varying by country) to choose from. The variable of interest is vote (taking the value of 1 when the party is chosen, 0 otherwise).
I have both case-specific variables (specific to each idno) like gender, education, age and country, as well as alternative-specific variables (that vary with each observation) such as dist_lr and dir_lr.
The dataset looks like this:
I was suggested to use cmclogit in order to estimate the effect of different measures (dist_lr dir_lr, etc) on vote.
I have thus declared the data:
I have then tried to run cmclogit. First with a bivariate regression, then by adding the case-specific variables, and finally with se clusterized by respondent.
The issue I encounter is the never-ending initial iterations for solving maximum likelihood. I tried to wait long time (4 hrs) but with no success.
More specifically, an example of the output I receive for each Iteration is:
Does anyone know why?
More generally, given the structures of the data, I have some questions:
1. Is cmclogit the most appropriate conditional logit model to use in this case? Or the cmmixlogit command (mixed logit choice model) would be more appropriate given the alternative-specific variables to be included in the analysis?
2. Do I need to estimate se clustered by the respondent id, or do cmclogit (as well as cmmixlogit) account for the non-independence of obs?
3. Since it is a cross-sectional study with individuals nested in countries, shall country as a variable appear in the casevars box?
Sincerely
Mattia
I am working with a dataset containing multiple observations for each case (representing one respondent), where each observation represents an alternative that may be chosen. More specifically I have individuals nested in 14 countries, and for each individual (idno) I have different sets of alternatives (party_choice, varying by country) to choose from. The variable of interest is vote (taking the value of 1 when the party is chosen, 0 otherwise).
I have both case-specific variables (specific to each idno) like gender, education, age and country, as well as alternative-specific variables (that vary with each observation) such as dist_lr and dir_lr.
The dataset looks like this:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float idno long(country party_choice) str8 party_voted float(vote dist_lr dir_lr) 18422 1 66 "Grune" 0 5.9 -7.6 18422 1 101 "Grune" 0 3 4 18422 1 40 "Grune" 1 1.5 10 18422 1 33 "Grune" 0 8.1 -16.400002 18422 1 65 "Grune" 0 4.9 -3.6000004 12487 1 66 "OVP" 1 1.9 0 12487 1 33 "OVP" 0 4.1000004 0 12487 1 101 "OVP" 0 1 0 12487 1 40 "OVP" 0 2.5 0 12487 1 65 "OVP" 0 .9000001 0 21381 1 33 "OVP" 0 4.1000004 0 21381 1 65 "OVP" 0 .9000001 0 21381 1 66 "OVP" 1 1.9 0 21381 1 101 "OVP" 0 1 0 21381 1 40 "OVP" 0 2.5 0 8283 1 40 "OVP" 0 5.5 -7.5 8283 1 101 "OVP" 0 4 -3 8283 1 65 "OVP" 0 2.1 2.7 8283 1 33 "OVP" 0 1.1000004 12.3 8283 1 66 "OVP" 1 1.0999999 5.7 18744 1 66 "NEOS" 0 3.9 -3.8 18744 1 65 "NEOS" 1 2.9 -1.8000002 18744 1 40 "NEOS" 0 .5 5 18744 1 101 "NEOS" 0 1 2 18744 1 33 "NEOS" 0 6.1 -8.200001 25697 1 66 "FPO" 0 1.0999999 5.7 25697 1 33 "FPO" 1 1.1000004 12.3 25697 1 65 "FPO" 0 2.1 2.7 25697 1 40 "FPO" 0 5.5 -7.5 25697 1 101 "FPO" 0 4 -3 23799 1 40 "SPO" 0 .5 7.5 23799 1 101 "SPO" 1 2 3 23799 1 66 "SPO" 0 4.9 -5.7 23799 1 33 "SPO" 0 7.1 -12.3 23799 1 65 "SPO" 0 3.9 -2.7 26323 1 101 "FPO" 0 1 0 26323 1 33 "FPO" 1 4.1000004 0 26323 1 40 "FPO" 0 2.5 0 26323 1 66 "FPO" 0 1.9 0 26323 1 65 "FPO" 0 .9000001 0 18772 1 33 "SPO" 0 4.1000004 0 18772 1 65 "SPO" 0 .9000001 0 18772 1 101 "SPO" 1 1 0 18772 1 66 "SPO" 0 1.9 0 18772 1 40 "SPO" 0 2.5 0 24353 1 101 "FPO" 0 1 0 24353 1 65 "FPO" 0 .9000001 0 24353 1 66 "FPO" 0 1.9 0 24353 1 40 "FPO" 0 2.5 0 24353 1 33 "FPO" 1 4.1000004 0 17765 1 33 "Grune" 0 7.1 -12.3 17765 1 40 "Grune" 1 .5 7.5 17765 1 65 "Grune" 0 3.9 -2.7 17765 1 66 "Grune" 0 4.9 -5.7 17765 1 101 "Grune" 0 2 3 17033 1 33 "SPO" 0 4.1000004 0 17033 1 40 "SPO" 0 2.5 0 17033 1 101 "SPO" 1 1 0 17033 1 65 "SPO" 0 .9000001 0 17033 1 66 "SPO" 0 1.9 0 21874 1 65 "OVP" 0 .0999999 .9000001 21874 1 101 "OVP" 0 2 -1 21874 1 33 "OVP" 0 3.1000004 4.1000004 21874 1 40 "OVP" 0 3.5 -2.5 21874 1 66 "OVP" 1 .9000001 1.9 14926 1 65 "SPO" 0 .9000001 0 14926 1 40 "SPO" 0 2.5 0 14926 1 101 "SPO" 1 1 0 14926 1 66 "SPO" 0 1.9 0 14926 1 33 "SPO" 0 4.1000004 0 15558 1 66 "SPO" 0 2.9 -1.9 15558 1 65 "SPO" 0 1.9 -.9000001 15558 1 40 "SPO" 0 1.5 2.5 15558 1 33 "SPO" 0 5.1 -4.1000004 15558 1 101 "SPO" 1 0 1 8221 1 101 "SPO" 1 0 1 8221 1 65 "SPO" 0 1.9 -.9000001 8221 1 33 "SPO" 0 5.1 -4.1000004 8221 1 66 "SPO" 0 2.9 -1.9 8221 1 40 "SPO" 0 1.5 2.5 22207 1 40 "OVP" 0 2.5 0 22207 1 33 "OVP" 0 4.1000004 0 22207 1 65 "OVP" 0 .9000001 0 22207 1 101 "OVP" 0 1 0 22207 1 66 "OVP" 1 1.9 0 17667 1 65 "OVP" 0 . . 17667 1 33 "OVP" 0 . . 17667 1 66 "OVP" 1 . . 17667 1 101 "OVP" 0 . . 17667 1 40 "OVP" 0 . . 9545 1 33 "NEOS" 0 3.1000004 4.1000004 9545 1 65 "NEOS" 1 .0999999 .9000001 9545 1 40 "NEOS" 0 3.5 -2.5 9545 1 66 "NEOS" 0 .9000001 1.9 9545 1 101 "NEOS" 0 2 -1 8266 1 66 "OVP" 1 .0999999 3.8 8266 1 65 "OVP" 0 1.0999999 1.8000002 8266 1 101 "OVP" 0 3 -2 8266 1 40 "OVP" 0 4.5 -5 8266 1 33 "OVP" 0 2.1000004 8.200001 end label values country country_label label def country_label 1 "Austria", modify label values party_choice party_choice label def party_choice 33 "FPO", modify label def party_choice 40 "Grune", modify label def party_choice 65 "NEOS", modify label def party_choice 66 "OVP", modify label def party_choice 101 "SPO", modify
I have thus declared the data:
Code:
cmset idno party_choice
I have then tried to run cmclogit. First with a bivariate regression, then by adding the case-specific variables, and finally with se clusterized by respondent.
Code:
cmclogit vote dist_lr cmclogit vote dist_lr, casevars(i.gender agea eduyrs) cmclogit vote dist_lr, casevars(i.gender agea eduyrs) vce(cl idno)
The issue I encounter is the never-ending initial iterations for solving maximum likelihood. I tried to wait long time (4 hrs) but with no success.
More specifically, an example of the output I receive for each Iteration is:
Code:
Iteration 21: Log likelihood -22325.631(not concave)
More generally, given the structures of the data, I have some questions:
1. Is cmclogit the most appropriate conditional logit model to use in this case? Or the cmmixlogit command (mixed logit choice model) would be more appropriate given the alternative-specific variables to be included in the analysis?
2. Do I need to estimate se clustered by the respondent id, or do cmclogit (as well as cmmixlogit) account for the non-independence of obs?
3. Since it is a cross-sectional study with individuals nested in countries, shall country as a variable appear in the casevars box?
Sincerely
Mattia