Generating a single graph representing kdensity, spikes, and dots on individual and party variables.

Mattia Gatti

Join Date: May 2023
Posts: 42

Generating a single graph representing kdensity, spikes, and dots on individual and party variables.

01 Nov 2023, 12:06

Hi there,

I open this new post to kindly ask for your suggestions concerning an issue I am facing when generating graphs with individual as well as political parties' data, on STATA 18.

Previously I opened a related post on the forum https://www.statalist.org/forums/for...-kdensity-plot , yet I believe it is best to separate the two given the greater articulation of this issue.

I am working with data from the 2019 CHES (Chapel Hill Expert Survey) that contains info on parties nested in 14 countries, as well as with the European Social Survey (ESS) containing information on individuals nested in 14 countries.

The goal is the one to generate kdensity graphs with spikes representing the median value for the individuals, and dots representing party positions, on 4 different socio-economic and socio-cultural issues, by country.

I possess information for individuals concerning their attitudes on the four issues: Redistribution_norm Immigration_policy_norm euftf_norm freehms_norm. I also generated variables for the median value, for each issue, plus the necessary upper limit and lower limit vars to create the spikes, e.g. I possess median_Redistribution_norm lowlim_Red_medianline upplim_Red_medianline. Finally, I possess information on each party for each country on each of the four issues, i.e. I generated a multitude of variables such as red_normSocialDemocrats or imm_normSocialDemocrats taking the same value for the individuals belonging to the same country, and missing for all others. To do so, I used the following coding:

Code:

/// 

use "/Users/mattiagatti/Desktop/1999-2019_CHES_dataset_meansv3 Supervisors.dta"

keep if year==2019

keep country party redistribution_norm multiclt_immig_policy_norm genderequality_norm eu_position_norm

save CHES2019_Supervisors_new.dta


// I rename the cntry var in ESS to do the merging


use ESS-Data-Wizard-subset-2023-10-25.dta

rename cntry country

save ESS_round9&10_subset_new2.dta


// I actually found out that the two country variables are different and need harmonization
use ESS_round9&10_subset_new2.dta

gen str3 iso_country = country

order iso_country, first
sort iso_country
bro 

drop country

save ESS_round9&10_subset_new2.dta, replace


// let's harmonize also on CHES

use "/Users/mattiagatti/Desktop/CHES2019_Supervisors_new.dta"


gen str3 iso_country = ""
replace iso_country = "BE" if country == 1
replace iso_country = "DK" if country == 2
replace iso_country = "DE" if country == 3
replace iso_country = "GR" if country == 4
replace iso_country = "ES" if country == 5
replace iso_country = "FR" if country == 6
replace iso_country = "IE" if country == 7
replace iso_country = "IT" if country == 8
replace iso_country = "NL" if country == 10
replace iso_country = "GB" if country == 11 //see possible mismatch --> we have UK in CHES, while GB in ESS but I use GB here to harmonize it
replace iso_country = "PT" if country == 12
replace iso_country = "AT" if country == 13
replace iso_country = "FI" if country == 14
replace iso_country = "SE" if country == 16


order iso_country, first
sort iso_country
drop country 

save CHES2019_Supervisors.dta_new, replace


// Change in wide format the CHES dataset 

sort iso_country party

// I need to make some changes to the categories of the party var by replacing signs such as / and -, dropping a party (DieTier), and the length of the variables' names

replace party = subinstr(party, "/", "_", .)
replace party = subinstr(party, "-", "_", .)
rename redistribution_norm red_norm
rename genderequality_norm gender_norm
rename multiclt_immig_policy_norm mult_imm_norm


// I now change into wide format and I generate as many variables as there are parties x4 (the four issue items)

reshape wide red_norm mult_imm_norm gender_norm eu_position_norm, i(iso_country) j(party) string //party is string variable

save CHES2019_Supervisors_new.dta, replace

// I now take the ESS and merge

use "/Users/mattiagatti/Desktop/ESS_round9&10_subset_new2.dta"
sort iso_country


merge m:1 iso_country using CHES2019_Supervisors_new.dta //PERFECT MATCHING (remember UK and GB difference)

save ESS_round9&10_subset_new2.dta, replace

I now possess a dataset that looks like this:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str3 iso_country long idno float(Redistribution_norm median_Redistribution_norm lowlim_Red_medianline upplim_Red_medianline red_normFPO red_normSPO red_normOVP red_normGrune red_normNEOS)
"AT" 21646   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 36779 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 15085   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 25655   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 28773 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 34466   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 14698   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 36504 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  9180 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 66304 .75 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 51621 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 64890 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 57650 .75 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 22869 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 54587 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 49061   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 58981  .5 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 45293   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 47422 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 36275 .75 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 23027 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 45610 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 24917   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  7843   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 38383   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 40509 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 29817 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 46621 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 54803 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 27444 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  7276   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  6174 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 51638 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  7501   1 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 51522   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 35957   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 51361 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 24636 .75 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 28416 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  3262   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  7189 .75 .25 0 1.7 .56 .24 .62 .25 .69
"AT"   815  .5 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 67043 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 67849 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  5633 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 11642   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 22281 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 46283   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 32498  .5 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 44448 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 49705 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 45030   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  6364 .75 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 18035 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 26283   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 60931   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 52326   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 65146   . .25 0 1.7 .56 .24 .62 .25 .69
"AT" 25078 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 60876 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  2153 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 47267 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  7046 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 11103 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 22467 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  6871 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 10728 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 21379   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 28210 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 35939   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  8675   1 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 14562   1 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 26645   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 29529   . .25 0 1.7 .56 .24 .62 .25 .69
"AT" 28694 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 34590 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  9441 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 20175 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 62112 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 57815   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  8286 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  9858 .75 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 34369 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 52138   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 16018 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  9976   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 35804   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 52560 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 39420   0 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 22258 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 35783 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 50091 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 21455  .5 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 35698 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 26741 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT"  6568  .5 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 40391 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 12455 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 10273 .25 .25 0 1.7 .56 .24 .62 .25 .69
"AT" 68393 .75 .25 0 1.7 .56 .24 .62 .25 .69
end
label values Redistribution_norm Redistribution_std1
label def Redistribution_std1 0 "0. Left", modify
label def Redistribution_std1 1 "1. Right", modify

** The 5 parties above are Austrian parties

Now, I would like to generate kdensity plots by countries estimating the distribution of individual preferences on each of the four issues, and superimpose a spike representing the median value for the individuals, plus multiple dots on the x axis representing the different positions political parties take on the issue.

The problem I face is that I do not possess a party_id variable making the task complex. Indeed, the code I am able to run is only one that takes the following form:

Code:

twoway kdensity Redistribution_norm if iso_country=="AT", bw(0.2) title(Redistribution Policy) range(0 1) xlabel(0 (0.25) 1)  || spike lowlim_Red_medianline upplim_Red_medianline  median_Redistribution_norm if iso_country=="AT", ytitle(Probability density) xtitle("`: var label Red'") || scatter red_normNEOS red_normGrune red_normFPO red_normOVP red_normSPO if iso_country=="AT", ylabel(0(0.1)1) ytitle("Redistribution")

As you can see, I am forced to insert all variables related to each party belonging to each country I am examining (in this case AT). I would like to find a way to play with just one variable in the scatter part of the code while adding the option "over(party identifier) asyvars" to have all parties represented by iso_country. I suppose the problem is made worse by the fact that the units of observation are individuals.

I hope the presentation of my problem was not too messy, and I thank you in advance for any fruitful suggestion to face the issue.

Best
Mattia

Tags: None

Announcement

Generating a single graph representing kdensity, spikes, and dots on individual and party variables.