Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Normalization of variables with scores in graph line

    Dear Statalist users,

    I'm a PhD student and stata beginner, and probably my question would appear silly to many of you though I've been spending two days on the issue without sorting the problem out. I use a survey in which I have income deciles and electoral choice for every respondent. I have a score that I made in a previous analysis which I have to use in order to "weight" the actual percentages of votes for each party, in the sense that the vote of a given respondent will not weight 1 anymore but the equivalent of the score of the correspondent party he/she voted.

    My objective is to make a line graph that indicates on the y axis the weigthed percentages of votes for parties together, on the x axis the income deciles. My code isn't working when the score is applied.

    For drawing the unweighted graph for, let's say, the following two parties (M5S and LN) together, I wrote this code, where votes for parties are expressed in dummy variables (1 = yes; 0 = no):


    **** LN

    egen LN_income = total (LN), by(income)
    egen LN_income_norm = count (LN), by(income)
    gen LN_new = (LN_income/ LN_income_norm)
    sort income

    ** error bar
    gen LN_new_error = sqrt(LN_new*(1 - LN_new)/LN_income_norm)

    ** to plot
    gen LN_new_low = LN_new - LN_new_error
    gen LN_new_high = LN_new + LN_new_error


    ********* M5S

    egen M5S_income = total (M5S), by(income)
    egen M5S_income_norm = count (M5S), by(income)
    gen M5S_new = (M5S_income/ M5S_income_norm)
    sort income

    ** error bar
    gen M5S_new_error = sqrt(M5S_new*(1 - M5S_new)/M5S_income_norm)

    ** to plot
    gen M5S_new_low = M5S_new - M5S_new_error
    gen M5S_new_high = M5S_new + M5S_new_error


    **** sum of the two parties

    gen sum_M5S_LN = (M5S_income/ M5S_income_norm) + (LN_income/ LN_income_norm)
    gen sum_M5S_LN_income_norm = M5S_income_norm + LN_income_norm
    sort income
    ** error bar
    gen sum_M5S_LN_error = sqrt(sum_M5S_LN*(1 - sum_M5S_LN)/sum_M5S_LN_income_norm)
    ** to plot
    gen sum_M5S_LN_low = sum_M5S_LN - sum_M5S_LN_error
    gen sum_M5S_LN_high = sum_M5S_LN + sum_M5S_LN_error

    *** final plot (basic form with no options)

    line sum_M5S_LN income|| rcap sum_M5S_LN_low sum_M5S_LN_high income




    When I apply the score, the code becomes:


    ******************************** LN

    egen LN_income = total (LN), by(income)
    egen LN_income_norm = count (LN), by(income)
    gen LN_new = (LN_income/ LN_income_norm)* 0.78
    sort income

    ** error bar
    gen LN_new_error = sqrt(LN_new*(1 - LN_new)/LN_income_norm)

    ** to plot
    gen LN_new_low = LN_new - LN_new_error
    gen LN_new_high = LN_new + LN_new_error


    ***************************** M5S

    egen M5S_income = total (M5S), by(income)
    egen M5S_income_norm = count (M5S), by(income)
    gen M5S_new = (M5S_income/ M5S_income_norm) *0.56
    sort income

    ** error bar
    gen M5S_new_error = sqrt(M5S_new*(1 - M5S_new)/M5S_income_norm)

    ** to plot
    gen M5S_new_low = M5S_new - M5S_new_error
    gen M5S_new_high = M5S_new + M5S_new_error

    **************** sum of the two parties
    gen sum_M5S_LN = (M5S_income/ M5S_income_norm) + (LN_income/ LN_income_norm)
    gen sum_M5S_LN_income_norm = M5S_income_norm + LN_income_norm
    sort income

    ** error bar
    gen sum_M5S_LN_error = sqrt(sum_M5S_LN*(1 - sum_M5S_LN)/sum_M5S_LN_income_norm)
    ** to plot
    gen sum_M5S_LN_low = sum_M5S_LN - sum_M5S_LN_error
    gen sum_M5S_LN_high = sum_M5S_LN + sum_M5S_LN_error

    *** final plot (basic form)

    line sum_M5S_LN income|| rcap sum_M5S_LN_low sum_M5S_LN_high income




    The problem is with normalization, because although the graph has a credible shape, normalised values are wrong. I logically got to the conclusion that there must be an extra passage that I am missing about normalising with the sum of the two parties*mean_score after obtaining the percentage*score, but I got lost on how to do this (provided I'm right).

    I thank you in advance for your help,

    J.







Working...
X