Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create variable for number of times dichotomous variable is 1

    I am looking to create a line graph that shows the rate of the event described by the dichotomous variable vote_rd over time, in other words, the line graph is intended to show how many times vote_rd == 1 over time. For this, I believe that I require a variable for the number of times vote_rd == 1 but I am unsure as to how to make this variable. There are a number of countries in the dataset as well, from which I only intend to use "DK" "AT" "FI" "FR" "SE" and "NL".

    Example of the data:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float vote_rd double year str2 cntry
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    1 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    1 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    1 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    1 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    1 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    1 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    1 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    1 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    1 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    0 2002 "DK"
    end

  • #2
    Code:
    egen count = total(vote_rd), by(year)
    might be a start and

    Code:
    egen count = total(vote_rd) if inlist(country, "DK", "AT", "FI", "FR", "SE", "NL"), by(year)
    might be one step further.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      Code:
      egen count = total(vote_rd), by(year)
      might be a start and

      Code:
      egen count = total(vote_rd) if inlist(country, "DK", "AT", "FI", "FR", "SE", "NL"), by(year)
      might be one step further.
      It seems to have worked, the frequencies of vote_rd and the values of count match perfectly. Thank you for your help

      Comment


      • #4
        Originally posted by Nick Cox View Post
        Code:
        egen count = total(vote_rd), by(year)
        might be a start and

        Code:
        egen count = total(vote_rd) if inlist(country, "DK", "AT", "FI", "FR", "SE", "NL"), by(year)
        might be one step further.
        The counts work well but I have been getting some very messy graphs, I think I would need to use gsort for this situation. Would you know how to sort the observations?

        Comment


        • #5
          Show example data, resulting graphs and the code that produced them to get precise advice.

          Comment


          • #6
            Originally posted by Nick Cox View Post
            Show example data, resulting graphs and the code that produced them to get precise advice.
            Here is a data example with random observations given that the dataset is quite large:

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input float(count_vrd vote_rd) double year float essround str2 cntry
            496 0 2002 1 "DK"
            496 0 2002 1 "DK"
            418 0 2008 4 "DK"
            418 0 2008 4 "DK"
            418 0 2008 4 "DK"
            418 0 2008 4 "DK"
            805 0 2012 6 "DK"
            418 0 2008 4 "FR"
            805 0 2012 6 "FR"
            805 0 2012 6 "FR"
            805 0 2012 6 "FR"
            496 0 2002 1 "NL"
            496 0 2002 1 "NL"
            496 1 2002 1 "NL"
            496 0 2002 1 "NL"
            418 1 2008 4 "NL"
            418 0 2008 4 "NL"
            329 1 2010 5 "NL"
            475 1 2014 7 "NL"
            475 0 2014 7 "NL"
            475 0 2014 7 "NL"
            496 0 2002 1 "AT"
            460 0 2006 3 "AT"
            460 0 2006 3 "AT"
            475 0 2014 7 "AT"
            475 0 2014 7 "AT"
            330 0 2004 2 "FI"
            330 0 2004 2 "FI"
            418 0 2008 4 "FI"
            805 0 2012 6 "FI"
             94 0 2016 8 "FI"
             94 0 2016 8 "FI"
             94 0 2016 8 "FI"
             94 0 2016 8 "FI"
             94 0 2016 8 "FI"
            330 0 2004 2 "SE"
            330 0 2004 2 "SE"
            418 0 2008 4 "SE"
             94 0 2016 8 "SE"
             94 0 2016 8 "SE"
            330 0 2004 2 "AT"
            330 0 2004 2 "AT"
            460 0 2006 3 "SE"
            460 0 2006 3 "SE"
            460 0 2006 3 "SE"
            460 0 2006 3 "NL"
            805 0 2012 6 "NL"
            250 0    . 8 "AT"
            250 1    . 8 "AT"
            250 0    . 8 "AT"
            end
            The code I was using was to create a line graph was:
            Code:
            twoway (line count_vrd year if cntry == "AT") ///
            (line count_vrd year if cntry == "DK") ///
            (line count_vrd year if cntry == "FR") ///
            (line count_vrd year if cntry == "FI") ///
            (line count_vrd year if cntry == "SE") ///
            (line count_vrd year if cntry == "NL") ///
            I am definitely using the wrong method here since what I am aiming to do is displaying the percentage of vote_rd over time by country, which is why I had first asked about creating a variable counting the number of times vote_rd was == 1.

            The graph I obtained looks like this:
            Attached Files

            Comment


            • #7
              That graph is clearly not what you want because it doesn't show what you want, the data are in the wrong sort order and the legend is useless. Superimposed line charts often result in spaghetti, so here is one guess at what you (should) want. See also

              Code:
              . search spaghetti, sj
              
              Search of official help files, FAQs, Examples, and Stata Journals
              
              SJ-21-2 gr0087  . . Front-and-back plots to ease spaghetti and paella problems
                      (help fabplot if installed) . . . . . . . . . . . . . . . .  N. J. Cox
                      Q2/21   SJ 21(2):539--554
                      explores front-and-back plots, in which each subset of data
                      is shown separately with the other subsets as backdrop
              
              SJ-19-4 gr0080  . . . . . .  Some simple devices to ease the spaghetti problem
                      . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                      Q4/19   SJ 19(4):989--1008                               (no commands)
                      gives guidance on avoiding the spaghetti problem in graphics
                      (where multiple time series or other functional traces show
                      mostly a tangled mess)
              Code:
              egen pc_vote = mean(100 * vote_rd), by(cntry year) 
              label var pc_vote "% voting yes"
              egen tag = tag(cntry year)
              
              * install from Stata Journal first 
              fabplot line pc_vote year if tag, by(cntry) frontopts(lwidth(*3))
              Click image for larger version

Name:	vote_rd.png
Views:	1
Size:	55.1 KB
ID:	1716694

              Comment

              Working...
              X