Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating New Variable in Time Series Data

    Hello and thank you for taking the time to help.

    I am working on an analysis of voter roll purges and want to create a new variable to regress against CCES turnout data but I haven't used stata in over a year and my quant class didn't cover this type of variable generation.

    The variables I have consists of the total number of removed voters and the total number of registered voters for 50 states (minus a few removed for lack of data) and election year (2010, 2012, 2014, 2016, and 2018). I'd like to divide the number of purged voters (total_removed) in a year (for ex. 2018) by the total registered voters (total_reg) in the prior year (in this case, 2016) in each state. I'm pretty far out of my depth and I've guessed at a few formulas but nothing's worked.

    Do I need to create dummy variables for each pair of years? As in, create a variable for each 2018 - 2016 pair then 2016 - 2014 etc.? That's the only thing I can think of but that also sounds very time intensive and clunky for my reg.

    Thank you for any help you can provide.

  • #2
    Well, you don't show any example data, so the code will have to be based on several assumptions. I assume that there is a variable, state, that identifies states, and that it is a numeric variable. I assume that there is a variable, year, that identifies year. Consistent with what you say in #1, I also assume that the year is always even, so that the "prior" year is two years earlier.

    Code:
    xtset state year, delta(2)
    gen wanted = total_removed/L1.total_reg
    See -help tsvarlist- for more information on lagged variables and other time-series operators.

    If your state variable is string rather than numeric, -encode- it. -help encode-

    If this does not perform in your data, do fire up the -dataex- command to show example data when you post back. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Thank you for the help! Your advice worked and I created the variable several and append the data into a single sheet.

      I've run into a new issue that you may be able to help with. Using survey data from CCES, I need to determine turnout by state per election. I've tried using xtset to establish turnout as my panel variable and year as my time variable but I am getting an error message that says "repeated time values within panel". Because it's survey data, there are numerous observations per year. I've changed the turnout data so that "0" means they did not vote and "1" means they did, this way the mean for vv_turnout_gvm is my turnout rate but I can not figure out how to break that down so I can see the turnout rate in a given state during a given election.I've also created a year_code so that each year can be recorded as 1, 2, etc. instead of the actual year if that help.


      [CODE]
      Code:
      * Example generated by -dataex-. To install:    ssc    install    dataex
      clear
      input long(year vv_turnout_gvm state)
      2010 1 42
      2010 1 12
      2010 1 47
      2010 1 28
      2010 1  8
      2010 1  6
      2010 1 34
      2010 1 37
      2010 1 48
      2010 1 13
      2010 1  6
      2010 1 34
      2010 1 27
      2010 1  8
      2010 1 12
      2010 1 17
      2010 1 48
      2010 1 34
      2010 1 46
      2010 1 41
      2010 1 32

      Code:
       
      label    values vv_turnout_gvm vv_turnout_gvm
      label    def vv_turnout_gvm 1 "Voted", modify
      label    values state state
      label    def state 1 "Alabama", modify
      label    def state 2 "Alaska", modify
      label    def state 4 "Arizona", modify
      label    def state 6 "California", modify
      label    def state 8 "Colorado", modify
      label    def state 9 "Connecticut", modify
      label    def state 12 "Florida", modify

      Thank you for any guidance you can share. Please let me know if more information could help.

      Comment


      • #4
        The example data you show and your description leave some things unclear. Is each observation a single voter? If not, what is it?

        It doesn't make much sense to set vv_turnout_gvm as the panel variable. A panel variable is a variable that identifies groups of observations, and within that group you have multiple observations (typically spaced out over time). So, a sensible way to declare a panel variable here would be to declare it to be state.

        Now, if there is just one (or no) observation per state per year in your data, you can -xtset state year-. But if the data set is a data set of voters, there will be multiple observations for each combination of state and year. In that case you just -xtset state- and that's it.

        If the observations in your data set are voters, then to get the turnout rate in state in each year (there is no variable identifying elections, I suppose year is what you meant), you do :

        Code:
        by state year, sort: egen turnout_rate = mean(vv_turnout_gvm)

        Comment


        • #5
          by state year, sort: egen turnout_rate = mean(vv_turnout_gvm)
          That seems to have worked. The data sample I pasted in my last comment was individual voters. Thank you for the help, you're a lifesaver!

          Comment

          Working...
          X