Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating variable storing means by state and year

    Greetings,

    I'm running Stata 15.1 on OSX. My ultimate objective is to examine the extent that each US state's political ideology has varied over time. I'm unsure how to go about this though. My current approach--which is computationally demanding, given the size of the dataset (N=392,755)--is to do the following:

    Code:
    egen ideo_AL=wtmean(ideo5) if state==1, by(year) weight(weight_cumulative)
    Code:
    egen ideo_AK=wtmean(ideo5) if state==2, by(year) weight(weight_cumulative)
    And so on and so forth. The obvious issue with this approach (which I'm hoping is surmountable) is that I'm forced to create separate mean-variables for each state. This is both time (and CPU) consuming and also leaves me with dozens of variables. I'm thus wondering if there is a way to store all the annual state means in a single variable. Thanks in advance for your input!

    Here is some sample data:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long(year ideo5 state) double weight_cumulative
    2006 2 37 1.3460122911501249
    2006 3 39  .7038786650023755
    2006 2 34 1.1582975461680245
    2006 2 17 1.0166128334258435
    2006 2 36  .6563246369597865
    2006 2 48   .609829636694957
    2006 4 27  .5647936660597331
    2006 2 32   .609829636694957
    2006 2 48 .24095777953558578
    2006 3 24  .8015982440280928
    2006 4 12  .5647936660597331
    2006 3 18 1.1582975461680245
    2006 5  6  .6563246369597865
    2006 5  6 .44513128169733296
    2006 3 41  1.448635774935967
    2006 2 26 .24095777953558578
    2006 3 53   .609829636694957
    2006 4 27 .44513128169733296
    2006 1 18   .609829636694957
    2006 4 36  .5647936660597331
    2006 3 20  .5647936660597331
    2006 6 42 1.1582975461680245
    2006 2  8 1.1582975461680245
    2006 4 19  .5247828540709243
    2006 5  1 1.1582975461680245
    2006 3 26 .24095777953558578
    2006 4 39  .8958343645587167
    2006 2 13   .609829636694957
    2006 3 13 1.0941220443843926
    2006 4 19   .270499072840533
    2006 4 18 1.1582975461680245
    2006 5 32  1.448635774935967
    2006 4 12  .5647936660597331
    2006 4  6 .44513128169733296
    2006 5  6  .5647936660597331
    2006 3 48   .609829636694957
    2006 2  6  .5247828540709243
    2006 3 18   .609829636694957
    2006 1  6  1.448635774935967
    2006 4 48 .24095777953558578
    2006 2 18 .24095777953558578
    2006 4 38   .609829636694957
    2006 4 17 1.2466091687511938
    2006 4 48  1.448635774935967
    2006 5  6 1.3460122911501249
    2006 3 26  .6563246369597865
    2006 2  5  .8323721437736751
    2006 2  6  .6563246369597865
    2006 3 39 .47906925029552666
    2006 3 53 .27526857046048253
    2006 3 48 .24095777953558578
    2006 3 53 .24095777953558578
    2006 5 53  1.448635774935967
    2006 3 48 .24095777953558578
    2006 2 36   .609829636694957
    2006 2  6 .24095777953558578
    2006 4 42 .44513128169733296
    2006 3 26   .609829636694957
    2006 4 26 .24095777953558578
    2006 2 27  .8323721437736751
    2006 2  6 .24095777953558578
    2006 3 18  .5247828540709243
    2006 3  4 .44513128169733296
    2006 3 13 .44513128169733296
    2006 3 39   .609829636694957
    2006 6 40 .44513128169733296
    2006 2 39 1.3460122911501249
    2006 1 37 .47906925029552666
    2006 2 48  .5247828540709243
    2006 3 47 .24095777953558578
    2006 3  6  .8958343645587167
    2006 3  6 .24095777953558578
    2006 4 48  .8269102220986886
    2006 4 49 .24095777953558578
    2006 4 27 1.3460122911501249
    2006 3  4   .609829636694957
    2006 2 33  .8323721437736751
    2006 3 39 .24631205437842263
    2006 5 37  .5247828540709243
    2006 4 55   .609829636694957
    2006 4  6 1.3460122911501249
    2006 4 37   .609829636694957
    2006 3 18  .5647936660597331
    2006 4 20  1.448635774935967
    2006 4 34 .24095777953558578
    2006 4 17 1.2466091687511938
    2006 4 48 .44513128169733296
    2006 3 39 .24095777953558578
    2006 3 36  .6563246369597865
    2006 4 17  .7038786650023755
    2006 5 12 1.1582975461680245
    2006 4 26 .44513128169733296
    2006 4  4 .24095777953558578
    2006 5 48 .24095777953558578
    2006 6 48 .24095777953558578
    2006 2 53 .24095777953558578
    2006 4 26 .24095777953558578
    2006 4  6  .6563246369597865
    2006 4 31   .270499072840533
    2006 3  6  1.048714303130492
    end
    label values ideo5 ideo5
    label def ideo5 1 "Very Liberal", modify
    label def ideo5 2 "Liberal", modify
    label def ideo5 3 "Moderate", modify
    label def ideo5 4 "Conservative", modify
    label def ideo5 5 "Very Conservative", modify
    label def ideo5 6 "Not Sure", modify
    label values state state
    label def state 1 "Alabama", modify
    label def state 4 "Arizona", modify
    label def state 5 "Arkansas", modify
    label def state 6 "California", modify
    label def state 8 "Colorado", modify
    label def state 12 "Florida", modify
    label def state 13 "Georgia", modify
    label def state 17 "Illinois", modify
    label def state 18 "Indiana", modify
    label def state 19 "Iowa", modify
    label def state 20 "Kansas", modify
    label def state 24 "Maryland", modify
    label def state 26 "Michigan", modify
    label def state 27 "Minnesota", modify
    label def state 31 "Nebraska", modify
    label def state 32 "Nevada", modify
    label def state 33 "New Hampshire", modify
    label def state 34 "New Jersey", modify
    label def state 36 "New York", modify
    label def state 37 "North Carolina", modify
    label def state 38 "North Dakota", modify
    label def state 39 "Ohio", modify
    label def state 40 "Oklahoma", modify
    label def state 41 "Oregon", modify
    label def state 42 "Pennsylvania", modify
    label def state 47 "Tennessee", modify
    label def state 48 "Texas", modify
    label def state 49 "Utah", modify
    label def state 53 "Washington", modify
    label def state 55 "Wisconsin", modify

  • #2
    Just to elaborate a bit more: the reason I'm not going about this via a regression model (i.e. reg ideo5 i.year##i.state) is because I want to create a dataset of aggregate variables. Keeping the respondent-level observations naturally slows down estimation. I also want to append state-level data from other datasets to determine whether they're related to temporal variation in state-level ideology. In other words, I'm trying to assemble a panel dataset of aggregate variables.

    Comment


    • #3
      Code:
      by year state, sort:egen ideo_AL=wtmean(ideo5), weight(weight_cumulative)

      Comment


      • #4
        You are using the community-contributed egen function wtmean from the package _gwtmean on SSC, as you are asked to explain.

        Unusually this package has no help file, but in comments in the .ado file the syntax is explained as

        Code:
        egen newvar = wtmean(exp), by(byvar) weight(wgt_expr)
        which seems to imply that only a single by() variable is allowed. But that is misleading because the precise syntax for the option is

        Code:
        by(string) 
        so multiple variables are allowed. (If the string were not a legal varlist, calling the function would fail.) Hence this would work as well as Clyde's fine solution:

        Code:
        egen ideo = wtmean(ideo5), weight(weight_cumulative) by(state year)
        where we don't need _AL (flagging Alabama). Incidentally, 55 states? Perhaps you're including DC, Puerto Rico, etc., but make sure that one state does not appear in multiple versions.





        Comment


        • #5
          For weighted average means, you can also use asgen (available from SSC). The following code will find the weighted average mean for each state in each year

          Code:
          ssc install asgen
          help asgen
          bys year state: asgen ideo_AL2=ideo5, weight(weight_cumulative)
          Regards
          --------------------------------------------------
          Attaullah Shah, PhD.
          Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
          FinTechProfessor.com
          https://asdocx.com
          Check out my asdoc program, which sends outputs to MS Word.
          For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

          Comment

          Working...
          X