I am using Stata version 15.1 and I have 47 variables and 1,010 observations in my dataset. I am doing educational research with 112 colleges included (This is setup as a panel data set). However, I have expanded it so that I can capture the average prices for all colleges in set categories. I now have 69,999 observations in my data set.
My goal:
I am trying to create a variable that will represents the weighted average price of colleges per state within a year range on my panel data set. For example the average price that would fill for this field would account for 40% of the average for the state the college is in plus 60% of the average of states that the college is not in.
Articles and forums I have reviewed
https://www.statalist.org/forums/for...weight-my-data
https://www.stata.com/manuals13/svyp...tification.pdf
These were very helpful, but it left me unsure how to account for set weights when I want it to account for the current state for the year
What I have been able to do so far that has been effective:
I have been able to generate a average per year for the following categories- 1—Public, four-year or above 2—Private not-for-profit, four-year or above 3—Private for-profit, four-year or above 4—Public, two-year. I have all the individual colleges in the data set and I know their prices. The variable for the price I want to average is "avgprice_per_year."
What I am trying with code modification narrative:
I now need to take this one step further to account for a weighted average.
My first thought was to modify the code that is working to provide an average per year per type (see below):
As I consider how to account for the averages on a per-state basis my best assessment is to simply add state like the following and then this would put the value per state per year with each college throughout my data set.
For example, Anywhere University Arizona 2008 would have the same value as Unknown University Arizona 2008. However, Questionable University Arkansas 2008 would have a different average since it is a different state even though it is the same year.
This is the code I was thinking would work:
Still Problematic
This unfortunately is not working and I am not sure how to account for the weighted average I want for each college in each state per year. I would be most grateful for some guidance in this regard.
Data Ex information (I populated the field I am struggling with, “weighted_average_60_40” as 1)
My goal:
I am trying to create a variable that will represents the weighted average price of colleges per state within a year range on my panel data set. For example the average price that would fill for this field would account for 40% of the average for the state the college is in plus 60% of the average of states that the college is not in.
Articles and forums I have reviewed
https://www.statalist.org/forums/for...weight-my-data
https://www.stata.com/manuals13/svyp...tification.pdf
These were very helpful, but it left me unsure how to account for set weights when I want it to account for the current state for the year
What I have been able to do so far that has been effective:
I have been able to generate a average per year for the following categories- 1—Public, four-year or above 2—Private not-for-profit, four-year or above 3—Private for-profit, four-year or above 4—Public, two-year. I have all the individual colleges in the data set and I know their prices. The variable for the price I want to average is "avgprice_per_year."
What I am trying with code modification narrative:
I now need to take this one step further to account for a weighted average.
My first thought was to modify the code that is working to provide an average per year per type (see below):
sort year
forval sec = 1/6 {
by year : egen _sector_`sec' = mean(avg_tuition_fees_ft) if sector==`sec'
by year : egen sector_`sec' = max(_sector_`sec')
local lab: label sectordf `sec'
label var sector_`sec' `"`lab'"'
}
drop _sector*
*****************Rename******************
rename sector_1 Public_4yr_or_above
rename sector_2 Private_NFP_4yr_or_above
rename sector_3 Private_FP_4yr_or_above
rename sector_4 Public_2yr
rename sector_5 Private_NFP_2yr
rename sector_6 Private_FP_2yr
forval sec = 1/6 {
by year : egen _sector_`sec' = mean(avg_tuition_fees_ft) if sector==`sec'
by year : egen sector_`sec' = max(_sector_`sec')
local lab: label sectordf `sec'
label var sector_`sec' `"`lab'"'
}
drop _sector*
*****************Rename******************
rename sector_1 Public_4yr_or_above
rename sector_2 Private_NFP_4yr_or_above
rename sector_3 Private_FP_4yr_or_above
rename sector_4 Public_2yr
rename sector_5 Private_NFP_2yr
rename sector_6 Private_FP_2yr
For example, Anywhere University Arizona 2008 would have the same value as Unknown University Arizona 2008. However, Questionable University Arkansas 2008 would have a different average since it is a different state even though it is the same year.
This is the code I was thinking would work:
sort year state
forval sec = 1/6 {
by year state : egen _sector_`sec' = mean(avg_tuition_fees_ft) if sector==`sec'
by year state : egen sector_`sec' = max(_sector_`sec')
local lab: label sectordf `sec'
label var sector_`sec' `"`lab'"'
}
drop _sector*
*****************Rename******************
rename sector_1 Public_4yr_or_above
rename sector_2 Private_NFP_4yr_or_above
rename sector_3 Private_FP_4yr_or_above
rename sector_4 Public_2yr
rename sector_5 Private_NFP_2yr
rename sector_6 Private_FP_2yr
forval sec = 1/6 {
by year state : egen _sector_`sec' = mean(avg_tuition_fees_ft) if sector==`sec'
by year state : egen sector_`sec' = max(_sector_`sec')
local lab: label sectordf `sec'
label var sector_`sec' `"`lab'"'
}
drop _sector*
*****************Rename******************
rename sector_1 Public_4yr_or_above
rename sector_2 Private_NFP_4yr_or_above
rename sector_3 Private_FP_4yr_or_above
rename sector_4 Public_2yr
rename sector_5 Private_NFP_2yr
rename sector_6 Private_FP_2yr
Still Problematic
This unfortunately is not working and I am not sure how to account for the weighted average I want for each college in each state per year. I would be most grateful for some guidance in this regard.
Data Ex information (I populated the field I am struggling with, “weighted_average_60_40” as 1)
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long unitid int year str20 state double avg_tuition_fees_ft float(Public_4yr_or_above Private_NFP_4yr_or_above Private_FP_4yr_or_above Public_2yr Private_NFP_2yr Private_FP_2yr weighted_average_60_40) 101189 2008 "Alabama" 12700 5922.394 19520.313 15352.144 2820.107 9714.963 12881.91 1 101189 2009 "Alabama" 13690 6265.079 20407.69 15560.724 2956.39 9989.72 14087.479 1 101189 2010 "Alabama" 14820 6608.007 21402.32 15243.452 3165.12 10624.734 13521.01 1 101189 2011 "Alabama" 15880 7072.479 22200.95 15185.118 3381.933 11853.928 13948.85 1 101189 2012 "Alabama" 17380 7360.464 23126.385 15275.388 3519.0115 12151.585 14208.567 1 101189 2013 "Alabama" 18230 7573.859 23792.57 15208.097 3661.56 12504.496 14351.418 1 101189 2014 "Alabama" 18750 7750.858 24454.877 15483.184 3777.615 12589.093 14354.9 1 101189 2015 "Alabama" 19280 7991.114 25185.45 15604.883 3924.161 12476.803 14654.785 1 101189 2016 "Alabama" 20130 8037.394 26062.97 15333.85 4075.6096 13207.118 14619.66 1 105899 2008 "Arizona" 14440 5922.394 19520.313 15352.144 2820.107 9714.963 12881.91 1 105899 2009 "Arizona" 15177 6265.079 20407.69 15560.724 2956.39 9989.72 14087.479 1 105899 2010 "Arizona" 16174 6608.007 21402.32 15243.452 3165.12 10624.734 13521.01 1 105899 2011 "Arizona" 18968 7072.479 22200.95 15185.118 3381.933 11853.928 13948.85 1 105899 2012 "Arizona" 21040 7360.464 23126.385 15275.388 3519.0115 12151.585 14208.567 1 105899 2013 "Arizona" 21832 7573.859 23792.57 15208.097 3661.56 12504.496 14351.418 1 105899 2014 "Arizona" 22466 7750.858 24454.877 15483.184 3777.615 12589.093 14354.9 1 105899 2015 "Arizona" 23110 7991.114 25185.45 15604.883 3924.161 12476.803 14654.785 1 105899 2016 "Arizona" 23896 8037.394 26062.97 15333.85 4075.6096 13207.118 14619.66 1 107044 2008 "Arkansas" 13130 5922.394 19520.313 15352.144 2820.107 9714.963 12881.91 1 107044 2009 "Arkansas" 13580 6265.079 20407.69 15560.724 2956.39 9989.72 14087.479 1 107044 2010 "Arkansas" 14040 6608.007 21402.32 15243.452 3165.12 10624.734 13521.01 1 107044 2011 "Arkansas" 14610 7072.479 22200.95 15185.118 3381.933 11853.928 13948.85 1 107044 2012 "Arkansas" 15240 7360.464 23126.385 15275.388 3519.0115 12151.585 14208.567 1 107044 2013 "Arkansas" 16170 7573.859 23792.57 15208.097 3661.56 12504.496 14351.418 1 107044 2014 "Arkansas" 17040 7750.858 24454.877 15483.184 3777.615 12589.093 14354.9 1 107044 2015 "Arkansas" 17805 7991.114 25185.45 15604.883 3924.161 12476.803 14654.785 1 107044 2016 "Arkansas" 18440 8037.394 26062.97 15333.85 4075.6096 13207.118 14619.66 1 107141 2008 "Arkansas" 18066 5922.394 19520.313 15352.144 2820.107 9714.963 12881.91 1 107141 2009 "Arkansas" 18880 6265.079 20407.69 15560.724 2956.39 9989.72 14087.479 1 107141 2010 "Arkansas" 19730 6608.007 21402.32 15243.452 3165.12 10624.734 13521.01 1 107141 2011 "Arkansas" 20766 7072.479 22200.95 15185.118 3381.933 11853.928 13948.85 1 107141 2012 "Arkansas" 21774 7360.464 23126.385 15275.388 3519.0115 12151.585 14208.567 1 107141 2013 "Arkansas" 22734 7573.859 23792.57 15208.097 3661.56 12504.496 14351.418 1 107141 2014 "Arkansas" 23588 7750.858 24454.877 15483.184 3777.615 12589.093 14354.9 1 107141 2015 "Arkansas" 24468 7991.114 25185.45 15604.883 3924.161 12476.803 14654.785 1 107141 2016 "Arkansas" 25324 8037.394 26062.97 15333.85 4075.6096 13207.118 14619.66 1 109785 2008 "California" 26640 5922.394 19520.313 15352.144 2820.107 9714.963 12881.91 1 109785 2009 "California" 27750 6265.079 20407.69 15560.724 2956.39 9989.72 14087.479 1 109785 2010 "California" 28800 6608.007 21402.32 15243.452 3165.12 10624.734 13521.01 1 109785 2011 "California" 29940 7072.479 22200.95 15185.118 3381.933 11853.928 13948.85 1 109785 2012 "California" 31076 7360.464 23126.385 15275.388 3519.0115 12151.585 14208.567 1 109785 2013 "California" 32306 7573.859 23792.57 15208.097 3661.56 12504.496 14351.418 1 109785 2014 "California" 33096 7750.858 24454.877 15483.184 3777.615 12589.093 14354.9 1 109785 2015 "California" 34754 7991.114 25185.45 15604.883 3924.161 12476.803 14654.785 1 109785 2016 "California" 36120 8037.394 26062.97 15333.85 4075.6096 13207.118 14619.66 1 110097 2008 "California" 26424 5922.394 19520.313 15352.144 2820.107 9714.963 12881.91 1 110097 2009 "California" 27744 6265.079 20407.69 15560.724 2956.39 9989.72 14087.479 1 110097 2010 "California" 28852 6608.007 21402.32 15243.452 3165.12 10624.734 13521.01 1 110097 2011 "California" 29908 7072.479 22200.95 15185.118 3381.933 11853.928 13948.85 1 110097 2012 "California" 31004 7360.464 23126.385 15275.388 3519.0115 12151.585 14208.567 1 110097 2013 "California" 32142 7573.859 23792.57 15208.097 3661.56 12504.496 14351.418 1 110097 2014 "California" 33322 7750.858 24454.877 15483.184 3777.615 12589.093 14354.9 1 110097 2015 "California" 34498 7991.114 25185.45 15604.883 3924.161 12476.803 14654.785 1 110097 2016 "California" 36696 8037.394 26062.97 15333.85 4075.6096 13207.118 14619.66 1 110361 2008 "California" 22330 5922.394 19520.313 15352.144 2820.107 9714.963 12881.91 1 110361 2009 "California" 23266 6265.079 20407.69 15560.724 2956.39 9989.72 14087.479 1 110361 2010 "California" 24654 6608.007 21402.32 15243.452 3165.12 10624.734 13521.01 1 110361 2011 "California" 25808 7072.479 22200.95 15185.118 3381.933 11853.928 13948.85 1 110361 2012 "California" 26900 7360.464 23126.385 15275.388 3519.0115 12151.585 14208.567 1 110361 2013 "California" 28122 7573.859 23792.57 15208.097 3661.56 12504.496 14351.418 1 110361 2014 "California" 29422 7750.858 24454.877 15483.184 3777.615 12589.093 14354.9 1 110361 2015 "California" 30384 7991.114 25185.45 15604.883 3924.161 12476.803 14654.785 1 110361 2016 "California" 31372 8037.394 26062.97 15333.85 4075.6096 13207.118 14619.66 1 112075 2008 "California" 23930 5922.394 19520.313 15352.144 2820.107 9714.963 12881.91 1 112075 2009 "California" 24950 6265.079 20407.69 15560.724 2956.39 9989.72 14087.479 1 112075 2010 "California" 26000 6608.007 21402.32 15243.452 3165.12 10624.734 13521.01 1 112075 2011 "California" 27300 7072.479 22200.95 15185.118 3381.933 11853.928 13948.85 1 112075 2012 "California" 28500 7360.464 23126.385 15275.388 3519.0115 12151.585 14208.567 1 112075 2013 "California" 29630 7573.859 23792.57 15208.097 3661.56 12504.496 14351.418 1 112075 2014 "California" 30640 7750.858 24454.877 15483.184 3777.615 12589.093 14354.9 1 112075 2015 "California" 31690 7991.114 25185.45 15604.883 3924.161 12476.803 14654.785 1 112075 2016 "California" 32780 8037.394 26062.97 15333.85 4075.6096 13207.118 14619.66 1 112084 2008 "California" 20980 5922.394 19520.313 15352.144 2820.107 9714.963 12881.91 1 112084 2009 "California" 21610 6265.079 20407.69 15560.724 2956.39 9989.72 14087.479 1 112084 2010 "California" 22692 6608.007 21402.32 15243.452 3165.12 10624.734 13521.01 1 112084 2011 "California" 23814 7072.479 22200.95 15185.118 3381.933 11853.928 13948.85 1 112084 2012 "California" 25108 7360.464 23126.385 15275.388 3519.0115 12151.585 14208.567 1 112084 2013 "California" 25888 7573.859 23792.57 15208.097 3661.56 12504.496 14351.418 1 112084 2014 "California" 27090 7750.858 24454.877 15483.184 3777.615 12589.093 14354.9 1 112084 2015 "California" 19791 7991.114 25185.45 15604.883 3924.161 12476.803 14654.785 1 112084 2016 "California" 21650 8037.394 26062.97 15333.85 4075.6096 13207.118 14619.66 1 114813 2008 "California" 23202 5922.394 19520.313 15352.144 2820.107 9714.963 12881.91 1 114813 2009 "California" 23898 6265.079 20407.69 15560.724 2956.39 9989.72 14087.479 1 114813 2010 "California" 23904 6608.007 21402.32 15243.452 3165.12 10624.734 13521.01 1 114813 2011 "California" 24790 7072.479 22200.95 15185.118 3381.933 11853.928 13948.85 1 114813 2012 "California" 25336 7360.464 23126.385 15275.388 3519.0115 12151.585 14208.567 1 114813 2013 "California" 25716 7573.859 23792.57 15208.097 3661.56 12504.496 14351.418 1 114813 2014 "California" 26638 7750.858 24454.877 15483.184 3777.615 12589.093 14354.9 1 114813 2015 "California" 27954 7991.114 25185.45 15604.883 3924.161 12476.803 14654.785 1 114813 2016 "California" 29370 8037.394 26062.97 15333.85 4075.6096 13207.118 14619.66 1 117104 2008 "California" 11650 5922.394 19520.313 15352.144 2820.107 9714.963 12881.91 1 117104 2009 "California" 12500 6265.079 20407.69 15560.724 2956.39 9989.72 14087.479 1 117104 2010 "California" 12560 6608.007 21402.32 15243.452 3165.12 10624.734 13521.01 1 117104 2011 "California" 12500 7072.479 22200.95 15185.118 3381.933 11853.928 13948.85 1 117104 2012 "California" 13100 7360.464 23126.385 15275.388 3519.0115 12151.585 14208.567 1 117104 2013 "California" 12500 7573.859 23792.57 15208.097 3661.56 12504.496 14351.418 1 117104 2014 "California" 12860 7750.858 24454.877 15483.184 3777.615 12589.093 14354.9 1 117104 2015 "California" 13320 7991.114 25185.45 15604.883 3924.161 12476.803 14654.785 1 117104 2016 "California" 14138 8037.394 26062.97 15333.85 4075.6096 13207.118 14619.66 1 120537 2008 "California" 21890 5922.394 19520.313 15352.144 2820.107 9714.963 12881.91 1 end label values unitid unitiddf label values year yeardf label values avg_tuition_fees_ft tuition_fees_ftdf
Comment