Say my hypothesis is: has the income gap between whites and blacks decreased over time?
I start with a simple analysis of raw data, I can tabulate the mean wages for each group by each year and compute the difference. I can also test whether this difference is significant using t-test, each time limiting the test for the year analyzed.
The issues are:
1. Plotting this requires some data "destruction" with table, replace OR some sort of collapsing. I would have loved to avoid this. Have a plot where the x axis is time, the y axis is the the difference in wages between groups. Any way to achieve this?
2. t-testing for each year separately cannot answer the question whether the difference decreases or widens across time. say in 1990 the difference is 1.8 and this is statically significant. in 1991 the difference is 1.799 and it is also statistically significant. but is the difference between 1.8 & 1.799 statistically significant? disjoint t-test cannot provide an answer obviously.
Code example:
I start with a simple analysis of raw data, I can tabulate the mean wages for each group by each year and compute the difference. I can also test whether this difference is significant using t-test, each time limiting the test for the year analyzed.
The issues are:
1. Plotting this requires some data "destruction" with table, replace OR some sort of collapsing. I would have loved to avoid this. Have a plot where the x axis is time, the y axis is the the difference in wages between groups. Any way to achieve this?
2. t-testing for each year separately cannot answer the question whether the difference decreases or widens across time. say in 1990 the difference is 1.8 and this is statically significant. in 1991 the difference is 1.799 and it is also statistically significant. but is the difference between 1.8 & 1.799 statistically significant? disjoint t-test cannot provide an answer obviously.
Code example:
Code:
clear all webuse nlswork, clear drop if race == 3 *estimating the difference in means* bysort year: ttest ln_wage, by(race) *graphing the difference* collapse ln_wage, by(race year) reshape wide ln_wage, i(year) j(race) rename ln_wage1 ln_wage_white rename ln_wage2 ln_wage_black gen diff = ln_wage_white - ln_wage_black twoway line diff year, yline(0) ylabel(-0.2(0.1)0.2)
Comment