Hello Statalist,
I have what seems to be probably an easy issue to solve, but I can't figure out how to do it efficiently in stata code.
I am currently working with a dataset of ethnic group power relationships. The unit of analysis in the dataset is the ethnic group. However, I need to integrate parts of the data into my current dataset that has as its unit of analysis countries.
The ethnic group data are currently formatted with a date range instead of individual lines that would make it a panel/time series dataset. So this is an example of the data and its current format:
So the key area for my interest is the from and to columns. I need to be able to transform the data into time series for each of these groups. This is what I'd like to do with the data
Of course, this was relatively easy to do in a spreadsheet where the data currently reside. However, there are over 8000 groups that I need to do this for.
I was wondering if anyone might have an idea of how I could use stata code to transform the date ranges to yearly time series data. The data will be kept in the same dataset, so I'm not breaking out the individual panels into their own datasets. The observations will, of course, increase substantially, but it will make my integration of the data into my primary dataset much easier. I've tried to think how I can do this for a few days, but I've yet to come up with any ideas on even how to approach this problem as I've never encountered this type of transformation before.
Thanks for any suggestions. I look forward to your ideas!
Best,
Bob
I have what seems to be probably an easy issue to solve, but I can't figure out how to do it efficiently in stata code.
I am currently working with a dataset of ethnic group power relationships. The unit of analysis in the dataset is the ethnic group. However, I need to integrate parts of the data into my current dataset that has as its unit of analysis countries.
The ethnic group data are currently formatted with a date range instead of individual lines that would make it a panel/time series dataset. So this is an example of the data and its current format:
gwid | statename | from | to | group | groupid | gwgroupid | umbrella | size | status | reg_aut |
2 | United States of America | 1946 | 1965 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1946 | 1965 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1946 | 1965 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE |
So the key area for my interest is the from and to columns. I need to be able to transform the data into time series for each of these groups. This is what I'd like to do with the data
gwid | statename | year | group | groupid | gwgroupid | umbrella | size | status | reg_aut |
2 | United States of America | 1946 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1947 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1948 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1949 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1950 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1951 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1952 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1953 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1954 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1955 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1956 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1957 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1958 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1959 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1960 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1961 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1962 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1963 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1964 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1965 | Whites | 1000 | 201000 | 0.69099998 | MONOPOLY | ||
2 | United States of America | 1946 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1947 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1948 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1949 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1950 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1951 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1952 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1953 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1954 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1955 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1956 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1957 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1958 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1959 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1960 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1961 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1962 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1963 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1964 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1965 | African Americans | 3000 | 203000 | 0.124 | DISCRIMINATED | FALSE | |
2 | United States of America | 1946 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1947 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1948 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1949 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1950 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1951 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1952 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1953 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1954 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1955 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1956 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1957 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1958 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1959 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1960 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1961 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1962 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1963 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1964 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE | |
2 | United States of America | 1965 | American Indians | 5000 | 205000 | 0.0078 | POWERLESS | TRUE |
Of course, this was relatively easy to do in a spreadsheet where the data currently reside. However, there are over 8000 groups that I need to do this for.
I was wondering if anyone might have an idea of how I could use stata code to transform the date ranges to yearly time series data. The data will be kept in the same dataset, so I'm not breaking out the individual panels into their own datasets. The observations will, of course, increase substantially, but it will make my integration of the data into my primary dataset much easier. I've tried to think how I can do this for a few days, but I've yet to come up with any ideas on even how to approach this problem as I've never encountered this type of transformation before.
Thanks for any suggestions. I look forward to your ideas!
Best,
Bob
Comment