Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • repeated time values within panel r(451)

    Hello this is my first post! I am using YTS data from the CDC ill put a link down below. I am currently trying to run fixed effect regression on my data but when I command: xtset Stateid YEAR I get the error r(451). from what I can tell its saying every time the state and year are the same its a repeated time value even though the data throughout the row is different. If someone knows the problem or how I can work around it would be great!

    if you need any more details please let me know. Thank you!

    What I am trying to do with stata
    xtset Stateid YEAR
    xtreg Data_Value policy Gender Education i.YEAR, fe robust

    policy = LocationAbbr: "CA" "HI" when YEAR > 2015

    preview of data as I currently have it
    Click image for larger version

Name:	Screenshot 2021-11-20 093400.jpg
Views:	1
Size:	989.7 KB
ID:	1637362


    Orginal Dataset:
    https://chronicdata.cdc.gov/Survey-D...Data/4juz-x2tp

  • #2
    It seems then that
    Code:
    xtset Stateid
    without specifying the year will be appropriate for your data.

    Comment


    • #3
      Please read the forum FAQ, with particular emphasis on #12, for excellent advice on the most helpful ways to show example data, and other advice for improving your chances of getting a timely and useful response to your question. Your screenshot is, on my computer, unreadable--this is a common problem with screenshots on this forum. Even if it were readable, it would not be helpful for many purposes because there is no way to import a screenshot into Stata if it becomes necessary to work with the data to try out alternative code in order to answer your question. In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      That said, in this case, looking at the description of the data set found at the link you provided makes it possible to understand what is going on and suggest a solution.

      Your data set contains multiple observations per state year combination because the different observations within each combination reflect different subpopulations such as sex and race and they also reflect different "topics" and may be denominated in different units. Running any kind of regression analysis on a heterogeneous hodge-podge of data in this way would be a serious waste of time. Now, it may be that you have already extracted from the full data set some subset that refers to a single "topic" (or perhaps a small number of highly related topics). And you may also have selected the particular combination(s) of sex and race that you are interested in. This must be done actively: "Male," "Female," and "All" are all in the data set together--you could keep male, or female, or both if you want to examine the effect of sex in your models. Or if sex effects are of no interest to your research questions you could just keep all and drop the male and female rows. But there is no analysis in which it would make sense to retain all of "Male," "Female," and "All."

      So if you have already selected an appropriate set of observations for analysis (I can't tell because your screenshot isn't readable), then you may still be, appropriately, left with data that has multiple observations per time period per state. No problem. The time variable is optional in -xtset- and is needed only for analyses that rely on time-series operators like lags and leads and seasonal differences and the like, or use autoregressive correlation structure. But your regression model involves none of those things. So, again, conditional on your having filtered out observations so that you are left with a coherent set that makes sense to analyze at all, just -xtset Stateid- and go from there.

      Added: Crossed with #2.

      Comment

      Working...
      X