Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unbalanced panel data with VERY irregular time intervals

    I'm currently facing a challenge in analyzing unbalanced panel data characterized by highly irregular time intervals. My online search has identified existing methodologies that typically handle two distinct types of panel data with irregular time intervals: 1) Panel data in which a fraction of observations with irregular time intervals due to missing values, and 2) Panel data collected through surveys conducted across different waves with irregular yet consistent time intervals, such as surveys conducted in 1965, 1966, 1968, and 1969.

    However, the nature of my dataset is different. It comprises observations of homes that were traded at least once between 2006 and 2020. Descriptive analysis revealed that about half of these homes were traded only once, approximately 30% were traded twice, and the remaining 20% were traded between 3 to 6 times within this 15-year timeframe. For homes involved in more than one transaction, the first transactions could occur at any point between 2006 and 2019, with subsequent transaction(s) happening between 2007 and 2020. For instance, Home X was traded in 2006 and 2012, Home Y in 2009, 2015, and 2018, and Home Z was traded once in 2010.

    The dependent variable is the transaction price in log form. My key independent variable is a binary variable indicating whether a purchase is made using cash or a loan. I also have some time-variant variables at the neighborhood level and a few time-invariant variables representing housing size, age, and structural type.

    Could you suggest a specific econometric model, along with a corresponding Stata function or R package, that would be suitable for analyzing this unique dataset? Do I also have to deal with spatial autocorrelation in my modeling analysis as some homes are geographically close to each other?

  • #2
    though your data is not randomized, I think that the following article will still be of interest: Pullenayegum, EM and Scharfstein, DO (2022), "Randomized trials with repeatedly measured outcomes: handling irregular and potentially informative assessment times", Epidemiological Review, 44: 121-137

    however, I'm not sure I see/understand how your data is "panel"

    standard mixed effects models in Stata (e.g., -mixed-) will estimate these models; they will not correct for spatial autocorrelation (if any), but maybe you could handle that via some form of "neighborhood" effect?

    I am not personally a fan of log-transforming variables (though I know some people who really like it and suggest it for virtually everything) but maybe a mxed effect poisson model will do at least as well

    Comment

    Working...
    X