Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating first lag autocorrelation coefficient - repeated time values in sample - sample may not include multiple panels.

    Hello,

    I am fairly new to Stata and I am currently trying to find the AR(1) coefficient (which is to my knowledge the autocorrelation at lag 1) of my regression model.

    My data looks as follows: in total I have 6000 unique lots over the time period 2009-2019
    lot_identifier Lot_price Auction_year auction_month Auction_location lot_size ln_real_price limited_edition_dummy
    1 12560 2009 1 London 35 9.45 1
    2 3400 2009 6 New York 56 8.24 0
    3 2351 2010 1 London 12 4.245 1
    4 453 2010 5 Dubai 34 7.32 1
    5 6587 2011 7 Dubai 64 8.234 0
    6 7809 2011 5 New York 12 4.34 0
    7 4086 2011 8 Dubai 7 5.236 0
    8 2354 2012 1 Dubai 23 2.45 1
    9 2654 2012 9 London 24 2.234 0
    10 3685 2013 10 Dubai 75 6.24 0
    11 56966 2013 3 London 54 9.234 0
    12 5373 2014 11 Dubai 53 1.34 1
    13 9832 2015 3 New York 43 3.234 0
    14 24609 2015 1 London 12 6.432 1
    15 95028 2016 9 Dubai 67 7.235 0
    16 3456 2017 12 New York 34 2.245 1
    17 7795 2017 3 London 23 6.325 0
    18 3264 2018 11 Dubai 64 1.325 1
    19 9245 2019 3 Dubai 41 4.246 0
    I peformed a regression like this to find the effects of the independent variables on the dependent variable 'ln_real_price:
    reg ln_real_price i.auction_year i.auction_month i.auction_location lot_size limited_edition_dummy, robust baselevels
    Now I want to find the autocorrelation of the first lag. First of all, I am not sure which method is best to use to find this coefficient. Secondly, when trying different methods Stata tells me to set a time variable. I want to set 'auction_year' as my time variable but when I do
    tsset auction_year
    stata gives me the error 'repeated time values in sample'.
    Then if I try
    tsset A auction_year
    I get this
    tsset A auction_year
    panel variable: A (unbalanced)
    time variable: auction_year, 2009 to 2019
    delta: 1 unit
    And if I then try for example
    corrgram ln_real_price
    or
    wntestq ln_real_price, lags(1)
    I get the error 'sample may not include multiple panels'.

    Maybe I am using the totally wrong tests for finding the autocorrelation of the first lag of my regression model, maybe it is something else? Can somebody please help me to find the AR(1) coefficient?

    Kind regards,

    Bas van den Boomen

  • #2
    -corrgram- is only usable with time series data. You do not have time series data: for most years you have several observations. So it is not even possible to define lags in this data. Look at the observation with lot_no = 6. The year is 2011. But there are three different observations for 2010: which one of them is supposed to be "the lag" of that observation? It is impossible to say. The very concept of lags and autocorrelation is inapplicable to this kind of data.

    It looks, from your example data tableau, as if you really have panel data here, with three panels (New York, London, and Dubai). And maybe you want to do a separate -corrgram- for each of those? A separate command, with an appropriate -if- clause, for each of those cities would do the trick. If the real data has many panels, use a -foreach- loop.

    Comment

    Working...
    X