Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Repeat Sales Price Index

    Hello.

    I am currently running a project where I wish to create a price index on the returns of collectible assets using data from auction websites.

    Using panel data: item_ID, sale_date, price - In total I have collected 28,000 unique sales across 5,000 different items. Each sale observation has at least one corresponding sale of the identical asset at another date. In most cases the item has only been sold twice, but in some cases it has been sold over 300 times.

    Is there an approach on STATA that will allow me to quickly take the log price difference between a series of sales? e.g. Getting 3 returns for sale1 & sale2, sale2 & sale3, sale3 & sale4 for each item.

    I was initially considering a Repeat Sales Regression methodology as proposed by Bailey, Muth, Nourse (1963) but this approach is proving difficult to implement for circumstances in which I have more than 2 sales observations. Attached is a .do file which in the right lines of what I wish to do.

    Thank you in advance!

    Example-RepeatedSales.do

  • #2
    Welcome to Stata list. You will increase your chances of useful answer by following the FAQ on asking questions-provide Stata code in code delimiters, readable Stata output, and sample data using dataex. Note that we are not largely from your research area so few of us if any will know what a repeat sales regression methodology means. In addition, many of us will not open files from people we don't know - That is why we recommend including the programming in the posting using code delimiters. Note that you should cut down the program to what you really need to illustrate your problem.. It is important to explain what you want to calculate in general terms and what you want to estimate in general terms so that we can help you better.

    One way to think of this is that you have panel data where the panel is the asset, but the times are not consistent. You can xtset the data for panel without specifying the time variable. What you lose than is it you cannot use L. or D. operators. Instead, you can make sure your data is sorted correctly and then refer to previous observations with a relative position indicator. For example, the one period difference would could be:
    g logdiff1=log(price - price[_n-1] ) if item==item[_n-1]

    You want the if condition so it won't lag across items.
    An alternative would be to create an artificial time variable that simply counts up over the sales within assets so the 1st sale is time 1, 2nd time to, etc. Then you could use L and D operators freely.


    It just occurred to me that I assumed you had the data in long form. That is, you have a separate observation for each sale. In Stata, it is much easier to work in long form for this kind of problem then in wide form where you have each sale as a separate variable. If your data are in wide form, see the reshape command.

    Comment


    • #3
      Phil Bromiley Thank you SO much. I have been bashing my head against the wall with this problem and the code you provided above executed precisely what I needed.

      For those stumbling across this post in the future:

      panel data: item_type date_sale
      Code:
      g logdiff1=(log(price) - log(price[_n-1])) if item_type==item_type[_n-1]
      Produces this output.
      sale item_type price date_sale logdiff1
      1 1 2100 Jan-17 .
      2 1 3100 Oct-17 .3894
      3 2 2600 Dec-17 .
      4 2 1850 Feb-18 -.3404
      5 2 2200 Mar-18 .1732
      6 2 2500 Apr-18 .1278
      7 3 1400 Jan-17 .
      8 3 1300 Feb-17 -.0743
      9 3 950 Mar-17 -.3134

      Comment


      • #4
        Tom Cordiez Hope you are doing good! I am also doing a RSR approach to compile a price index for collectibles. It would be great if we could have a chat as I am pretty lost.

        Thanks and looking forward.

        Comment

        Working...
        X