Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logarithmic vs Level Form of Independent Variable in a Hedonic Price Model

    I am regressing real estate price against distance from a rail station (distance variables are two dummy variables; half km and onekm). Can anyone help explain the difference between the p-values of the variables of interest (halfkm and onekm) when using the log form and the level form of the dependent variable? Furthermore, what should we use as the dependent variable the lvel form or the log form? (Regression results of the two models are shown below) Apologies for the messy result I cant upload an image.

    Clarification: For the level form regression: both halfkm and onekm are significant while the log form yields: insignificant halfkm and significant onekm

    1st model(level form):

    reg realestateprice improvements area area2 jobsaccessible jobswithin jobpop barangaypopulation halfkm onekm commercial misc , robust

    Linear regression Number of obs = 4436
    F( 11, 4424) = 76.18
    Prob > F = 0.0000
    R-squared = 0.4609
    Root MSE = 2.3e+07

    ------------------------------------------------------------------------------------
    | Robust
    realestateprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    -------------------+----------------------------------- -----------------------------
    improvements | 1.173872 .1393172 8.43 0.000 .9007405 1.447003
    area | 36781.77 9161.338 4.01 0.000 18820.96 54742.58
    area2 | -.81506 .6277304 -1.30 0.194 -2.045726 .4156057
    jobsaccessible | 12.73107 5.022244 2.53 0.011 2.884964 22.57719
    jobswithin | 169.8118 37.36524 4.54 0.000 96.55724 243.0664
    jobpop | -227163.1 121080.6 -1.88 0.061 -464541.6 10215.43
    barangaypopulation | -15.09364 11.23772 -1.34 0.179 -37.12519 6.937906
    halfkm | 7584262 3160256 2.40 0.016 1388578 1.38e+07
    onekm | -2790965 844281.5 -3.31 0.001 -4446180 -1135751
    commercial | 2348497 3034705 0.77 0.439 -3601043 8298038
    misc | 5697798 9687808 0.59 0.556 -1.33e+07 2.47e+07
    _cons | -1.40e+07 4193190 -3.34 0.001 -2.22e+07 -5786588
    ------------------------------------------------------------------------------------

    . reg lprice improvements area area2 jobsaccessible jobswithin jobpop barangaypopulation halfkm onekm commercial misc , robust

    Linear regression Number of obs = 4436
    F( 11, 4424) = 172.15
    Prob > F = 0.0000
    R-squared = 0.5233
    Root MSE = .71524

    ------------------------------------------------------------------------------------
    | Robust
    lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    -------------------+----------------------------------------------------------------
    improvements | 8.36e-09 7.87e-09 1.06 0.288 -7.07e-09 2.38e-08
    area | .0015044 .0000942 15.96 0.000 .0013196 .0016891
    area2 | -6.68e-08 7.44e-09 -8.97 0.000 -8.13e-08 -5.22e-08
    jobsaccessible | 1.67e-06 8.85e-08 18.81 0.000 1.49e-06 1.84e-06
    jobswithin | 4.38e-06 7.03e-07 6.23 0.000 3.00e-06 5.75e-06
    jobpop | .0156883 .0030317 5.17 0.000 .0097447 .0216319
    barangaypopulation | -2.91e-07 3.19e-07 -0.91 0.363 -9.17e-07 3.35e-07
    halfkm | .0281292 .0533367 0.53 0.598 -.0764374 .1326959
    onekm | -.0862971 .0364874 -2.37 0.018 -.1578306 -.0147635
    commercial | .558586 .0640047 8.73 0.000 .4331047 .6840673
    misc | .2639745 .0957014 2.76 0.006 .0763519 .4515971
    _cons | 13.86445 .0414546 334.45 0.000 13.78318 13.94573
    ------------------------------------------------------------------------------------

    https://imgur.com/p1tpIff - Regression Result with level form
    https://imgur.com/lpggYq1 - Regression Result wiht log form

    Last edited by Kier Ballar; 07 Dec 2017, 08:50.

  • #2
    Dear Kier,

    The two models are estimating different things and it is not surprizing that the results are different. Which model to use depends on what you want to do with it; I have discussed that choice in the following (rather obscure) paper:

    Reis, H. and Santos Silva, J.M.C. (2006), Hedonic Prices Indexes for New Passenger Cars in Portugal (1997-2001), Economic Modelling, 23(6), pp. 890-908.

    Best wishes,

    Joao

    Comment


    • #3
      Dear Joao,

      Thank you so much for your advice! I will read your paper once I get back to my university since my univ has access to sciencedirect. I will update you once I obtain your paper. Thank you so much!

      Respectfully,

      Kier

      Comment


      • #4
        Cross-posted at https://www.reddit.com/r/stata/comme...dent_variable/

        Please note our policy on cross-posting, explicit in the advice that all are asked to read before posting. You are asked to tell us about it.

        Conversely, anyone interested in the Reddit thread would presumably like to know about suggestions here.

        Comment


        • #5
          Dear Nick,

          Deepest apologies.

          Respectfully

          Comment


          • #6
            EDIT: Title should be "Logarithmic vs Level Form of the DEPENDENT Variable in a Hedonic Price Model" --- I don't know how to edit the original post


            EDIT: One professor advised us to include all regressions of the model in the paper(5 regression over all). The regression are as follows 1) realestateprice as 2)log of real estate price 3)price per square meter 4) price per square meter 5) realestate price (but observations are restricted to properties with less or equal to 20 000 000 as price to exclude outliers) --- as the dependent variable. Furthermore, we observed that the only time that the halfkm dummy var was significant is on the regression where realestateprice is the dependent var but as we restrict the data to exclude outliers then the significance of the halfkm goes away. Also, we regressed the 5 dependent models with x's being halfkm and onekm only, we found out that with halfkm and onekm are always significant if this model was used(2 x's). Maybe this was because the effect of the train station proximity are overpowered as more variables are added (job accessibility) He theorized that proximity to train stations in our city does not really have an effect on the price of the property as the areas where train stations are built are historically part of the city's core. Properties near a train station are desirable properties even before the train line were built. Do you think his explanation can save our thesis? (There was also a study conducted in our country that showed that proximity is significant for residential properties but insignificant to commercial properties). Or is the understatement of the t-stat due to the fact that we only have 300 properties within a halfkm of a trainstation out of 4000 properties?

            Further, how do you interpret a boxcox result and how to decide on what kind of boxcox to use? Some papers advise that boxcox should be used to find the correct functional form of the hedonic price model.

            I also posted this here:

            https://www.reddit.com/r/academiceco...dent_variable/
            https://www.reddit.com/r/AskEconomic...dent_variable/
            https://www.reddit.com/r/stata/comme...dent_variable/
            Last edited by Kier Ballar; 08 Dec 2017, 12:40.

            Comment

            Working...
            X