Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question on AIC (and Log-likelihood)

    Hello everybody.

    I would have two quick theoretical questions on the AIC, which raised when performing some regression analysis in Stata.

    Firstly, when comparing two nested models, the best fit can be expressed by the AIC: 2k - 2ln(L).

    Now, I have two models;

    Model 1 with: AIC = 301, ln(L) = - 470, Pseudo R^2 = 0.12
    Model 2 with: AIC = 1200, ln(L) = - 560, Pseudo R^2 = 0.07

    The question is: is it correct to affirm that Model 1 is to be preferred over Model 2, since the former has a larger Pseudo R^2 and a lower AIC, AND, at the same time, a lower ln(L) (the log-likelihood)? Does that mean, also, that when the log-likelihood is negative, I should select the model with the higher (ie closer to 0) ln(L)?


    Secondly, I wanted to ask whether it is possible to use the AIC to compare the same model but estimated through two different estimators (GMM and ML, eg), or if in this case, using the AIC is usueless and I should consider just the log-likelihood.

    Thanks!

    Kodi

  • #2
    Hello.

    To compare the same model estimated with different estimators, I think using the R^2 can be a way to discriminate among the different estimators.

    Comment


    • #3
      mmm... I'm not quite sure about this.. anyone else would like to help me? Many thanks.

      Comment


      • #4
        All fit indices involve a somewhat arbitrary decision on how much to weight or punish the model for number of parameters.

        Many would prefer AIC or BIC over R-squared, particularly in a model with only a pseudo r-square. If you were doing regression, and the models are nested, then the model including all the variables will have a higher r-squared than any model using part of the variables - this is why folks use adjusted R-square.

        However, if the models are nested, then you can do direct tests to compare the models.

        Comment


        • #5
          Hi Hannon,

          If you are not sure which criteria should be used, you can do a step further by computing Root Mean Square Error (RMSE), Mean Absolute Prediction Error (MAPE), and Mean Prediction Error (MPE) of in-sample and out-sample to test which models provide the best estimations given your dataset.

          For further information regarding the three indicators, please refer to this paper: https://onlinelibrary.wiley.com/doi/....1002/hec.1498

          Hope this helps.

          Best regards,

          DL

          Comment


          • #6
            Ok, this was useful.

            Thank you all for your kind answers.

            Kodi

            Comment

            Working...
            X