Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Singular matrix estat kmo

    Dear Stata users,
    I am currently trying to practice a bit on STATA.
    I decided to make a regression from a database that I constructed by myself making a survey.
    This regression explains lnP in function of a lot of variables (27) some are dummies and some perfectly correlated.
    Observing the mean of the VIF test higher than 10 I decided to run a ACP.
    Everything runs fine. but when I want to run the kmo test the error "correlation matrix is singular".
    I would like to know how to fix this problem and why it happens.

    Thanks in advance for your help,
    Marco

  • #2
    Marco (as per FAQ, please consider the preference for full given and family names on this forum. Thanks):
    - as per FAQ, your chances of getting helpful replies are conditional on posting what you typed and what you got from Stata. Besides, you may want to share an example/excerpt of your dataset (type -search dataex- from within Stata and follow the instructions reported in the help file);
    That said:
    -a high VIF might be immaterial if it refers to controls and worrisome if it refers to predictors;
    - 27 independent variables might be pretty good if you have a sample of 10/15 observations per predictors; otherwise, your results will probably affected by overifitting;
    - if with ACP you mean the Italian acronym for PCA (principal component analysis), you may want to take a look at https://stats.stackexchange.com/ques...trix-and-pca.;
    - does kmo stay for Kaiser-Mayer-Olkin?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Carlo,

      - Here are some informations about my dataset:
      The dependent variable is studied throug more than 50 variables, I decided to group them by group in function of their kind of information.
      One group is about the profil of the product, the second one about a group of comon characteristics and the last one also.
      The sample is constitudes by more than 400 observations grouped by group of 45 or 70 more or less.

      - What I try to do, because I am not so experimented, is the following process:

      The regression is more or less like the following one:
      reg lnP Motiv2 Motiv3 Adv2 Adv3 Shop2 Shop3 Style2 Style3 Impulsive Discount Online Frequency Middel End Indiferent Partner1 Partner2 Partner3 Partner4

      For some of them I observe a vif mean really high. Am I true to said that that the coefficient are more volatile due to collinearity ?
      And so , I am good to try to reduce this collinearity opting for PCA ?

      - I enter some codes:

      global profil Motiv2 Motiv3 Adv2 Adv3 Shop2 Shop3 Style2 Style3 Impulsive Discount Online Frequency Middel End Indiferent Partner1 Partner2 Partner3 Partner4
      pca $profil
      screeplot, yline(1)
      pca $profil, comp(6)
      predict pc1 pc2 pc3 pc4 pc5 pc6
      estat kmo //Matrix singular problem

      - My problem is when I run the Kaiser-Mayer-Olkin test in fact. From the link in your previous message, it means that the matrix generated from my variables is singular and so not adapted to the PCA ?

      Thank your for your consideration,
      Marco

      Comment


      • #4
        Marco:
        again, your last post is nit the best way for eliciting postive replies (see tha FAQ. Thanks).
        That said, your problem probably rests on quasi-extreme multicollinearity: try to rethink about your regression specification, first.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment

        Working...
        X