Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem making centered matrix

    Hello, to complete my thesis I need to calculate Mahalanobis Distance for several recorded Dimensions I got from a survey. The problem is when making the centered matrix which I supposedly need to complete the code to generate the needed values. I am a complete novice to Stata and not a statistician, I study IB. This will probably be the most complex operation I need in my thesis so I need some help (and my professors all are on vacation right now). I asked Chat-GPT to provide some code and I think most of it will work except for creating the matrix. The code is as follows:

    *calculate mahalanobis/formative distance for perceived psychic distance * Calculate means summarize nl_lg_hm nl_rel_hm nl_edu_hm nl_econ_hm nl_dem_hm nl_pol_hm, meanonly matrix means = r(mean) * Calculate centered matrix matrix centered = J(_N, 1, 1) * means - (nl_lg_hm \ nl_rel_hm \ nl_edu_hm \ nl_econ_hm \ nl_dem_hm \ nl_pol_hm) * Calculate covariance matrix matrix covmatrix = (centered' * centered) / (_N - 1) * Calculate Mahalanobis distance gen mahalanobis_distance = . forval i = 1/`=_N' { matrix obs = centered[`i', ] matrix m_dist = obs * inv(covmatrix) * obs' replace mahalanobis_distance = sqrt(m_dist[1,1]) in `i' }
    The problem is that creating a matrix always ignores/ cannot find the first variable, no matter if it is really in my dataset, or completely made up. The variables in the previous code do actually exist and I quadruple checked. However using "summarize" the variables all have only zeros when checking the data. I tried to circumvent this by creating new diff_variables using "egen":
    * Calculate means using egen egen mean_nl_lg_hm = mean(nl_lg_hm) egen mean_nl_rel_hm = mean(nl_rel_hm) egen mean_nl_edu_hm = mean(nl_edu_hm) egen mean_nl_econ_hm = mean(nl_econ_hm) egen mean_nl_dem_hm = mean(nl_dem_hm) egen mean_nl_pol_hm = mean(nl_pol_hm) * Generate differences and create centered matrix gen double diff_nl_lg_hm = nl_lg_hm - mean_nl_lg_hm gen double diff_nl_rel_hm = nl_rel_hm - mean_nl_rel_hm gen double diff_nl_edu_hm = nl_edu_hm - mean_nl_edu_hm gen double diff_nl_econ_hm = nl_econ_hm - mean_nl_econ_hm gen double diff_nl_dem_hm = nl_dem_hm - mean_nl_dem_hm gen double diff_nl_pol_hm = nl_pol_hm - mean_nl_pol_hm Now the new diff_variables actually had values. Still trying to create a centered matrix always ignored the first variable stating it like this:

    .matrix define centered_matrix = diff_nl_lg_hm diff_nl_rel_hm diff_nl_edu_hm diff_nl_econ_hm diff_nl_dem_hm diff_nl_pol_hm
    diff_nl_lg_hm not found
    r(111);

    or
    . matrix centered = diff_nl_lg_hm \ diff_nl_rel_hm \ diff_nl_edu_hm \ diff_nl_econ_hm \ diff_nl_dem_hm \ diff_nl_pol_hm
    diff_nl_lg_hm not found
    r(111);

    or
    matrix centered = (diff_nl_lg_hm \ diff_nl_rel_hm \ diff_nl_edu_hm \ diff_nl_econ_hm \ diff_nl_dem_hm \ diff_nl_pol_hm)

    or
    matrix centered = (diff_nl_lg_hm diff_nl_rel_hm diff_nl_edu_hm diff_nl_econ_hm diff_nl_dem_hm diff_nl_pol_hm)

    I think there are some more mutations I used, according to GPT they should have the same effect just differently operationalized. Please help.


  • #2
    I am sorry, I also have no idea why it changed the format. I had spacing in there before posting.

    Comment

    Working...
    X