. // looking at retrieving the contents of a global macro in mata

. global testmac wow. what a fun narrative

. dis "$testmac"

wow. what a fun narrative

. mata

------------------------------------------------- mata (type end to exit) -----

: "$testmac"

wow. what a fun narrative

: maccontents=st_global("testmac")

: maccontents

wow. what a fun narrative

:

: end

-------------------------------------------------------------------------------

]]>

I have a loop for selecting control observations without replacement for my case observations in a case control study. Unfortunately, this runs slowly because the dataset is quite large (around 5000 cases and 8 million controls to select from.) Thus, I’m looking to recreate the following loop in Mata with any tweaks that may improve the execution time as this has to be run for multiple case-control configurations. Currently, a single run is an over-night endeavor using Stata 18 on Windows Server 2012R2.

Due to legal reasons, I cannot disclose the dataset. Instead, I will try my best to illustrate the loop with the auto dataset.

I am aware of the excellent calipmatch package, which address 95% of my issue. However, one of the matching criteria is that the control observations must have a date value that is larger than that of the case observation, which to my understanding is not possible with calipmatch, because calipmatch only allows exact matches or caliper widths going above and below a certain score and not 'greater than' criterions. I tried to understand the calipmatch source code and modify it to my needs, but it is beyond my Mata skill level.

A part of the reason why the current loop runs slow is that it identifies all possible matches and then finds the first X matches. In this case, it is two matches. In my actual data, there will be 30 matches and there will be two more exact matching criteria, i.e., having the same value on a categorical variable. In the following example, imagine that the variable 'price' would correspond to the date variable in the actual dataset. I have annotated it to illustrate my intention.

Code:

sysuse auto generate byte case = strpos(make, "Buick") > 0 // Make a case group, in this case it is the Buick cars. count if case == 0 // 67 control observations to select from. gen float tmp = runiform() sort tmp // Setting the order of the observation at random gen tmp2 = . // Making temporary variable that will be rewritten throughout the loop. gen sto = . // Making a storage variable that will assign matched observations with the same ID as each case. bysort sto case: replace sto = _n if case == 1 // Adding IDs to case numbers. forvalues i=1/67{ levelsof price if sto == `i', local(A) // Saving the value of price in a local. replace tmp2 = `i' if price > `A' & missing(sto) // Identifying all observations with prices higher than the ith case. replace sto = `i' if price > `A' & missing(sto) & sum(tmp2 == `i') <= 2 // Saving the first two matches as controls. replace tmp2 = . // Resets the temporary storage. }

Best regards Soeren]]>

I am trying to loop in Mata. I need to loop in Mata because I am creating matrices that are used in a solver. In the following code I am creating the vector firstterm, but I cannot figure out how to loop over the S_pp matrices (from S_pp1-S_pp100), where $models=100.

mata:

for (i = 1; i <= $models; i++) {

for (j = 1; j <= $models; j++) {

for (k = 1; k <= $models; k++) {

if (i == j & j == k) {

firstterm = J($draws, 1, .)

for (n = 1; n <= $draws; n++) {

firstterm[n, 1] = sim_B_rand_price_sim[n, 1]^2 * (1 - 2 * sim_shares_sim[i, n]) * sim_shares_sim[i, n] * (1 - sim_shares_sim[i, n]) / $draws

}

S_ppi[j, k] = sum(firstterm)

.......

The rest of the code is not needed to answer the question. If anyone could help me figure out how to edit S_ppi[j,k] so that the code will loop over the existing matrices S_pp1-S_pp100, I will be eternally grateful. ]]>

Code:

: Cmat = (2,2,3\1,2,2\1,2,0) : : // Eigenvalues and Eigenvectors : eigenvalues = . : eigenvectors = . : : eigensystem(Cmat, eigenvectors, eigenvalues,) : // Extract real part of eigenvalues and eigenvectors : eigenvectors = Re(eigenvectors) : eigenvalues = Re(eigenvalues') : : : eigenvectors 1 2 3 +----------------------------------------------+ 1 | -.7621332906 -.5136120346 -.8772027338 | 2 | -.5282854667 -.3152370696 .4686056046 | 3 | -.3742556787 .7980152053 .104518664 | +----------------------------------------------+ : eigenvalues 1 +---------------+ 1 | 4.859523389 | 2 | -1.43366463 | 3 | .5741412412 | +---------------+

Code:

: sort(eigenvalues,-1) 1 +---------------+ 1 | 4.859523389 | 2 | .5741412412 | 3 | -1.43366463 | +---------------+

Thank you for your help!

Best,

Ali

]]>