Good morning,
I wrote a Mata function that uses panelsetup() to do something by groups, and it (seemed that it) is miraculously working after I spent almost 40 hours debugging it.
I tried it on another dataset, and it failed with the message
After scratching my head for another business day, I finally figured out what is the problem: When I have missing values in my data, I exclude those missing values with my -mark- command, and from thereafter I am not able to stick back my results in my data because the dimension of my results vector and my original data are different. So the question is, how do you stick your Mata results matrix back to your Stata data, when you have missing values and because of this the dimensionalities start to differ?
Here is an example of the problem. In the following Mata code I write a Mata function that replicates what -egen, min()- and -egen,max()- do, that is, it calculates min and max by groups.
The function works fine when I do not have missing data, and indeed produces the same results at the -egen, min()- and -egen,max()-.
so far so good.
But now when I exclude some missing values with my -mark- statement, it all falls to pieces:
So what do we do here? What is the solution to moving variables back to Stata from Mata when there are missing values and this makes the dimensionality of objects different?
I wrote a Mata function that uses panelsetup() to do something by groups, and it (seemed that it) is miraculously working after I spent almost 40 hours debugging it.
I tried it on another dataset, and it failed with the message
Code:
st_store(): 3200 conformability error
myquantile_by(): - function returned error
<istmt>: - function returned error
r(3200);
Here is an example of the problem. In the following Mata code I write a Mata function that replicates what -egen, min()- and -egen,max()- do, that is, it calculates min and max by groups.
The function works fine when I do not have missing data, and indeed produces the same results at the -egen, min()- and -egen,max()-.
Code:
clear
clear mata
mata:
void function mean_by_store(string scalar var, string scalar groupid, string scalar touse)
{
real scalar i, j, j0, j1, min, max
real colvector id, y
real matrix info, result
string colvector minmax
minmax = ("min" \ "max")
id = st_data(., groupid, touse)
y = st_data( ., var, touse)
result = J(rows(y),2,.)
info=panelsetup(id, 1)
for (i=1; i<=rows(info); i++) {
j0 = info[i, 1]
j1 = info[i, 2]
min = min(y[|j0\j1|])
max = max(y[|j0\j1|])
for (j=j0; j<=j1; j++) {
result[j,1] = min
result[j,2] = max
}
}
for (j=1; j<=2; j++) st_store(., st_addvar("double", "my_"+minmax[j]), result[,j])
}
end
sysuse auto
keep price rep
sort rep
mark touse
mata: mean_by_store("price", "rep", "touse")
egen min = min(price), by(rep)
egen max = max(price), by(rep)
summ my_min min my_max max
. summ my_min min my_max max
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
my_min | 74 3589.203 261.2575 3291 4195
min | 74 3589.203 261.2575 3291 4195
my_max | 74 13178.01 2871.961 4934 15906
max | 74 13178.01 2871.961 4934 15906
But now when I exclude some missing values with my -mark- statement, it all falls to pieces:
Code:
. sysuse auto, clear
(1978 Automobile Data)
.
. keep price rep
.
. sort rep
.
. mark touse if !missing(rep)
.
. mata: mean_by_store("price", "rep", "touse")
st_store(): 3200 conformability error
mean_by_store(): - function returned error
<istmt>: - function returned error
r(3200);
end of do-file
r(3200);

Comment