I've got a precision problem with optic cup to disc ratio (a clinical measure of nerve tissue, > 0.5 means glaucoma is a possibility). It's normally recorded to 1 dp but borderline cases can be recorded halfway between (e.g. 0.65)

There's only a few of these so I need to round them to 1 dp. The database is imported from excel and OpticDiscAssessmentVerticalC (double) shows 1 or 2 dp when viewed on data browser, but

results in cdr (float) which has 8 dp for some values when viewed in data browser, e.g. 0.4000001 or 0.69999999. You can see from the table below that the cut offs are inconsistent. I've considered using egen and then specifying each cut point to 2 dp and then relabeling but this seems inelegant.

It's a small problem but an annoying one. I'd be very grateful if anyone can suggest a better way

Ali

Ali Poostchi

Ophthalmlolgy Registrar, Nottingham, UK

table cdr, contents (min OpticDiscAssessmentVerticalC max OpticDiscAssessmentVerticalC)

cdr | min(OpticD~C) | max(OpticD~C) |

0 | 0 | 0 |

.1 | .1 | .15 |

.2 | .2 | .2 |

.3 | .25 | .35 |

.4 | .4 | .4 |

.5 | .45 | .5 |

.6 | .55 | .6 |

.7 | .65 | .7 |

.8 | .75 | .8 |

.9 | .85 | .95 |

1 | 1 | 1 |

I have a trouble with constructing the McElroy's Goodness of Fit statistics for the seemingly unrelated regressions with system wide constraints.

I have three equations, whose vector of explanatory variables are same for all three.By using

I would like to construct the system wide measurement of goodness of fit, following McElroy(1977)'s R^2

Array

Where bz is the vector of estimated coefficients of slope parameters, and Z is corresponding vector of explanatory variables. S^{-1} is the consistent estimator of cross-equation variance-covariance matrix.

A is defined as

Array

where l is a (nx1) column vector of ones and n is the number of observations.

By using the estimated coefficient matrix e(b) (which is partitioned into slope and constant terms) and variance-covariace matrix of residual e(Sigma), i constructed the McElroy's R^2. Matrix calculation is technically operated successfully, but the problem is that the LHS and RHS (in the definition of R^2) does not equate.

When we have a system-wide constraint, should we include (in some ways) the constraint matrix to calculate the R^2?

It would be greatly appreciated if you could give me some tips to untangle this puzzle.

Thanks in advance!

Kensuke]]>

I tried as below but it did not work

Code:

. areg d_pc i.diasem##i.turno i.anomes, absorb(mun_res codestab) cluster(mun_res codestab) absorb(): too many variables specified r(103);

1) I'm having trouble conducting a Hausman-test for my xtpoisson, getting the respons:

chi2(37) = (b-B)'[(V_b-V_B)^(-1)](b-B)

= -4631.45 chi2<0 ==> model fitted on these

data fails to meet the asymptotic

assumptions of the Hausman test;

see suest for a generalized test

I've done it in the correct order eg: fe re. And as I understand it I can't use "suest" for xtpoisson? It requires predict and score which I don't find as an option for xtpoisson. I have done the hausman test with xreg instead but since I am supposed to use xtpoisson I want to make the test work with this specification.

2) I also want to do a goodness of fit on my model, but get the respons:

. estat gof

subcommand estat gof is unrecognized

r(321);

So concluding that this does not work on xtpoisson either. Are there any equivalent tests I can use? I am concerned with overdispersion in my data since the variance is far greater then the mean. Or can I look past Thos when using fixed effects and robust SE witht he xtpoisson command?

My dependent variabel is FDI flow and since I do not use ln it is highly skewed.]]>

I'm trying to run a bootstrap to analyze whether a variable that is considered as endogenous mediates two independent variables. Since my DV is a censored variable with different censoring point, and the main equation is conditioned on a subset of data, I use the CMP ( conditional mixed process) approach by Roodman(2008) to estimate three equations jointly.

here is the model:

==========

cmp (MMM= IV1 IV2 IV3 CONTROL1 CONTROL2)

(DV = MMM IV1 IV2 IV3 CONTROL1 CONTROL2 AAA BBB)

(SELECT=CONTROL1 CONTROL2 CCC DDD),

ind($cmp_cont "cond(SELECT,cond(CENSOR1==1,$cmp_right,$cmp_c ont) ,$cmp_out)" $cmp_probit) nonrtol

===========

Now I want to see whether "MMM" mediates the effect of "IV1", "IV2" and "IV3" on "DV", and here is the syntax:

===============

capture program drop bootmm

program bootmm, rclass

syntax [if] [in]

cmp (MMM= IV1 IV2 IV3 CONTROL1 CONTROL2)

(DV = MMM IV1 IV2 IV3 CONTROL1 CONTROL2 AAA BBB)

(SELECT=CONTROL1 CONTROL2 CCC DDD),

ind($cmp_cont "cond(SELECT,cond(CENSOR1==1,$cmp_right,$cmp_c ont) ,$cmp_out)" $cmp_probit) nonrtol

return scalar ind_IV1 = [MMM]_b[IV1]*[DV]_b[MMM]

return scalar ind_IV2 = [MMM]_b[IV2]*[DV]_b[MMM]

return scalar ind_IV3 = [MMM]_b[IV3]*[DV]_b[MMM]

end

bootstrap r(ind_IV1) r(ind_IV2) r(ind_IV3), bca reps(1000) nodots: bootmm

============

I kept getting the error message as following:

"insufficient observations to compute jackknife standard errors

no results will be saved

r(2000);"

WHAT SHOULD I DO?

thanks,

Hannah]]>

I am comparing the 4 treatment over time.

i did a logrank test to see the strata (0,1,2,3) are equality.( which is significant)

also so i did logrank test for pairwise (0 vs1, 0vs2, 0vs3, 1vs2, 1vs3, 2vs3)

sts test trt, logrank

sts test trt if trt~=2 & trt~=3, logrank

sts test trt if trt~=1 & trt~=3, logrank

sts test trt if trt~=1 & trt~=2, logrank

sts test trt if trt~=0 & trt~=2, logrank

sts test trt if trt~=0 & trt~=3, logrank

sts test trt if trt~=0 & trt~=1, logrank

i did also landmark analysis, 1, 3 and 5 years in these 3 interval there is not significant different between treatment,

sts test trt if landmark==1

sts test trt if landmark==3

sts test trt if landmark==5

my supervisor ask me to do a adjustment for multiple comparison for the logrank test.

I look at the sas example

http://support.sas.com/documentation...examples02.htm

i try to do it in stata , i need advice

i am attaching the data

********here is the program**********************

cd "C:\statistics\survival"

import delimited C:\statistics\survival\bmt.csv, clear

*a data frame with 137 observations on the following 22 variables.

la de group 1 "1-ALL" 2 "AML low-risk" 3 "high-risk"

la val group group

la var t2 "disease-Free survival time"

rename t2 T

rename d1 Status

la de Status 1 "Dead" 0 "Alive"

la val Status Status

la var d2 "relapse indicator"

la de d2 1 "Relapsed" 0 "Disease-Free"

stset T, failure(Status)

sts graph, by(group) risktable

sts test group

can you please advice me how can i do it after adjustment in stata.

Thanks

Sugi]]>

I'm using timeseries data and want to generate a return series. The return is supposed to be calculated on the data point that is at least 3 seconds away from this datapoint. You'll see this in the data example below - so for observation 9, the return is calculated relative to the 4th observation, while the 10th return is calculated as 10th price/9th price-1, since these are the first dates, that are at least 3 seconds apart. I've tried

Code:

rolling

Code:

cond()

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input double(price return date) .4563140351911763 0 1676192401234 .12123021072046514 0 1676192401456 .015799518412771385 0 1676192401789 .5531145142756151 0 1676192401998 .36208761774438625 0 1676192402234 .4010092064132522 0 1676192402456 .5990130135963275 0 1676192402789 .010658344783998186 0 1676192402998 .48161974610468405 -.1292585284343224 1676192405000 .06049858759734017 -.8743851594818326 1676192410000 .8339152409690928 12.784044786621696 1676192460000 .783335628399825 11.947998614669551 1676192461000 .5320383300504032 -.32080412180761453 1676192465000 end format %tcCCYY-NN-DD_HH:MM:SS.sss date

Thank you in advance

]]>

This is my Problem: Only if someone participates in treatment, they are asked if they are participating voluntarily or are forced to do so. So it lookes s.th. like this:

id | wave | Treatment (yes=1) | voluntary (yes=1) |

1 | 1 | 0* | (missing) |

1 | 2 | 1 | 1 |

1 | 3 | 1 | 1 |

2 | 1 | 0 | (missing) |

2 | 2 | 0 | (missing) |

2 | 3 | 1 | 0 |

I think that a fixed-effects analysis needs a transition from zero to one in any case, also for an interaction term and that this is why, if I try to conduct an analysis, Stata tells me that I have "no observations". I looked it up, and there definitely are cases, so it is not a problem of too many missings. However, I do not have any transition in my data from voluntary== 0 to voluntary == 1.

I thought about changing the missings of the "voluntary"-variable prior to the participation into zero's.

sort id wave

replace voluntary = 0 if voluntary[_n+1] !=.

So it looks like this:

id | wave | Treatment (yes=1) | voluntary (yes=1) | old variable | voluntary (yes=1) | new variable |

1 | 1 | 0 | (missing) | 0 |

1 | 2 | 1 | 1 | 1 |

1 | 3 | 1 | 1 | 1 |

2 | 1 | 0 | (missing) | (missing) |

2 | 2 | 0 | (missing) | 0 |

2 | 3 | 1 | 1 | 0 |

I am very interested in your feedback. Or maybe you have any other idea on how to tackle that problem? Thank you very much in advance!]]>

Below I show a few observation of a panel data set which is organized by id number and year.

Each subsidiary (BVDID) is located in a given country (COUNTRY_S), belongs to a given parent firm (guo_bvdid) and is observed across several years (year). NEWID groups one subsidiary and one parent firm observed for some years.

Code:

* Example generated by -dataex-. To install: ssc install dataex clear NEWID BVDID COUNTRY_S guo_bvdid COUNTRY_HQ year 5 "AR30-50119642-4" "AR" "LU0B95859" "LU" 2006 5 "AR30-50119642-4" "AR" "LU0B95859" "LU" 2007 5 "AR30-50119642-4" "AR" "LU0B95859" "LU" 2008 5 "AR30-50119642-4" "AR" "LU0B95859" "LU" 2009 5 "AR30-50119642-4" "AR" "LU0B95859" "LU" 2010 5 "AR30-50119642-4" "AR" "LU0B95859" "LU" 2011 5 "AR30-50119642-4" "AR" "LU0B95859" "LU" 2012 14 "AR30-50278659-4" "AR" "BE0400454404" "BE" 2004 14 "AR30-50278659-4" "AR" "BE0400454404" "BE" 2005 14 "AR30-50278659-4" "AR" "BE0400454404" "BE" 2006 14 "AR30-50278659-4" "AR" "BE0400454404" "BE" 2007 14 "AR30-50278659-4" "AR" "BE0400454404" "BE" 2008 56 "AR30-63945397-5" "AR" "ESA28015865" "ES" 2007 56 "AR30-63945397-5" "AR" "ESA28015865" "ES" 2009 56 "AR30-63945397-5" "AR" "ESA28015865" "ES" 2010 end

What I would like to have is to create a table with subsidiaries per countries e.g. AR has 129 subsidiaries...

I have tried to use: egen COUNT_SUB = count(NEWID), by(guo_bvdid COUNTRY_S), but this shows the number of observation of a subsidiary. I think that -egen- is correct in this case, bud I don't know how to use it properly.

Thank you,

An

]]>

I need to compare the mean of two variables using a Mann-Whitney U tests. I usually use this test as : |ranksum [depvar], by ([grouping var])|. But the result I found gives me the p-value for a two-sided tests. In the present case, I need to get the p-value for a one-sided test. I do not find any option that can provide me the p-value for a one-sided test. I know that the ttest can give me the p-value for a one-sided test but assumptions for this test are not respected in my case so I need to use a Mann Whitney U test. Do you have any idea, option of the rank-sum test that can provide me the p-value for the one-sided test?

Many thanks]]>

I'm trying create a single regression table that reports results for all observations and then a subset of observations. I can't seem to find a way to do this with esttab or outreg2 and I'm unfamiliar with other ways to produce LATEX files from stata. The desired output looks something like this:

Array

Any ideas?

Thanks!Best,

Cecil]]>

Thank you!]]>

As a new member on statalist, I have a question about closing a thread I started. FAQ of statalist says that:

Closing a thread you started is important, especially by reporting what solved your problem. You can then thank those who tried to help.

I have searched for a while on websites but did not find how to do that.

Best,

Tyrah]]>

I want to find optimal weights to combine different forecasts by minimizing a quadratic form given some constraints. It's a quadratic programming problem of the form:

where w is a m-x-1-vector and wmin (w' * S * w) subject to sum(w_{i})=1 and 0<=w_{i}<=1 for all i ,

I know that this could be solved in R using the quadprog package. However, my whole analysis is set up in Stata and I would prefer solving the problem in Stata. How can I do this?

Any pointers would be appreciated, thanks!

Boris]]>