I performed intraclass correlation in STATA (. icc) to assess the test-retest reliability of my study participants (basically, they completed a similar set of questionnaire 1 week apart).

I then tried to use the prefix command bootstrap:

. bootstrap ICC = r(icc_i), reps(100) cluster(ID) : icc score ID visit

I ended with this error message:

Bootstrap replications (10)

----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5

xxxxxxxxxx

insufficient observations to compute bootstrap standard errors

no results will be saved

Wondering whether anyone can help me with this?

Thank you]]>

What i did not really get is why, in appending one of my dataset, Stata shows a message (not in red, so not an error which compromises the process) that states "var A was int now float": what does it mean? It happens only in one of the dataset that I appended, even if I coded and treated the variables exactly in the same way in all of them. Is it something whose I should be worried?

Thanks a lot, G.]]>

I am trying to graph the grades of students who are divided into two groups (control and treatment). One set of variables that I would like to graph (test scores) are divided into 11 categories (sections of the test) over 10 waves. So I would like to make 11 graphs, one for each section of the test, with the average scores on the y-axis and the period on the x-axis. I want each group to be plotted as two different time series. I would then like to have the standard error as whiskers surrounding each point.

I also have data from a survey collected over 3 waves with over 60 different variables. So I would also like to make time series graphs as above but with these variables.

I hope that these graphs could be done using a loop, since 70+ graphs will take a long time to code individually.

Thank you for your help. ]]>

I hope someone can help me.

I have obtained means and medians of a variable for all months of the year (Thus 12 means and 12 medians for Jan till Dec).

I want to perform means difference test and Mood's non-parametric median test, to test that January is different from all other months.

I know how I can do it when I want to compare only two months, but how can I compare Jan with all other months of the year for both tests?

Thank you kindly in advance!

]]>

I have a str variable called 'gender', all observations are either 'male' or 'female'.

I want to gen a new int variable that auto-populates the observations as '1' for female and '2' for male.

Variations on a theme have included this

. generate female=0

. replace female = 1 if genderofpi=female

I assume actually those two commands can be more simply done in one, but a solution via any route would be very much appreciated!

Thanks.]]>

I recently explored the functions of -predictnl- and now prefer it over -predict- due to its flexibility.

One key feature appears to be the -variance()- option, which gives you variance estimates for each predicted value. However, I could not really understand yet how exactely Stata calculates these figures. Here's an example:

Code:

. clear . input x y x y 1. 1 13 2. 1 21 3. 2 27 4. 1 20 5. 2 29 6. 2 19 7. 1 11 8. 1 12 9. 1 12 10. 1 20 11. end . . regress y x Source | SS df MS Number of obs = 10 -------------+---------------------------------- F(1, 8) = 8.40 Model | 186.685714 1 186.685714 Prob > F = 0.0199 Residual | 177.714286 8 22.2142857 R-squared = 0.5123 -------------+---------------------------------- Adj R-squared = 0.4513 Total | 364.4 9 40.4888889 Root MSE = 4.7132 ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x | 9.428571 3.252419 2.90 0.020 1.928479 16.92866 _cons | 6.142857 4.483151 1.37 0.208 -4.195307 16.48102 ------------------------------------------------------------------------------ . predictnl yp2 = predict(), variance(yvar) . list * +------------------------------+ | x y yp2 yvar | |------------------------------| 1. | 1 13 15.57143 3.173469 | 2. | 1 21 15.57143 3.173469 | 3. | 2 27 25 7.404762 | 4. | 1 20 15.57143 3.173469 | 5. | 2 29 25 7.404762 | |------------------------------| 6. | 2 19 25 7.404762 | 7. | 1 11 15.57143 3.173469 | 8. | 1 12 15.57143 3.173469 | 9. | 1 12 15.57143 3.173469 | 10. | 1 20 15.57143 3.173469 | +------------------------------+

The variance of *g*(*θ**, ***x***i*) is estimated by

Var {*g*(*θ**, ***x***i*)} = **GV***G'*

Var {

As next step, I would like to create prediction (not confidence) intervals using the predicted variances. Is there any command, that would do this for me? Or would I need to do this by hand?

Best,

Jakob]]>

I'm working on a dataset that contains the variable OCCHISCO (Occupation of workers) that I converted to HISCLASS (occupational classification system).

So now I have occupation as a code number (from 1 to 12) but I still have non translated occupations that I need to drop as you can see below

Array

How can I do it ? I tried:

Code:

keep if occhisco == "1" | occhisco == "2" | occhisco == "3"| occhisco == "4"| occhisco == "5"| occhisco == "6"| occhisco == "7"| occhisco == "8"| occhisco == "9"| occhisco == "10"| occhisco == "11"| occhisco == "12"

r(109);

So I tried to add the value label but it dropped all my observations

Code:

keep if occhisco == "1":occhisco_lbl | ....

I would like to ask the following:

My model is initially OLS, with poverty incidence as the dependent variable and a vector of predictors.

I had been advised to consider transforming the poverty incidence into deciles and use it as the Y. Thus far, I have been seeing "ordered logit" that could fit it, i.e. 10 categories representing the deciles. However, based on my readings, dependent variables in ologit are usually up to three only. I haven't seen 10 categories. I would like to ask if there is a limit to the number of categories on the LHS/ dependent side?

Also, are there alternate models where I could use decile version of a continuous variable as the dependent?

Thank you very much.

]]>

I am working with survey date to find the determinants of labour force participation. When I run the logistic regression I get an error message saying :

an error occurred when svy executed logistic

r(2000);

Stata says the error is due to no observations on which to perform the statistical calculation. However this is not the case, as I do not have any missing observations in my dataset.

Below is the code I used.

Code:

svy: logistic workedinlast12months i.employstatus headofhousehold i.age i.married i.educ i.firstbirthage i.childrenaged5andunder v213 i.firstmarriageage i.region urban

P.S- I had to eliminate some of my variables in order to post here, as it initially exceeded the linesize limit.

Code:

----------------------- copy starting from the next line ----------------------------------------- copy up to and including the previous line ------------------ .Code:* Example generated by -dataex-. To install: ssc install dataex clear input str15 caseid float(workedinlast12months employstatus headofhousehold age firstmarriageage childrenaged5andunder educ married urban) byte mom_once float(survey weight clusters strata surveyyear) " 2162 2 1" 1 3 1 7 1 2 1 2 0 1 1 1.275161 1 2 1 " 3 86 1 1" 1 3 1 6 2 2 1 3 0 1 1 1.275161 2 2 1 " 5192 1 1" 1 3 1 7 1 1 1 3 0 1 1 1.275161 3 2 1 " 12 14 2 1" 1 3 1 4 1 2 1 3 0 1 1 1.275161 4 2 1 " 12 63 1 1" 1 3 1 5 1 2 1 3 0 1 1 1.275161 4 2 1 end label values clusters clusters label def clusters 1 "1 2", modify label def clusters 2 "1 3", modify label def clusters 3 "1 5", modify label def clusters 4 "1 12", modify label values strata strata label def strata 2 "1 north/ extreme north/ ad rural", modify

Thank you in advance]]>

I have got three clusters from my data and I would like to turn these into dummy variables. For example if a country is part of the first cluster or the second etc.

Wondering what would be the easiest way to do this?

Any help would be wonderful!

N]]>

I was regression CAR(Cumulative Abnormal return) to delisting firms

CAR min -1200%, max 3300%

so.. I tried T-test

" ttest car==0 "

T-value is 0.000.

is this right?

how should I do...

thanks

]]>

I am using the following command to generate a graph:

twoway lfitci WorkQuantity WageOffer_PairedEmployer || scatter WorkQuantity WageOffer_PairedEmployer

The outcome is the following (after changing some colours and layouts):

Array

As you can see the zero of the y-axis is not exactly at the bottom-left of the graph, but slightly higher. This is annoying as I am comparing multiple similar graphs and this one seems misleading from a quick look. I tried using Graph Editor and correct the axis, but nothing changes. I presume it's an automatic thing of STATA because it plots the regression line, but

Many thanks in advance,

Bill

]]>

Code:

DateBecame

Code:

19550101

My other variable

Code:

DateLeft

Code:

19961001

Thanks in advance]]>

Code:

```
logit ind i.factor continuous
contrast factor#c.continuous
Contrasts of marginal linear predictions
Margins : asbalanced
------------------------------------------------------------
| df chi2 P>chi2
-------------------------+----------------------------------
factor#c.continuous | (omitted)
------------------------------------------------------------
```