<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
	<channel>
		<title>Statalist - Forums for Discussing Stata</title>
		<link>https://www.statalist.org/forums/</link>
		<description>Talk about all things related to Stata</description>
		<language>en</language>
		<lastBuildDate>Thu, 25 Jun 2026 01:55:14 GMT</lastBuildDate>
		<generator>vBulletin</generator>
		<ttl>60</ttl>
		<image>
			<url>images/misc/rss.png</url>
			<title>Statalist - Forums for Discussing Stata</title>
			<link>https://www.statalist.org/forums/</link>
		</image>
		<item>
			<title>nbreg DiD: pooled post-period AME outside range of year-specific AMEs from separate model specifications</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786404-nbreg-did-pooled-post-period-ame-outside-range-of-year-specific-ames-from-separate-model-specifications</link>
			<pubDate>Wed, 24 Jun 2026 20:50:52 GMT</pubDate>
			<description>I am running a difference-in-differences analysis with a negative binomial model and report three sets of estimates: a Year 1 effect, a Year 2+...</description>
			<content:encoded>I am running a difference-in-differences analysis with a negative binomial model and report three sets of estimates: a Year 1 effect, a Year 2+ effect, and a separate pooled post-period effect from an independent regression. In several subgroups, the pooled estimate falls outside the range defined by the Year 1 and Year 2+ estimates. Is this expected behavior when the pooled and period-specific estimates come from separate model specifications with different treatment indicators? Are there published examples in the health policy literature where this pattern is documented or discussed?</content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Shin Lee</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786404-nbreg-did-pooled-post-period-ame-outside-range-of-year-specific-ames-from-separate-model-specifications</guid>
		</item>
		<item>
			<title>New -mixedpower- package for calculating power and sample size analytically for linear mixed and marginal models available from SSC</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786402-new-mixedpower-package-for-calculating-power-and-sample-size-analytically-for-linear-mixed-and-marginal-models-available-from-ssc</link>
			<pubDate>Wed, 24 Jun 2026 18:45:05 GMT</pubDate>
			<description>Dear Statalist users, 
 
With thanks to Kit Baum, I would like to introduce a new package mixedpower. As the title suggests, the synonymous program...</description>
			<content:encoded><![CDATA[Dear Statalist users,<br />
<br />
With thanks to Kit Baum, I would like to introduce a new package <b>mixedpower</b>. As the title suggests, the synonymous program calculates power and sample size analytically for linear mixed models, typically for use in planning of an RCT with longitudinal continuous outcomes.<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code"> ssc install mixedpower</pre>
</div>There is flexibility in specifying the treatment effects, even allowing user-specified functions of the schedule time list (which needn't actually represent time), as well as the random effects or within-subject error structure. These aspects can even differ between treatment and control groups.<br />
<br />
All variance parameter inputs may be entered manually or instead 'automatically' read-in from a 'suitable' mixed model in memory, saving on both time and risk of a mistake.<br />
<br />
One may calculate power/sample size accounting for both a) the amount of longitudinal data collected at a given timepoint due to staggered recruitment and b) dropout, simultaneously. Dropout rates may differ between control and treatment groups, as can the allocation ratio.<br />
<br />
One may also estimate power under situation where the nature of the treatment effect is mis-specified. For example, what is the power when you assume a proportionate slope effect for the treatment group in the analysis model, if the true treatment effect in fact changed non-linearly over time. <b>mixedpower</b> will also provide the subsequent slope effect estimate.<br />
<br />
Additional programs in the package calculate power and sample size for 1) a multivariate mixed model (<b>mvmixedpower</b>) when you might want to synthesis multiple continuous outcomes to increase power, especially for an interim analysis and 2) a mixed model for 'directly measured' difference data (<b>dmmixedpower</b>). These programs come with less features.<br />
<br />
The help files are unapologetically extensive and come with lots of examples. Here are some below:<br />
<br />
1) First load Stata's pig dataset, and create a fake treatment group, as well as set week to starting at time zero. If we fit an unstructured marginal model, sometimes known as a mixed model for repeated measures (MMRM) on the first 5 measures (if week&lt;=4) we may calculate power for a similarly scheduled trial but with n=100, testing just the final difference effect (which equals 2) whilst automatically loading all the variance parameters.<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">webuse pig , clear
gen trt=id&gt;=25
replace week=week-1
mixed weight i.week i.week#1.trt if week&lt;=4 || id: ,nocons resid(unstr, t(week)) reml
mixedpower, trtspec(factor) sched(0 1 2 3 4) altcont(factor) diff(0.5(0.5)2) lctest(0 0 0 1) n(100) marginal errxt(auto) nohead</pre>
</div>This will give the following output, including the mixed model syntax for the implied analysis.<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">
Mixed model syntax:
constraint 1 _b[0.time#1.trt]=0
mixed depvar i.time   i.time#1.trt   , constraints(1) || id_level2: , nocons resid(unstructured , t(time))
-------------------------------------------------------------------------------------------------------------------------

Calculating power for a 2-level mixed model with factor treatment effect parameterisation:
  visit schedule      = 0 1 2 3 4
  treatment effect(s) = 0.5 1 1.5 2
  alpha               = 0.050
  total sample size   = 100
  n in control arm    = 50
  n in treatment arm  = 50
  power               = 0.9147</pre>
</div>2) An example emphasising the generalisability of <b>mixedpower</b> by recreating Stata's own power command for a cluster-randomised trial (output supressed):<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">power twomeans 0 0.4, power(0.9) m1(25) m2(25) sd(2) rho(0.1) kratio(2)
mixedpower, schedule(1(1)25) trtspec(intercept) altcont(noslope) difference(0.4) marginal errxt(input(exchangeable 4 0.1)) power(0.9) arat(1 2)</pre>
</div>3) An example incorporating partial follow-up due to both staggered recruitment and dropout. Note, the output will also give the number of subjects reaching each visit of the schedule list:<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">mixedpower, trtspec(slope) schedule(0 1 2 3 4) diff(0.5) cov(10, 1\1, 2) error(10) n(500) alpha(0.1) strec(0.05 0.1 0.15 0.2 0.4) drop(0.1 0.05 0.05 0.05 0.75)</pre>
</div>with output...<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">-------------------------------------------------------------------------------------------------------------------------
Mixed model syntax:
mixed depvar c.time   c.time#1.trt    || id_level2: time , cov(unstr)  resid(independent , t(time))
-------------------------------------------------------------------------------------------------------------------------

Calculating power for a 2-level mixed model with slope treatment effect parameterisation:
  visit schedule      = 0 1 2 3 4
  treatment effect(s) = 0.5
  alpha               = 0.100
  total sample size   = 500
  n in control arm    = 250
  n in treatment arm  = 250
  power               = 0.7718

Table of control and treatment group numbers (rounded) reaching each visit:
           |  visit 1  visit 2  visit 3  visit 4  visit 5
           |  time=0   time=1   time=2   time=3   time=4
-----------+-------------------------------------------------
control    |  225      191      159      120      75  
treatment  |  225      191      159      120      75</pre>
</div>4) An example employing the user-supplied functions, where linear slopes are assumed for both groups but in fact the disease progression shows an 'early decline' with a long plateau from about year 2 to the end and the treatment effect is actually proportional to this complex function of time. This example recreates a result from a simulation study by Morgan et al**<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">mixedpower, trtspec(slope) sched(0(1)5) diff(-0.05) cov(0.5, .0354\.0354, 0.01) error(0.15) n(230) actualcont(user(-5*exp(-2*x)+5)) cbeta(6 0.2) actualtrt(user(-5*exp(-2*x)+5)) nosyn</pre>
</div>Output:<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">Calculating power for a 2-level mixed model with slope treatment effect parameterisation but with actual user treatment effect:
  visit schedule      = 0 1 2 3 4 5
  treatment effect(s) = -0.05
  alpha               = 0.050
  total sample size   = 230
  n in control arm    = 115
  n in treatment arm  = 115
  power               = 0.5287</pre>
</div>Please feel free to ask questions about the package, either here or by email (including any bugs spotted).<br />
<br />
Matthew Burnell<br />
MRC Centre of Research Excellence in Clinical Trial Innovation<br />
University College London<br />
London, UK<br />
<a href="mailto:m.burnell@ucl.ac.uk">m.burnell@ucl.ac.uk</a><br />
<br />
<br />
** Katy E. Morgan, Ian R. White, Chris Frost. How important is the linearity assumption in a sample size calculation for a randomised controlled trial where treatment is anticipated to affect a rate of change? BMC Medical Research Methodology (2023) 23:274 <a href="https://doi.org/10.1186/s12874-023-02093-2" target="_blank">https://doi.org/10.1186/s12874-023-02093-2</a>]]></content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Matthew Burnell</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786402-new-mixedpower-package-for-calculating-power-and-sample-size-analytically-for-linear-mixed-and-marginal-models-available-from-ssc</guid>
		</item>
		<item>
			<title>Data Cleaning</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786386-data-cleaning</link>
			<pubDate>Tue, 23 Jun 2026 11:37:28 GMT</pubDate>
			<description>I am trying to run panel regression. Variables in sample have different number of observations starting from 17000 to 24000. I want to use maximum...</description>
			<content:encoded>I am trying to run panel regression. Variables in sample have different number of observations starting from 17000 to 24000. I want to use maximum no.of observations but have the same number of observations in all the tested models. I tried drop command and it brings observations to a few thousand only. How do I ensure the same number of observation accross models while ensuring maximum numbers?</content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>aima khan</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786386-data-cleaning</guid>
		</item>
		<item>
			<title>collect all results into a single table using collect</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786382-collect-all-results-into-a-single-table-using-collect</link>
			<pubDate>Mon, 22 Jun 2026 15:11:11 GMT</pubDate>
			<description>I am using stata 18.5 for windows 
 
I have the following code. I want to see all the results of the loop into a single stacked table but only get...</description>
			<content:encoded><![CDATA[I am using stata 18.5 for windows<br />
<br />
I have the following code. I want to see all the results of the loop into a single stacked table but only get the last looped result.<br />
I've tried multple variation of the coding and cannot figure it out.<br />
<br />
collect clear<br />
svyset [pweight=weightvar]<br />
<br />
foreach var in var list gender married etc... {<br />
<br />
collect _r_b _r_ci: svy: mean hours_worked, over (year_group `var')<br />
<br />
collect style cell result [_r_b _r_ci], nformat (%4.1f)<br />
collect layout (`var'#result) (year_group#smdset)<br />
}<br />
<br />
collect style head cmdset, title(hide) level(hide)<br />
collect preview<br />
<br />
<br />
<br />
 ]]></content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Debbie Burke</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786382-collect-all-results-into-a-single-table-using-collect</guid>
		</item>
		<item>
			<title>syntax model GBTM</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786375-syntax-model-gbtm</link>
			<pubDate>Mon, 22 Jun 2026 05:34:39 GMT</pubDate>
			<description>Good morning to everybody  I have 4 variables measured at 3 timepoints: 12 months, 18 months, and 24 months. Is the syntax for choosing the GBTM...</description>
			<content:encoded><![CDATA[<br />
 Good morning to everybody  I have 4 variables measured at 3 timepoints: 12 months, 18 months, and 24 months. Is the syntax for choosing the GBTM model correct? I have 106 adults(at least two measurements per outcome).   <br />
 Censoring limits were defined outcome by outcome in accordance with the observed empirical range, with a small margin beyond the extremes, as no unambiguous theoretical limits were available for these standardized variables.  <br />
 Based on the data structure and criteria of parsimony, stability, and interpretability, with three time points, the search was limited to polynomial forms of order 0/1, without exploring quadratic terms. Although a quadratic specification is technically possible with three surveys, it is often poorly informative and potentially unstable, especially with multiple outcomes and a small sample size. What do you think?   Thanks in advanced to everybody<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">**************************************************** 
 * GBTM MULTI-OUTCOME (4 outcomes, cnorm) * FINAL OPERATIONAL VERSION * CORRECT VERSION: pass uses OCC_pp, also checks TotProb and adds diagnostic entropy * Consistent with: Klijn + Nagin multitrajectory + recent review * * LOGIC: * STEP 0 = preliminary univariate exploration of individual outcomes * STEP 1 = choice of K in the multi-outcome model with equal initial order * STEP 2 = fixed K, structured comparison of all plausible 0/1 models * STEP 2B = refit/inspection of finalist models * * CRITERIA: * - BIC = primary criterion * - APPA / OCC_pp / minP / minTotProb / mismatch = adequacy/support criteria * - relative entropy = additional diagnostic of assignment clarity; NOT included in the pass * - absolute number of groups = descriptive; Does NOT qualify * - DELTABIC &lt;= 2 = competing models * - final decision = BIC + parsimony + interpretability + classification diagnostics * - with 3 time points and MAXORDER=1, the possible orders are 0/1 
****************************************************

clear all
set more off
set seed 12345
set sortseed 12345

cd &quot;C:\Users\xxxxxxxxxxx\Desktop\LCA_prova&quot;

global DATAFILE &quot;databasex.dta&quot;
global IDVAR    &quot;id&quot;

****************************************************
* OUTCOME
****************************************************
global VAR1 &quot;var1_12 var1_18 var1_24&quot;
global VAR2 &quot;var2_12 var2_18 var2_24&quot;
global VAR3 &quot;var3_12 var3_18 var3_24&quot;
global VAR4 &quot;var4_12 var4_18 var4_24&quot;


**************************************************** 
 * CNORM RANGE - OPTIMIZED ON EMPIRICAL DATA * Expanded outward to ensure numerical stability and avoid artificial clipping 
****************************************************
global MIN1 -8
global MAX1  6
global MIN2 -9
global MAX2 13
global MIN3 -15
global MAX3  11
global MIN4 -3
global MAX4  9

****************************************************
* RICERCA
****************************************************
global MAXK       2
global MAXORDER   1
global STARTORDER 1
global NREFIT     5

****************************************************
* SOGLIE DI ADEGUATEZZA
**************************************************** 
 global THR_MINP 0.05 // minimum assigned proportion of the group global THR_MINTOTPROB 0.05 // minimum estimated proportion from posterior probabilities global THR_MINAPP 0.70 // minimum average posterior probability global THR_MINOCC 5 // minimum OCC_pp global THR_MAXMIS 0.05 // maximum mismatch 

global DELTABIC    2

****************************************************
* 
 PROGRAM: create times 
****************************************************
capture program drop make_time
program define make_time
    capture drop t1 t2 t3
    gen t1 = 0
    gen t2 = 1
    gen t3 = 2
end

****************************************************
* PROGRAMMA: statistiche post-traj
****************************************************
capture program drop gbtm_stats
program define gbtm_stats, rclass
    syntax , K(integer)

    capture drop Mp countG counter APP p n d OCC TotProb mismatch d_pp OCC_pp SD_post __sdtmp

    gen double Mp = 0
    foreach pr of varlist _traj_ProbG* {
        replace Mp = `pr' if `pr' &gt; Mp
    }

    sort _traj_Group
    by _traj_Group: gen countG  = _N
    by _traj_Group: gen counter = _n
    by _traj_Group: egen double APP = mean(Mp)

    gen double p = countG/_N

    gen double TotProb = .
    forvalues gg = 1/`k' {
        quietly summarize _traj_ProbG`gg', meanonly
        replace TotProb = r(mean) if _traj_Group == `gg'
    }

    gen double mismatch = abs(TotProb - p)

    gen double OCC = .
    gen double OCC_pp = .
    if `k' &gt; 1 {
        gen double n = APP/(1-APP)
        gen double d = p/(1-p)
        replace OCC = n/d
        gen double d_pp = TotProb/(1-TotProb)
        replace OCC_pp = n/d_pp
    }
    else {
        replace OCC    = 999
        replace OCC_pp = 999
    }

    * PROTEZIONE: Evita crash di Stata se un sottogruppo contiene un solo record (SD non calcolabile)
    gen double SD_post = .
    forvalues gg = 1/`k' {
        capture by _traj_Group: egen double __sdtmp = sd(_traj_ProbG`gg') if _traj_Group == `gg'
        if !_rc {
            replace SD_post = __sdtmp if _traj_Group == `gg'
            drop __sdtmp
        }
    }

    * Relative entropy (0-1)
    tempvar __hsum __plnp
    local entropy = 1
    if `k' &gt; 1 {
        gen double `__hsum' = 0
        forvalues gg = 1/`k' {
            gen double `__plnp' = cond(_traj_ProbG`gg' &gt; 0, _traj_ProbG`gg' * ln(_traj_ProbG`gg'), 0)
            replace `__hsum' = `__hsum' + `__plnp'
            drop `__plnp'
        }
        quietly summarize `__hsum', meanonly
        local entropy = 1 + (r(sum) / (_N * ln(`k')))
    }

    preserve
        keep if counter == 1
        quietly summarize APP, meanonly
        local minAPP  = r(min)
        local meanAPP = r(mean)
        quietly summarize p, meanonly
        local minP = r(min)
        quietly summarize TotProb, meanonly
        local minTotProb = r(min)
        quietly summarize mismatch, meanonly
        local maxMismatch = r(max)
        quietly summarize OCC, meanonly
        local minOCC = r(min)
        quietly summarize OCC_pp, meanonly
        local minOCCpp = r(min)
    restore

    local pass = (`minP' &gt;= $THR_MINP) &amp; ///
                 (`minTotProb' &gt;= $THR_MINTOTPROB) &amp; ///
                 (`minAPP' &gt;= $THR_MINAPP) &amp; ///
                 (`minOCCpp' &gt;= $THR_MINOCC) &amp; ///
                 (`maxMismatch' &lt;= $THR_MAXMIS)

    return scalar minAPP      = `minAPP'
    return scalar meanAPP     = `meanAPP'
    return scalar minP        = `minP'
    return scalar minTotProb  = `minTotProb'
    return scalar maxMismatch = `maxMismatch'
    return scalar entropy     = `entropy'
    return scalar minOCC      = `minOCC'
    return scalar minOCCpp    = `minOCCpp'
    return scalar pass        = `pass'
end

**************************************************** 
 * TEMPORARY FILES 
****************************************************
tempfile phase0tmp step1tmp step2tmp finalists4 step2ranked

****************************************************

* PHASE 0: PRELIMINARY UNIVARIATE EXPLORATION
****************************************************
tempname h0
capture postclose `h0'
postfile `h0' str8 outcome int K str20 orders ///
    double ll aic bic minAPP minOCC minOCCpp minP minTotProb maxMismatch entropy pass ///
    using `phase0tmp', replace

forvalues vv = 1/4 {
    forvalues k = 1/$MAXK {
        use &quot;$DATAFILE&quot;, clear
        sort $IDVAR, stable
        make_time
        local indep t1 t2 t3

        local oo &quot;&quot;
        forvalues g = 1/`k' {
            local oo &quot;`oo' $STARTORDER&quot;
        }
        local oo : list retok oo

        quietly capture traj, ///
            var(${VAR`vv'}) indep(`indep') order(`oo') model(cnorm) min(${MIN`vv'}) max(${MAX`vv'})
        if _rc continue

        quietly gbtm_stats, k(`k')
        post `h0' (&quot;VAR`vv'&quot;) (`k') (&quot;`oo'&quot;) (e(ll)) (e(AIC)) (e(BIC_n_subjects)) ///
            (r(minAPP)) (r(minOCC)) (r(minOCCpp)) ///
            (r(minP)) (r(minTotProb)) (r(maxMismatch)) (r(entropy)) (r(pass))
    }
}
postclose `h0'

use `phase0tmp', clear
save phase0_univariate_scan_4var.dta, replace

**************************************************** 
 * STEP 1: choice of K in multi-outcome 
****************************************************
tempname h1
capture postclose `h1'
postfile `h1' ///
    str5 stage int K str20 o1 str20 o2 str20 o3 str20 o4 ///
    int group nG ///
    double p TotProb APP OCC OCC_pp mismatch SD_post ///
    double ll aic bic minAPP meanAPP minOCC minOCCpp minP minTotProb maxMismatch entropy pass ///
    using `step1tmp', replace

forvalues k = 1/$MAXK {
    use &quot;$DATAFILE&quot;, clear
    sort $IDVAR, stable
    make_time
    local indep t1 t2 t3

    local o1 &quot;&quot;
    local o2 &quot;&quot;
    local o3 &quot;&quot;
    local o4 &quot;&quot;
    forvalues g = 1/`k' {
        local o1 &quot;`o1' $STARTORDER&quot;
        local o2 &quot;`o2' $STARTORDER&quot;
        local o3 &quot;`o3' $STARTORDER&quot;
        local o4 &quot;`o4' $STARTORDER&quot;
    }
    local o1 : list retok o1
    local o2 : list retok o2
    local o3 : list retok o3
    local o4 : list retok o4

    quietly capture traj, multgroups(`k') ///
        var1($VAR1) indep1(`indep') order1(`o1') model1(cnorm) min1($MIN1) max1($MAX1) ///
        var2($VAR2) indep2(`indep') order2(`o2') model2(cnorm) min2($MIN2) max2($MAX2) ///
        var3($VAR3) indep3(`indep') order3(`o3') model3(cnorm) min3($MIN3) max3($MAX3) ///
        var4($VAR4) indep4(`indep') order4(`o4') model4(cnorm) min4($MIN4) max4($MAX4)
    if _rc continue

    quietly gbtm_stats, k(`k')
    local ll  = e(ll)
    local aic = e(AIC)
    local bic = e(BIC_n_subjects)
    local minAPP      = r(minAPP)
    local meanAPP     = r(meanAPP)
    local minOCC      = r(minOCC)
    local minOCCpp    = r(minOCCpp)
    local maxMismatch = r(maxMismatch)
    local entropy     = r(entropy)
    local minP        = r(minP)
    local minTotProb  = r(minTotProb)
    local pass        = r(pass)

    forvalues gg = 1/`k' {
        quietly summarize countG if _traj_Group == `gg', meanonly
        local nG = r(mean)
        quietly summarize p if _traj_Group == `gg', meanonly
        local pg = r(mean)
        quietly summarize TotProb if _traj_Group == `gg', meanonly
        local tpg = r(mean)
        quietly summarize APP if _traj_Group == `gg', meanonly
        local appg = r(mean)
        quietly summarize OCC if _traj_Group == `gg', meanonly
        local occg = r(mean)
        quietly summarize OCC_pp if _traj_Group == `gg', meanonly
        local occppg = r(mean)
        quietly summarize mismatch if _traj_Group == `gg', meanonly
        local misg = r(mean)
        
        local sdg = .
        quietly count if _traj_Group == `gg'
        if r(N) &gt; 1 {
            quietly summarize SD_post if _traj_Group == `gg', meanonly
            local sdg = r(mean)
        }

        post `h1' (&quot;STEP1&quot;) (`k') (&quot;`o1'&quot;) (&quot;`o2'&quot;) (&quot;`o3'&quot;) (&quot;`o4'&quot;) ///
            (`gg') (`nG') (`pg') (`tpg') (`appg') (`occg') (`occppg') (`misg') (`sdg') ///
            (`ll') (`aic') (`bic') (`minAPP') (`meanAPP') (`minOCC') (`minOCCpp') ///
            (`minP') (`minTotProb') (`maxMismatch') (`entropy') (`pass')
    }
}
postclose `h1'

use `step1tmp', clear
egen byte tagmodel = tag(K o1 o2 o3 o4)
keep if tagmodel
drop tagmodel
save step1_kselection_4var.dta, replace

gsort -pass -bic
count if pass == 1
if r(N) &gt; 0 {
    keep if pass == 1
    gsort -bic
}
else {
    gsort -bic
}
quietly summarize K in 1, meanonly
local BESTK = r(min)
di as result &quot;K selezionato = `BESTK'&quot;

****************************************************

* STEP 2: STRUCTURED SEARCH (Safe Combinatorial Logic)
****************************************************
tempname h2
capture postclose `h2'
postfile `h2' ///
    str5 stage int K str20 o1 str20 o2 str20 o3 str20 o4 ///
    int group nG ///
    double p TotProb APP OCC OCC_pp mismatch SD_post ///
    double ll aic bic minAPP meanAPP minOCC minOCCpp minP minTotProb maxMismatch entropy pass ///
    using `step2tmp', replace

local k = `BESTK'
local base = $MAXORDER + 1
local ncomb = `base'^`k'

forvalues i1 = 1/`ncomb' {
    local o1 &quot;&quot;
    forvalues g = 1/`k' {
        local div   = `base'^(`k' - `g')
        local digit = mod(int((`i1' - 1)/`div'), `base')
        local o1 &quot;`o1' `digit'&quot;
    }
    local o1 : list retok o1

    forvalues i2 = 1/`ncomb' {
        local o2 &quot;&quot;
        forvalues g = 1/`k' {
            local div   = `base'^(`k' - `g')
            local digit = mod(int((`i2' - 1)/`div'), `base')
            local o2 &quot;`o2' `digit'&quot;
        }
        local o2 : list retok o2

        forvalues i3 = 1/`ncomb' {
            local o3 &quot;&quot;
            forvalues g = 1/`k' {
                local div   = `base'^(`k' - `g')
                local digit = mod(int((`i3' - 1)/`div'), `base')
                local o3 &quot;`o3' `digit'&quot;
            }
            local o3 : list retok o3

            forvalues i4 = 1/`ncomb' {
                local o4 &quot;&quot;
                forvalues g = 1/`k' {
                    local div   = `base'^(`k' - `g')
                    local digit = mod(int((`i4' - 1)/`div'), `base')
                    local o4 &quot;`o4' `digit'&quot;
                }
                local o4 : list retok o4

                use &quot;$DATAFILE&quot;, clear
                sort $IDVAR, stable
                make_time
                local indep t1 t2 t3

                quietly capture traj, multgroups(`k') ///
                    var1($VAR1) indep1(`indep') order1(`o1') model1(cnorm) min1($MIN1) max1($MAX1) ///
                    var2($VAR2) indep2(`indep') order2(`o2') model2(cnorm) min2($MIN2) max2($MAX2) ///
                    var3($VAR3) indep3(`indep') order3(`o3') model3(cnorm) min3($MIN3) max3($MAX3) ///
                    var4($VAR4) indep4(`indep') order4(`o4') model4(cnorm) min4($MIN4) max4($MAX4)
                if _rc continue

                quietly gbtm_stats, k(`k')
                local ll  = e(ll)
                local aic = e(AIC)
                local bic = e(BIC_n_subjects)
                local minAPP      = r(minAPP)
                local meanAPP     = r(meanAPP)
                local minOCC      = r(minOCC)
                local minOCCpp    = r(minOCCpp)
                local maxMismatch = r(maxMismatch)
                local entropy     = r(entropy)
                local minP        = r(minP)
                local minTotProb  = r(minTotProb)
                local pass        = r(pass)

                forvalues gg = 1/`k' {
                    quietly summarize countG if _traj_Group == `gg', meanonly
                    local nG = r(mean)
                    quietly summarize p if _traj_Group == `gg', meanonly
                    local pg = r(mean)
                    quietly summarize TotProb if _traj_Group == `gg', meanonly
                    local tpg = r(mean)
                    quietly summarize APP if _traj_Group == `gg', meanonly
                    local appg = r(mean)
                    quietly summarize OCC if _traj_Group == `gg', meanonly
                    local occg = r(mean)
                    quietly summarize OCC_pp if _traj_Group == `gg', meanonly
                    local occppg = r(mean)
                    quietly summarize mismatch if _traj_Group == `gg', meanonly
                    local misg = r(mean)
                    
                    local sdg = .
                    quietly count if _traj_Group == `gg'
                    if r(N) &gt; 1 {
                        quietly summarize SD_post if _traj_Group == `gg', meanonly
                        local sdg = r(mean)
                    }

                    post `h2' (&quot;STEP2&quot;) (`k') (&quot;`o1'&quot;) (&quot;`o2'&quot;) (&quot;`o3'&quot;) (&quot;`o4'&quot;) ///
                        (`gg') (`nG') (`pg') (`tpg') (`appg') (`occg') (`occppg') (`misg') (`sdg') ///
                        (`ll') (`aic') (`bic') (`minAPP') (`meanAPP') (`minOCC') (`minOCCpp') ///
                        (`minP') (`minTotProb') (`maxMismatch') (`entropy') (`pass')
                }
            }
        }
    }
}
postclose `h2'

use `step2tmp', clear
save step2_models_4var.dta, replace

egen byte tagmodel = tag(K o1 o2 o3 o4)
keep if tagmodel
drop tagmodel
keep if K == `BESTK'

count if pass == 1
if r(N) &gt; 0 {
    keep if pass == 1
}

gsort -bic
quietly summarize bic, meanonly
local bestbic = r(max)
keep if bic &gt;= (`bestbic' - $DELTABIC)

gsort -bic -minAPP -minOCCpp maxMismatch
gen rank_finalista = _n
save `step2ranked', replace
save finalists_step2_4var.dta, replace

count
local NFINAL = r(N)
di as result &quot;Numero modelli finalisti entro DeltaBIC = `NFINAL'&quot;
list rank_finalista K o1 o2 o3 o4 bic minAPP minOCCpp maxMismatch entropy minP minTotProb pass, noobs

**************************************************** 
 * STEP 2B: refit of the finalist models 
****************************************************
local NINSPECT = cond(`NFINAL' &lt; $NREFIT, `NFINAL', $NREFIT)
forvalues i = 1/`NINSPECT' {
    use finalists_step2_4var.dta, clear
    local CK  = K[`i']
    local CO1 = o1[`i']
    local CO2 = o2[`i']
    local CO3 = o3[`i']
    local CO4 = o4[`i']

    capture log close candlog
    log using &quot;candidate4_`i'_K`CK'.smcl&quot;, replace name(candlog)

    use &quot;$DATAFILE&quot;, clear
    sort $IDVAR, stable
    make_time
    local indep t1 t2 t3

    traj, multgroups(`CK') ///
        var1($VAR1) indep1(`indep') order1(`CO1') model1(cnorm) min1($MIN1) max1($MAX1) ///
        var2($VAR2) indep2(`indep') order2(`CO2') model2(cnorm) min2($MIN2) max2($MAX2) ///
        var3($VAR3) indep3(`indep') order3(`CO3') model3(cnorm) min3($MIN3) max3($MAX3) ///
        var4($VAR4) indep4(`indep') order4(`CO4') model4(cnorm) min4($MIN4) max4($MAX4)

    di as result &quot;BIC = &quot; e(BIC_n_subjects)
    di as result &quot;AIC = &quot; e(AIC)
    di as result &quot;LL  = &quot; e(ll)

    log close candlog
}

**************************************************** 
 * LEAD CANDIDATE ACCORDING TO PRE-SPECIFIED CRITERIA 
****************************************************
use `step2ranked', clear
gen byte _pick = (_n == 1)
quietly summarize K if _pick, meanonly
local FK = r(min)
levelsof o1 if _pick, local(FO1) clean
levelsof o2 if _pick, local(FO2) clean
levelsof o3 if _pick, local(FO3) clean
levelsof o4 if _pick, local(FO4) clean
drop _pick

di as result &quot;CANDIDATO PRINCIPALE:&quot;
di as result &quot;K      = `FK'&quot;
di as result &quot;order1 = `FO1'&quot;
di as result &quot;order2 = `FO2'&quot;
di as result &quot;order3 = `FO3'&quot;
di as result &quot;order4 = `FO4'&quot;

use &quot;$DATAFILE&quot;, clear
sort $IDVAR, stable
make_time
local indep t1 t2 t3

traj, multgroups(`FK') ///
    var1($VAR1) indep1(`indep') order1(`FO1') model1(cnorm) min1($MIN1) max1($MAX1) ///
    var2($VAR2) indep2(`indep') order2(`FO2') model2(cnorm) min2($MIN2) max2($MAX2) ///
    var3($VAR3) indep3(`indep') order3(`FO3') model3(cnorm) min3($MIN3) max3($MAX3) ///
    var4($VAR4) indep4(`indep') order4(`FO4') model4(cnorm) min4($MIN4) max4($MAX4)

di as result &quot;BIC finale = &quot; e(BIC_n_subjects)
di as result &quot;AIC finale = &quot; e(AIC)
di as result &quot;LL finale  = &quot; e(ll)

**************************************************** 
 * FINAL STATISTICS OF THE SELECTED MODEL 
****************************************************
quietly gbtm_stats, k(`FK')

di as result &quot;minAPP finale      = &quot; r(minAPP)
di as result &quot;meanAPP finale     = &quot; r(meanAPP)
di as result &quot;minP finale        = &quot; r(minP)
di as result &quot;minTotProb finale  = &quot; r(minTotProb)
di as result &quot;minOCC finale      = &quot; r(minOCC)
di as result &quot;minOCCpp finale    = &quot; r(minOCCpp)
di as result &quot;maxMismatch finale = &quot; r(maxMismatch)
di as result &quot;entropy finale     = &quot; r(entropy)
di as result &quot;pass finale        = &quot; r(pass)</pre>
</div>]]></content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Tommaso Salvitti</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786375-syntax-model-gbtm</guid>
		</item>
		<item>
			<title><![CDATA[Error &amp;quot;break&amp;quot; without pressing break]]></title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786372-error-break-without-pressing-break</link>
			<pubDate>Sun, 21 Jun 2026 20:41:37 GMT</pubDate>
			<description><![CDATA[Why does the following syntax stop with --Break-- r(1) although I don't press Break: 
 
clear  
cap frame drop original  
frame create original ...]]></description>
			<content:encoded><![CDATA[Why does the following syntax stop with <span style="color:#FF0000"><span style="font-family:courier new">--Break--</span></span><span style="color:#0000FF"><span style="font-family:courier new"> r(1)</span></span> although I don't press Break:<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">clear 
cap frame drop original 
frame create original 
frame change original 
clear 
input int(vp fp fn vn) 
 41  51 43 276 
 31  60 49 269 
 35  70  9 187 
 30  71 30 179 
 10  15 21 236 
185  27 63  75 
 29  53 53 683 
 27  43  4  99 
213 881 32 600 
 32  54 62 581 
end 
scalar nstudies = _N 
 
frame change default 
forvalues i = 1/`=scalar(nstudies)' { 
   frame original: scalar vp = vp[`i'] 
   frame original: scalar fp = fp[`i'] 
   frame original: scalar fn = fn[`i'] 
   frame original: scalar vn = vn[`i'] 
 
   clear 
   input x1 x2 freq 
      1 1 . 
      1 0 . 
      0 1 . 
      0 0 . 
   end 
   replace freq = scalar(vp) if _n==1 
   replace freq = scalar(fp) if _n==2 
   replace freq = scalar(fn) if _n==3 
   replace freq = scalar(vn) if _n==4 
    
   tab2 x1 x2 [fw=freq] 
}</pre>
</div>Here is a part of the result: 
<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">. forvalues i = 1/`=scalar(nstudies)' {
  2.    frame original: scalar vp = vp[`i']
  3.    frame original: scalar fp = fp[`i']
  4.    frame original: scalar fn = fn[`i']
  5.    frame original: scalar vn = vn[`i']
  6. 
.    clear
  7.    input x1 x2 freq
  8.       1 1 .
  9.       1 0 .
 10.       0 1 .
 11.       0 0 .
 12.    end
<span style="color:#FF0000">--Break--</span>
<span style="color:#0000FF">r(1)</span>;</pre>
</div>If I move the part starting with -input- outside of -<b><span style="font-family:courier new">foreach</span></b>- the syntax runs without break (but then I can't loop through the data of the studies in the frame &quot;original&quot;):<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">frame change default 
forvalues i = 1/`=scalar(nstudies)' { 
   frame original: scalar vp = vp[`i'] 
   frame original: scalar fp = fp[`i'] 
   frame original: scalar fn = fn[`i'] 
   frame original: scalar vn = vn[`i'] 
 
   clear 
}    
   input x1 x2 freq 
      1 1 . 
      1 0 . 
      0 1 . 
      0 0 . 
   end 
   replace freq = scalar(vp) if _n==1 
   replace freq = scalar(fp) if _n==2 
   replace freq = scalar(fn) if _n==3 
   replace freq = scalar(vn) if _n==4 
    
   tab2 x1 x2 [fw=freq]</pre>
</div><br />
<br />
<br />
 ]]></content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Dirk Enzmann</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786372-error-break-without-pressing-break</guid>
		</item>
		<item>
			<title>CSDID Omitted results.</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786363-csdid-omitted-results</link>
			<pubDate>Thu, 18 Jun 2026 10:40:45 GMT</pubDate>
			<description><![CDATA[Hello, 
 
I am trying to use the csdid command (Callaway &amp; Sant'Anna 2021) with a monthly panel dataset but all ATT(g,t) coefficients are returned as...]]></description>
			<content:encoded><![CDATA[Hello,<br />
<br />
I am trying to use the csdid command (Callaway &amp; Sant'Anna 2021) with a monthly panel dataset but all ATT(g,t) coefficients are returned as zero (omitted) and I cannot identify the cause.<br />
<br />
**Setup**<br />
- Stata 16.1<br />
- csdid and drdid installed from SSC<br />
- Panel data: borrower-level monthly observations<br />
- 6 treated cohorts entering the program in Jan–Jun 2025 (cohort_num 1–6)<br />
- 1 never-treated control group (cohort_num 0)<br />
- Time variable: sequential integer months (2024m7 = 1, 2024m8 = 2, ... 2025m11 = 17)<br />
- Data spans 2024m7 to 2025m11 (17 months total)<br />
<br />
**Variable construction**<br />
<br />
gen int ym_csdid = int(ym) - 772     // sequential: 2024m7=1 to 2025m11=17<br />
gen int entry_ym_csdid = 0<br />
replace entry_ym_csdid = int(ym(2025, cohort_num)) - 772 if cohort_num &gt;= 1 &amp; cohort_num &lt;= 6<br />
// Results in: cohort1=7, cohort2=8, cohort3=9, cohort4=10, cohort5=11, cohort6=12<br />
<br />
xtset panel_id ym_csdid<br />
// panel variable: panel_id (unbalanced)<br />
// time variable: ym_csdid, 1 to 17, delta 1 unit<br />
<br />
**tab ym_csdid entry_ym_csdid (0.1% sample by cohort and control, ~5,000 obs)**<br />
<br />
Sequential |<br />
    month: |<br />
 2024m7=1, |<br />
 2024m8=2, |                          entry_ym_csdid<br />
       ... |         0              7          8          9         10           11 |     Total<br />
-----------+------------------------------------------------------------------+----------<br />
         1 |       191         24         29         22         16         12 |       304 <br />
         2 |       191         24         29         22         16         12 |       304 <br />
         3 |       191         24         29         22         16         12 |       304 <br />
         4 |       191         24         29         22         16         12 |       304 <br />
         5 |       190         24         29         22         16         12 |       303 <br />
         6 |       188         24         29         22         16         12 |       301 <br />
         7 |       184         24         29         22         16         12 |       297 <br />
         8 |       182         24         29         22         16         12 |       295 <br />
         9 |       182         24         29         22         16         12 |       295 <br />
        10 |       179         24         29         22         16         12 |       292 <br />
        11 |       178         24         29         22         16         12 |       291 <br />
        12 |       176         24         29         22         16         12 |       289 <br />
        13 |       176         23         29         22         16         12 |       288 <br />
        14 |       175         23         29         22         16         12 |       287 <br />
        15 |       175         23         29         22         16         12 |       287 <br />
        16 |       172         23         29         22         16         12 |       284 <br />
        17 |       172         23         29         22         16         12 |       284 <br />
-----------+------------------------------------------------------------------+----------<br />
     Total |     3,093        403        493        374        272        204 |     5,009 <br />
<br />
<br />
Sequential |<br />
    month: |<br />
 2024m7=1, | entry_ym_c<br />
 2024m8=2, |    sdid<br />
       ... |        12 |     Total<br />
-----------+-----------+----------<br />
         1 |        10 |       304 <br />
         2 |        10 |       304 <br />
         3 |        10 |       304 <br />
         4 |        10 |       304 <br />
         5 |        10 |       303 <br />
         6 |        10 |       301 <br />
         7 |        10 |       297 <br />
         8 |        10 |       295 <br />
         9 |        10 |       295 <br />
        10 |        10 |       292 <br />
        11 |        10 |       291 <br />
        12 |        10 |       289 <br />
        13 |        10 |       288 <br />
        14 |        10 |       287 <br />
        15 |        10 |       287 <br />
        16 |        10 |       284 <br />
        17 |        10 |       284 <br />
-----------+-----------+----------<br />
     Total |       170 |     5,009 <br />
<br />
**Command used**<br />
<br />
csdid repayment, ivar(panel_id) time(ym_csdid) gvar(entry_ym_csdid) method(dripw) notyet<br />
<br />
**Result**<br />
<br />
All ATT(g,t) coefficients are 0 (omitted) across all 6 groups and all time periods. Number of obs = ~4,600. No x marks (estimation did not fail), but no estimates either.<br />
<br />
                                                Number of obs     =      4,694<br />
Outcome model  : least squares<br />
Treatment model: inverse probability<br />
------------------------------------------------------------------------------<br />
             |      Coef.   Std. Err.      z    P&gt;|z|     [95% Conf. Interval]<br />
-------------+----------------------------------------------------------------<br />
g7           |<br />
       t_1_2 |          0  (omitted)<br />
       t_2_3 |          0  (omitted)<br />
       t_3_4 |          0  (omitted)<br />
       t_4_5 |          0  (omitted)<br />
       t_5_6 |          0  (omitted)<br />
       t_6_7 |          0  (omitted)<br />
       t_6_8 |          0  (omitted)<br />
       t_6_9 |          0  (omitted)<br />
      t_6_10 |          0  (omitted)<br />
      t_6_11 |          0  (omitted)<br />
      t_6_12 |          0  (omitted)<br />
      t_6_13 |          0  (omitted)<br />
      t_6_14 |          0  (omitted)<br />
      t_6_15 |          0  (omitted)<br />
      t_6_16 |          0  (omitted)<br />
      t_6_17 |          0  (omitted)<br />
-------------+----------------------------------------------------------------<br />
----------and so on, so on--------------<br />
<br />
I have also tried:<br />
- never instead of notyet<br />
- Running without any covariates<br />
- Converting time and gvar to integer storage type (was float before)<br />
- Using the original %tm integer values (774–790) instead of sequential 1–17<br />
- Reducing to 0.1% stratified sample by cohort<br />
<br />
None of these resolved the issue.<br />
<br />
For reference, I confirmed csdid works correctly on the mpdta example dataset on a separate machine (Stata 14), producing real estimates with no omissions.<br />
<br />
The tab pattern looks similar to the working mpdta example. I cannot identify what structural difference in my data is causing all cells to be omitted.<br />
<br />
Any help would be greatly appreciated.<br />
<br />
Thank you.]]></content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Passakorn Tapasanan</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786363-csdid-omitted-results</guid>
		</item>
		<item>
			<title>medsem with gsem</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786360-medsem-with-gsem</link>
			<pubDate>Thu, 18 Jun 2026 08:08:13 GMT</pubDate>
			<description>Hello, I want to run a mediation analysis, but I need to use sampling weights in my estimation. SEM does not accept weights, and GSEM seems not to...</description>
			<content:encoded>Hello, I want to run a mediation analysis, but I need to use sampling weights in my estimation. SEM does not accept weights, and GSEM seems not to allow mediation. Any hints? Thank you.</content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Ylenia Curci</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786360-medsem-with-gsem</guid>
		</item>
		<item>
			<title>New addlegend package available from SSC</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786358-new-addlegend-package-available-from-ssc</link>
			<pubDate>Thu, 18 Jun 2026 07:43:09 GMT</pubDate>
			<description>Thanks to Kit Baum, a new package called addlegend is available from SSC. To install, type: 
 
 
. ssc install addlegend, replace 
 
addlegend is a...</description>
			<content:encoded><![CDATA[Thanks to Kit Baum, a new package called <span style="font-family:courier new">addlegend </span>is available from SSC. To install, type:<br />
<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">. ssc install addlegend, replace</pre>
</div><span style="font-family:courier new">addlegend</span> is a utility to create a do-it-yourself legend and add it to a twoway graph. In contrast to Stata's <span style="font-family:courier new">legend()</span> option, <span style="font-family:courier new">addlegend</span> can combine multiple symbols in a single legend key. Here's an example:<br />
<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">sysuse auto
twoway (sc mpg turn, msize(large) ms(Oh)) ///
       (sc mpg turn, msize(large) ms(X) pstyle(p1)) ///
       (lfit mpg turn, pstyle(p2))
addlegend, y(50) x(105) margin(r=40): ///
       (Oh X) &quot;Mileage (mpg)&quot;, msize(large) ///
    || (line) &quot;Fitted values&quot;</pre>
</div><a href="filedata/fetch?filedataid=1786359">Array </a><br />
<br />
<br />
See <a href="https://github.com/benjann/addlegend/" target="_blank">github.com/benjann/addlegend</a> for some further examples.<br />
ben<br />
<br />
<br />
<br />
 ]]></content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Ben Jann</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786358-new-addlegend-package-available-from-ssc</guid>
		</item>
		<item>
			<title>Old grammer mac def</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786357-old-grammer-mac-def</link>
			<pubDate>Thu, 18 Jun 2026 07:17:23 GMT</pubDate>
			<description><![CDATA[Dear Stata users, 
 
I have checked an old and worn user-written command, the -unitroot- produced in 1992 or even earlier. There's a piece of code...]]></description>
			<content:encoded><![CDATA[Dear Stata users,<br />
<br />
I have checked an old and worn user-written command, the -unitroot- produced in 1992 or even earlier. There's a piece of code that raise error. And I report the code and set trace on information below. I want to know what is mac def ? And how to bypass this error in versions that &gt; Stata 10 ? Thank you very much.<br />
<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">    local rc=_rc
        quietly use `tmpfile', clear
    capture erase `tmpfile'
        mac def S_FN &quot;`dsn'&quot;
    error `rc'
/*    Rest of program, SRB 6/17/92    */
    local j=0
    while (`j'&lt;=`lags') {
        local j=`j'+1
        mac def S_`j' = `tau`j''
    }</pre>
</div>
<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">  - local rc=_rc
  - quietly use `tmpfile', clear
  = quietly use ......\Temp\ST_e28_000004.tmp, clear
  - capture erase `tmpfile'
  = capture erase ......\Temp\ST_e28_000004.tmp
  - mac def S_FN &quot;`dsn'&quot;
  = mac def S_FN &quot;&quot;
  - error `rc'
  = error 198</pre>
</div>]]></content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Chen Samulsion</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786357-old-grammer-mac-def</guid>
		</item>
		<item>
			<title>geoplot: select or clip</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786354-geoplot-select-or-clip</link>
			<pubDate>Thu, 18 Jun 2026 07:14:37 GMT</pubDate>
			<description><![CDATA[I reveived the following private message on statalist: 
 
 
 
Here's my answer: 
 
To plot just one state, simply use the if qualifier in geoplot....]]></description>
			<content:encoded><![CDATA[I reveived the following private message on statalist:<br />
<br />
<div class="bbcode_container">
	<div class="bbcode_quote">
		<div class="quote_container">
			<div class="bbcode_quote_container vb-icon vb-icon-quote-large"></div>
			
				In geoplot, I have a map of Brazil (from the IBGE) with 26 states and 5500 municipalities. [...] I want to do my analysis for one state at a time. When I load the map, I have the whole country, but I want eliminate (clip) all the states except the one I'm working on. How can I do that?
			
		</div>
	</div>
</div>Here's my answer:<br />
<br />
To plot just one state, simply use the if qualifier in geoplot. Example<br />
<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">local url http://fmwww.bc.edu/repec/bocode/i/
geoframe create regions `url'Italy-RegionsData.dta, id(id) coord(xcoord ycoord) shp(Italy-RegionsCoordinates.dta)
geoplot (area regions if region==&quot;Umbria&quot;, fcolor(AntiqueWhite)) ///
        (label regions region if region==&quot;Umbria&quot;) ///
        , tight</pre>
</div><a href="filedata/fetch?filedataid=1786355">Array </a><br />
<br />
Alternatively, you can also use geoframe select to create a frame that contains the data of that state only and then use this frame in geoplot. Example:<br />
<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">frame regions: geoframe select if region==&quot;Umbria&quot;, into(Umbria)
geoplot (area Umbria, fcolor(AntiqueWhite)) ///
        (label Umbria region) ///
        , tight</pre>
</div>(same result as above)<br />
<br />
If you want to generate a plot that clips the surrounding sates, rather than omitting them, you can use geoframe rclip to create a frame with the clipped data. Example:<br />
<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">frame regions: geoframe query bbox if region==&quot;Umbria&quot;, pad(30)
frame regions: geoframe rclip r(limits), into(Umbria2)
geoplot (area Umbria2, fcolor(AntiqueWhite*.5)) ///
        (area Umbria2 if region==&quot;Umbria&quot;, fcolor(AntiqueWhite)) ///
        (label Umbria2 region if region==&quot;Umbria&quot;, psty(p2)) ///
        , tight background(water)</pre>
</div><a href="filedata/fetch?filedataid=1786356">Array </a><br />
ben]]></content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Ben Jann</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786354-geoplot-select-or-clip</guid>
		</item>
		<item>
			<title>Update of nb_adjust (SSC)</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786351-update-of-nb_adjust-ssc</link>
			<pubDate>Wed, 17 Jun 2026 20:15:01 GMT</pubDate>
			<description>Thanks to Kit Baum an update of nb_adjust (version 2.13) is available on SSC. 
 
nb_adjust identifies and adjusts (or removes) outliers in a count...</description>
			<content:encoded><![CDATA[Thanks to Kit Baum an update of <b><span style="font-family:courier new">nb_adjust</span></b> (version 2.13) is available on SSC.<br />
<br />
<b><span style="font-family:courier new">nb_adjust</span></b> identifies and adjusts (or removes) outliers in a count variable, assuming that the values follow a negative binomial distribution. For more information, see the corresponding help file.<br />
<br />
In the previous version, specifying a seed did not guarantee identical results, since the seed would also have had to be used to set <b><span style="font-family:courier new">set sortseed</span></b>. This has been fixed. In addition, a reference to an example of using <b><span style="font-family:courier new">nb_adjust</span></b> has been added to the help file.<br />
<br />
By the way: It would be helpful if a note were added to the Stata documentation for <b><span style="font-family:courier new">set seed</span></b> and <b><span style="font-family:courier new">mata: rseed()</span></b> stating that there may be situations in which <b><span style="font-family:courier new">set sortseed</span></b> is also required to ensure reproducible results. See also the posts by <a href="https://www.statalist.org/forums/member/22-brendan-halpin" class="b-bbcode-user js-bbcode-user" data-userid="22">Brendan Halpin</a> &quot;<a href="https://www.statalist.org/forums/forum/general-stata-discussion/general/1424763" target="_blank">Setting random seed is not enough?</a>&quot;<br />
<br />
 ]]></content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Dirk Enzmann</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786351-update-of-nb_adjust-ssc</guid>
		</item>
		<item>
			<title>Frames with collapse</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786346-frames-with-collapse</link>
			<pubDate>Wed, 17 Jun 2026 13:59:42 GMT</pubDate>
			<description><![CDATA[I'm new to v19 but used Stata Collapse for years (and also SAS proc something output to create a new separate dataset with the summary values). 
I...]]></description>
			<content:encoded><![CDATA[I'm new to v19 but used Stata Collapse for years (and also SAS proc something output to create a new separate dataset with the summary values).<br />
I know Frames will do this (very excited), but I can't find the perfect self-help training tutorial or vide.<br />
<br />
Q1) Can anyone recommend a frames tutorial or video with examples like this (similar to proc freq output=..., or collapse to a new dataset).<br />
Q2) If anyone is feeling generous, would you offer a simple code example of using collapse to create a summary dataset, but then keep the original data in memory and the summary set in a new frame?<br />
I'll 'recognize' the code when I see it and be very grateful.  <br />
Sue]]></content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Susan Bondy</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786346-frames-with-collapse</guid>
		</item>
		<item>
			<title>SPSIV - Synthetic Instrumental Variables for Spatial Regression without External Instruments</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786342-spsiv-synthetic-instrumental-variables-for-spatial-regression-without-external-instruments</link>
			<pubDate>Wed, 17 Jun 2026 03:02:41 GMT</pubDate>
			<description>Dear Statalist members, 
 
I would like to introduce spsiv, a new Stata command for generating synthetic instrument variables (SIV) used in spatial...</description>
			<content:encoded><![CDATA[Dear Statalist members,<br />
<br />
I would like to introduce <b>spsiv</b>, a new Stata command for generating synthetic instrument variables (SIV) used in spatial regression models with endogenous variables.<br />
This command implements the aggregated IV method of Le Gallo &amp; Paez (2013) and Fingleton (2023), providing instruments strongly correlated with endogenous regression variables while still meeting standard IV requirements. <b>spsiv</b> supports both cross-sectional and panel data setups and can be used in conjunction with commands such as <b>spivreg</b>, <b>spivregress</b>, <b>xtdpd</b>, and <b>xtabond2</b>.<br />
Furthermore, SIV can also be used with conventional endogenous regressions, provided a given spatial correlation scheme exists involving the endogenous variable. This specification has also been used in Fingleton (2023).<br />
<br />
Thanks to Prof. Kit Baum, the command is already available on SSC and can be installed by running the command:<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">ssc install spsiv</pre>
</div>All comments, suggestions, and bug reports are welcome.<br />
<br />
References:<ol class="decimal"><li>Fingleton, B. (2022). Estimating dynamic spatial panel data models with endogenous regressors using synthetic instruments. <i>Journal of Geographical Systems, 25</i>, Article 1. <a href="https://doi.org/10.1007/s10109-022-00397-3" target="_blank">https://doi.org/10.1007/s10109-022-00397-3</a></li>
<li>Le Gallo, J., &amp; Páez, A. (2013). Using synthetic variables in instrumental variable estimation of spatial series models. <i>Environment and Planning A</i>, <i>45</i>(9), 2227-2242.</li>
</ol>Here are some examples:<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">    * Cross-sectional data
        copy https://www.stata-press.com/data/r19/homicide1990.dta ., replace
        copy https://www.stata-press.com/data/r19/homicide1990_shp.dta ., replace
        use homicide1990, clear
        spset
        spmat idistance m _CX _CY, id(_ID) dfunction(dhaversine) replace
        spsiv ln_population ln_pdensity gini, m(m) a(0.1)

    * Panel data
        copy https://www.stata-press.com/data/r19/homicide_1960_1990.dta ., replace
        copy https://www.stata-press.com/data/r19/homicide_1960_1990_shp.dta . , replace
        use homicide_1960_1990, clear
        xtset _ID year
        spset
        preserve
        keep if year==1990
        spmat idistance m _CX _CY, id(_ID) dfunction(dhaversine) replace
        restore
        spsiv ln_population ln_pdensity gini if year==1990, m(m) a(0.1)
        spsiv ln_population ln_pdensity gini, m(m) a(0.1)</pre>
</div>And the results:<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">. use homicide1990, clear 
(S.Messner et al.(2000), U.S southern county homicide rates in 1990)

. spset 

      Sp dataset: homicide1990.dta
Linked shapefile: homicide1990_shp.dta
            Data: Cross sectional
 Spatial-unit ID: _ID
     Coordinates: _CX, _CY (planar)

. spmat idistance m _CX _CY, id(_ID) dfunction(dhaversine) replace 

. spsiv ln_population ln_pdensity gini, m(m) a(0.1) 
(S.Messner et al.(2000), U.S southern county homicide rates in 1990)

Correlation between X and synthetic intrumental variables
------------------------------------------------
Variable (X)ln_populationln_pdensity     gini
------------------------------------------------
Correlation    0.7498      0.7833      0.7985
------------------------------------------------

. copy https://www.stata-press.com/data/r19/homicide_1960_1990.dta ., replace
(file homicide_1960_1990.dta not found)

. copy https://www.stata-press.com/data/r19/homicide_1960_1990_shp.dta . , replace
(file homicide_1960_1990_shp.dta not found)

. use homicide_1960_1990, clear 
(S.Messner et al.(2000), U.S southern county homicide rate in 1960-1990)

. xtset _ID year 

Panel variable: _ID (strongly balanced)
 Time variable: year, 1960 to 1990, but with gaps
         Delta: 1 unit

. spset 

      Sp dataset: homicide_1960_1990.dta
Linked shapefile: homicide_1960_1990_shp.dta
            Data: Panel
 Spatial-unit ID: _ID
         Time ID: year (see xtset)
     Coordinates: _CX, _CY (planar)

. preserve 

. keep if year==1990 
(4,236 observations deleted)

. spmat idistance m _CX _CY, id(_ID) dfunction(dhaversine) replace 

. restore 

. spsiv ln_population ln_pdensity gini if year==1990, m(m) a(0.1) 
(S.Messner et al.(2000), U.S southern county homicide rate in 1960-1990)

Correlation between X and synthetic intrumental variables
------------------------------------------------
Variable (X)ln_populationln_pdensity     gini
------------------------------------------------
Correlation    0.7498      0.7833      0.7985
------------------------------------------------

. spsiv ln_population ln_pdensity gini, m(m) a(0.1) 
(S.Messner et al.(2000), U.S southern county homicide rate in 1960-1990)

Correlation between X and synthetic intrumental variables
------------------------------------------------
Variable (X)ln_populationln_pdensity     gini
------------------------------------------------
Correlation    0.7315      0.7789      0.8418
------------------------------------------------</pre>
</div><br />
 ]]></content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Manh Hoang Ba</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786342-spsiv-synthetic-instrumental-variables-for-spatial-regression-without-external-instruments</guid>
		</item>
		<item>
			<title>How to obtain the sample size used by csdid2?</title>
			<link>https://www.statalist.org/forums/forum/general-stata-discussion/general/1786341-how-to-obtain-the-sample-size-used-by-csdid2</link>
			<pubDate>Wed, 17 Jun 2026 02:23:04 GMT</pubDate>
			<description><![CDATA[Dear all: 
 
I am using csdid2 (net install csdid2, from(&quot;https://friosavila.github.io/stpackages&quot;)) and would like to know how to obtain the sample...]]></description>
			<content:encoded><![CDATA[Dear all:<br />
<br />
I am using csdid2 (net install csdid2, from(&quot;https://friosavila.github.io/stpackages&quot;)) and would like to know how to obtain the sample size used in the estimation. Is there an option or stored result that reports the number of observations actually used by csdid2? Thank you.<br />
<br />

<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code">*ssc install csdid
*ssc install drdid
*net install csdid2, from(&quot;https://friosavila.github.io/stpackages&quot;)

*ssc install frause

frause mpdta, clear

csdid lemp, ivar(countyreal) time(year) gvar(first)
estat event
estat simple

csdid2 lemp, ivar(countyreal) tvar(year) gvar(first)
estat event
estat simple</pre>
</div>]]></content:encoded>
			<category domain="https://www.statalist.org/forums/forum/general-stata-discussion/general">General</category>
			<dc:creator>Frank Huang</dc:creator>
			<guid isPermaLink="true">https://www.statalist.org/forums/forum/general-stata-discussion/general/1786341-how-to-obtain-the-sample-size-used-by-csdid2</guid>
		</item>
	</channel>
</rss>
