This is my first post so I apologize in advance if I missed anything in the protocol for posting.

I am estimating a binary patient-level outcome variable

*CR*in an interrupted time series that will be interpreted as a hospital-level effect, and am using an indicator variable for policy implementation,*HRRP*for patient data nested by hospital (patient i at hospital j) and am using the following stata code:*glm cr i.hrrp c.elapsed_qtr i.hrrpXtime2 $season $patient $hospital $market , vce(cl new_hosp_num) family(binomial binomial_denominator_variable) link(logit),*where*elapsed_qtr*and the interaction term are for the post-period time series, the*$patient*are patient-level characteristics and*$hospital*and*$market*are hospital-level characteristics, and*new_hosp_num*is the identifier for hospital j.My post is about this

*binomial_denominator_variable.*The Stata manual says this about the

*binomial_denominator_variable,*which it calls v*arnameN:*

"The binomial distribution can be specified as 1) family(binomial), 2) family(binomial #N ), or 3) family(binomial varnameN ). In case 2, #N is the value of the binomial denominator N, the number of trials. Specifying family(binomial 1) is the same as specifying family(binomial)."The binomial distribution can be specified as 1) family(binomial), 2) family(binomial #N ), or 3) family(binomial varnameN ). In case 2, #N is the value of the binomial denominator N, the number of trials. Specifying family(binomial 1) is the same as specifying family(binomial).

**In case 3, varnameN is the variable containing the binomial denominator, allowing the number of trials to vary across observations**."From two earlier posts, including one by Clyde Schechter on 25 May 2016 ("GLM and blogit for proportion variable:different results") and one by Nick Cox on 19 Aug 2014 ("Entropy measure DV in panel data: Best regression technique?"), I infer that in case 3, varnameN should be the denominator of the dependent variable.

**First, operationally:**1. What is Stata actually doing when VarnameN is included in this manner? For example, is it taking the average across each varnameN group?

**Second, in my example:**2. I have the numerator CR{1}, the denominator, CR{0,1} for total number of patients, and the frequency of CR at each hospital (so I don't need to use fracreg or betareg where the denominator isn't available). I would like to have stata take into account the hospital clusters in my output. Note I am including standard errors clustered at the hospital level.

3. To use this function correctly, would I use:

varnameN=total number of hospitals in the cohort over all time periods?

varnameN=total number of patients at each hospital over all time periods?

varnameN=total number of hospitals for each quarter?

varnameN=total number of patients at each hospital for each quarter?

(some of the results change substantially depending on which is used, mostly for the time series and hospital-level variables)

4. Would it be more appropriate to leave varnameN blank and interpret the final patient-level coefficients for a given hospital with a certain number of patients?

5. I have not tried a mixed effects approach yet, as it was not something my advisers were keen on, but would that be a useful approach given the multi-level nature of the data?

Thank you!

