Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Coding of fixed effects estimator when the response is an ordinal variable (Blow-Up and Cluster)

    In my project, I use a recently developed estimator ("Blow-Up and Cluster") which allows to control for fixed effects when the response is an ordinal variable (life satisfaction in my case).
    My problem is that I have two alternative codings of it and they yield (sometimes very) different results and drop different number of observations. Moreover, I cannot trace why so much more observations are dropped as compared to linear fixed effects transformation.

    The idea of the estimator is to copy each observation K-1 times (K is the number of categories in the ordinal variable), use clogit at all possible dichotomizations on all of the sample and then cluster the errors at an individual level to correct for the multiple entries in the likelihood function. Full description (page 11)
    The authors provide a code for the estimator which is the following:

    Code:
    ivar is the individual identifier,
      yvar is the ordered dependent variable, and
      xvars is the list of explanatory variables.
       
      capture program drop feologit_buc
      program feologit_buc, eclass
      version 10
      gettoken gid 0: 0
      gettoken y x: 0
      tempvar iid id cid gidcid dk
      qui sum ‘y’
      local lk= r(min)
      local hk= r(max)
      bys ‘gid’: gen ‘iid’=_n
      gen long ‘id’=‘gid’*100+‘iid’
      expand ‘=‘hk’-‘lk’’
      bys ‘id’: gen ‘cid’=_n
      qui gen long ‘gidcid’= ‘gid’*100+‘cid’
      qui gen ‘dk’= ‘y’>=‘cid’+1
      clogit ‘dk’ ‘x’, group(‘gidcid’) cluster(‘gid’)
      end
       
      feologit_buc ivar yvar xvars
    However, Dickerson et al.notes this code drops observations in the British Household Panel Survey (which I am using) because the new id variable is stored as a long, which has a maximum of 2,147,483,620 and the id's of the BHPS are larger than this. They provide an alternative coding:

    Code:
    capture program drop bucologit
      program bucologit
      version 11.2
      syntax varlist [if] [in], Id(varname)
      preserve
      marksample touse
      markout ‘touse’ ‘id’
      gettoken yraw x : varlist
      tempvar y
      qui egen int ‘y’ = group(‘yraw’)
      qui keep ‘y’ ‘x’ ‘id’ ‘touse’
      qui keep if ‘touse’
      qui sum ‘y’
      local ymax = r(max)
      forvalues i = 2(1)‘ymax’ {
      qui gen byte ‘yraw’‘i’ = ‘y’ >= ‘i’
      }
      drop ‘y’
      tempvar n cut newid
      qui gen long ‘n’ = _n
      qui reshape long ‘yraw’, i(‘n’) j(‘cut’)
      qui egen long ‘newid’ = group(‘id’ ‘cut’)
      sort ‘newid’
      clogit ‘yraw’ ‘x’, group(‘newid’) cluster(‘id’)
      restore
      end
    I recoded the identifier variable in BHPS, so that it doesn't exceed the maximum and tried feologic_buc and bucologit on the same specifications and I get the following results for an example specification:






    Could anyone spot what exactly is causing the differences?

    As of my second problem, inconsistency with linear fixed effects, I do not understand why xtreg, fe has so much more observations. Results below:


    I know that clogit will drop all observations that do not vary in life satisfcation, whereas xtreg, fe does not. However, I find that in this sample only 20 observations fall into this criteria (I didn't find this explicitly but included lfsato as a regressor in a separate xtreg, fe and the sample got reduced by 20).

    Can anybody suggest why the number of observations is so different from the above results?

  • #2
    Duplicate post. See http://www.statalist.org/forums/foru...up-and-cluster

    Comment

    Working...
    X