Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Coding a fixed effects estimator where response is an ordinal variable (Blow-Up and Cluster)

    In my project, I use a recently developed estimator ("Blow-Up and Cluster") which allows to control for fixed effects when the response is an ordinal variable (life satisfaction in my case).
    My problem is that I have two alternative codings of it and they yield (sometimes very) different results and drop different number of observations. Moreover, I cannot trace why so much more observations are dropped as compared to linear fixed effects transformation.

    The idea of the estimator is to copy each observation K-1 times (K is the number of categories in the ordinal variable), use clogit at all possible dichotomizations on all of the sample and then cluster the errors at an individual level to correct for the multiple entries in the likelihood function. Full description (page 11)
    The authors provide a code for the estimator which is the following:

    Code:
    ivar is the individual identifier,
    yvar is the ordered dependent variable, and
    xvars is the list of explanatory variables.
    
    capture program drop feologit_buc
    program feologit_buc, eclass
    version 10
    gettoken gid 0: 0
    gettoken y x: 0
    tempvar iid id cid gidcid dk
    qui sum ‘y’
    local lk= r(min)
    local hk= r(max)
    bys ‘gid’: gen ‘iid’=_n
    gen long ‘id’=‘gid’*100+‘iid’
    expand ‘=‘hk’-‘lk’’
    bys ‘id’: gen ‘cid’=_n
    qui gen long ‘gidcid’= ‘gid’*100+‘cid’
    qui gen ‘dk’= ‘y’>=‘cid’+1
    clogit ‘dk’ ‘x’, group(‘gidcid’) cluster(‘gid’)
    end
    
    feologit_buc ivar yvar xvars
    However, Dickerson et al.notes this code drops observations in the British Household Panel Survey (which I am using) because the new id variable is stored as a long, which has a maximum of 2,147,483,620 and the id's of the BHPS are larger than this. They provide an alternative coding:

    Code:
    capture program drop bucologit
    program bucologit
    version 11.2
    syntax varlist [if] [in], Id(varname)
    preserve
    marksample touse
    markout ‘touse’ ‘id’
    gettoken yraw x : varlist
    tempvar y
    qui egen int ‘y’ = group(‘yraw’)
    qui keep ‘y’ ‘x’ ‘id’ ‘touse’
    qui keep if ‘touse’
    qui sum ‘y’
    local ymax = r(max)
    forvalues i = 2(1)‘ymax’ {
    qui gen byte ‘yraw’‘i’ = ‘y’ >= ‘i’
    }
    drop ‘y’
    tempvar n cut newid
    qui gen long ‘n’ = _n
    qui reshape long ‘yraw’, i(‘n’) j(‘cut’)
    qui egen long ‘newid’ = group(‘id’ ‘cut’)
    sort ‘newid’
    clogit ‘yraw’ ‘x’, group(‘newid’) cluster(‘id’)
    restore
    end
    I recoded the identifier variable in BHPS, so that it doesn't exceed the maximum and tried feologic_buc and bucologit on the same specifications and I get the following results for an example specification:

    Click image for larger version

Name:	Screenshot 2014-11-26 21.45.32.png
Views:	1
Size:	65.8 KB
ID:	486048


    Click image for larger version

Name:	Screenshot 2014-11-26 21.45.43.png
Views:	1
Size:	72.5 KB
ID:	486049


    Could anyone spot what exactly is causing the differences?

    As of my second problem, inconsistency with linear fixed effects, I do not understand why xtreg, fe has so much more observations. Results below:

    Click image for larger version

Name:	Screenshot 2014-11-26 22.02.36.png
Views:	1
Size:	65.8 KB
ID:	486050
    I know that clogit will drop all observations that do not vary in life satisfcation, whereas xtreg, fe does not. However, I find that in this sample only 20 observations fall into this criteria (I didn't find this explicitly but included lfsato as a regressor in a separate xtreg, fe and the sample got reduced by 20).

    Can anybody suggest why the number of observations is so different from the above results?

  • #2
    Sorry, duplicated thread. Please click here

    Comment


    • #3
      Liucija

      I am posting an answer here as the tables are missing in the other thread (it’s best to post results as text wrapped within code tags).

      It’s difficult to say what’s causing the difference in the results on the basis of the information you have provided, but bucologit and feologit_buc should produce identical results when there are no issues with the ID variable. Some suggestions:

      - Make sure that you use the correct type of quotation marks around local macro names – in the code you have posted the quotation marks are incorrect. I’ve posted the corrected code below my signature.

      - You don’t say how you generated pidnew. I would recommend using the following code:
      Code:
      egen pidnew = group(pid)
      - Regarding the linear fixed effects results I would create a variable identifying responses with no variation over time in the dependent variable and include an if statement with xtreg:
      Code:
      bysort pid (year): egen sd_lfsato = sd(lfsato)
      xtreg lfsato ... if sd_lfsato!=0 & sd_lfsato <. & female==0, fe ...
      - Also note that feologit_buc does not restore the data to its original state after "blowing it up". So you need to include preserve and restore statements in the do-file yourself.

      Arne

      Code:
      capture program drop feologit_buc
      program feologit_buc, eclass
      version 10
      gettoken gid 0: 0
      gettoken y x: 0
      tempvar iid id cid gidcid dk
      qui sum `y'
      local lk= r(min)
      local hk= r(max)
      bys `gid': gen `iid'=_n
      gen long `id'=`gid'*100+`iid'
      expand `=`hk'-`lk''
      bys `id': gen `cid'=_n
      qui gen long `gidcid'= `gid'*100+`cid'
      qui gen `dk'= `y'>=`cid'+1
      clogit `dk' `x', group(`gidcid') cluster(`gid')
      end
      Code:
      capture program drop bucologit
      program bucologit
          version 11.2
          syntax varlist [if] [in], Id(varname)
      
          preserve
      
          marksample touse
          markout `touse' `id'
          
          gettoken yraw x : varlist
          tempvar y
          qui egen int `y' = group(`yraw')
      
          qui keep `y' `x' `id' `touse'
          qui keep if `touse'
      
          qui sum `y'
          local ymax = r(max)
          forvalues i = 2(1)`ymax' {
              qui gen byte `yraw'`i' = `y' >= `i'
          }
          drop `y'
          
          tempvar n cut newid
          qui gen long `n' = _n
          qui reshape long `yraw', i(`n') j(`cut')
          qui egen long `newid' = group(`id' `cut')
          sort `newid'
          clogit `yraw' `x', group(`newid') cluster(`id')  
      
          restore            
      end
      Last edited by Arne Risa Hole; 28 Nov 2014, 03:03.

      Comment

      Working...
      X