I know that xi is an outdated command, but since some important user-written commands don't allow factor variables, creating dummies remains important. Iv'e noticed that using xi, whether as a command or a prefix for estimation is very slow and it seems that using "qui: tab var, gen(new_)" is incredibly faster.
i'm using NBER patent data for example:
http://www.nber.org/~jbessen/pat76_06_assg.dta.zip
the data has a time variable called appyear. generating dummies of appyear by xi and also by tab, gen show the remarkable difference ( stata 12 IC):
xi takes more than 104 seconds, while tab, gen takes less than 7! why is xi so inefficient?
i'm using NBER patent data for example:
http://www.nber.org/~jbessen/pat76_06_assg.dta.zip
the data has a time variable called appyear. generating dummies of appyear by xi and also by tab, gen show the remarkable difference ( stata 12 IC):
Code:
. summ appyear Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- appyear | 3279509 1992.13 8.474844 1901 2006 r; t=0.24 9:35:04 . xi i.appyear i.appyear _Iappyear_1901-2006 (naturally coded; _Iappyear_1901 omitted) r; t=104.10 9:36:48 . qui tab appyear, gen(new_year) r; t=6.49 9:36:55
Comment