Hi all,
I am trying to calculate the simple regression below:
Outcome is the binary variable of the company's client being a female or not. The independent variable is the age of the employee of the company who is assigned to work with the client. For fixed effects, I have an interaction of the office the employee works at, year and d25 which is a binary variable (=1 if age of the client is above 25.)
I have also clustered at the employee level.
Now I have two questions:
1. Is it the correct way to include my fixed effects in a logit regression?
2. Why is it taking so so long to run?
I have about 1,400,000 clients, 3,362 unique employees, 190 offices (unique office ids), and 12 years.
I am trying to calculate the simple regression below:
Code:
logit female age i.office_id#i.year#i.d25, cluster(employee_id)
I have also clustered at the employee level.
Now I have two questions:
1. Is it the correct way to include my fixed effects in a logit regression?
2. Why is it taking so so long to run?
I have about 1,400,000 clients, 3,362 unique employees, 190 offices (unique office ids), and 12 years.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float(female employee_id office_id year age age_client d25) 1 111 40 2002 24 45 1 0 111 40 2002 24 29 1 1 112 41 2002 36 32 1 1 112 41 2003 37 23 0 1 112 41 2004 38 22 0 0 112 41 2004 38 23 0 0 113 41 2002 40 40 1 1 114 42 2006 20 37 1 0 114 42 2007 21 36 1 1 114 42 2007 21 19 0 0 115 43 2006 42 26 1 0 115 43 2006 42 29 1 1 116 41 2006 23 34 1 1 116 41 2007 24 42 1 end
Comment