Dear all,
I have panel data going from 2000 to 2018 with approximately 300.000 observations. My task is to come up with a model to see which factors leads a startup company end up as a scaleup. I define my binary target variable "scaleup" as company that has higher growth in revenue than costs every year over a 5 year period. My independent variables are both time variant and time invariant and includes firm information such as accounting data, company form, average education among employees/board members, age, number of employees, industry and many other variables.
My problem comes to the modelling part.
1. Which models is appropriate to find a relationship between scaleups and firm characteristics? Been looking into the xtreg module until now.
2. How should I code my "scaleup" target variable? Until now I have coded it as a 1 in year 5 if the company satisfies my definition of a scaleup.
My data set look something like the one below. (NB: this is not the full dataset, its fake and does not include all the variables and years)
Thank you.
Kind regards,
Ole Karlsen
I have panel data going from 2000 to 2018 with approximately 300.000 observations. My task is to come up with a model to see which factors leads a startup company end up as a scaleup. I define my binary target variable "scaleup" as company that has higher growth in revenue than costs every year over a 5 year period. My independent variables are both time variant and time invariant and includes firm information such as accounting data, company form, average education among employees/board members, age, number of employees, industry and many other variables.
My problem comes to the modelling part.
1. Which models is appropriate to find a relationship between scaleups and firm characteristics? Been looking into the xtreg module until now.
2. How should I code my "scaleup" target variable? Until now I have coded it as a 1 in year 5 if the company satisfies my definition of a scaleup.
My data set look something like the one below. (NB: this is not the full dataset, its fake and does not include all the variables and years)
firm id | Year | avg_age_employees | avg_education_board_members | number_of_female_employees | scaleup |
1 | 2000 | 44,00 | 4,50 | 10 | 0 |
1 | 2001 | 43,00 | 4,60 | 12 | 0 |
1 | 2002 | 42,00 | 4,30 | 12 | 0 |
1 | 2003 | 45,00 | 4,50 | 13 | 0 |
1 | 2004 | 46,00 | 5,00 | 13 | 0 |
1 | 2005 | 47,00 | 5,10 | 13 | 1 |
2 | 2000 | 45,00 | 2,30 | 2 | 0 |
2 | 2001 | 45,00 | 2,40 | 2 | 0 |
2 | 2002 | 43,00 | 2,40 | 3 | 0 |
2 | 2003 | 47,00 | 2,20 | 4 | 0 |
2 | 2004 | 48,00 | 2,20 | 2 | 0 |
2 | 2005 | 44,00 | 2,10 | 1 | 0 |
3 | 2000 | 33,00 | 3,00 | 2 | 0 |
3 | 2001 | 34,00 | 4,00 | 3 | 0 |
3 | 2002 | 32,00 | 5,00 | 2 | 0 |
3 | 2003 | 30,00 | 5,00 | 4 | 0 |
3 | 2004 | 30,00 | 6,00 | 2 | 0 |
3 | 2005 | 30,00 | 6,00 | 2 | 0 |
4 | 2000 | 50,00 | 4,10 | 4 | 0 |
4 | 2001 | 50,00 | 4,10 | 4 | 0 |
4 | 2002 | 50,00 | 4,20 | 3 | 0 |
4 | 2003 | 56,00 | 4,30 | 6 | 0 |
4 | 2004 | 67,00 | 4,40 | 6 | 0 |
4 | 2005 | 67,00 | 4,50 | 6 | 1 |
5 | 2000 | 25,00 | 2,30 | 4 | 0 |
5 | 2001 | 26,00 | 2,30 | 2 | 0 |
5 | 2002 | 34,00 | 2,40 | 4 | 0 |
5 | 2003 | 29,00 | 2,50 | 3 | 0 |
5 | 2004 | 29,00 | 2,50 | 3 | 0 |
5 | 2005 | 29,00 | 2,20 | 3 | 0 |
6 | 2000 | 25,00 | 2,30 | 5 | 0 |
6 | 2001 | 26,00 | 2,30 | 4 | 0 |
6 | 2002 | 34,00 | 2,40 | 3 | 0 |
6 | 2003 | 29,00 | 2,30 | 2 | 0 |
6 | 2004 | 29,00 | 2,30 | 2 | 0 |
6 | 2005 | 29,00 | 2,20 | 3 | 0 |
7 | 2000 | 50,00 | 4,30 | 7 | 0 |
7 | 2001 | 50,00 | 4,20 | 6 | 0 |
7 | 2002 | 50,00 | 4,20 | 6 | 0 |
7 | 2003 | 56,00 | 4,40 | 8 | 0 |
7 | 2004 | 67,00 | 4,50 | 8 | 0 |
7 | 2005 | 67,00 | 5,60 | 12 | 1 |
8 | 2000 | 26,00 | 2,30 | 3 | 0 |
8 | 2001 | 27,00 | 2,30 | 3 | 0 |
8 | 2002 | 27,00 | 2,40 | 3 | 0 |
8 | 2003 | 27,00 | 2,30 | 3 | 0 |
8 | 2004 | 28,00 | 2,30 | 3 | 0 |
8 | 2005 | 29,00 | 2,20 | 3 | 0 |
Thank you.
Kind regards,
Ole Karlsen
Comment