Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data. Binary dependent variable with time invariant and variant independent variables

    Dear all,

    I have panel data going from 2000 to 2018 with approximately 300.000 observations. My task is to come up with a model to see which factors leads a startup company end up as a scaleup. I define my binary target variable "scaleup" as company that has higher growth in revenue than costs every year over a 5 year period. My independent variables are both time variant and time invariant and includes firm information such as accounting data, company form, average education among employees/board members, age, number of employees, industry and many other variables.

    My problem comes to the modelling part.
    1. Which models is appropriate to find a relationship between scaleups and firm characteristics? Been looking into the xtreg module until now.
    2. How should I code my "scaleup" target variable? Until now I have coded it as a 1 in year 5 if the company satisfies my definition of a scaleup.


    My data set look something like the one below. (NB: this is not the full dataset, its fake and does not include all the variables and years)
    firm id Year avg_age_employees avg_education_board_members number_of_female_employees scaleup
    1 2000 44,00 4,50 10 0
    1 2001 43,00 4,60 12 0
    1 2002 42,00 4,30 12 0
    1 2003 45,00 4,50 13 0
    1 2004 46,00 5,00 13 0
    1 2005 47,00 5,10 13 1
    2 2000 45,00 2,30 2 0
    2 2001 45,00 2,40 2 0
    2 2002 43,00 2,40 3 0
    2 2003 47,00 2,20 4 0
    2 2004 48,00 2,20 2 0
    2 2005 44,00 2,10 1 0
    3 2000 33,00 3,00 2 0
    3 2001 34,00 4,00 3 0
    3 2002 32,00 5,00 2 0
    3 2003 30,00 5,00 4 0
    3 2004 30,00 6,00 2 0
    3 2005 30,00 6,00 2 0
    4 2000 50,00 4,10 4 0
    4 2001 50,00 4,10 4 0
    4 2002 50,00 4,20 3 0
    4 2003 56,00 4,30 6 0
    4 2004 67,00 4,40 6 0
    4 2005 67,00 4,50 6 1
    5 2000 25,00 2,30 4 0
    5 2001 26,00 2,30 2 0
    5 2002 34,00 2,40 4 0
    5 2003 29,00 2,50 3 0
    5 2004 29,00 2,50 3 0
    5 2005 29,00 2,20 3 0
    6 2000 25,00 2,30 5 0
    6 2001 26,00 2,30 4 0
    6 2002 34,00 2,40 3 0
    6 2003 29,00 2,30 2 0
    6 2004 29,00 2,30 2 0
    6 2005 29,00 2,20 3 0
    7 2000 50,00 4,30 7 0
    7 2001 50,00 4,20 6 0
    7 2002 50,00 4,20 6 0
    7 2003 56,00 4,40 8 0
    7 2004 67,00 4,50 8 0
    7 2005 67,00 5,60 12 1
    8 2000 26,00 2,30 3 0
    8 2001 27,00 2,30 3 0
    8 2002 27,00 2,40 3 0
    8 2003 27,00 2,30 3 0
    8 2004 28,00 2,30 3 0
    8 2005 29,00 2,20 3 0



    Thank you.
    Kind regards,
    Ole Karlsen

  • #2
    Ole:
    welcome to this forum.
    1) -xtlogit- is the way yo go if your regressand is binary;
    2) I think that you should code -scale-up- 1 each year the -scaleup- requirement is satisfied.
    Kind regards,
    Carlo
    (Stata 16.0 SE)

    Comment


    • #3
      Many thanks Lazzaro.

      Comment

      Working...
      X