Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • creating variable for an incumbent firm and a new entrant firm

    I have a firm level data set, unbalanced panel data from 1999 to 2016. My aim is to understand the effect of small scale product de-reservation on product innovation. So here I consider a firm to be an incumbent if the firm was producing a small scale product till the year of de-reservation. After the de-reservation, many other firms start to enter that product segment. So here I make another cataegory new entrant which means the firms entered into the small scale segment after the de-reservation.
    I have created the incumbent variable dummy using this command.
    Code:
    gen incumbent_dummy=.
    replace incumbent_dummy=1 if year<= year_dereservation& year_dereservation!=.
    How to create the variable for new entrant dummy where if the factory_id associated with the incumbents comes, it must be ignored. That means, I need the set of firms which became a small scale producer after the year of de- reservation only. The variable SSI_FACTORY_DUMMY=1 denote if the firm has produced a small scale product either before or after the year of de-reservation.
    I attach the following example dataset

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(fact_id year) int year_dereservation float(new_product_dummy ownershipcode log_age factory_size factory_size_squared log_working_capital log_welfare_expenses sector_dummy_new incumbent_dummy SSI_FACTORY_DUMMY)
     1 2000    . 0 1  1.609438  2.564949  6.578965 13.417416  8.980172 18 . .
     1 2002    . 1 1   1.94591 1.3862944  1.921812  12.54669         . 17 . .
     2 2010    . 0 1   2.70805 2.6390574  6.964624 13.806715   9.95342 29 . .
     3 2001 2004 0 1  2.772589  3.218876 10.361162  15.09514  11.90896 28 1 1
     6 2011    . 0 1  2.833213 1.3862944  1.921812 16.931124  11.37226 27 . .
     7 2005 2007 1 1  3.583519  2.833213  8.027098 12.402966 10.217605 26 1 1
     7 1999 2007 0 1 3.4011974 2.3025851  5.301898 13.370925         . 26 1 1
     7 2001 2007 0 1  2.833213  2.564949  6.578965         .  9.540148 26 1 1
     7 2000 2007 0 1  2.564949  2.397895  5.749902 13.876872  10.14655 26 1 1
     7 2002 2007 0 1  3.496508 2.1972246  4.827796         . 10.473478 26 1 1
     7 2010 2007 1 1  3.713572 3.0910425  9.554543  13.41069 12.595788 23 . 1
     8 2004    . 1 1 2.1972246  2.944439  8.669721 13.977489         . 17 . .
     8 2011    . 1 1  2.484907  2.833213  8.027098 15.517985 9.1736765 13 . .
     8 2003    . 1 1   1.94591  2.564949  6.578965  13.96584         . 17 . .
     8 2000    . 0 1 1.3862944  2.833213  8.027098 13.969795         . 17 . .
     9 2015    . 0 1  2.397895  3.496508 12.225566         .  11.65472 29 . .
     9 2014    . 1 1 2.3025851  3.367296  11.33868 16.130667  12.19795 29 . .
    10 2006    . 0 1 2.1972246   2.70805  7.333536  12.24688         . 16 . .
    10 2015    . 0 1  3.433987  2.484907  6.174761  13.75274         . 12 . .
    10 2013    . 1 1  3.367296   1.94591  3.786566 12.848382         . 12 . .
    11 2013    . 0 1 1.3862944  2.944439  8.669721 16.368998         . 10 . .
    13 2000    . 0 1  2.833213  2.564949  6.578965 13.261424 10.695756 22 . .
    13 2005    . 1 1   1.94591  2.944439  8.669721 15.013954 11.860026 21 . .
    14 2003    . 0 1 3.0910425  2.833213  8.027098         . 10.630094 26 . .
    14 2015    . 1 1 3.5263605   1.94591  3.786566 14.136666  9.283219 23 . .
    14 2009    . 1 1  3.433987 2.3025851  5.301898 13.378538  8.660081 23 . .
    14 2010    . 0 1  3.465736 2.0794415  4.324077 12.958135  8.718009 23 . .
    16 2009    . 1 1   1.94591   4.85203   23.5422 15.197473 10.829867 32 . .
    16 2011    . 1 1 2.1972246 4.4998097  20.24829         . 11.818275 32 . .
    16 2010    . 1 1 2.0794415   4.59512  21.11513 15.029297 10.814404 32 . .
    16 2015    . 0 1  2.564949  5.123964 26.255005 14.762444 13.568634 32 . .
    16 2016    . 1 0 2.6390574  5.283204  27.91224 16.979624 13.110736 32 . .
    16 2013    . 1 1  2.397895 3.6888795 13.607832 14.666152 12.920453 32 . .
    16 2008    . 1 1 1.7917595  5.117994  26.19386  15.16214  8.003029 24 . .
    16 2014    . 0 1  2.484907  4.820282 23.235113         . 12.382204 32 . .
    17 2009    . 1 1 3.5263605   5.53339   30.6184  14.75193         . 12 . .
    17 2006    . 0 1  3.433987 4.4886365  20.14786  14.20142         . 16 . .
    17 2012    . 0 1  2.397895  5.375278  28.89362 15.552017         . 12 . .
    17 2011    . 1 1  3.583519  5.852202 34.248272 15.647636         . 12 . .
    17 2010    . 0 1  3.555348   4.94876 24.490227  15.17169         . 12 . .
    17 2007    . 0 1  3.465736  5.556828  30.87834  15.25402         . 16 . .
    17 2008    . 0 1  3.496508  5.497168 30.218857 14.856136  10.91509 16 . .
    17 2013    . 0 1  2.484907 1.3862944  1.921812 15.907572         . 12 . .
    18 2005 2015 0 1  2.995732   2.70805  7.333536 15.180366         . 28 1 1
    18 2013 2015 1 1 3.3322046  2.995732  8.974412 15.891048   9.23552 25 1 1
    18 2003 2015 0 1  2.890372  2.890372  8.354249 14.785128         . 28 1 1
    19 2004    . 0 1  2.890372  4.969813 24.699045 13.889056 11.900804 24 . .
    20 2014    . 0 1 1.0986123 3.6888795 13.607832 19.382566         . 24 . .
    20 2015    . 0 1  2.564949 4.2341065 17.927658  19.65147         . 24 . .
    23 2005 2007 0 1  3.367296  2.484907  6.174761 14.416636  7.677864 26 1 1
    23 2009 2007 1 1  3.295837 1.7917595  3.210402 13.178997  8.926784 23 . 1
    23 2011 2007 1 1  3.555348 2.0794415  4.324077  14.53971    8.2943 23 . 1
    25 2008    . 0 1 2.0794415 2.6390574  6.964624  15.05743 10.176754 24 . .
    25 2007    . 0 1   1.94591  2.564949  6.578965 14.786767 10.094769 24 . .
    25 2014    . 1 1  2.833213 2.6390574  6.964624 17.242157 10.927682 20 . .
    26 2011    . 1 1 3.4011974 1.0986123  1.206949   9.46312         . 12 . .
    26 2007    . 0 1  3.367296 1.7917595  3.210402  8.977525         . 16 . .
    26 2012    . 0 1  3.433987 1.0986123  1.206949  9.521861         . 12 . .
    27 2005    . 1 1  2.772589  2.772589  7.687248 16.365507  7.903965 24 . .
    27 2004    . 0 1   2.70805   2.70805  7.333536 16.354773  7.242083 24 . .
    30 2006 2007 0 1 3.2580965  1.609438 2.5902905         .  7.635304 26 1 1
    30 2011 2007 1 1   3.73767  3.970292 15.763217  14.10942 12.244967 23 . 1
    30 2013 2007 1 1 3.9512436 2.1972246  4.827796 12.920233   8.69165 23 . 1
    32 2008    . 1 1 2.0794415  2.944439  8.669721         .  7.790696 25 . .
    32 2007    . 0 1   1.94591  2.564949  6.578965         .  9.177507 25 . .
    33 2006    . 0 1  3.465736  4.477337 20.046545 14.201146         . 16 . .
    33 2010    . 1 1  3.555348 4.4998097  20.24829         .         . 12 . .
    33 2016    . 1 0   2.70805  2.397895  5.749902 15.503132         . 12 . .
    34 2004    . 0 1   1.94591  2.833213  8.027098 11.554893  7.809947 29 . .
    34 2005    . 1 1  2.397895 2.0794415  4.324077         .  8.410943 29 . .
    35 2011    . 0 1  2.564949  2.397895  5.749902 16.026924         . 24 . .
    37 2003    . 1 1 3.5263605  1.609438 2.5902905 11.346848  8.258681 29 . .
    37 1999    . 0 1 3.5263605 2.0794415  4.324077  13.22428         . 29 . .
    37 2007    . 0 1  3.637586 1.7917595  3.210402 14.103975         . 29 . .
    37 2001    . 1 1  3.465736  1.609438 2.5902905 13.340822   8.56121 29 . .
    39 2001    . 0 0  3.178054  2.564949  6.578965 13.266263         . 22 . .
    39 2013    . 0 0  3.583519 1.3862944  1.921812 12.376183         . 18 . .
    39 2011    . 1 0 3.5263605   1.94591  3.786566 14.167113         . 18 . .
    40 2004    . 0 1   1.94591  2.944439  8.669721 14.657135         . 20 . .
    40 2000    . 0 1 1.3862944 2.0794415  4.324077  14.27706  9.219894 20 . .
    40 2015    . 1 1  2.944439  2.944439  8.669721 16.117624   10.1105 16 . .
    43 2004    . 0 1 1.7917595 2.1972246  4.827796         .  9.932318 24 . .
    43 2013    . 1 1   2.70805 2.6390574  6.964624  16.54192 11.846415 21 . .
    44 2011    . 0 1 1.3862944  2.484907  6.174761         .         . 10 . .
    45 2015    . 0 1  2.995732  2.397895  5.749902 15.899123 10.617662 27 . .
    47 2006    . 0 1  3.496508 2.3025851  5.301898  14.10745   7.20786 26 . .
    47 2009    . 0 1  2.995732 2.3025851  5.301898 13.191768  6.514713 23 . .
    47 2011    . 1 1 3.4011974 1.0986123  1.206949         .         . 23 . .
    47 2003    . 0 1 3.4011974 2.1972246  4.827796 12.983995         . 26 . .
    49 2008    . 1 1  2.890372   2.70805  7.333536         .  11.14829 26 . .
    49 2001    . 0 1  2.397895  2.995732  8.974412 11.006324 10.256923 26 . .
    49 2000    . 0 1 3.0445225  2.944439  8.669721 12.398155  9.433484 26 . .
    51 2006    . 0 1   2.70805  2.564949  6.578965 14.529168         . 20 . .
    54 2007    . 1 1  3.367296   2.70805  7.333536 14.884635         . 28 . .
    54 2004    . 0 1 3.2580965  2.397895  5.749902   14.7036         . 28 . .
    55 2011    . 0 1 1.3862944   1.94591  3.786566 11.706994         . 10 . .
    56 2003 2007 1 1   1.94591  2.484907  6.174761  10.26399         . 27 1 1
    56 2002 2007 0 1 1.7917595  2.397895  5.749902 10.208395         . 27 1 1
    56 2004 2007 1 1 2.0794415  .6931472   .480453         .         . 27 1 1
    57 2015    . 0 1 1.3862944 2.0794415  4.324077 14.600776  7.673223 23 . .
    end
    Last edited by George Paily; 09 Jun 2022, 00:59.

  • #2
    I don't understand how this can be done from the data you show. Perhaps I just don't understand what some of the variables mean. If you tried to create your new variable by hand, which variable(s) would you use to identify the year in which a company "entered into the small scale segment," and how would you use it(them) to do so?

    Comment


    • #3
      Thank you very much for the reply and sorry for the misunderstanding in the question. Let me try to make my question more clear. My aim is to understand how small scale dereservation has affected product innovation of small firms compared to the firms that started producing those products after de-reservation. After the de-reservation any firms can make these products. that is the policy change... In the data, year_deservation is the year at which the product corresponding to that firm got de-reserved. Some firms used to produce these products before the dereservation. these firms can be identifid by looking the units year<=year_dereservation. Now, after the policy change how to identify the firm units which were not in the dataset before the year_dereservation but appeared only after the year_dereservation.

      thanks a lot for your help

      Comment


      • #4
        So, if I am understanding correctly, the code would be:
        Code:
        replace incumbent_dummy = 0 if missing(incumbent_dummy)
        
        by fact_id (year), sort: egen byte new_entry = min(year > year_dereservation) ///
            if !missing(year_dereservation)
            
        replace new_entry = 0 if incumbent_dummy
        The already existing variable incumbent_dummy is coded as 1/missing. This is a bad practice in Stata that leads to problems. It is better to code dichotomous variables in Stata as 1 = yes, 0 = no, with missing values used only for observations where it is indeterminate whether yes or no is applicable. Hence the first line in the above code.

        It appears that in the example data, there are no new entry firms. All of the fact_id's that have an associated dereservation year also have observations that precede that year. Hopefully this is not the case in your entire data set. Alternatively, I may still be misunderstanding what is wanted and the code would be wrong.

        Comment


        • #5
          Thank you very much Clyde. Here as I mentioned above, the policy change happened in different years for differente firms. In this context, i was wondering if I could examine the impact of de-reservation using a difference in difference with multiple time periods. So i was trying to employ csdid and drdid packages. But I'm stuck with how to create the treatment variable here. Let me tell more clearely. If year_dereservation of a firm is 2000, then rthose firms should be considered as treated for all subsequent years in which the firm appears. Likewise if the firm got its first treatment in 2001, then all those firms should be considered as treated in all subsequent years in which the firm appears. So here my control group will be 'not yet treated'. It t will be greatly appreciated if you could tell how to do it .. Thanks a lot

          Comment


          • #6
            Well, there are two thing that seem to get in the way of this approach. One is that you have some firms for which no year_dereservation value is specified. What is their story?

            The other is that if, as I think you are saying in #5, every firm is eventually treated, then there is no real control group for a DID design. What you would have is called a stepped wedge design, and it would require including in the regression not just the already_treated vs not_yet_treated variable but also fixed effects for firm and year. As a way of identifying causal effects it is a somewhat weaker design than a DID. A (generalized) DID can have different units initiate treatment at different times, but it is not doable when there are no never-treated units.

            Comment


            • #7
              Thank you so much. Yes for some firms, year_dereservation is missing because they are large firms and not subject to the policy change. I was not sure if they could be an adequate control group though they are never treated, the pre exising charaterics might be different for small and large firms.

              Therefore I removed these large firms from the data and kept only small firms which got treatment in any year. Once they get the treatment they must be considered as treated. and I was wondering a good control is the ones which are not yet treated and they might have more or less same pre-treatment charateristics.

              If large firms could be a proper control group, i could retain these firms also in the sample and use never treated as controls

              thanks a lot

              Comment


              • #8
                Well, if you want to proceed with the stepped-wedge design, the code to create the variable reflecting treatment status, after you remove all the large firms from the data set, is:
                Code:
                assert !missing(year)
                label define treatment_status   0   "Not yet treated"   1   "Treated"
                gen byte treatment_status:treatment_status = year > year_dereservation
                And remember that if you go this route, the regression must also include both firm and year fixed effects to give valid results.

                As for whether you can use large firms as a control group, I cannot advise you. I don't work in finance/economics, and I have no idea whether there are relevant differences between large and small firms that would make them unsuitable. Perhaps somebody who works in your field is following this thread and will chime in. If not, you will need to get advice from a colleague, or a forum on this topic. It is not a statistical or Stata issue; it is a substantive question in your discipline.

                Comment


                • #9
                  Thank you so much. it was of great help.

                  Comment

                  Working...
                  X