Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • counting without dropping

    Hi dear Profs and colleagues,

    I have 3 variables, firms ID: NPC_FIC , year & Firmsage: firm_age. I am going to obtain firmsage i,t. i for each firm and year 2010-2011 while keeping all obs.
    (The period is for 12 years I mentioned an example for the first 2 years)
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long NPC_FIC byte firm_age int year
    500000002 25 2010
    500000002 25 2010
    500000002 25 2010
    500000002 25 2010
    500000002 25 2010
    500000119 22 2010
    500000119 22 2010
    500000119 22 2010
    500000119 22 2010
    500000119 22 2010
    500000119 22 2010
    500000119 22 2010
    500000119 22 2010
    500000346 18 2010
    500000346 18 2010
    500000346 18 2010
    500000346 18 2010
    500000346 18 2010
    500000376 64 2010
    500000376 64 2010
    500000376 64 2010
    500000376 64 2010
    500000376 64 2010
    500000376 64 2010
    500000376 64 2010
    500000376 64 2010
    500000376 64 2010
    500000395 12 2010
    500000395 12 2010
    500000395 12 2010
    500000395 12 2010
    500000395 12 2010
    500000395 12 2010
    500000856 25 2010
    500000856 25 2010
    500000856 25 2010
    500000856 25 2010
    500000856 25 2010
    500000856 25 2010
    500000856 25 2010
    500000856 25 2010
    500000856 25 2010
    500000856 25 2010
    500000002 26 2011
    500000002 26 2011
    500000002 26 2011
    500000002 26 2011
    500000002 26 2011
    500000119 23 2011
    500000119 23 2011
    500000119 23 2011
    500000119 23 2011
    500000119 23 2011
    500000119 23 2011
    500000119 23 2011
    500000119 23 2011
    500000346 19 2011
    500000346 19 2011
    500000346 19 2011
    500000346 19 2011
    500000346 19 2011
    500000376 65 2011
    500000376 65 2011
    500000376 65 2011
    500000376 65 2011
    500000376 65 2011
    500000376 65 2011
    500000376 65 2011
    500000376 65 2011
    500000376 65 2011
    500000395 13 2011
    500000395 13 2011
    500000395 13 2011
    500000395 13 2011
    500000395 13 2011
    500000395 13 2011
    500000856 26 2011
    500000856 26 2011
    500000856 26 2011
    500000856 26 2011
    500000856 26 2011
    500000856 26 2011
    500000856 26 2011
    500000856 26 2011
    500000856 26 2011
    500000856 26 2011
    end

    if I do this:

    Code:
    bysort NPC_FIC : keep if _n == _N
    only one firm will be left, but I don't look for that. Is there any way to compute firms' age for i & t without dropping?

    All my thanks.

  • #2
    In the example you provide the firm age is constant by firm and by year:

    Code:
    . bysort NPC_FIC year: assert firm_age[1]==firm_age[_N]
    
    .
    so the assertion is correct. So it is not clear what you want to compute, in your example firms age is already computed.

    Comment


    • #3
      Thank you for getting back to me.
      you are right. Firms' age has already been computed. What I need is to obtain each firm's age. Now there are repeated firms ID and naturally repeated firms age. So I need to keep/count one without dropping (firms age I,t).

      Comment


      • #4
        I still do not understand...

        Do you want to tag only one observation per firm/year group? If so

        Code:
        egen mytag = tag(NPC_FIC year)
        and then whatever computations that you want to do that include only one observation per firm and year, you add the qualifier if mytag. Say

        Code:
        . tabstat firm_age if mytag, by(year) stats(mean count)
        
        Summary for variables: firm_age
        Group variable: year 
        
            year |      Mean         N
        ---------+--------------------
            2010 |  27.66667         6
            2011 |  28.66667         6
        ---------+--------------------
           Total |  28.16667        12
        ------------------------------

        Comment


        • #5
          Let me put it this way. Please ignore #1.
          I start from the beginning.
          I need to add a control variable namely: firms' age.
          Since some firms have several establishments, therefore they appear more than once in the data.
          Question: if I enter firms' age as a control variable in this condition ( duplicated numbers for firms' age and firms' IDs) is that correct?
          The explanatory variable has 84 obs. while the firms age is around some thousands (number of establishments exist in the data)
          Thank you.Enable GingerCannot connect to Ginger Check your internet connection
          or reload the browserDisable in this text fieldRephraseRephrase current sentence6Edit in GingerĂ—
          Last edited by Paris Rira; 25 Jan 2023, 15:06.

          Comment


          • #6
            Now I think I understand :P.

            1. If your dependent variable, and your main explanatory variable vary at the establishment level, it is correct to include the repeated age. You do not need to do anything, you just run the regression on your raw data. In this case here you fundamentally are conducting the analysis at establishment level, and not on firm level.

            2. If your dependent variable and your explanatory variable vary only at the firm level, that is, your regression is fundamentally a firm level regression and not and establishment level regression, then you need to do what I showed you in #4. You tag one establishment per firm, and you run the regression only on the tagged observations.

            Comment


            • #7
              Dear Kolev, Thank you so much for the comprehensive explanation. The econometric model is a bit complex though
              Click image for larger version

Name:	equation.PNG
Views:	2
Size:	3.3 KB
ID:	1698862

              i: firms
              s: sectors
              d: districts (7 districts)
              t: 2010-2021

              the explanatory variable of interest, Sdt has 7 districts * 12 years=84
              Xit: firms age (control variable)

              The dimension of the explanatory variable and the control variable is not the same. Is still the second option in #6 valid?

              Comment

              Working...
              X