Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Data manipulation in unbalanced panel data

    Dear Statalist Community,

    I'm working on an unbalanced panel dataset, a toy example is presented below
    Input Data
    Year Firm Supplier
    2010 A XX
    2010 A XY
    2010 A XZ
    2010 B ZZ
    2010 B XX
    2011 A XX
    2011 A XY
    2011 B XY
    2011 B ZZ
    2012 A XX
    2012 A XY
    2012 A ZZ
    2012 A XZ
    2012 B ZZ
    My task is to identify those suppliers who have been with the firm for atlas 3 "consecutive" years and categorize them as belonging to a particular class. Specifically, my ideal output table would be as follows
    Output Status
    Year Firm Supplier Category
    2012 A XX 1
    2012 A XY 1
    2012 A ZZ 0
    2012 A XZ 0
    2012 B ZZ 1
    Any help in solving this is much appreciated.
    Many thanks!

  • #2
    I don't quite understand what you're trying to do here. The description seems clear enough, but your ideal output table doesn't seem consistent with it. First, there is no apparent reason why you have a Year variable in it if the purpose is just to describe which Firm Supplier pairs have had any 3 consecutive years of relationship. It is also clear why you don't show Firm B Supplier XX and Firm B Supplier XY with Category = 0.

    Anyway, I've gone with what your words said, rather than your table, except that I do include a year variable, namely the last year in which that supplier and firm were associated. Hopefully you can figure out how to modify the code to get what you actually want.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int year str1 firm str2 supplier
    2010 "A" "XX"
    2010 "A" "XY"
    2010 "A" "XZ"
    2010 "B" "ZZ"
    2010 "B" "XX"
    2011 "A" "XX"
    2011 "A" "XY"
    2011 "B" "XY"
    2011 "B" "ZZ"
    2012 "A" "XX"
    2012 "A" "XY"
    2012 "A" "ZZ"
    2012 "A" "XZ"
    2012 "B" "ZZ"
    end
    
    // IDENTIFY SPELLS OF CONSECUTIVE YEARS A SUPPLIER HAS WITH A FIRM
    by supplier firm (year), sort: gen spell = sum(year != year[_n-1] + 1)
    //    GET SPELL LENGTH
    by supplier firm spell (year), sort: gen length = _N
    
    //    NOW IDENTIFY IF A FIRM AND SUPPLIER HAVE HAD
    //    ANY 3 YEAR SPELLS
    by supplier firm: egen category = max(length)
    replace category = (category >= 3)
    
    //    SAVE OBSERVATION FROM LAST YEAR IN FIRM-SUPPLIER PAIR
    by firm supplier (year), sort: keep if _n == _N
    In the future, please post example data using the -dataex- command, as I have done in this response. The kind of table you showed can be difficult to import into Stata (it wasn't in this instance, but they often are). Run -ssc install dataex- to install the -dataex- command and then run -help dataex- to read the simple instructions for using it. This is the most helpful way to give example data because it enables those who want to help you to quickly and easily create a 100% faithful replica of your Stata data example with a simple copy/paste operation.

    Comment


    • #3
      Thank you very much Clyde! It solves my problem and I agree the output table was not consistent. I only needed to identify if the suppliers in 2012 (for all firms in the same) had a relationship for more than 3 years with the firms. I think I should have been more clear on that. I will also use dataex for examples here on.

      Comment

      Working...
      X