Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replacing variable's value given conditions in other observations.

    Hello everyone,

    At the moment I am trying to create a variable that captures the industry sector's code that contains the most jobs in a city, for each year. I am using a dataset full of companies' information such as the city it is located in, the year, the number of jobs, the industry the company operates in, and the corresponding industry. I have created a sample of the dataset below.
    I already created the variable MostJobsSector, which gives the highest number of jobs that is located by one industry, per city, per year. The following code was used:

    Code:
    egen MostJobsSector= max(TotalJobSector), by (city year)
    Now, i do want to create a variable (IndustryMostJobs) that captures the corresponding industry code of the industry that contained the most jobs for each city, per year. The values of this variable should be the same for all companies within the same city and year. I tried to achieve this through the following code, but eventually, the response was "invalid syntax" when trying the third line of code.

    Code:
    gen IndustryMostJobs = 0
    replace industryMostJobs = IndustryCode if TotalJobSector == MostJobSector
    replace IndustryMostJobs =  max(IndustryCode), by(year plaats) if IndustryMostJobs == 0
    company city year IndustryCode JobsCompany TotalJobSector MostJobsSector IndustryMostJobs
    1 Chicago 1997 01 5 8 8 01
    2 Chicago 1997 01 3 8 8 01
    3 Chicago 1997 02 2 3 8 01
    4 Chicago 1997 02 1 3 8 01
    5 Chicago 1998 01 4 6 9 02
    6 Chicago 1998 01 2 6 9 02
    7 Chicago 1998 02 9 9 9 02
    8 Chicago 1998 03 7 7 9 02
    9 Maimi 1997 01 3 11 11 01
    10 Miami 1997 01 6 11 11 01
    11 Miami 1997 01 2 11 11 01
    12 Miami 1997 02 8 8 11 01
    13 Miami 1998 01 1 1 8 03
    14 Miami 1998 02 5 5 8 03
    15 Miami 1998 03 4 13 13 03
    16 Miami 1998 03 4 13 13 03
    17 Miami 1998 03 5 13 13 03
    Any input will be highly appreciated. Please post any questions in case i have not been fully clear.

    Yours Sincerely,

    Patrick Johnson


  • #2
    Patrick:
    a temptative answer might be (Warning: lower-case):
    Code:
    bysort city year: egen industrymostjobs=max(industrycode)
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Sir Lazzaro,

      This code worked perfectly. Thank you very much.

      Patrick

      Comment


      • #4
        Patrick:
        you're welcome.
        Please call me Carlo, as all on (and many more off) the list do!
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Patrick,

          1. Carlo's solution at #2 is incorrect. The "better" code, which seems providing the right output for your example, might be:
          Code:
          bys city year (TotalJobSector): gen IndustryMostJobs= IndustryCode[_N]
          2. However, for your actual data, there might be some "locking" circumstances, i.e there are two or more Industries with the same TotalJobSector at max value (by year and city). It is unclear of what IndustryCode you want to pick out as IndustryMostJobs in this situation.

          3. For the future post, please use -dataex-(as instructed in https://www.statalist.org/forums/help, section 12.2) to give out a small example, which would make the discussion more effective and convenient.

          Comment

          Working...
          X