Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Collapse dyads data

    I am willing to collapse the LEHD Origin-Destination Employment Statistics (LODES) in order to get overall informations regarding the employment in the block.
    The dataset connects a home block with a work block and gives one count of jobs for each home-work block dyad and for each year.
    Each obs of my dataset contains a pair of h-w block (h_geocode and w_geocode) and the number of workers (s00) for each dyad and year.
    I would like to compute the total numbers of workers that live in block H and the share of workers that commute to another block to work (h_geocode!=w_geocode).
    Moreover I would like to collect the same observation at the Census Tract level (w_trc and h_trc).
    I have some doubts\trouble with collapsed data. Do you have any suggestion on how to deal with it?
    Thank you!


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float year double(w_geocode h_geocode) str11(w_trc h_trc) int s000 str5 county
    2005 280010006001010 280010001001000 "28001000600" "28001000100" 1 "28001"
    2005 280010005003018 280010001001000 "28001000500" "28001000100" 1 "28001"
    2005 280010009002001 280010001001000 "28001000900" "28001000100" 1 "28001"
    2005 280679510004019 280010001001002 "28067951000" "28001000100" 1 "28001"
    2005 280639502001009 280010001001002 "28063950200" "28001000100" 1 "28001"
    2005 280010005004012 280010001001002 "28001000500" "28001000100" 1 "28001"
    2005 280010007002001 280010001001003 "28001000700" "28001000100" 1 "28001"
    2005 280010006001091 280010001001003 "28001000600" "28001000100" 1 "28001"
    2005 280010008003008 280010001001003 "28001000800" "28001000100" 1 "28001"
    2005 280010006001067 280010001001003 "28001000600" "28001000100" 1 "28001"
    2005 280010004002027 280010001001003 "28001000400" "28001000100" 1 "28001"
    2005 280010007002020 280010001001003 "28001000700" "28001000100" 1 "28001"
    2005 280010007002003 280010001001003 "28001000700" "28001000100" 1 "28001"
    2005 280010006001091 280010001001004 "28001000600" "28001000100" 1 "28001"
    2005 280490033005005 280010001001004 "28049003300" "28001000100" 1 "28001"
    2005 280219502001041 280010001001004 "28021950200" "28001000100" 1 "28001"
    2005 280679506002071 280010001001004 "28067950600" "28001000100" 1 "28001"
    2005 281499506002001 280010001001004 "28149950600" "28001000100" 1 "28001"
    2005 280890301051006 280010001001004 "28089030105" "28001000100" 1 "28001"
    2005 281599503002060 280010001001004 "28159950300" "28001000100" 1 "28001"
    2005 280010005005030 280010001001005 "28001000500" "28001000100" 1 "28001"
    2005 280010009003002 280010001001005 "28001000900" "28001000100" 1 "28001"
    2005 280730205002072 280010001001005 "28073020500" "28001000100" 1 "28001"
    2005 280010004003012 280010001001005 "28001000400" "28001000100" 1 "28001"
    2005 280010008003012 280010001001005 "28001000800" "28001000100" 1 "28001"
    2005 280839506004000 280010001001005 "28083950600" "28001000100" 1 "28001"
    2005 280010007002094 280010001001005 "28001000700" "28001000100" 1 "28001"
    2005 280010007002097 280010001001005 "28001000700" "28001000100" 2 "28001"
    2005 280010008003042 280010001001005 "28001000800" "28001000100" 2 "28001"
    2005 280010007002018 280010001001005 "28001000700" "28001000100" 2 "28001"
    2005 280750102014119 280010001001005 "28075010201" "28001000100" 1 "28001"
    2005 280979503002110 280010001001005 "28097950300" "28001000100" 1 "28001"
    2005 280010006001091 280010001001005 "28001000600" "28001000100" 2 "28001"
    2005 281139503005031 280010001001005 "28113950300" "28001000100" 1 "28001"
    2005 280010004003041 280010001001005 "28001000400" "28001000100" 1 "28001"
    2005 280010007002056 280010001001005 "28001000700" "28001000100" 1 "28001"
    2005 280010009002045 280010001001005 "28001000900" "28001000100" 1 "28001"
    2005 280639502001011 280010001001005 "28063950200" "28001000100" 1 "28001"
    2005 280859505005017 280010001001005 "28085950500" "28001000100" 1 "28001"
    2005 280490035003000 280010001001005 "28049003500" "28001000100" 1 "28001"
    2005 280010006001010 280010001001005 "28001000600" "28001000100" 1 "28001"
    2005 281210206003015 280010001001005 "28121020600" "28001000100" 1 "28001"
    2005 280219502001041 280010001001006 "28021950200" "28001000100" 1 "28001"
    2005 280010008003008 280010001001006 "28001000800" "28001000100" 1 "28001"
    2005 280010008003001 280010001001006 "28001000800" "28001000100" 1 "28001"
    2005 280010005004007 280010001001006 "28001000500" "28001000100" 1 "28001"
    2005 280010006001091 280010001001006 "28001000600" "28001000100" 2 "28001"
    2005 280679509001001 280010001001006 "28067950900" "28001000100" 1 "28001"
    2005 280010008003042 280010001001006 "28001000800" "28001000100" 1 "28001"
    2005 280010008003005 280010001001007 "28001000800" "28001000100" 1 "28001"
    2005 280219502001042 280010001001008 "28021950200" "28001000100" 1 "28001"
    2005 280010007002003 280010001001008 "28001000700" "28001000100" 1 "28001"
    2005 280010007002083 280010001001008 "28001000700" "28001000100" 1 "28001"
    2005 280010008001019 280010001001008 "28001000800" "28001000100" 2 "28001"
    2005 280010007002018 280010001001008 "28001000700" "28001000100" 1 "28001"
    2005 280639502001009 280010001001008 "28063950200" "28001000100" 1 "28001"
    2005 280219502001004 280010001001008 "28021950200" "28001000100" 1 "28001"
    2005 280010009002001 280010001001008 "28001000900" "28001000100" 1 "28001"
    2005 280859506004012 280010001001014 "28085950600" "28001000100" 1 "28001"
    2005 281059506011052 280010001001014 "28105950601" "28001000100" 1 "28001"
    2005 280010009003006 280010001001014 "28001000900" "28001000100" 1 "28001"
    2005 280219503001011 280010001001014 "28021950300" "28001000100" 1 "28001"
    2005 280010009001042 280010001001014 "28001000900" "28001000100" 3 "28001"
    2005 280639502001009 280010001001014 "28063950200" "28001000100" 1 "28001"
    2005 280010008001016 280010001001014 "28001000800" "28001000100" 3 "28001"
    2005 280010004002028 280010001001014 "28001000400" "28001000100" 1 "28001"
    2005 281210207031002 280010001001014 "28121020703" "28001000100" 1 "28001"
    2005 280919503003020 280010001001014 "28091950300" "28001000100" 1 "28001"
    2005 280010009001050 280010001001014 "28001000900" "28001000100" 1 "28001"
    2005 280010007002065 280010001001014 "28001000700" "28001000100" 1 "28001"
    2005 280010008003042 280010001001014 "28001000800" "28001000100" 4 "28001"
    2005 280010006001010 280010001001014 "28001000600" "28001000100" 3 "28001"
    2005 280010007002094 280010001001014 "28001000700" "28001000100" 1 "28001"
    2005 280010008001023 280010001001014 "28001000800" "28001000100" 1 "28001"
    2005 280010007002095 280010001001014 "28001000700" "28001000100" 1 "28001"
    2005 280010007002014 280010001001014 "28001000700" "28001000100" 1 "28001"
    2005 280010009001064 280010001001014 "28001000900" "28001000100" 1 "28001"
    2005 280010008001012 280010001001014 "28001000800" "28001000100" 1 "28001"
    2005 280010001002065 280010001001014 "28001000100" "28001000100" 2 "28001"
    2005 280010007002097 280010001001014 "28001000700" "28001000100" 1 "28001"
    2005 280010008003012 280010001001014 "28001000800" "28001000100" 1 "28001"
    2005 280010007001025 280010001001014 "28001000700" "28001000100" 1 "28001"
    2005 280010007002023 280010001001014 "28001000700" "28001000100" 1 "28001"
    2005 280010006001091 280010001001014 "28001000600" "28001000100" 4 "28001"
    2005 281510003002013 280010001001014 "28151000300" "28001000100" 1 "28001"
    2005 280010009001066 280010001001014 "28001000900" "28001000100" 2 "28001"
    2005 280010007002083 280010001001014 "28001000700" "28001000100" 1 "28001"
    2005 280010009001065 280010001001014 "28001000900" "28001000100" 1 "28001"
    2005 280639501002042 280010001001014 "28063950100" "28001000100" 1 "28001"
    2005 280010004001026 280010001001014 "28001000400" "28001000100" 1 "28001"
    2005 280010009001003 280010001001015 "28001000900" "28001000100" 1 "28001"
    2005 281310201001041 280010001001015 "28131020100" "28001000100" 1 "28001"
    2005 280010009001042 280010001001015 "28001000900" "28001000100" 1 "28001"
    2005 280010008001012 280010001001015 "28001000800" "28001000100" 2 "28001"
    2005 280470006001020 280010001001015 "28047000600" "28001000100" 1 "28001"
    2005 281210204011064 280010001001015 "28121020401" "28001000100" 1 "28001"
    2005 280450301001034 280010001001015 "28045030100" "28001000100" 1 "28001"
    2005 280490112024007 280010001001015 "28049011202" "28001000100" 1 "28001"
    2005 280010008003042 280010001001015 "28001000800" "28001000100" 3 "28001"
    2005 280010007002072 280010001001015 "28001000700" "28001000100" 1 "28001"
    end

  • #2
    Because you are trying to generate results at two different levels of aggregation (block and tract) I don't think -collapse- is the way to go here. Try this:

    Code:
    isid h_geocode year w_geocode, sort
    gen byte same_block = w_geocode == h_geocode
    gen byte same_tract = w_trc == h_trc
    by h_geocode year: egen block_workers = total(s000)
    by h_geocode year (same_block), sort: gen prop_work_same_block = s000[_N]/block_workers
    by h_trc year (same_tract), sort: egen tract_workers = total(s000)
    by h_trc year (same_tract): egen prop_work_same_tract = total(cond(same_tract, s000/tract_workers, .))

    Comment


    • #3
      That is a great help Clyde, it worked smoothly.
      In order to calculate the % of people that commute ousitde home block I can simply use:

      gen comm=1- prop_work_same_block

      But I am wondering how to measure the % of commuters only to certain block (eg. if w_geocode_characteristic==1).

      by h_geocode year: egen commuters = total(s000) if same_block==0&w_geocode_characteristic==1

      compute what I want.
      Of course I can use this code

      drop if same_block==1
      by h_geocode year: egen commuters= total(s000)

      and then merge this variable with the old dataset, but since it is quite time consuming (the dataset is quite big and merge operations take a lot of time) I hope there is a smarter alternative.





      Comment


      • #4
        But I am wondering how to measure the % of commuters only to certain block (eg. if w_geocode_characteristic==1).
        There is no variable named w_geocode_characteristic in the example data. I'll assume you just mean w_geocode. Anyway, it's
        Code:
        by h_geocode year, sort: egen numerator = total(cond(w_geocode == 1, s00, .))
        gen wanted = numerator/block_workers

        Comment


        • #5
          Thank you a lot, it really helps me!

          Comment

          Working...
          X