Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • bar graph by category and then by total of that category

    Hi everyone,
    I would like to create a bar graph of in which I graph a variable (median_age) by region and time and also by all regions combined. So far, I could only graph by region and time, but would like a last (new) set of bars to be one of all the regions combined. Any help is highly appreciated. I couldn't find this issue addressed anywhere else, after some thorough search already

    Data and code to replicate my graph
    -----------------------------------------------------
    webuse emd, clear
    gen region = "Region 4" if fips > 400000
    replace region = "Region 3" if fips > 300000 & region == ""
    replace region = "Region 2" if fips > 200000 & region ==""
    replace region = "Region 1" if region ==""

    gen time = runiformint(0, 1)

    label define time 1 "Time1" 0 "Time0", replace
    label values time time

    graph bar median_age, over(time) over(region)
    Last edited by Frank Odhiambo; 01 Aug 2022, 06:17.

  • #2
    https://www.stata-journal.com/articl...article=gr0058 addresses this kind of problem.

    Comment


    • #3
      Thanks Nick, that was very helpful. Downside is that my data already had 12 million observations so that expanding considerably slowed down my code. But that still worked, thanks. For anyone else who finds this thread, the solution based on Nick's recommendation is as below:

      Using this data
      ------------------------------------
      webuse emd, clear
      gen region = 1 if fips > 400000
      replace region = 2 if fips > 300000 & region == .
      replace region = 3 if fips > 200000 & region ==.
      replace region = 4 if region ==.

      gen time = runiformint(0, 1)

      label define time 1 "Time1" 0 "Time0", replace
      label values time time
      ------------------------------------

      I expanded the data and then generated a new variable "new_regions", then used that to plot the graph (as below):


      expand 2
      generate new_region = cond(_n <= _N/2, region, 4 + region) // data already has 4 regions, so I create a fifth one
      replace new_region = 5 if new_region >=5
      graph bar median_age, over(time) over(new_region)
      Last edited by Frank Odhiambo; 01 Aug 2022, 08:39.

      Comment

      Working...
      X