Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trouble with stata outputs: graphing multiple catvars in a stacked bar format in stata, help with using Floatplot


    I’m not a data scientist so my jargon is going to be really off here but let me try to break this down. I’m using stata to encode qualitative data for research. I have 5 categorical variables whodunit_1, whodunit_4, whodunit_3, whodunit_4, whodunit_5 that are all encoded and are scaled with the same 4 ordinal scale (0, 1,2,3 and missing). I want to stack them as a percent make up of their respective totals (similar to how pie charts are but for a stack floatplot) but can’t get all the variables on the same print out. This is my initial code which works fine for one catvar but how can I add all of them onto the same graph? my code right now is
    1. graph bar (percent), over(whodunit_1) stack asyvars
    to add an additional categorical variables to this code i tried: 2. graph bar (percent), over(whodunit_1) over(whodunit_2) stack asyvars but this just added them on top of one another rather than separating them as catvar1 and carvar2 in stata. So I was wondering if anyone here knew how to get them in seperated bars. Each bar should represent one distinct categorical variable. I can’t share a lot of information as the data is sensitive. -- I also thought that a floatplot would be helpful and I installed it using "SSC install floatplot" and ran the code floatplot whodunit_1, center(3) fcolors(red*0.6 red*0.2 blue*0.2 blue*0.6 blue) lcolors(red red blue blue blue) but wanted to make it look similar to how Nick Cox has written about it in the past, I want it to look something like this Thanks in advance!

  • #2
    Compare your posting at https://www.reddit.com/r/stata/comme...tiple_stacked/ which led to the suggestion by @random_stata_user there to use floatplot.

    Please note also our policy on cross-posting which is that you tell us about it. https://www.statalist.org/forums/help#crossposting

    Please note also, as explained alongside, the remedy for sensitive data, which is to provide fake data instead. https://www.statalist.org/forums/help#stata

    If you want to apply floatplot then using graph bar is moving in the wrong direction.

    The help for floatplot has an example aimed fortuitously but fortunately right in your direction with the advice

    To compare several variables, reshape long and apply floatplot, over():
    Here is a fake example aimed at your sketch. Missing values will just be ignored, and so are immaterial, unless missing really means higher than 3 or lower than 0 in which case you should recode. missing values to an integer. I am guessing that missing means just missing.,

    Code:
    * fake data 
    clear 
    set obs 100 
    set seed 2803 
    
    
    forval j = 1/5 { 
        gen whodunit_`j' = runiformint(0, 3)
        replace whodunit_`j' = max(0, whodunit_`j' - 2) if runiform() < (`j' * 0.1) 
    }
    
    * you start here 
    
    preserve 
    gen id = _n 
    reshape long whodunit_, i(id) j(which)
    set scheme s1color 
    
    floatplot whodunit_, highneg(1) over(which) fcolors(red*0.6 red*0.2 blue*0.2 blue*0.6) vertical ytitle(needs a sensible title)
    Click image for larger version

Name:	floatplot.png
Views:	1
Size:	23.1 KB
ID:	1682149

    Comment


    • #3
      Hi Nick,

      Thanks for this. It really helps. I am pretty new to stata and wasn't aware of posting rules in either subthread.

      When starting out with stata it is hard to understand a lot of this code, do you have a recommendation where I can go in the future to read more about code and building graphs like you have? I have tried codes such as help bar naturally which only gets me so far. For example, where would I find information about reshape long and how that can be useful in the future? Anything advice you can provide would be more than appreciated.

      The code you made with fake data is brilliant, thank you.

      Comment


      • #4
        Max Hammond It is good that you found #2 helpful.

        It is evident that Reddit's r/stata has four rules on the home page. as it were. It does use that word rules.

        I know much more about Statalist, where at https://www.statalist.org/ we say



        Please do read the Statalist FAQ for crucial advice before you try to post a message to Statalist. Knowledge of the FAQ will greatly improve the chance your question will be answered as you wish.
        and where when you open https://www.statalist.org/forums/new-content/51 you get a reminder and a link

        First read Advice on Posting.
        so both places post advice up front where it should be visible to anyone, new or not. As the present coordinator of the Statalist FAQ I try to avoid the word "rules"; the stance is that we give advice based on nearly 30 years of collective experience and make requests based on what tends to work best. If anyone thinks that the FAQ Advice is too long, I agree: it's too long for you right now because parts don't apply right now, but we don't have a way to know which parts.

        On the more interesting and more important question of how to learn Stata for what you need to do, the answer is just to ask a specific question and then if it's good enough for anyone to answer one thing leads to another. In your case, seeing the example graph you posted was enough for @random_stata_user on Reddit to think floatplot, and clearly I agree. By the way, describing the graph as a stacked bar chart is not quite specific enough and indeed would suggest to most experienced users graph bar, which I don't think is the answer, although as it turns out there is a command slideplot -- on SSC since 2003 -- based on graph bar. As I wrote slideplot and floatplot I am allowed to give the opinion that floatplot generally works better -- and it is based on graph twoway but no-one new to Stata can be expected to know that immediately, or even quickly.

        Nor should it be obvious to any new user that the data layout you started with is not quite right for what you want. But once someone's mentioned reshape the sequence to follow is (1) help reshape (2) read the linked manual entry (3) search reshape for more resources, bailing out as soon as you find out what you want or coming back here with a question. reshape is widely considered confusing and complicated. except by people who read the documentation and use it enough to find it familiar. So, what else is new?

        In short, you asked a fairly esoteric question, but the answer is more or less three lines of code, or four if I add the belated suggestion that


        Code:
        restore
        gets you back to the original data..


        Comment

        Working...
        X