Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Simple question about box plot

    Dear Statalist,
    I feel embarrassing to come up with this simple question since I don't know how to google it by a short sentence. I want to plot a box graph for a categorical variable for one of my two group subjects, so I use -if- to limit the range for the plotting. But the problem is, the other group of subjects are all with value of a number that not exist in the plotting group, like the values for the plotting group are 1,2,3... but for the other the value is 88. Thus the 88 always show in the bar graph, even it has no observation under the condition.
    I'd like to know how to omit it when plotting the graph for only one group? Thank you!
    Yue

  • #2
    There should not be a problem doing something like


    Code:
    graph box x y if x < 3
    as that places no constraints on the values of y shown.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      There should not be a problem doing something like


      Code:
      graph box x y if x < 3
      as that places no constraints on the values of y shown.
      Thank you Nick, I'm so sorry that I mistakenly type "Box plot" here (I sometimes mixed the names up). It's actually a bar plot, and I plot it based on percentage...

      My code is like below:

      graph bar (percent) if var1==1, over(catvar2) by(var3)

      var1 and var3 are dichotomous variable.

      Thank you and sorry for asking the question again.

      Comment


      • #4
        I thought I understood the question but now I don't think I do. I think you may need to give a data example that makes your question concrete.

        Comment


        • #5
          Originally posted by Nick Cox View Post
          I thought I understood the question but now I don't think I do. I think you may need to give a data example that makes your question concrete.
          Hi, Nick,
          Sorry for my obscure and long description. I would make an example data here:
          Variable: cancer stage (categorical, with 3 stages "1 2 3" as three categories)
          Group: 1 "case" 2 "control" (and sex as subgroup, 1 "female" 2 "male")
          So the data would be like this:
          id group sex can_stage
          001 1 1 3
          002 2 1 1
          003 1 1 1
          004 2 2 3
          005 1 2 2
          006 2 1 1
          007 1 2 1
          008 2 2 3

          What I would like to do is plotting a bar plot for the proportion distribution only for case group (since control group would not have information of cancer stage, and when inputting the data, we give the subjects from control group a code "88", representing "inapplicable in the question"), and by sex "male, female"

          Code:
          graph bar (percent) if group==1 & can_stage<88, over(can_stage) by(sex)

          However, because of the value "88" existed apart from "1 2 3" in the variable "can_stage", the graph will also display the proportion of "88" for me, which in fact I don't need, and will always be 0% for case group.

          So I'm wondering if there is any solution for not displaying the proportion of "88"? Thank you.
          Last edited by Yue YY; 30 Oct 2018, 09:18.

          Comment


          • #6
            Originally posted by Yue YY View Post

            Hi, Nick,
            Sorry for my obscure and long description. I would make an example data here:
            Variable: cancer stage (categorical, with 3 stages "1 2 3" as three categories)
            Group: 1 "case" 2 "control" (and sex as subgroup, 1 "female" 2 "male")
            So the data would be like this:
            id group sex can_stage
            001 1 1 3
            002 2 1 1
            003 1 1 1
            004 2 2 3
            005 1 2 2
            006 2 1 1
            007 1 2 1
            008 2 2 3

            What I would like to do is plotting a bar plot for the proportion distribution only for case group (since control group would not have information of cancer stage, and when inputting the data, we give the subjects from control group a code "88", representing "inapplicable in the question"), and by sex "male, female"

            Code:
            graph bar (percent) if group==1 & can_stage<88, over(can_stage) by(sex)

            However, because of the value "88" existed apart from "1 2 3" in the variable "can_stage", the graph will also display the proportion of "88" for me, which in fact I don't need, and will always be 0% for case group.

            So I'm wondering if there is any solution for not displaying the proportion of "88"? Thank you.
            I think I've made a mistake of the sample data I gave about. The value of can_stage of control group should be 88.

            Comment


            • #7
              Your example data aside, your code excludes values of 88 so I don't think I understand at all. However, replacing 88 by missing may solve your problem.

              Code:
              help mvdecode 

              Comment


              • #8
                Originally posted by Nick Cox View Post
                Your example data aside, your code excludes values of 88 so I don't think I understand at all. However, replacing 88 by missing may solve your problem.

                Code:
                help mvdecode 
                Thank you very much, Nick. And yes, I found the mistake in computing an example data here, but it's late so I can not correct it in the post. I'm so sorry.
                It's like, all people from group 2 are with value "88", and people from group 1 are with values "1" or "2" or "3". I just wanna plot the frequency bar for group 1, but value "88" will also be marked on x-axis.
                Click image for larger version

Name:	1.jpg
Views:	1
Size:	28.3 KB
ID:	1468231


                -mvdecode- will be a choice for me. In this case, there is no missing value for this variable in two groups. I'm just thinking that when graphing for other variables with missing values existed originally, using -mvdecode- may be a bit inappropriate.
                Last edited by Yue YY; 30 Oct 2018, 14:14.

                Comment


                • #9
                  Not before my computer and can't test codes, but I think you can add -nofill- option in your command to get what you want. And a suggestion, you'd better post your codes in use so people here can detect it.

                  Comment


                  • #10
                    Originally posted by Chen Samulsion View Post
                    Not before my computer and can't test codes, but I think you can add -nofill- option in your command to get what you want. And a suggestion, you'd better post your codes in use so people here can detect it.
                    Thank you Chen! I think that is what I want.
                    And I'm sorry about not posting the code. I will make it next time.

                    Comment

                    Working...
                    X