Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to stratify results in a forest plot by two variables?

    Dear Statalist,

    I am trying to deal with the following problem:

    I want to plot several regression coefficients estimated considering quite few outcomes all together. My problem is that I do have 5 sub-populations for each outcome and within each sub-population 3 groups (with 2 coefficients to plot then, as the third group is ref). Therefore, I need to plot two coefficients for each subpopulation. I am using the metan package because it allows me to plot directly the coefficient and the CIs which have already been calculated (this is my case). To my knowledge other programs (e.g. ipdmetan) require the actual dataset to work properly.


    At the moment I have used the following command:



    metan coef lci uci, wgt(weight) nooverall null(0) ///
    nobox lcols(sub_population group) tests(190) force ///
    by(outcome) xlabel (-1.5,0,1)


    coef = regression coefficient
    lci uci = CIs
    weight = weight (equal to 1, as it is not a meta-analysis)

    The code works just fine but it has two main problems:

    - The outcomes labels are displayed in the right position but not in bold characters
    - Using lcols I have tried to overcome the issue displaying two columns, the first one on the left is about the sub-population, and the second one about the groups. This is clear but not aesthetically nice as it shows something similar to this

    Blood pressure
    Subpopulation 1 || Group1 (Group 3 as ref)
    Subpopulation 1 || Group 2 (Group 3 as ref)
    Subpopulation 2 || Group 1 (Group 3 as ref)
    Subpopulation 2 || Group 2 (Group 3 as ref)

    while I would like something like this:

    Blood pressure

    Subpopulation 1 (Group 3 as ref)
    Group 1
    Group 2

    Subpopulation 2 (Group 3 as ref)
    Group 1
    Group 2


    Is there a way to overcome this issue using metan? To my understanding the 'by' option does only support one variable. Alternatively, can I use a different package? The package should allow me to plot coefficients which have already been calculated though.


    Thanks in advance.

  • #2
    Anybody who can help? Thanks in advance!

    Comment


    • #3
      I don't quite follow what you have been trying to accomplish. However, here is my opinion:

      1.If you have been trying to present subgroups, you only have to reorganize your data accordingly and use the option label(namevar=variable).
      2. Copy and paste the code below in order to see if it addresses your problem.


      */ --------------- start--------------
      clear
      input coef lci uci str20 group str20 subpopulation
      .1266349 -.3146492 .5679189 Group1 Sub1
      -.3509529 -1.023146 .3212402 Group2 Sub1
      .08614 -.2777374 .4500174 Group1 Sub2
      -.0951817 -.358211 .1678475 Group2 Sub2
      -.0707719 -.4005753 .2590315 Group1 Sub3
      -.1216107 -.3963165 .1530952 Group2 Sub3
      .1412438 -.3248311 .6073186 Group1 Sub4
      -.260193 -.5850677 .0646817 Group2 Sub4
      -.1227866 -.5573016 .3117283 Group1 Sub5
      -.0232415 -.2844573 .2379743 Group2 Sub5
      end
      metan coef lci uci , label(namevar=group) by(subpopulation) nooverall classic nosubgroup nowt xlabel(-2,-1,1,2) astext(60)
      */----------------end-------------------


      All the best,

      Tiago

      Comment


      • #4
        You query may have remained unanswered for some time mainly due to the lack of information, as requested in the FAQ.

        Taking the data presented in #3 , let's try to help you.

        By the way, I got an error message ("type mismath", r 109) after typing Thiago's command.

        Below, my suggestion:

        Code:
        .label define refsubpop 1 "Sub3" 2 "Sub1" 3 "Sub2" 4 "Sub4" 5 "Sub5"
        .encode subpopulation, gen(subpop)
        .label values subpop refsubpop
        .codebook subpop
        .metan coef lci uci, lcols(subpop) by(group)

        Click image for larger version

Name:	image_6894.png
Views:	1
Size:	16.5 KB
ID:	1374944




        Hopefully that helps!
        Last edited by Marcos Almeida; 19 Feb 2017, 04:20.
        Best regards,

        Marcos

        Comment


        • #5
          */ --------------- start--------------
          clear
          input coef lci uci str20 group str20 subpopulation
          .1266349 -.3146492 .5679189 Group1 Sub1
          -.3509529 -1.023146 .3212402 Group2 Sub1
          .08614 -.2777374 .4500174 Group1 Sub2
          -.0951817 -.358211 .1678475 Group2 Sub2
          -.0707719 -.4005753 .2590315 Group1 Sub3
          -.1216107 -.3963165 .1530952 Group2 Sub3
          .1412438 -.3248311 .6073186 Group1 Sub4
          -.260193 -.5850677 .0646817 Group2 Sub4
          -.1227866 -.5573016 .3117283 Group1 Sub5
          -.0232415 -.2844573 .2379743 Group2 Sub5
          end
          metan coef lci uci , label(namevar=group) by(subpopulation) nooverall classic nograph notable
          metan coef lci uci , label(namevar=group) by(subpopulation) nooverall classic nosubgroup nowt xlabel(-2,-1,1,2) astext(60)
          */----------------end-------------------

          should work. I have no idea why.

          Comment


          • #6
            Two further comments:

            Im #4, I wanted to write "type mismatch" (instead of "mismath").

            In the forthcoming messages, I kindly recommend to provide commands as well as data either under CODE delimiters or by using the SSC dataex.

            Thanks.
            Best regards,

            Marcos

            Comment


            • #7
              Thanks both. The suggested example works perfectly. I was wondering whether would be possible to have a further stratification (e.g. Group Subgroup1 and Subgroup2)
              Last edited by Eduardo Torre; 20 Feb 2017, 10:35.

              Comment


              • #8
                Hello Eduardo ,

                Unfortunately, I did not get the point of having so many strata in spite of just a few studies.

                That said, yes, you may fiddle with the commands shared in #4, plus the use of the if clause, for example.

                The user-written metan is really excellent and I recommend that you take a look at its help files.

                Surely, there you will find inspirational exemples!



                Best regards,

                Marcos

                Comment


                • #9
                  Dear Eduardo,

                  Apologies for the late response.

                  The above advice is, of course, excellent. However, if you already have coefficients and confidence intervals and specifically want extra details such as a third subgroup level and bolded headings, may I suggest using forestplot (part of the ipdmetan package). forestplot doesn't perform any analyses, but simply plots the data in memory (including, crucially, line breaks, spaces, formats, etc.) as a forest plot.

                  Let's start by generating a test dataset, based on the data you provide:

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input str14 outcome str8 subpopulation str13 group float(coef lci uci) byte weight
                  "Blood pressure" "Subpop 1" "Group 1"  -.5 -.75 -.25 1
                  "Blood pressure" "Subpop 1" "Group 2" -.25   -1   .5 1
                  "Blood pressure" "Subpop 1" "Group 3"    0    0    0 1
                  "Blood pressure" "Subpop 2" "Group 1"  -.5 -.75 -.25 1
                  "Blood pressure" "Subpop 2" "Group 2" -.25   -1   .5 1
                  "Blood pressure" "Subpop 2" "Group 3"    0    0    0 1
                  "Second outcome" "Subpop 1" "Group 1"  -.5 -.75 -.25 1
                  "Second outcome" "Subpop 1" "Group 2" -.25   -1   .5 1
                  "Second outcome" "Subpop 1" "Group 3"    0    0    0 1
                  "Second outcome" "Subpop 2" "Group 1"  -.5 -.75 -.25 1
                  "Second outcome" "Subpop 2" "Group 2" -.25   -1   .5 1
                  "Second outcome" "Subpop 2" "Group 3"    0    0    0 1
                  end
                  Now, we're basically going to manually generate the spacing and labelling usually performed by metan, but to our own specifications.

                  Let's start with the headings and groupings:

                  Code:
                  gen int obs = _n
                  gen byte expand = 2*(outcome!=outcome[_n-1])
                  expand expand
                  bysort obs : gen byte _USE = cond(expand, _n>1, 1)
                  drop expand
                  
                  replace subpop = "" if _USE==0
                  replace group = "" if _USE==0
                  replace coef = . if _USE==0
                  replace lci = . if _USE==0
                  replace uci = . if _USE==0
                  replace weight = . if _USE==0
                  
                  // Bold-face heading using SMCL; see -help smcl- for details
                  gen labels = `"{bf:"'+outcome+`"}"' if _USE==0
                  replace labels = subpop if subpop!=subpop[_n-1] & missing(labels)
                  label var labels "Outcome and subpopulation"
                  label var group "Group"
                  At this point, we can produce a basic forest plot:

                  Code:
                  forestplot coef lci uci, lcols(labels group) nowt
                  Now let's enhance it further:

                  Code:
                  // line breaks between subpops
                  gen byte expand = 2*(_USE==1 & group=="Group 3")
                  expand expand
                  bysort obs (_USE) : replace _USE=0 if expand==2 &_n>1
                  replace group = "" if _USE==0
                  drop expand
                  
                  // replace "effect size" text for reference category
                  gen effect = string(coef, "%5.2f") + " (" + string(lci, "%5.2f") + ", " + string(uci, "%5.2f") + ")" if _USE==1 & group!="Group 3"
                  replace effect = "(reference)" if _USE==1 & group=="Group 3"
                  label var effect "Effect (95% CI)"
                  
                  // left-justify left-hand columns ("describe" first to see the current display format; then simply negate the width value)
                  describe labels group subpop
                  format labels %-19s
                  format group %-13s
                  format subpop %-9s
                  Now we have a pretty good result:

                  Code:
                  forestplot coef lci uci, lcols(labels group) rcols(effect) nowt nostats

                  If you're not confident with Stata code, most of this manipulation could be done in Excel and copied/pasted into Stata. After running the code fragments above, try viewing the result in Stata and looking at the data structure. The crucial elements are:

                  - The data itself (effect size and confidence limits). forestplot will plot this, and will also automatically create a right-hand column to display the numbers as formatted text. However, in our second ("enhanced") plot, we over-rode this in order to display the reference category correctly (I plan to make this easier in a future version of forestplot). We manually generated our right-hand column and specified it in the rcols() option.

                  - Labels: We've taken the "outcome" (with bold formatting) and "subpopulation" to create our first left-hand column, "labels"; we then requested that "group" be an additional left-hand column using the lcols() option.

                  - Spaces and "_USE": the data will be plotted in the row order you provide, honouring any empty (in terms of effect-size data) rows. forestplot automatically looks for a variable named _USE which tells it what sort of data is in each row. Here, it's pretty simple: either data (_USE==1) or empty rows (_USE==0); but it can be more complicated (e.g. diamonds for pooled effects).

                  - Left-justification: Stata automatically right-justifies all its data, including strings. Therefore, we need to left-justify our left-hand columns before plotting. Unfortunately, there is no easy way of just saying "left-justify my data" (as far as I'm aware); you have to use Stata's format command, which is highly specific. Example code is given above. This is one step that cannot be done in Excel, as Stata will not honour the Excel justification when copying/pasting (as far as I'm aware).


                  I hope this is useful; please let me know if you have any questions.

                  Thanks,

                  David.

                  Comment


                  • #10
                    Dear Statalisters,

                    I would like to revisit this topic as I have a similar query to the original by Eduardo Torre . For what it's worth, as I understand this thread is two years old now but there is the ipdover command now available to achieve the sub-group regression in forest plots.

                    My query however relates to using metan to perform meta-analyses and plot them on a single plot across two strata or 'layers'. For example, I am comparing outcomes in trials that diagnose disease using different methodologies (first 'layer' or strata) and then within each group of studies that use the same methodology there are different comparators to the trial drug of interest (second 'layer' or strata).

                    Using the useful plot by Marcos Almeida is it possible to get metan to plot what you have coded except each subgroup would represent a meta-analysis in itself.

                    Very crude, but I have attached a sketch to help trigger what I have in mind.

                    Many thanks,
                    Alexander
                    Attached Files
                    Many thanks,
                    Alexander
                    (Stata v14.2 IC for Mac)

                    Comment


                    • #11
                      Dear Alexander,

                      Leaving aside the text indentation, this looks similar to the original example except that you have four subgroups (albeit in sets of two) rather than two. Is that correct? If so, the only difficulty would seem to be your wish to have the subgroup diamonds above, rather than below, the individual trials.

                      At first glance, this looks do-able with forestplot.

                      Thanks,

                      David.

                      Comment


                      • #12
                        Hi David Fisher
                        Apologies for the late response. Not strictly four subgroups, as you say, two sets of two subgroups. You mention that it is doable with forestplot. Are you able to provide a brief worked example to show this? I suspect my issue could be with the way I have coded the data. At the moment, I have two separate forest plots for Strata 1 and Strata 2 in my sketch above.

                        Code:
                        metan event_treat no_event_treat event_comparator no_event_comparator if strata==1, rr by(sub_strata) label(namevar==study) counts nooverall
                        and then

                        Code:
                        metan event_treat no_event_treat event_comparator no_event_comparator if strata==2, rr by(sub_strata) label(namevar==study) counts nooverall
                        I guess I wished metan had a way to incorporate two by options in the plot. In my example strata represents methodology and sub_strata represents drug comparator (placebo or active).
                        Many thanks,
                        Alexander
                        (Stata v14.2 IC for Mac)

                        Comment

                        Working...
                        X