Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple graph matrix


    Dear Stata experts,

    I came across a graph as below (from a PUBMED article) and wondering if there is any way to do it on STATA. I know the "graph matrix" command could do the scatterplot part, but I'm struggling to add the linear fit line and kernel density estimation in the same graph. I would really appreciate your guidance on it!
    Click image for larger version

Name:	1.png
Views:	1
Size:	306.4 KB
ID:	1553851



  • #2
    Perhaps this was done by 100 separate graph commands, with the saved graphs then combined.
    Code:
    help graph combine

    Comment


    • #3
      Thank you William! I was able to produce the graph below with scatterplot and linear fit line, which is quite close to the one I'm looking for , by using "combineplot" command but - I still can't make the kernel density estimation appear in the same graph.

      I saw the similar ones in multiple articles so I hope there could be a tip to make it, rather than just combining the graphs
      Click image for larger version

Name:	1.png
Views:	1
Size:	320.3 KB
ID:	1553866

      Comment


      • #4
        I am way over my head here - treat what follows as an educated guess.

        I assume the graphs on the diagnoal in post #1 are the kernel density graphs you are striving for.

        I am guessing your code is something like
        Code:
        foreach v1 of varlist var1-var8 {
        foreach v2 of varlist var1-var8 {
        twoway scatter `v1' `v2', ...
        }
        }
        and if so, then perhaps what you want is
        Code:
        foreach v1 of varlist var1-var8 {
        foreach v2 of varlist var1-var8 {
        if "`v1'"=="`v2'" {
            ​​​​​​​twoway kdensity `v1', ...
        }
        else {
            ​​​​​​​twoway scatter `v1' `v2', ...
        ​​​​​​​}
        }
        }

        Comment


        • #5
          It's a detail but in the original, I note, or rather infer, that wind direction is one of the variables included. But smoothing directions -- or with respect to directions -- requires realising that 0 deg and 360 deg are one and the same (presumably, North (*)) and that 10 deg (say) is just 20 deg away from 350 deg, and so forth. So smoothings -- whether kernel density estimation or some kind of scatter plot smoothng -- should wrap around at North.

          I would be really, really impressed if the authors' software allowed a switch to treat circular variables differently.

          Circular variables can be thought of as including compass direction, clock (time of day) and calendar (time of year) as leading special cases. The co-occurrence of "c" words here is a curious coincidence, but one I find personally compelling and congenial.

          A meme in graphic discussions is that Aha! I see interesting details in that graph! is always preferable to Wow! How did you do that? I add that Huh? What I am supposed to think about this? is even less desirable. .

          The obvious but crucial trade-off here is between presenting a great deal of information (good) and presenting so much detail that the reader is not inclined to, or not able to, start trying to work out what needs to be thought about (not so good).

          (*) Privileging North as a reference direction is, naturally, hemispherism. Be aware that unthinking hemispherism can appear obnoxious or at least insensitive to other-directed people. .

          Comment


          • #6
            Originally posted by Nick Cox View Post
            I add that Huh? What I am supposed to think about this? is even less desirable.
            In particular, that was my reaction to the kernel density estimate of "week" in the matrix in post #1. Because the underlying data presumably has a fixed number of observations per week, the density should be a flat lilne, but instead we see a spurious "tailing off" at the boundaries.

            Now, I tried to demonstrate this, which the following code does.
            Code:
            set obs 100
            generate t = 1000+_n
            expand 10
            kdensity t
            I believed that
            Code:
            kdensity t, boundary
            would solve the problem, but it throws an error message that option boundary is not allowed. Huh? I really don't have enough experience with smoothing in Stata to carry this exploration farther.

            I'll add that a close look at the kernel density of rain in post #1 seems to show the same problem - an artificial lowering of the density in the neighborhood of the lower boundary at 0.

            Added in edit:
            Code:
            twoway kdensity t, boundary
            did not throw the error message, but also did not solve the "tailing off" problem.
            Last edited by William Lisowski; 19 May 2020, 07:35.

            Comment


            • #7
              I had not noticed the boundary option before but I can't see that it could help here.

              Bounded responses need special treatment so that probability mass is not smoothed away.

              Whether week in #1 is week of year or weekly date the distribution is likely to be in principle approximately uniform but we see in #1 artefacts of the smoothing method.

              My guess is that the graph in #1 was produced in R but the kernel density estimation routines in Stata also lack options to implement standard suggestions such as those detailed in 2004 within https://www.stata-journal.com/articl...article=gr0003 (pp.76-78).

              Comment

              Working...
              X