Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • local polynomial smoothed graph

    I am trying to create a kernel-weighted local polynomial smoothed values graph.

    1.) At the mean value of y at each x by group

    An example graph is added below the dataex

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float y byte(x group)
    -1.48  1 1
      .04  8 2
      .04  8 2
     -.92  9 1
    -1.66  1 1
      .78  4 2
      1.3  5 2
      1.9  3 2
    -2.14  4 1
    -2.21  4 1
    -1.71  8 1
     -.79  8 1
      .39  7 1
      .24  4 1
      .31  7 2
     2.88  6 2
      .53  3 2
     -.89  2 2
    -1.57  3 1
    -1.28  3 1
     -.39  4 1
     -.53  9 1
    -1.13  1 1
    -1.78  3 2
      .32  4 2
     -.07  5 2
    -1.56  2 2
      .03  5 2
     -.69  5 1
     1.74 10 1
     1.39  6 1
      .78  7 1
      .51  1 1
      .53 10 1
     -.32 10 1
    -1.28  9 2
    -1.82  3 2
      .72  5 2
      .28 10 2
     -.72  5 2
     -.93  2 1
     3.97  1 1
    -4.17  4 1
      .85  3 2
     -.47  5 1
    -1.38  3 1
    -1.63  2 1
     -1.2  8 1
     -.79  3 1
     -.23 10 1
     1.29  3 2
     1.94  6 1
      .12  3 1
     -.56  3 1
      .71  1 1
      .76  8 1
       .9  5 1
     -.43  4 1
     1.63  3 1
     1.19  7 1
     5.11  5 1
     1.22  9 1
       .8  2 1
     -.75  7 2
     -1.3  3 2
    -1.77  5 1
     -.75  3 1
     -.29 10 1
    -3.15  2 1
     -.63 10 2
     -.85  5 1
      .26  2 1
     -.67  2 1
     1.62  2 1
     -.92  4 1
     1.83  6 1
     -.61  7 1
     -.15  4 1
    -1.42 10 1
      .02  9 1
      .22  5 1
    -1.17 10 1
     -.63  6 1
     -.12  3 1
       .6  8 1
      .37  9 1
     -.04 10 2
     1.01  2 1
     1.35  7 1
    -1.93  3 1
    -2.21  9 1
     -.56  2 1
     1.52  2 1
     1.05  2 1
      .27  7 2
    -1.46  3 1
      .48  6 1
      .46  1 2
    -2.89  3 1
      -.7  9 1
      .82  8 1
     1.05  8 1
     1.29  9 1
     1.26  6 2
     1.92  1 1
     1.58  6 1
     1.39  3 1
     -.04  1 1
     1.66  7 1
     1.93  9 1
       -1  3 1
     -.21  3 1
      .18  3 1
     2.31  4 1
     2.36 10 1
      1.6  2 1
     -.09 10 1
     -.28  2 2
     -.17  1 1
      .67  7 2
     -2.2  5 1
     -.48  5 1
     1.35  9 2
     3.25  9 2
     1.85  5 1
      .34  6 1
      .16  6 1
      .09  8 1
    -1.87  7 1
      .66  8 1
     1.24  6 1
      .31  3 1
      .53  7 2
      .69  6 1
     -.66  7 1
      .03  3 1
     1.45  6 1
    -1.05 10 1
    -2.14  4 1
      .38  4 2
     -.48  8 1
     -.43  3 1
    -2.01  7 2
    -2.19 10 1
     -2.9  3 1
      .07  7 2
     1.73  8 1
      -.6  7 1
     -.38  9 1
      .91  4 2
    -2.28  7 1
       .3  8 1
      .51  8 1
    -1.57  2 1
    -1.07  1 1
      .83  9 1
    -3.01  3 1
      .54 10 1
    -1.44  2 1
      .58  9 1
     -.24  6 1
     -.36  5 1
     -.94  5 1
     2.24  7 1
     -.65  7 1
     -.02  1 1
    -2.35  1 1
    -1.81  8 1
      1.1  8 1
      .13  4 1
     -.26 10 1
     -.03  2 1
      -.3  8 1
     -.22  9 1
     2.25  6 2
     -1.2  8 2
      .44 10 2
      .83  7 1
     -.66  7 2
     -.78  9 1
    -1.21  5 1
    -1.59  5 1
     1.09  4 1
     1.48 10 1
    -2.96  1 1
     -.06  3 1
      .06  2 1
      .27  6 1
    -2.24  1 1
     -.81  4 2
     -.54  7 1
     -.77  7 1
     -1.1  4 1
    -1.52  9 1
    -2.87  5 1
    -1.54 10 1
     -.81 10 1
    -1.64  9 1
      .73 10 1
     1.17  4 1
    end
    Click image for larger version

Name:	Screenshot 2025-06-20 130944.png
Views:	1
Size:	72.0 KB
ID:	1779029
    Best regards,
    Mukesh

  • #2
    Code:
    twoway (lpoly y x if group == 1) (lpoly y x if group == 2)
    If you want the raw values on top, you can also add a scatterplot with
    Code:
    scatter y x if group == 1
    Best wishes

    Stata 18.0 MP | ORCID | Google Scholar

    Comment


    • #3
      Thank Felix Bittmann for your response.
      I want at mean value of y by x
      Best regards,
      Mukesh

      Comment


      • #4
        So, calculate the means first using egen or collapse and then fire up a smoother.

        Comment


        • #5
          Following Nick Cox 's suggestion, the command then becomes:

          Code:
          collapse (mean) y, by(x group)
          twoway (lpoly y x if group == 1) (lpoly y x if group == 2)
          Keep in mind that lpoly also has the degree option.

          HTML Code:
          degree(#) specifies the degree of the polynomial to be used in the
                  smoothing.  The default is degree(0), meaning local-mean smoothing.
          I was not sure if your initial request was about this type of smoothing or the explicit version, just shown.

          Best wishes

          Stata 18.0 MP | ORCID | Google Scholar

          Comment


          • #6
            This statement related to #1:- First, we computed the mean of y by x and graphed the variable and its smoothed values (using the kernel-weighted local polynomial smoothing algorithm) by x.

            If I understood properly, the statement implies:

            Code:
             collapse (mean) y, by(x group)
            Code:
            twoway (lpoly y x if group == 1) (lpoly y x if group == 2) (scatter y x if group == 2) (scatter y x if group == 1)
            Dear Felix Bittmann are there any specific ways/rules to decide the degree or bw?. The study mentioned in #1 mentioned nothing about degree/bw.

            I am asking because my real data has 200000 observations on y with range (-6,6) and 60 (x discrete values). Approx. 30% of y are below (-2).

            Greatly thankful for the responses so far.
            Last edited by Mukesh Punia; 20 Jun 2025, 04:20.
            Best regards,
            Mukesh

            Comment


            • #7
              I dont think there is an easy answer to your questions. The Stata manual provides some guidance and references: https://www.stata.com/manuals13/rlpoly.pdf
              Personally, I would play around with the options and see how this changes your graph. As long as you report transparently what you are doing you should be fine.
              Last edited by Felix Bittmann; 20 Jun 2025, 05:53.
              Best wishes

              Stata 18.0 MP | ORCID | Google Scholar

              Comment


              • #8
                Dear Felix Bittmann your responses are greatly appreciated!

                I was looking at your paper on BMI & happiness. Find it interesting conceptually & methodologically, the robustness part.
                I am trying to establish relationships between childhood (mal)nutrition and well-being in adolescence in LMICs. About to finish my PhD. May I send you an e-mail if needed, for help/discussion or for collab?
                Thank you
                Last edited by Mukesh Punia; 20 Jun 2025, 05:37.
                Best regards,
                Mukesh

                Comment


                • #9
                  To me. reduction to means before smoothing adds an extra and arbitrary step and raises a variety of issues, such as: why not medians, or trimmed means; or how many observations go into each mean (or other summary) and whether that should be taken into account.

                  More generally, whatever works well for your data and purpose can well be the prime consideration, but there are others, such as whether you are trying a smoother curve that is a weighted moving average or one that is a local linear regression.

                  You have given us no information that I can see on what your variables are. Sometimes that does guide what you're seeking, depending on the nature of the generating process, and whether you expect kinks or even jumps in the relationship.
                  Last edited by Nick Cox; 20 Jun 2025, 06:38.

                  Comment


                  • #10
                    Dear Nick Cox in #6 I described my variable & n. More clearly y is standardised height-for-age z-score of children under-five age as per WHO 2006 growth standards. X is age in months completed (0-59). If you wish to see the graph I will post after some time, I am bit away from my PC. I am doing this because Jeff Leroy at IFPRI in his two paper in 2014-15 used height-for-age deficiency (HAD) metric to assess how at different ages stunning cumulates. Sharing full data here is I think not possible because of 2000000 observations.

                    Thank you - Mukesh
                    Best regards,
                    Mukesh

                    Comment


                    • #11
                      You'd need a subject-matter expert to say more. I'd expect that while individuals may have slightly irregular growth curves, that would be averaged out over such a large sample.

                      Comment


                      • #12
                        True!
                        Based on 161160 observations over 59 months and 5 wealth index quintiles. graph is here for highest v/s lowest quintiles:

                        Code:
                         collapse (mean) HAZ, by(age wquintile)
                        Code:
                         twoway (lpoly HAZ age if wquintile== 1) (lpoly HAZ age if wquintile == 5) (scatter HAZ age if wquintile == 1) (scatter HAZ age if wquintile == 5)
                        with other default options

                        Click image for larger version

Name:	Graph.png
Views:	1
Size:	92.7 KB
ID:	1779053



                        Best regards,
                        Mukesh

                        Comment


                        • #13
                          This children are presumably born at different times of year, so what's the story there?

                          Comment


                          • #14
                            Thank you Dear Nick for an important question. I will post a detailed comment in coming days. I think your concern is seasonality.
                            A quick response is that children are of the same age, i.e. 3 months old, but are from two (lowest & highest) economic status.
                            Best regards,
                            Mukesh

                            Comment

                            Working...
                            X