Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with twoway scatter: Why do the values remain continuously at y=0 on the graph?

    I'd like to post a variable on the x-axis, ranging from 0 to 90000. However, the values are fixed and are not necessarily continuous (it starts at 0, then continues at 1, jumps to a non-sequential value, etc). I wanted to make a graph with y showing the number of observations per id depending on whether the power_p variable is 0, 1, ...

    Next, I'd like to plot this graph by product, i.e. the categorical variable "product_classification2" (1,2,3,4). This variable is nominal, not cardinal. The "order" is not important.

    The "total3" variable represents a count the id's, by "power_p". "Power_p3" equals 1 if "power_p" is 0, 2 if 3, and so on.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float id_numerical long(power_p power_p3) float total3 long product_classification2
     167391 0   2 21 1
     606274 0   2 21 2
     463980 0   2 21 1
      12089 0   2 21 4
     463980 0   2 21 1
     332979 0   2 21 1
     606274 0   2 21 2
     196748 0   2 21 1
     573193 0   2 21 2
     727027 0   2 21 1
      12089 0   2 21 4
     547007 0   2 21 2
     332979 0   2 21 1
      37068 0   2 21 2
      37068 0   2 21 2
     196748 0   2 21 1
     606274 0   2 21 2
     467938 0   2 21 2
      37068 0   2 21 2
     606274 0   2 21 2
      37068 0   2 21 2
     261637 1   3 50 2
     373646 1   3 50 1
    1244020 1   3 50 2
     363069 1   3 50 2
      66665 1   3 50 2
     775905 1   3 50 2
     369693 1   3 50 2
     131820 1   3 50 2
     878192 1   3 50 2
     304150 1   3 50 1
    1095955 1   3 50 1
     595716 1   3 50 2
     347416 1   3 50 1
     785459 1   3 50 2
     410882 1   3 50 2
    1064514 1   3 50 2
     976872 1   3 50 2
     410882 1   3 50 2
      88765 1   3 50 1
     223320 1   3 50 2
     304150 1   3 50 1
    1236813 1   3 50 2
     347416 1   3 50 1
     347416 1   3 50 1
    1095955 1   3 50 1
     604293 1   3 50 2
     870793 1   3 50 2
     347416 1   3 50 1
    1076215 1   3 50 2
    1192145 1   3 50 1
    1085856 1   3 50 2
     922230 1   3 50 1
    1095955 1   3 50 1
     870891 1   3 50 2
     803598 1   3 50 1
     347416 1   3 50 1
    1095955 1   3 50 1
    1008453 1   3 50 2
     853336 1   3 50 1
    1095955 1   3 50 1
     151802 1   3 50 2
     853336 1   3 50 1
       6117 1   3 50 2
     630905 1   3 50 2
     542584 1   3 50 2
     846853 1   3 50 2
     527149 1   3 50 2
     803598 1   3 50 1
     655872 1   3 50 2
     373646 1   3 50 1
     602225 2 463  2 1
     602225 2 463  2 1
     924145 3 722 13 1
     754578 3 722 13 1
     754578 3 722 13 1
      88766 3 722 13 1
      22159 3 722 13 1
    1074655 3 722 13 1
     832257 3 722 13 1
     832257 3 722 13 1
     313726 3 722 13 1
     924145 3 722 13 1
      88766 3 722 13 1
    1074655 3 722 13 1
     313726 3 722 13 1
     231538 4 899 14 2
       6946 4 899 14 1
     317723 4 899 14 2
     694927 4 899 14 1
    1070575 4 899 14 1
     317723 4 899 14 2
    1093714 4 899 14 2
    1093714 4 899 14 2
    1070575 4 899 14 1
     231538 4 899 14 2
       6946 4 899 14 1
     816495 4 899 14 1
     694927 4 899 14 1
     816495 4 899 14 1
    end
    label values power_p3 power_p3
    label def power_p3 2 "0", modify
    label def power_p3 3 "1", modify
    label def power_p3 463 "2", modify
    label def power_p3 722 "3", modify
    label def power_p3 899 "4", modify
    label values product_classification2 product_classification2
    label def product_classification2 1 "Clasico", modify
    label def product_classification2 2 "Indexado", modify
    label def product_classification2 4 "Tarifa Justa", modify

    Here is the graph that I obtained:

    Click image for larger version

Name:	Fail.png
Views:	1
Size:	51.6 KB
ID:	1727215


    Best,

    Michael

  • #2
    Please show the command you used to get the graph in #1,

    Comment


    • #3
      This is the most basic one that I used:

      graph twoway scatter total3 power_p3

      I then used the ,by option but obtain weird things.


      Basically, this is the kind of graphs that I want, by category product_classification2 (contained 1,2,3,4 for each category):

      Click image for larger version

Name:	Graph.png
Views:	1
Size:	145.1 KB
ID:	1727231



      Thank you for your help Nick Cox
      Last edited by Michael Duarte Goncalves; 15 Sep 2023, 07:17.

      Comment


      • #4
        Thanks for further detail. I can't follow all that you want here, but

        * I wouldn't use a scatter plot to show frequencies

        * I don't gather that your category scale makes smoothing helpful, or even valid

        * You seem surprised, yet at the same time expectant, that a few categories are common, but many categories are not , and so that this will show up on a graph.

        * The value label definitions seem backward to me.

        All that said, this may help. Perhaps the major point is use of a square root scale to dampen the contrast of common and rare categories.

        At the same time, I fear that the full version of this with all the data will still be a mess and that you might be better off with focus on the most common categories, whatever they are.

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input float id_numerical long(power_p power_p3) float total3 long product_classification2
         167391 0   2 21 1
         606274 0   2 21 2
         463980 0   2 21 1
          12089 0   2 21 4
         463980 0   2 21 1
         332979 0   2 21 1
         606274 0   2 21 2
         196748 0   2 21 1
         573193 0   2 21 2
         727027 0   2 21 1
          12089 0   2 21 4
         547007 0   2 21 2
         332979 0   2 21 1
          37068 0   2 21 2
          37068 0   2 21 2
         196748 0   2 21 1
         606274 0   2 21 2
         467938 0   2 21 2
          37068 0   2 21 2
         606274 0   2 21 2
          37068 0   2 21 2
         261637 1   3 50 2
         373646 1   3 50 1
        1244020 1   3 50 2
         363069 1   3 50 2
          66665 1   3 50 2
         775905 1   3 50 2
         369693 1   3 50 2
         131820 1   3 50 2
         878192 1   3 50 2
         304150 1   3 50 1
        1095955 1   3 50 1
         595716 1   3 50 2
         347416 1   3 50 1
         785459 1   3 50 2
         410882 1   3 50 2
        1064514 1   3 50 2
         976872 1   3 50 2
         410882 1   3 50 2
          88765 1   3 50 1
         223320 1   3 50 2
         304150 1   3 50 1
        1236813 1   3 50 2
         347416 1   3 50 1
         347416 1   3 50 1
        1095955 1   3 50 1
         604293 1   3 50 2
         870793 1   3 50 2
         347416 1   3 50 1
        1076215 1   3 50 2
        1192145 1   3 50 1
        1085856 1   3 50 2
         922230 1   3 50 1
        1095955 1   3 50 1
         870891 1   3 50 2
         803598 1   3 50 1
         347416 1   3 50 1
        1095955 1   3 50 1
        1008453 1   3 50 2
         853336 1   3 50 1
        1095955 1   3 50 1
         151802 1   3 50 2
         853336 1   3 50 1
           6117 1   3 50 2
         630905 1   3 50 2
         542584 1   3 50 2
         846853 1   3 50 2
         527149 1   3 50 2
         803598 1   3 50 1
         655872 1   3 50 2
         373646 1   3 50 1
         602225 2 463  2 1
         602225 2 463  2 1
         924145 3 722 13 1
         754578 3 722 13 1
         754578 3 722 13 1
          88766 3 722 13 1
          22159 3 722 13 1
        1074655 3 722 13 1
         832257 3 722 13 1
         832257 3 722 13 1
         313726 3 722 13 1
         924145 3 722 13 1
          88766 3 722 13 1
        1074655 3 722 13 1
         313726 3 722 13 1
         231538 4 899 14 2
           6946 4 899 14 1
         317723 4 899 14 2
         694927 4 899 14 1
        1070575 4 899 14 1
         317723 4 899 14 2
        1093714 4 899 14 2
        1093714 4 899 14 2
        1070575 4 899 14 1
         231538 4 899 14 2
           6946 4 899 14 1
         816495 4 899 14 1
         694927 4 899 14 1
         816495 4 899 14 1
         end 
         
         label values power_p3 power_p3
        label def power_p3 2 "0", modify
        label def power_p3 3 "1", modify
        label def power_p3 463 "2", modify
        label def power_p3 722 "3", modify
        label def power_p3 899 "4", modify
        label values product_classification2 product_classification2
        label def product_classification2 1 "Clasico", modify
        label def product_classification2 2 "Indexado", modify
        label def product_classification2 4 "Tarifa Justa", modify
        
        levelsof power_p, local(xvals)
        
        foreach x of local xvals { 
            su power_p3 if power_p == `x', meanonly 
            local xla `xla' `x' "`r(mean)'"
        }
        
        * install from Stata Journal 
        mylabels 0 25 100 225 400 625 900 1225, myscale(sqrt(@)) local(yla) 
        
        
        spikeplot power_p [fw=total3], lw(thick) root xla(`xla') yla(`yla') ytitle(Frequency (root scale)) by(product, note(""))
        Click image for larger version

Name:	spikeplot.png
Views:	1
Size:	31.7 KB
ID:	1727246

        Comment


        • #5
          Good afternoon Nick Cox,

          Thank you so much for the tips provided.
          You are totally right.


          All the best,

          Michael

          Comment

          Working...
          X