Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How plot a scatter plot for the dataset given below

    Dear All,

    I would like to seek your help in plotting a scatter plot where X-axis is the timeline (timeframe) and Y-axis is the riskscore (1-3) and the dots should be the trend description.

    Your help would be highly appreciated.

    dataex signal timeframe riskscore

    copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str77 signal str11 timeframe byte riskscore
    "Fall in global and emerging market venture capital/private equity investments"    "0-3 years "    1
    "DFIs increasing financing to marginalised entrepreneurs"    "3-7 Years "    2
    "Increasing investment in nature-based solutions"    "0-3 Years "    2
    "Increasing investment in nature-based solutions"    "0-3 Years "    2
    "Investments in hydrocarbon production increasing"    "0-3 Years "    2
    "Investments in hydrocarbon production increasing"    "0-3 years "    2
    "Fossil fuel subsidies persist"    "0-3 Years "    3
    "Fossil fuel subsidies persist"    "3-7 Years "    3
    "Developing economies increasingly disrupted by geopolitical tensions"    "3-7 Years "    3
    "Increase in youth unemployment in developing countries"    "7-10 Years "    3
    "Increase in youth unemployment in developing countries"    "0-3 Years "    3
    "Increase in green tax incentives in developed countries"    "7-10 years "    1
    "Increase in green tax incentives in developed countries"    "0-3 years "    1
    "New wave of debt swaps for climate and nature"    "3-7 Years "    1
    "New wave of debt swaps for climate and nature"    "0-3 years "    1
    "Increasing consumer demand for sustainability"    "0-3 Years "    1
    "Increasing consumer demand for sustainability"    "0-3 years "    1
    "Gender backlash"    "3-7 Years "    1
    "ESG regulation in the spotlight"    "0-3 years "    1
    "ESG regulation in the spotlight"    "3-7 years "    2
    "More tech innovations for philanthropy"    "3-7 Years "    2
    "More tech innovations for philanthropy"    "3-7 Years "    2
    "Growing number of countries in debt distress"    "3-7 Years "    2
    "Increased use of weather modifying technology"    "0-3 Years "    2
    "Increased use of weather modifying technology"    "3-7 Years "    2
    "Climate activism expanding"    "3-7 Years "    3
    "Climate activism expanding"    "3-7 Years "    3
    "Ever-stronger AI"    "7-10 Years "    3
    "Increasingly digital lives"    "3-7 Years "    3
    "Growth in metaverse users"    "0-3 Years "    3
    "Boom in breakthrough technology"    "7-10 Years "    1
    "Rise in social unrest"    "0-3 Years "    1
    "Rise in social unrest"    "3-7 Years "    1
    "Growing demand for new forms of governance"    "3-7 Years "    1
    "Growing demand for new forms of governance"    "3-7 Years "    1
    "Climate shocks - more intense, more frequent"    "0-3 Years "    2
    "New alliances for the global commons"    "0-3 Years "    2
    "New alliances for the global commons"    "3-7 Years "    2
    "Distributed energy solutions proliferate"    "0-3 years "    2
    "Distributed energy solutions proliferate"    "3-7 Years "    2
    "Inequitable distribution of technology persists"    "0-3 Years "    2
    "Growing concern for privacy"    "0-3 Years "    3
    "Race for scarce resources"    "0-3 Years "    3
    "Race for scarce resources"    "0-3 years "    3
    "Growing youth bulge in many developing countries"    "0-3 Years "    3
    "Growing youth bulge in many developing countries"    "3-7 Years "    3
    "Developing countries assert themselves"    "3-7 Years "    1
    "Increasing polarization"    "7-10 Years "    1
    "Growing concern for future generations"    "0-3 Years "    1
    "Democratic backsliding"    "7-10 years "    1
    "Democratic backsliding"    "0-3 years "    1
    "Shifting nature of work"    "3-7 Years "    2
    "Mental health under stress"    "0-3 years "    2
    "You don't speak for me"    "0-3 Years "    2
    "You don't speak for me"    "0-3 years "    2
    "Growing inequalities"    "3-7 Years "    2
    "Social contracts under pressure"    "0-3 years "    2
    "Multilateral fragmentation"    "3-7 years "    3
    "Multilateral fragmentation"    "3-7 Years "    3
    "Increase in green tax incentives in developed countries"    "3-7 Years "    3
    "Democratic backsliding"    "3-7 Years "    3
    "You don't speak for me"    "0-3 Years "    3
    "Distributed energy solutions proliferate"    "3-7 Years "    1
    "Growing inequalities"    "3-7 Years "    1
    end
    copy up to and including the previous line ------------------

    Listed 64 out of 64 observations

    Thank you in advance.




  • #2
    It's hard for me to follow exactly what you want here and whether it will work.

    The first detail is that 64 questions yield 39 different signals, but you don't mention putting those in a plot, so leave that on one side.

    timeframe is a string and inconsistent, so that should be cleaned up.


    Code:
    . tab timeframe
    
      timeframe |      Freq.     Percent        Cum.
    ------------+-----------------------------------
     0-3 Years  |         18       28.12       28.12
     0-3 years  |         12       18.75       46.88
     3-7 Years  |         26       40.62       87.50
     3-7 years  |          2        3.12       90.62
    7-10 Years  |          4        6.25       96.88
    7-10 years  |          2        3.12      100.00
    ------------+-----------------------------------
          Total |         64      100.00
    
    . replace timeframe = lower(timeframe)
    (48 real changes made)
    As the first characters 0 3 7 are already in a desirable sort order, that could be used as it now is in any graph command that supports use of a string variable for one axis. Alternatively, encode would yield a numeric version if that were needed.

    Code:
    . encode timeframe, gen(time)
    
    . tab time
    
           time |      Freq.     Percent        Cum.
    ------------+-----------------------------------
     0-3 years  |         30       46.88       46.88
     3-7 years  |         28       43.75       90.62
    7-10 years  |          6        9.38      100.00
    ------------+-----------------------------------
          Total |         64      100.00
    Now we have 3 distinct time frames and 3 distinct values of risk score. The problem with a standard scatter plot will be over-plotting of identical values. Some people would shake them apart using jittering, namely addition of random noise, but in my experience that rarely works well for this kind of data. I would try a bar chart instead. For example, I reached for tabplot from the Stata Journal.

    Code:
     
    tabplot riskscore timeframe, showval yasis separate(riskscore) bar1(color(navy) bar2(color(blue)) bar3(color(red))
    Click image for larger version

Name:	risk.png
Views:	1
Size:	33.5 KB
ID:	1739360


    There are some guesses here. I am supposing that risk 3 is higher than risk 2 and so on. If not omit the yasis option. Using different colours -- and if so what they are -- is clearly at choice.

    The objection that you might well use a table is likely to seem cogent. The point is also that a scatter plot seems unlikely to work better.

    Comment

    Working...
    X