Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a dot plot with horizontal spacing depending on a variable

    Dear Statalisters,

    Please consider this dataset. I've hidden sensitive data so it may not make sense here.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte rank str4 year str107 name byte section str10 id
    24 "1986" "reorieofoefi" 1 "ID1"
    28 "1986" "reor"         1 "ID2"
    19 "1992" "kdsfsof"      1 "ID2"
    52 "1992" "peproz"       1 "ID3"
    58 "1992" "ofkoof"       1 "ID4"
     1 "1998" "fpdsofps"     0 "ID5"
     9 "1998" "zrpzorz"      1 "ID6"
    16 "1998" "mflmsf"       0 "ID2"
     9 "2004" "opzrpoz"      1 "ID7"
    18 "2004" "spfopdsof"    1 "ID2"
    24 "2004" "spffods"      1 "ID8"
    25 "2004" "psdlmv"       0 "ID9"
    38 "2004" "pfo^zfz"      1 "ID10"
    30 "2010" "kvxlmmkldg"   1 "ID11"
    33 "2014" "vpssklgs"     1 "ID12"
    45 "2014" "sokozkv"      0 "ID13"
    46 "2014" "vklsklkzg"    0 "ID14"
    14 "2017" "kvskkozkgo"   0 "ID15"
    end
    I want every variable stored in this toy example to appear in a dot plot but I can't really know how to do this as I'm new in Stata.

    year: it should be my X-axis variable
    rank: it should be my Y-axis variable
    section: it is a binary variable. For each year, I want the plot to show horizontal spacing depending on whether section == 0 or section == 1. All the section == 0 observations should be at the left of the year, and all the section == 1 observations should be at the right of the year.
    id: I want dot sharing the same id to have the same color.
    name: I want the values contained in name to be shown alongside the dot.

    At the end of the day, it should look like this (forgive my Paint skills!)
    Click image for larger version

Name:	idea.png
Views:	1
Size:	14.8 KB
ID:	1729664



    As you can see whatever1 in 1986 became whatever2 in 1992, and it also changed order. But they are the same color because they would have the same ID. Please also note that year-rank combo is a unique identifyer, so there will never be two points sharing the same Y in the same year. I would also like the Y scale to be reverse so that the dot plot truly looks like a small questionnaire with the order of questions!

    Please help me!

    Best regards.
    Last edited by Valentine Laurent; 10 Oct 2023, 00:58.

  • #2
    Code:
    clear
    input byte rank int year str12 name byte section str4 id
    24 1986 "reorieofoefi" 1 "ID1"
    28 1986 "reor"         1 "ID2"
    19 1992 "kdsfsof"      1 "ID2"
    52 1992 "peproz"       1 "ID3"
    58 1992 "ofkoof"       1 "ID4"
     1 1998 "fpdsofps"     0 "ID5"
     9 1998 "zrpzorz"      1 "ID6"
    16 1998 "mflmsf"       0 "ID2"
     9 2004 "opzrpoz"      1 "ID7"
    18 2004 "spfopdsof"    1 "ID2"
    24 2004 "spffods"      1 "ID8"
    25 2004 "psdlmv"       0 "ID9"
    38 2004 "pfo^zfz"      1 "ID10"
    30 2010 "kvxlmmkldg"   1 "ID11"
    33 2014 "vpssklgs"     1 "ID12"
    45 2014 "sokozkv"      0 "ID13"
    46 2014 "vklsklkzg"    0 "ID14"
    14 2017 "kvskkozkgo"   0 "ID15"
    end
    
    // start with a scatter plot
    scatter rank year
    
    // now we want to move section = 0 to the left
    // and section = 1 to the right
    gen x = year - 0.5 if section == 0
    replace x = year + 0.5 if section == 1
    label var x "year"
    
    scatter rank x
    // Well, that does what you asked for, but I don't know it works as
    // well as you hoped
    
    // same collor for same id
    separate rank , by(id) veryshortlabel
    scatter rank? rank?? x
    // again it does what you asked for, but ...
    
    // name the dots
    local gr "twoway "
    forvalues i = 1/15 {
        local gr "`gr' scatter rank`i' x, mlabel(name) || "
    }
    `gr'
    // again it does what you asked for, but ...
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thank you Maarten ! I would have never thought about creating a variable x like that !

      I understand the graph must look a bit odd, especially with my data example, but with my real data it makes more sense. I'll edit the number of total IDs and shorten the names so that it's readable but at least now I know how to produce the shape I want

      One last question: Is there a way to only show the ticks of my year variable? Because the graph has a constant scale from 1980 to 2020, but I would only like ticks at my year values, even if it is not a constant scale. Thank you very much!

      Comment


      • #4

        Code:
        levelsof year, clean local(levels) 
        .... xla(`levels')

        Comment

        Working...
        X