Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using string variable as x-axis in a plot

    My data consists of an x-variable (year) and a y-variable (y1). I want to plot a line graph of (y1, year). "year" ranges from 1923-1933, but the issue is that 1930 is split into two parts: 1930a and 1930b, denoting the first and second half of 1930, respectively. Because of this, the "year" variable has to be in string format. How do I plot a line graph using a string x-axis?

    Example:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str5 year float y1
    "1923"  112
    "1924"  112
    "1925"  117
    "1926"  115
    "1927"  128
    "1928"  132
    "1929"  141
    "1930a" 123
    "1930b" 158
    "1931"  170
    "1932"  171
    "1933"  163
    end

  • #2
    You can find the midpoint of a continuous interval. For example, the midpoint between 1930 and 1930.5 is 1930.25, and the midpoint between 1930.5 and 1931 is 1930.75. There is nothing wrong with using fractional years. Even if you use strings, these are mapped to real numbers when constructing a scale.

    Comment


    • #3
      Andrew Musau I was able to do it this way:

      Code:
      gen t = .
      replace t = 1 if year == "1923"
      replace t = 2 if year == "1924"
      replace t = 3 if year == "1925"
      replace t = 4 if year == "1926"
      replace t = 5 if year == "1927"
      replace t = 6 if year == "1928"
      replace t = 7 if year == "1928"
      replace t = 8 if year == "1929"
      replace t = 9 if year == "1930a"
      replace t = 10 if year == "1930b"
      replace t = 11 if year == "1931"
      replace t = 12 if year == "1932"
      replace t = 13 if year == "1933"
      
      labmask t, values(year)
      levelsof t, local(tvalues)
      graph twoway line y1 t, xlabel(`tvalues', valuelabels)
      But a strange gap appears on the x-axis between 1927 and 1928 when I plot -- the distance between them on the x-axis is much wider than between other points. How should I fix this?
      Last edited by Saunok Chakrabarty; 18 May 2024, 11:44. Reason: Edit: I renamed my y variable as y1 in the edited answer.

      Comment


      • #4
        That's just a manual encode, and the scale is misleading. You are implying that the gap between 1923 and 1924 is the same as the gap between the first and second halves of 1930. 4 units for a year works well with your data. Below, I use labmask from the Stata Journal.

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str5 year float y1 val
        "23"  112 2
        "24"  112 6
        "25"  117 10
        "26"  115 14
        "27"  128 18
        "28"  132 22
        "29"  141 26
        "30a" 123 30
        "30b" 158 32
        "31"  170 34
        "32"  171 38
        "33"  163 42
        end
        *search labmask
        labmask val, values(year)
        twoway line y1 val, xlabel(2 (4) 30 32 34 (4) 42, valuelabels) xtitle(1900s) ytitle(Whatever)
        Click image for larger version

Name:	Graph.png
Views:	1
Size:	20.7 KB
ID:	1753778

        Comment


        • #5
          Andrew Musau Thank you! That works.

          Comment

          Working...
          X