Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bug in graph combine?

    I am trying to use graph combine in Stata 18, without success as soon as I include one specific variable.

    I have been able to identify the variable "age" as seemingly causing the error. A simple demonstration with two histograms:

    Code:
    . hist age, name(g1, replace)
    (bin=37, start=34, width=.89189189)
    
    . hist age, name(g2, replace)
    (bin=37, start=34, width=.89189189)
    
    . graph combine g1 g2
    too few quotes
    r(132);

    I can't see anything wrong with the variable age.

    Problem is, -graph combine- works nicely with the data produced by dataex below.

    Code:
    . des age
    
    Variable      Storage   Display    Value
        name         type    format    label      Variable label
    ---------------------------------------------------------------------------------------------------------------------------------------------
    age             byte    %11.0f                Hva er din alder?
    
    .
    end of do-file
    
    . do /var/folders/33/21gh5ql91ts2sfl0y9d_l1g80000gn/T/StataRun1684930937536.do
    
    . dataex age
    
    
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte age
     .
     .
    34
    40
    39
    36
    40
    44
    42
    36
    37
    41
    43
    40
    37
    38
    43
    37
    43
    36
    39
    42
    42
    42
    44
    39
    38
    43
    40
    43
    39
    35
    41
    42
    36
    36
    39
    40
    43
    39
    37
    42
    36
    41
    41
    39
    36
    36
    35
    37
    41
    37
    36
    37
    44
    41
    39
    36
    42
    44
    39
    40
    39
    42
    37
    35
    38
    42
    41
    39
    42
    42
    36
    35
    35
    40
    36
    39
    35
    40
    40
    35
    37
    44
    36
    41
    35
    43
    42
    40
    38
    43
    35
    41
    44
    44
    36
    44
    43
    41
    end

    So there is something odd with this particular variable.

    Code:
    . codebook age
    
    ---------------------------------------------------------------------------------------------------------------------------------------------
    age                                                                                                                        Hva er din alder?
    
    ---------------------------------------------------------------------------------------------------------------------------------------------
    
                      Type: Numeric (byte)
    
                     Range: [34,67]                       Units: 1
             Unique values: 34                        Missing .: 2/6,054
    
                      Mean: 50.5345
                 Std. dev.: 8.45551
    
               Percentiles:     10%       25%       50%       75%       90%
                                 38        44        51        57        62
    
    
    
    . tab age, missing
    
     Hva er din |
        alder?
     |      Freq.     Percent        Cum.
    ------------+-----------------------------------
             34 |          1        0.02        0.02
             35 |        152        2.51        2.53
             36 |        170        2.81        5.34
             37 |        159        2.63        7.96
             38 |        168        2.78       10.74
             39 |        160        2.64       13.38
             40 |        170        2.81       16.19
             41 |        164        2.71       18.90
             42 |        167        2.76       21.66
             43 |        158        2.61       24.26
             44 |        167        2.76       27.02
             45 |        173        2.86       29.88
             46 |        193        3.19       33.07
             47 |        192        3.17       36.24
             48 |        194        3.20       39.44
             49 |        217        3.58       43.03
             50 |        256        4.23       47.26
             51 |        235        3.88       51.14
             52 |        268        4.43       55.57
             53 |        293        4.84       60.41
             54 |        292        4.82       65.23
             55 |        245        4.05       69.28
             56 |        218        3.60       72.88
             57 |        228        3.77       76.64
             58 |        229        3.78       80.43
             59 |        218        3.60       84.03
             60 |        157        2.59       86.62
             61 |        167        2.76       89.38
             62 |        122        2.02       91.39
             63 |        157        2.59       93.99
             64 |        110        1.82       95.80
             65 |        112        1.85       97.65
             66 |         61        1.01       98.66
             67 |         79        1.30       99.97
              . |          2        0.03      100.00
    ------------+-----------------------------------
          Total |      6,054      100.00

    PS. If StataCorp would like the data, feel free to contact me.
    Last edited by Christopher Bratt; 24 May 2023, 07:21.

  • #2
    I can not reproduce with a simulated dataset. Would you please either post the dataset here or send it to tech support. Please also include the following information, the output from

    Code:
    about
    query graphics
    and run:

    Code:
     hist age
    graph save hist.gph
    post here or send hist.gph to us as well. Thanks.

    Comment


    • #3
      This works for me in Stata 18. Note the extra option discrete, needed but orthogonal to this thread.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input byte age int freq
      34   1
      35 152
      36 170
      37 159
      38 168
      39 160
      40 170
      41 164
      42 167
      43 158
      44 167
      45 173
      46 193
      47 192
      48 194
      49 217
      50 256
      51 235
      52 268
      53 293
      54 292
      55 245
      56 218
      57 228
      58 229
      59 218
      60 157
      61 167
      62 122
      63 157
      64 110
      65 112
      66  61
      67  79
       .   2
      end
      
      
      expand freq 
      
      label var age "Hva er din alder?"
      
      hist age, name(G1, replace) discrete 
      
      graph combine G1 G1

      Comment


      • #4
        Code:
        . keep dv age
        
        . save df_statalist, replace
        file df_statalist.dta saved
        
        . lowess dv age, jitter(10) name(g_1, replace)
        . lowess dv age, jitter(10) name(g_2, replace)
        
        . graph combine g_1 g_2
        too few quotes
        r(132);
        Attached Files

        Comment


        • #5
          This problem I can reproduce. Remove the variable label and it goes away. But it's not the question mark that is the problem, and I can't see any exotic characters in the label.

          Comment


          • #6
            Thanks, I can reproduce the issue. Interestingly, The error happens in Stata 17 and 16 as well (I did not go back further).

            Comment


            • #7
              Ok, I know what is going on, for some reason, there is a new line character (ASCII char 10) at the end of the variable label for age in your dataset, i.e., the label is actually:

              Code:
               "Hva er din alder?\n"
              which causes graph combine to break.To verify:

              Code:
              . local name : var label age
              
              . di strlen("`name'")
              18
              
              . di strlen("Hva er din alder")
              16
               
              . di strpos(`"`name'"', char(10))
              18
              
              . di tobytes("`name'")
              \d072\d118\d097\d032\d101\d114\d032\d100\d105\d110\d032\d097\d108\d100\d101\d114\d063\d010
              Relabeling the age varaible without the new line character at the end fixes the issue. BTW, that explains the header in tab age, missing in the first post.
              Last edited by Hua Peng (StataCorp); 24 May 2023, 10:40.

              Comment


              • #8
                Good catch!

                Comment


                • #9
                  Thanks a lot to both of you.

                  I can confirm that
                  Code:
                  label variable age "Age"
                  solves the problem. -graph combine- works as expected.

                  There are often problems with data delivered by others, this was a new one (and the data are from a well-establised polling organisation).


                  Actually, there was a giveaway in my -tab age- (see post #1 in this thread): The line break is clearly visible. But I wouldn't have understood the significance of that anyway.
                  Last edited by Christopher Bratt; 24 May 2023, 10:57.

                  Comment

                  Working...
                  X