Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Exporting Graphs Which Contain Unicode

    Here's an example where I have a beautiful Unicode string in a graph title that gets mangled when I export to PS:

    Code:
    sysuse auto, clear
    reg price mpg
    margins, dydx(*)
    marginsplot, title(`=ustrunescape("\u03B2\u0303")')
    graph export betahat.ps, logo(off) orientation(landscape) mag(175) replace
    graph export betahat.pdf, mag(175) replace
    The export to PDF works nicely, but that is only available on a Mac.

    The PS file (which can be turned into a pdf with ps2pdf on Linux), has "??" instead of the $\hat \beta$.

    Similarly, the user-written -graphexportpdf- has the same problem.

    Any suggestions on how this can be fixed?

  • #2
    From -help whatsnew13to14-

    Export graphs or output containing Unicode using PDF instead of PostScript (PS) or Encapsulated PostScript (EPS). PS and EPS do not support Unicode. In some cases, you can use PS and EPS because Stata converts accented Latin characters to the Extended ASCII characters that PS and EPS expect.
    Unicode characters can not be converted to Extended ASCII characters in Latin1 encoding are replace with question marks.

    Stata 13 could only export PDF files on Windows and Mac. Stata 14 can now export PDF files on all platforms, including both X-Windows and console under Linux. Unicode is also supported when exporting to PDF on all platforms in Stata 14. There are however some differences between platforms which I will try to shed some light on below.

    To export a PDF on all platforms:

    Code:
    . sysuse auto
    . scatter mpg weight
    . graph export xyz.pdf
    A quick note. Dimitriy V. Masterov used the -mag()- option when exporting a PDF. This has an adverse affect on Unix and Windows and probably should not be allowed. We will look into this further.

    Here is the platform dependent information that I promised...

    Stata on Mac uses Apple's built in PDF engine which is very full-featured. In most cases the exported PDF file should look like what you see on the screen. I will not talk about the Mac (GUI version) any further.

    Stata on Windows, Unix, and Mac (console) uses a PDF library to export a PDF. Exporting Unicode to a PDF is dependent upon embedding the correct font in the PDF so that the Unicode characters may be viewed correctly by a PDF viewer at a later time.

    Stata for Windows and Stata for Unix (GUI) automatically embed the font used by the Graph window as long as it is a TrueType font. Typically this should yield a reasonable result, although you are not guaranteed to get exactly what you see on the screen. Here is why:

    Modern operating systems use font substitution. When a character needs to be rendered to the screen, there may not be a glyph which represents that character in the currently selected font. When this happens, another font is secretly chosen so the character can be rendered to the screen.

    Another twist is that Unix systems use a lot of Type 1 fonts. We are not currently able to embed Type 1 fonts. If you choose a Type 1 font for the screen, your characters might look a little different in the exported PDF. We hope to add Type 1 support in the future.

    Sometimes it may be necessary to have multiple TrueType fonts embedded in a PDF, for example, if you have a graph that has multiple languages that are not represented by just one font. Stata provides a facility in the -translator set- command for handling that situation. Here is an example:

    Code:
    . translator set Graph2pdf addfonts "Batang, SimSun"
    Unix and Mac console do not have a graph window so there is not any TrueType font embedded by default. If you need Unicode-PDF support in the console version, you must use -translator set gph2pdf addfonts-. That is similar to the setting shown above used to tell Stata which fonts to embed. Note that for console, -gph2pdf- is used instead of -Graph2pdf-. Unfortunately -gph2pdf- is not working at the time of this post, but we will be releasing an update shortly to address the issue.

    Dimitriy V. Masterov may run into trouble with the particular example he has chosen...

    Code:
    . marginsplot, title(`=ustrunescape("\u03B2\u0303")')
    ... because the placement of the tilde right above the Beta symbol depends on the combination of the font and the layout engine (screen or PDF) being used, i.e., a different font and/or a different display engine may not place the tilde at the same spot. Our experiments show that the default font on Windows (Arial) does a good job, but on Linux the tilde was usually rendered overlapping with the Beta symbol when in PDF form. If Dimitriy has access to Arial on his Linux system he may have better luck.

    To summarize, Stata 14 has more full PDF export support across all platforms, including the ability to embed fonts supporting Unicode. In some cases, you need to tell Stata which fonts to embed, particularly if you are using Type 1 fonts on Unix GUI rather than TrueType fonts, or if you are using Unix console or Mac console. Windows and Mac GUI versions in general will choose the appropriate font(s) to embed in PDF files based on the fonts used in the Graph window.
    Last edited by James Hassell (StataCorp); 04 May 2015, 13:27.

    Comment


    • #3
      What is the best way to import a graph exported as PDF into MS Word? Take the example below.
      Code:
      sysuse auto
      scatter mpg weight
      graph export test.pdf
      graph export test.eps
      I have Windows 7 and imported the EPS file into Word 2010 with Insert - Picture and the PDF file with Insert - Object - Adobe Acrobat Document. Unlike the high-quality EPS graph, the PDF graph looks highly pixelated and blurry when printed and is unusable for a publication.
      Last edited by Friedrich Huebler; 04 May 2015, 15:45.

      Comment


      • #4
        The eps and pdf routes seems to have the same issues on Linux.

        Also, the documentation should be be amended. -help graph export- reads "pdf is available only for Stata for Windows and Stata for Mac."

        Comment


        • #5
          The attached pdf is obtained on a Linux machine (CentOS 5) by first changing default graph font to Arial, (Edit->Preferences->Graph->Font), then run the following:

          Code:
          sysuse auto, clear
          reg price mpg
          margins, dydx(*)
          marginsplot, title(`=ustrunescape("\u03B2\u0303")')
          graph export beta.pdf, replace
          Attached Files

          Comment


          • #6
            This worked very nicely!

            I had to install the Arial font first on Ubuntu with

            Code:
             apt-get install ttf-mscorefonts-installer

            Comment

            Working...
            X