Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Import Excel problem with Excel 2016 (Version 1708) Stata 14.2

    I am not able to import an Excel file (in Stata 14.2) when it was saved with Excel 2016 (Version 1708 - Build 8431.2079). Does someone else experience the same behaviour?
    Stata's error message is: Unable to load file. Error: "Element not found"

    I use this code for import:
    Code:
    import excel "cordis\cordis-fp6projects.xlsx", sheet("projects") firstrow clear
    The data is downloaded from https://data.europa.eu/euodp/en/data...disfp6projects

  • #2
    I'm unable to reproduce your difficult with either Stata 14.2 or Stata 15.0. The only error message I get is a warning that there is no sheet in that file called "projects". Visual inspection of the file confirms that is true, but removing -sheet("projects")- from the command leads to uneventful loading of the file in both versions of Stata. (There is only one sheet in the file, but it's name isn't "projects".)

    Perhaps your version of the file has gotten corrupted somehow? Have you tried re-downloading the file and trying again?

    Comment


    • #3
      Hi Clyde, thanks for investigating it with me. Please excuse my sloppy description: I opened the original file in Excel 2016 - changed the sheet name and saved it.
      After this I get:

      import excel "cordis-fp6projects.xlsx", sheet("projects") firstrow
      file cordis-fp6projects.xlsx could not be loaded
      r(603);

      The error message reported in my first post appears when I use the GUI.

      Clyde, on a side note - but that is what brought me to this issue is - that I am not able to -destring- the variables totalCost ecMaxContribution. My guess is that the original file is already corrupt.
      The solution I have is to load the file into LibreOffice save it as csv - and import the csv-file into Excel. In this process I can convert both columns into integers and then save it to xlsx-file format again.

      And following this, Excel is not the problem - the original file is...

      PS: BTW, I am unable to load the original file into LibreOffice (5.4.2.1) ...

      Comment


      • #4
        As for the problem with -destring-, the problem is seen to lie with the data itself:

        Code:
        . tab totalCost if missing(real(totalCost))
        
            totalCost |      Freq.     Percent        Cum.
        --------------+-----------------------------------
        21 404 453,87 |          1      100.00      100.00
        --------------+-----------------------------------
                Total |          1      100.00
        
        . tab ecMaxContribution if missing(real(ecMaxContribution))
        
        ecMaxContribution |      Freq.     Percent        Cum.
        ------------------+-----------------------------------
              13 999 999, |          1      100.00      100.00
        ------------------+-----------------------------------
                    Total |          1      100.00
        Those observations are not valid string representations of numbers, both due to embedded blanks, and the comma. (If you are using comma as a decimal point then you need to specify the -dpcomma- option in -destring-; but the blanks will bite even so.)

        Comment


        • #5
          Hi Clyde,
          thanks a lot for sharing your code to identify problematic cases to -destring-. I was not aware of such an easy solution to figure it out.
          I looked at -charlist- (SSC) but could not identify the space as problematic.

          Comment


          • #6
            See also https://www.statalist.org/forums/for...iable-problems for a program pushing the missing(real()) idea further.

            Comment


            • #7
              Originally posted by Marc Kaulisch View Post
              I looked at -charlist- (SSC) but could not identify the space as problematic.
              charlist is most useful to identify characters that appear to be spaces (ASCII code 32) to the eye, but are actually something else. Here, the spaces are problematic per se, regardless of the characters that produce them. Look at destring's ignore() option to deal with this.

              Best
              Daniel

              Comment


              • #8
                Dear Nick, dear Daniel, thanks for your input.
                Nick, the -problemstring- programme helped me just a few seconds ago for the first time and hopefully it will do far more often.

                A solution that I use due to the decimal comma problem in the data is
                Code:
                ren ecContribution pro_costs
                split pro_costs, p(,) gen(ecContribution_)
                destring ecContribution_1, gen(ecContribution)
                It looks like that the -dpcomma- option and/or the excel import have problems when cases vary in their number of decimal places (some have one others two) - so I chose the split-command.

                Comment

                Working...
                X