Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • problem unzipping files stata- I/O error on write could not perform unzip

    I apologize for the possible typos, english is not my mother tongue.

    I´ve been working on a loop to download and unzip files from a website but I keep getting the same response when running the commands:

    file
    603-Modulo01/ENAHO_Definici�nValoresMonetariosBasedeDato
    > s.pdf I/O error on write
    could not perform unzip

    program error: code follows on the same line as close brace
    r(198);

    And additionally, I get the last two lines which describe a problem with closing braces, although I´ve checked out the braces and apparently they´re correctly placed (I would really appreciate an answer noticing any mistakes) ; the error still pops up.
    I using stata 14 MP 64 And the codes are the following ones:

    #delimit;
    foreach i in 603 634{;
    cd "$directorio";
    mkdir "$directorio/`i'";
    cd "$directorio/`i'";

    #delimit;
    foreach j of numlist 1/5 {;
    copy http://iinei.inei.gob.pe/iinei/srienaho/descarga/SPSS/`i'-Modulo0`j'.zip `i'-Modulo0`j'.zip ;
    unzipfile `i'-Modulo0`j'.zip, replace;
    erase `i'-Modulo0`j'.zip;
    };
    #delimit cr

    #delimit;
    foreach k of numlist 7/9 {;
    copy http://iinei.inei.gob.pe/iinei/srienaho/descarga/SPSS/`i'-Modulo0`k'.zip `i'-Modulo0`k'.zip ;
    unzipfile `i'-Modulo0`k'.zip, replace;
    erase `i'-Modulo0`k'.zip;
    };
    #delimit cr

    #delimit;
    foreach y of numlist 10/13 {;
    copy http://iinei.inei.gob.pe/iinei/srienaho/descarga/SPSS/`i'-Modulo`y'.zip `i'-Modulo`y'.zip ;
    unzipfile `i'-Modulo`y'.zip, replace;
    erase `i'-Modulo`y'.zip;
    };
    #delimit cr

    #delimit;
    foreach x of numlist 15/18 {;
    copy http://iinei.inei.gob.pe/iinei/srienaho/descarga/SPSS/`i'-Modulo`x'.zip `i'-Modulo`x'.zip ;
    unzipfile `i'-Modulo`x'.zip, replace;
    erase `i'-Modulo`x'.zip;
    };
    #delimit cr

    #delimit;
    foreach p of numlist 22/28 {;
    copy http://iinei.inei.gob.pe/iinei/srienaho/descarga/SPSS/`i'-Modulo`p'.zip `i'-Modulo`p'.zip ;
    unzipfile `i'-Modulo`p'.zip, replace;
    erase `i'-Modulo`p'.zip;
    };
    #delimit cr

    #delimit;
    foreach t in 34 37 77 78 84 85 {;
    copy http://iinei.inei.gob.pe/iinei/srienaho/descarga/SPSS/`i'-Modulo`t'.zip `i'-Modulo`t'.zip ;
    unzipfile `i'-Modulo`t'.zip, replace;
    erase `i'-Modulo`t'.zip;
    };
    #delimit cr
    };
    #delimit cr

    Thanks in advance for your responses and possible solutions.

  • #2
    For starters, consider this:
    1) All those -delimit- commands are unnecessary, and repeating them might cause problems. See below for a cleaned-up copy of your code without any of them. You might try running it and see what happens. Always putting a space before a "{" is a good idea, as I've done below.At least someone else here will be able to more easily read the code, and finding the error might be easier. Also, see the FAQ on nicer ways to display code on this forum.

    3) It's hard to tell where the problem is occurring. Take a look at -help set trace-, which would show exactly where the error is occurring.

    4) When trying to solve a problem like this, it's good to try something simpler first, see if that works, and then try something more complicated. I would try just one of the inner loops in your code and see if that works, and then start adding other ones. Can you download and unzip at least *one* file?

    5) I see some PDF file referenced above. I can't think offhand of why that would be relevant in a Stata program.

    6) The error on "write" suggests that you might be trying to copy a file to a directory where you don't have permission. You might check on that.

    Code:
    foreach i in 603 634 {
        cd "$directorio"
        mkdir "$directorio/`i'"
        cd "$directorio/`i'"
        foreach j of numlist 1/5  {
            copy http://iinei.inei.gob.pe/iinei/srienaho/descarga/SPSS/`i'-Modulo0`j'.zip `i'-Modulo0`j'.zip
            unzipfile `i'-Modulo0`j'.zip, replace
            erase `i'-Modulo0`j'.zip
        }
    
        foreach k of numlist 7/9  {
            copy http://iinei.inei.gob.pe/iinei/srienaho/descarga/SPSS/`i'-Modulo0`k'.zip `i'-Modulo0`k'.zip
            unzipfile `i'-Modulo0`k'.zip, replace
            erase `i'-Modulo0`k'.zip
        }
    
        foreach y of numlist 10/13  {
            copy http://iinei.inei.gob.pe/iinei/srienaho/descarga/SPSS/`i'-Modulo`y'.zip `i'-Modulo`y'.zip
            unzipfile `i'-Modulo`y'.zip, replace
            erase `i'-Modulo`y'.zip
        }
    
        foreach x of numlist 15/18  {
            copy http://iinei.inei.gob.pe/iinei/srienaho/descarga/SPSS/`i'-Modulo`x'.zip `i'-Modulo`x'.zip
            unzipfile `i'-Modulo`x'.zip, replace
            erase `i'-Modulo`x'.zip
        }
    
        foreach p of numlist 22/28  {
            copy http://iinei.inei.gob.pe/iinei/srienaho/descarga/SPSS/`i'-Modulo`p'.zip `i'-Modulo`p'.zip
            unzipfile `i'-Modulo`p'.zip, replace
            erase `i'-Modulo`p'.zip
        }
    
        foreach t in 34 37 77 78 84 85  {
            copy http://iinei.inei.gob.pe/iinei/srienaho/descarga/SPSS/`i'-Modulo`t'.zip `i'-Modulo`t'.zip
            unzipfile `i'-Modulo`t'.zip, replace
            erase `i'-Modulo`t'.zip
        }
    }

    Comment


    • #3
      1. Thank you for your response and advice
      3. I tried it, the error occurs only on that line (as originally posted)
      4. Each one of the inner loops work fine separately.
      5. It comes inside the file to unzip.
      6. The files I´'ve tried to unzip have each one of them among 5 and 6 files inside, just 1 or 2 of them (the subfiles) are pdf.

      I´ve tried the code without the -delimit- command, ran the cleaned-up copy but the same problem remains.

      Comment


      • #4
        "... error occurs only on that line ... ."

        That's the question: Exactly which line is that? Perhaps you know, but it's not obvious to me, although it might be to someone else. If you don't know exactly which line it is, I have a simpler suggestion than -set trace on-: Just put a bunch of "display" lines into your code, i.e., something like -display "here1"- , -display "here2"-, ... etc. When you have used that to narrow down the problem part of the code, you can surround it with -set trace on- and -set trace off- to get a more detailed display of what's going on.

        Comment


        • #5
          Following the advice

          4) When trying to solve a problem like this, it's good to try something simpler first, see if that works, and then try something more complicated.
          is important.

          You report
          file
          603-Modulo01/ENAHO_DefinicinValoresMonetariosBasedeDato
          > s.pdf I/O error on write
          could not perform unzip
          First, confirm the archive can be opened using some other zip program,
          then, test the unzipping of that archive without any looping etc.
          Code:
          set trace on
          
          unzipfile 603-Modulo01.zip 
          
          set trace off 
          And report exactly what was run and all results including error messages.



          The in your filename may indicate some encoding issues, maybe in the zip-archive.

          Using Stata/MP 16.0 for Windows (64-bit x86-64) Revision 08 Jan 2020:
          Code:
          * use some external zip program 
          . 
          . shell "C:\Program Files\7-Zip\7z.exe" e 603-Modulo01.zip -oNOT_USING_Stata_ZIP
          
          . dir NOT_USING_Stata_ZIP\*
            <dir>   1/26/20 18:32  .                 
            <dir>   1/26/20 18:32  ..                
            <dir>   7/16/19 15:17  603-Modulo01      
           605.3k   4/23/18  0:54  CED-01-100 2017.pdf.pdf
           335.0k   4/23/18  0:34  CodigoConglomerado_6_digitos.pdf
             0.1k   4/23/18  0:40  ConglomeA6digitos.sps
          9692.0k   4/24/18 15:44  Diccionario_2017.pdf
            58.6M   7/11/19 23:58  Enaho01-2017-100.sav
            56.2k   4/23/18  0:28  ENAHO_DefiniciónValoresMonetariosBasedeDatos.pdf
            87.7k   4/23/18  0:32  ENAHO_Estratificación del Marco.pdf
           162.7k   5/26/18  2:10  FICHA TECNICA_PUNTOS GPS AÑO 2017.pdf
           893.7k   4/24/18 16:33  FichaTecnica_2017.pdf
          Run Stata unzipfile command:
          Code:
          . unzipfile 603-Modulo01.zip 
            ---------------------------------------------------------------------------- begin unzipfile ---
            - version 11.0
            - syntax anything(everything id=zipfile) [, replace]
            - gettoken ZipFileName rest : anything
            - if (`"`rest'"' != "") {
            = if (`""' != "") {
              di as error "invalid syntax"
              exit 198
              }
            - if (c(userversion) < 15.1) {
              if (`"`replace'"' != "") {
              local overwrite "overwrite"
              }
              mata : zipfile_cmd()
              }
            - else {
            - mata : unzipfile_cmd()
              inflating: 603-Modulo01/CED-01-100 2017.pdf.pdf
              inflating: 603-Modulo01/CodigoConglomerado_6_digitos.pdf
              inflating: 603-Modulo01/ConglomeA6digitos.sps
              inflating: 603-Modulo01/Diccionario_2017.pdf
              inflating: 603-Modulo01/Enaho01-2017-100.sav
          java.lang.reflect.InvocationTargetException
                  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
                  at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
                  at java.base/java.lang.reflect.Method.invoke(Unknown Source)
                  at com.stata.Javacall.load(Javacall.java:130)
                  at com.stata.Javacall.load(Javacall.java:90)
          Caused by: java.lang.IllegalArgumentException: malformed input off : 759, length : 1
                  at java.base/java.lang.StringCoding.throwMalformed(Unknown Source)
                  at java.base/java.lang.StringCoding.decodeUTF8_0(Unknown Source)
                  at java.base/java.lang.StringCoding.newStringUTF8NoRepl(Unknown Source)
                  at java.base/java.lang.System$2.newStringUTF8NoRepl(Unknown Source)
                  at java.base/java.util.zip.ZipCoder$UTF8.toString(Unknown Source)
                  at java.base/java.util.zip.ZipFile.getZipEntry(Unknown Source)
                  at java.base/java.util.zip.ZipFile$ZipEntryIterator.next(Unknown Source)
                  at java.base/java.util.zip.ZipFile$ZipEntryIterator.nextElement(Unknown Source)
                  at java.base/java.util.zip.ZipFile$ZipEntryIterator.nextElement(Unknown Source)
                  at com.stata.plugins.zip.StUnzipfile.unzipfile(StUnzipfile.java:67)
                  ... 6 more
          Caused by: java.nio.charset.MalformedInputException: Input length = 1
                  ... 16 more
              }
            ------------------------------------------------------------------------------ end unzipfile ---
          r(5100);
          
          end of do-file
          
          r(5100);

          Comment


          • #6
            The problem is due to the tilde in pdf documents

            Comment


            • #7
              The problem, if it still exists, is due to how the underlying java zip library handle "non-UTF-8 characters" in filenames.

              Comment


              • #8
                Hi,

                I found the solution with the command "capture", here the code for the firts part:

                Code:
                capture unzipfile "Modulo_`var'.zip"
                My code for download is (only for "Modulo 1"):

                Code:
                local mods "737 687 634 603 546 498"
                foreach var of local mods {
                copy "http://iinei.inei.gob.pe/iinei/srienaho/descarga/STATA/`var'-Modulo01.zip" ///
                     "Modulo_`var'.zip", replace 
                capture unzipfile "Modulo_`var'.zip"
                erase "Modulo_`var'.zip" 
                }
                Good luck
                by Coper


                Comment

                Working...
                X