Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • import multiple xml files and search a specific text within them

    Hello,

    I downloaded the NSF awards given in 2015 from here http://www.nsf.gov/awardsearch/download.jsp. It's a zip file of more than 9000 xml files. Each xml file is an NSF award. My goal is to search each xml file to see how many awards meet a certain condition, e.g., principal investigator located in California or at a certain university. Can this be done in Stata? Could anyone please help me?

    I tried xmluse to load a single xml file, but got an error message saying "unrecognizable XML doctype."

    Many thanks!
    Ji

  • #2
    You might have to do an intermediate step of formatting the schema to an excel type xml file then use xmluse with the doctype option set to excel. But I'm a newb so I probably shouldn't really comment on this.

    Comment


    • #3
      Thanks for the suggestion. I am a newb too. Do you know how to format them to excel type xml all at once? I have 9000 xml files in one folder.

      Comment


      • #4
        This discussion on Statalist has information related to your problem, but no solution: http://www.statalist.org/forums/foru...le-xml-doctype. I am afraid the tool mentioned in post #21 of the thread won't help you because it can only convert one file at a time.

        Comment


        • #5
          Thank you Friedrich! I found additional software that can do batch conversion after reading that post. It costs a few dollars, though. Anyone interested please check here. http://batch-xml-to-csv-converter.en.informer.com/

          Comment


          • #6
            It's not clear from your post which software you are referring to. In any case, here are some words of caution about Software Informer: https://www.mywot.com/en/forum/6231-...e-informer-com

            Comment

            Working...
            X