Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Importing csv file

    Hallo there, I am very new with stata and I have a problem with importing a csv file.
    The file contains a dataset of appr. 100.000 values in 20 different variables. The values are separated by semicolons to be allocated to different variables. By using the command "import delimited csv.txt, delimiter(";")" all but 15 values are imported correctly. For these 15 values the semicolon is not used to separate the information into different variables, but all the information, including the semicolon is imported into one variable.
    For example a;b;c does not become l a l b l c l, but l a;b;c l l l.
    It would be great if somebody had an idea how to correct this. Thank you very much.

  • #2
    Without seeing a sample of the actual data in the text file and exactly what Stata did with it, I can't be sure. But I have a guess about what's going on here. My guess is that the a;b;c that comes in as a single variable is, in fact, a single variable that is a string enclosed within quotes in the text file. When import delimited encounters a delimiter character that is within quotes, it treats it as a literal character, not as a delimiter.

    Depending on what the entire data set looks like there are a few ways around this:

    1. Use -filefilter- to eliminate all of the quotes from the text file before importing it. This is probably the simplest approach provided that there are no instances of semi-colons in the text file that really should be imported as literal semicolon characters, and provided that the quotation marks are themselves not literally needed in the imported data.

    2. If #1 isn't suitable, you can just import the data the way you have already done, and then use the -split- command to break up the a;b;c variable into three separate variables.

    Comment


    • #3
      Thank you very much, the split command worked very well.

      Comment

      Working...
      X