Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Appending multiple data files that contain string and numeric data types

    Hello, I'm quite new to Stata and need help with some code. The idea is that I would like to create a panel data set from an excel workbook that contains numerous sheets (161 sheets). These sheets each contain company data from the Nigerian stock exchange and has identical headings of 19 variables (Date, Price, Market cap...e.t.c). I wish to, therefore, create a panel data set in Stata that combines all these companies.

    I have code that creates 161 data files, the problem comes when I wish to append all these data files into one, Stata gives the following error "variable Price is double in master but str17 in using data. You could specify append's force option to ignore this numeric/string mismatch. The using variable would then be treated as if it contained numeric missing value."

    I tried force appending but that was not a good idea, I, therefore, request help in modifying my code to convert the variables to numeric before appending.

    The excel file is quite large so here is a google drive link in case testing is required. https://drive.google.com/file/d/1JVY...ew?usp=sharing

    Your help would be greatly appreciated.

    Please find my code below:

    clear all
    set more off
    set excelxlsxlargefile on
    *================================================= ========
    *import all stocks data from each sheet using a loop
    *================================================= ========
    forvalues i=1/161 { //put the last number of the last sheet. here there were 3 only
    import excel using project_data.xls, firstrow sheet(`i') clear //import each sheet in turn
    tempfile john`i' //create a temporary file to save imported from each sheet
    save john`i', replace //save the temporary file
    }

    use "john1" , clear //create a temporary master file from temporary file above
    set more off

    forvalues i=2/161 { //replace 3 with the highest number in your sheet
    append using "john`i'"
    }

  • #2
    Sorry, but for the kinds of reasons explained in the Statalist FAQ I have no interest in grappling with a large Excel file.

    There can't be good advice for dealing with string and numeric mixes without knowing why they arise. Read in the first problematic worksheet by itself and show us the result of

    Code:
    tab Price if missing(real(Price))
    That might make it obvious how to destring Price in such instances.

    Comment


    • #3
      I have run the code and it gives an error: "type mismatch r(109);"


      To add more colour, I've attached an image showing the description of the variables.

      Click image for larger version

Name:	Des-Stata.JPG
Views:	1
Size:	45.6 KB
ID:	1521270




      Comment


      • #4
        Not what I asked. The dataset you are showing has Price as double. What I want to see are results for a dataset with Price as string,

        And although your image is readable, screenshots in general are deprecated as we do explain at https://www.statalist.org/forums/help#stata

        Your code has created files john1 to john163 (separate issue: note that the tempfile command did nothing useful as you didn't use the files created; the save command was quite separate). So when you go


        Code:
        use "john1" , clear //create a temporary master file from temporary file above
        set more off
        
        forvalues i=2/161 {
             append using "john`i'"
        }
        what is the first file that fails? If need be, use this so you can see where you got to.

        Code:
        use "john1" , clear //create a temporary master file from temporary file above
        set more off
        
        forvalues i=2/161 { 
             display `i' 
             append using "john`i'"
        }

        Comment


        • #5
          Okay, thank you. A colleague helped me find the problem, it was actually a problem with the dataset. Thank you for your help

          Comment

          Working...
          X