Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to stop Stata from splitting cell that span across multiple columns when importing tables from excel & csv?

    I am trying to import a table from Excel into Stata, but it always split the cells that span across multiple columns. For example, my data looks like this in Excel
    55 a1 0 0
    56 a2 b2 0 1
    57 a3 1 0
    but after importing it to Stata, it becomes this
    55 a1 0 0
    56 a2 b2 0 1
    57 a3 1 0
    After importing, Stata split the cells that span across multiple columns. Only the first columns contain the original value and other columns are converted to missing. Therefore I can't tell the difference between row 1 and 3 and make my data cleaning task harder.

    I know Stata doesn't support individual cells that span across columns but is there a way to let Stata leave some marks when splitting those cells, such as letting other columns contain extended missing values like .s? Couldn't find any solution online
    Last edited by Avery Chen; 09 Aug 2021, 13:21.

  • #2
    What you show looks the same to me.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      What you show looks the same to me.
      In table 1, the second cell of the first row span across two columns, and there are only 3 cells in the first row.

      In table 2, that cell is split into two cells, one contains the original value and one empty. So there are 4 cells in the first row.

      Comment


      • #4
        I see now, thanks, but I still don’t understand what you expect or want Stata to do. Nothing in Stata corresponds to what you show as the original.

        Comment


        • #5
          Originally posted by Nick Cox View Post
          I see now, thanks, but I still don’t understand what you expect or want Stata to do. Nothing in Stata corresponds to what you show as the original.
          Yeah, after posting the thread I realize Stata doesn't support that, but can't change the title or delete and repost. Despite that, I think it's possible for Stata to give the newly split cells the same values as the original one, or extended missing values like .s so I can deal with them later? There isn't any options in "import" to do that so I don't know how

          For example, the original table is
          55 a1 0 0
          ... ... ... ... ...
          and I want to convert it into
          55 a1 a1 0 0
          or
          55 a1 .s 0 0
          . Rather than
          55 a1 0 0
          Thx!

          Comment


          • #6
            I think you're at a loss here. Even Excel cannot accomplish what you want when it's asked to save a worksheet as a CSV file. Excel does exactly what Stata does, placing the value from a merged cell in the leftmost of the cells, and leaving the rightmost cell empty.

            Casual exploration of Excel did not uncover any functions that would, when given the address of the rightmost cell of a merged cell, return the value present in the "merged cell". So even working purely in Excel, it seems impossible to automate the task of distinguishing merged cells from individual cells and replicating the value across unmerged cells. I wouldn't expect Stata to attempt to solve a problem of poor data representation that even Excel itself apparently cannot solve.

            Comment


            • #7
              Without resorting to VBA, this method should allow you to at least mark and unmerge them in Excel, and then you can sort them accordingly and revise the second column.

              1) Follow this link to first search and highlight all merged cells: https://support.microsoft.com/en-us/...d-b3f233daca2c
              2) Once they are highlight, first change those cells' font color so that it's different than the rest. Then, go to Home and under merge & center, choose Unmerge Cells.
              3) Now they should all be unmerged, but you'll need to duplicate the column B content into column C. Follow this link to sort column B by color: https://support.microsoft.com/en-us/...e-2e545c4a4654. Then you can copy and paste the contents over more easily.

              Save a copy, and try to import that into Stata again.

              And vow to never merge cell in Excel again. It may look nice for presentation, but never good for the data.

              Comment

              Working...
              X