Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Erase lines

    Hi all !

    I am working on this data base.

    Sometimes, there is no data, or a 0, or a "#N/A" in the columns "CrsRu" and "CrsOil".

    I would like to erase the lines, where there are a 0 or no data , or a "#N/A".
    I need the entire line to be erased, other ways it will be a problem for the date. Cause each value is at the same date, both CrsRu and CrsOil, so if one data alone disappear, without the other, it will bias it.

    Thank you for your time !!!!


    Attached Files

  • #2
    Hi Morad,

    In Excel you can do this with a simple IF function. Since this is a Stata forum, I will elaborate on how to do it in Stata.

    drop if missing(CrsRU)
    drop if missing(CrsOil)

    Hope it helps.

    Kind regards,

    Wesley

    Comment


    • #3
      In a previous post, you indicated that the following worked:

      Code:
      destring var1, ignore("#N/A") replace
      drop if var1==.
      Once you destring your variable, it is numeric. Therefore, there will be "." instead of blanks for missing values. The code drop if var1==. will remove those (for all variables, or "the entire line" in your words). To drop those with zeros, use the following after you destring:
      Code:
      drop if var1==0
      Stata/MP 14.1 (64-bit x86-64)
      Revision 19 May 2016
      Win 8.1

      Comment


      • #4
        Hello Morad,

        You may refer to "observations" and "variables" in Stata, instead of (Excel's) lines and columns, respectively.

        I suggest you try the - drop if - command so as to provide the solution you wish.

        Please keep in mind that the command for string variables should have the "values" under quotes.

        With regards to the dataset, please prefer to use .dta files (instead of Excel, for example), as recommended in the FAQ.

        Hopefgully that helps.

        Best,

        Marcos

        P.S: sorry for the somewhat redundant post. When I started writing a reply to #1, Carole's message wasn't yet published.
        Last edited by Marcos Almeida; 20 Apr 2016, 09:08.
        Best regards,

        Marcos

        Comment


        • #5
          Originally posted by Wesley Mokkink View Post

          In Excel you can do this with a simple IF function.
          Hi Wesley, yes I tried with IF function. But when it's about erasing the line, it requires VBA. And I don't know how to do it.

          Originally posted by Carole J. Wilson View Post
          In a previous post, you indicated that the following worked
          Hi Carole,
          It works but it doesn't erase both the #N/A and the other value.
          I need the entire line to disappear ! =)

          Comment


          • #6
            Originally posted by Marcos Almeida View Post
            I suggest you try the - drop if - command so as to provide the solution you wish.
            Hi Marcos,

            Maybe I did the wrong code, but I think this command only drop the #N/A and not the entire line ...

            Sorry for my English, I am French and not used to these technical words.

            Thanks for your time ! =)

            Comment


            • #7
              Hi Wesley, yes I tried with IF function. But when it's about erasing the line, it requires VBA. And I don't know how to do it.
              In this case, it does not require VBA. Insert the following function into Excel, drag it down, filter the column on "wrong", and delete the rows.

              HTML Code:
              =IFNA(C2,"wrong")

              Comment


              • #8
                Originally posted by Wesley Mokkink View Post

                In this case, it does not require VBA. Insert the following function into Excel, drag it down, filter the column on "wrong", and delete the rows.

                HTML Code:
                =IFNA(C2,"wrong")

                It doesn't works. It says there is an error in the function.

                Also, there is more than 4000 lines, if I have to do it by hand, best scenario it will take me a lot of time. Worst, I will forget Data ...

                I precise that this is only one data base, I have 17 like that to treat ... that's why I am posting here. I hope I will not need to do it all by hand .. :/

                Comment


                • #9
                  Morad Bali There are lots of ways in which you can do this. You could just simply import the Excel file into Stata. Stata will recognize the #N/A values as missing values. Then, performing
                  HTML Code:
                  drop if missing(varname)
                  , the rows which have a missing value are deleted.
                  ​If in the rare occasion Stata doesn't recognize the #N/A as missing, you can simply make them empty in Excel using the following IF function:
                  PHP Code:
                  If(C2="#N/A","",C2
                  Last edited by Wesley Mokkink; 20 Apr 2016, 09:26.

                  Comment


                  • #10
                    I have the following dataset (copy and paste the code into a new instance of Stata):

                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input str3(var1 var2)
                    "#NA" "#NA"
                    "1"   "6"  
                    "2"   "5"  
                    "3"   "."  
                    "."   "4"  
                    "0"   "1"  
                    "#NA" "3"  
                    end
                    It looks like this :

                    Code:
                     list
                    
                         +-------------+
                         | var1   var2 |
                         |-------------|
                      1. |  #NA    #NA |
                      2. |    1      6 |
                      3. |    2      5 |
                      4. |    3      . |
                      5. |    .      4 |
                         |-------------|
                      6. |    0      1 |
                      7. |  #NA      3 |
                         +-------------+

                    I destring variables var1 and var2 and look at the data:

                    Code:
                    destring var1 , replace ignore(#NA)
                    destring var2 , replace ignore(#NA)
                    
                    
                    . list
                    
                         +-------------+
                         | var1   var2 |
                         |-------------|
                      1. |    .      . |
                      2. |    1      6 |
                      3. |    2      5 |
                      4. |    3      . |
                      5. |    .      4 |
                         |-------------|
                      6. |    0      1 |
                      7. |    .      3 |
                         +-------------+
                    Now I drop if missing and look at the data:

                    Code:
                    drop if var1==.
                    
                    
                     list
                    
                         +-------------+
                         | var1   var2 |
                         |-------------|
                      1. |    1      6 |
                      2. |    2      5 |
                      3. |    3      . |
                      4. |    0      1 |
                         +-------------+
                    Indeed, the entire line is erased if i drop var1 (there are now only 4 observations instead of 7)

                    You can continue with the following to convince yourself that entire lines are being erased:

                    Code:
                    drop if var2==.
                    drop if var1==0
                    drop if var2==0
                    list

                    Stata/MP 14.1 (64-bit x86-64)
                    Revision 19 May 2016
                    Win 8.1

                    Comment


                    • #11
                      Hello Morad,

                      Perhaps you could try it again, after taking a look at the examples in the Stata Manual.

                      However, there are several ways to arrive to the same point.

                      Some lack elegance, but do the work. For example, cumbersome as it may, you can "sort" the specific variable by typing - sort var - , and (let's consider the "real" value for the variable is beyond 29) create an "id" - gen id = _n - , then - list id var if var < 29 - to confirm the "faulty" ids , and finally type - drop if id <= 120 - so as to get rid of the unwanted observations. With a bonus: an "id", which may be important, more so when dealing with large samples.
                      Best regards,

                      Marcos

                      Comment


                      • #12
                        Wesley Mokkink , Carole J. Wilson, Marcos Almeida :

                        Thank you very much for your time and comments !!!!!

                        It will all help me. I am recording it all and keep it very preciously. It's always nice to have several options.

                        Thank you again guys !!!

                        Comment

                        Working...
                        X