Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with 'Destring'

    Dear all,
    I am going to destring CAE_COD, but it says CAE_COD: contains nonnumeric characters; no replace. While all are numeric. I mean I am going to make data without labels. Any ideas appreciated.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str6 CAE_COD
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    "01100"
    end
    
    destring CAE_COD , replace
    CAE_COD: contains nonnumeric characters; no replace

  • #2
    With a default limit of the first hundred observations, -dataex- isn't reliably capable of showing the offending values, which appear to occur further on. In order to find them, try something like the following.
    Code:
    contract CAE_COD
    list if missing(real(CAE_COD)), noobs

    Comment


    • #3
      Thank you for getting back to me.
      Code:
      . contract CAE_COD
      
      . 
      . list if missing(real(CAE_COD)), noobs
      
        +-----------------+
        | CAE_COD   _freq |
        |-----------------|
        | 19_205      254 |
        +-----------------+

      Comment


      • #4
        So what does 19_205 mean and why you do seek a numeric equivalent? What is perhaps more likely is that you need to use encode.

        A non-destructive way to check what is problematic is to go -- with the original data


        Code:
        tab CAE_COD if missing(real(CAE_COD))
        Last edited by Nick Cox; 26 Jan 2023, 06:28.

        Comment


        • #5
          It solved. Thank you Nick & Joseph.

          Comment


          • #6
            What about non-numeric variables? I am going to see their original formats (without labels).

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input str2 PAIS
            "AE"
            "AO"
            "AT"
            "AU"
            "BE"
            "BG"
            "BM"
            "BR"
            "BZ"
            "CA"
            "CH"
            "CN"
            "CV"
            "CW"
            "CY"
            "CZ"
            "DE"
            "DK"
            "EG"
            "ES"
            "FI"
            "FR"
            "GB"
            "GG"
            "GI"
            "GR"
            "HK"
            "HU"
            "IE"
            "IL"
            "IM"
            "IN"
            "IS"
            "IT"
            "JE"
            "JP"
            "KR"
            "KW"
            "KY"
            "LI"
            "LU"
            "MA"
            "MO"
            "MT"
            "MU"
            "MX"
            "MZ"
            "NG"
            "NL"
            "NO"
            "NZ"
            "PA"
            "PL"
            "PT"
            "SA"
            "SC"
            "SE"
            "SG"
            "SI"
            "SK"
            "ST"
            "TC"
            "TN"
            "TR"
            "TW"
            "US"
            "VG"
            "VI"
            "WS"
            "ZA"
            end

            Comment


            • #7
              What about non-numeric variables do you want to know? If you need a numeric country identifier given a string variable, use encode.

              Comment


              • #8
                They are labeled. I need to show them they are. i.e. 1-AE, 2-AO, ...
                I need to show them AE, AO,.. When I wanna
                Code:
                g foreign_aff= 0
                replace foreign_aff= 1 if pais=='AE'
                
                NOT this code
                replace foreign_aff= 1 if pais==1

                Comment


                • #9
                  g foreign_aff= 0
                  replace foreign_aff= 1 if pais=='AE'
                  Using the original string variables, not the -encode-d numeric version, the code you show will work correctly if you change the single quote characters (') to double quotes (").

                  Also, although it is common to see people coding
                  Code:
                  gen logical_variable = 0
                  replace logical_variable = 1 if logical_expression
                  in Stata it is quicker and more transparent to do this as
                  Code:
                  gen logical_variable = logical_expression

                  Comment


                  • #10
                    Those look like country identifiers.

                    Another approach is:

                    Code:
                    egen cid = group(PAIS)

                    Comment


                    • #11
                      Originally posted by Clyde Schechter View Post

                      Using the original string variables, not the -encode-d numeric version, the code you show will work correctly if you change the single quote characters (') to double quotes (").
                      it worked.
                      g foreign_aff= 0
                      replace foreign_aff=1 if PAIS=="PT"
                      replace foreign_aff=0 if PAIS~="PT"


                      Originally posted by Clyde Schechter View Post
                      in Stata it is quicker and more transparent to do this as


                      gen logical_variable = logical_expression
                      I don't quite get it. could you please adjust it according to my above codes? Thanks.

                      Comment


                      • #12
                        Originally posted by George Ford View Post
                        Those look like country identifiers.
                        Actually they are abbreviations. This code reveals the labels not the abbreviations which i seek for.
                        Code:
                        . egen cid = group(pais)
                        
                        
                        . list cid in 1/10
                        
                             +-----+
                             | cid |
                             |-----|
                          1. |  57 |
                          2. |  50 |
                          3. |  32 |
                          4. |  48 |
                          5. |  50 |
                             |-----|
                          6. |  48 |
                          7. |   7 |
                          8. |  26 |
                          9. |  50 |
                         10. | 169 |
                             +-----+

                        Comment


                        • #13
                          They aren't labels, they are numeric values. PAIS already gives you the identifier name, so it's not clear to me what you are trying to produce or need to do with the new variable.

                          list PAIS cid

                          Comment


                          • #14
                            You can assign labels to cid equal to PAIS
                            HTML Code:
                            https://stats.oarc.ucla.edu/stata/faq/how-do-i-assign-the-values-of-one-variable-as-the-value-labels-for-another-variable/

                            Comment


                            • #15
                              Originally posted by George Ford View Post
                              They aren't labels, they are numeric values.
                              I don't like these numeric values. They always confound me. Once the variables themselves are shown in the data, so I can easily figure out which is which
                              All I need is to create a dummy variable=1 when pais is PT. (PAIS in Portuguese means country).

                              Comment

                              Working...
                              X