Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • problems to use the command keep if

    I have a Dataset with 150 countries. Now, I would like to keep the data of only about 35 countries. All the countries are identified by their isocode in the data set.
    I tried the following command:

    keep if isocode = "AUT"

    so this one worked. The problem is, that I want to keep 34 other countries.
    I tried to use the operator & ( keep if isocode =" "AUT" & "AUS" &......) but it didn't work out.

    Does anyone knows a better way to deal with this problem? I would be extremely thankful!

    Best
    Elio

  • #2
    Hello Elio,

    You may want to take a look at this FAQ: http://www.stata.com/support/faqs/da...-observations/

    Hopefully that helps.

    Best,


    Marcos
    Best regards,

    Marcos

    Comment


    • #3
      One way is:
      Code:
      clear
      set more off
      
      *----- example data -----
      
      input ///
      str3 country
      ABW
      AFG
      AGO
      GIB
      GIN
      GLP
      NLD
      NOR
      NPL
      end
      
      list
      
      *----- what you want -----
      
      // -inlist()- works better with numeric arguments
      encode country, gen(country2)
      
      // define countries to keep
      local tokeep "1, 5, 6"
      
      // keep
      keep if inlist(country2, `tokeep')
      
      list
      You mean the logical operator OR and not AND, but the original syntax is incorrect anyway. It should be more like:

      Code:
      keep if country == "ABW" | country == "GIN" | country == "GLP"
      You should:

      1. Read the FAQ carefully.

      2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

      3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

      4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.

      Comment


      • #4
        Thank you very much for your help, Marcos Almeida and Roberto Ferrer. It worked out perfectly! I will definitely have a much closer look at the FAQ next time, sorry about that.
        Have a nice day !

        Best
        Elio

        Comment


        • #5
          There are problems here at several levels. Roberto has focused on the most important. "Or" logic is needed here, as
          "and" logic does not apply. Stata evaluates true-or-false statements within observations, not across them, and an observation cannot be simultaneously for two or more countries.

          Further,

          Code:
           
          keep if isocode = "AUT"
          would not have worked because it is illegal. The code should have included == not =. That in itself is trivial as most Stata users should be able to spot the typo, but an alarmingly large proportion of the threads here move more slowly than they should because posters do not post the exact code they used.

          Note that even something like

          Code:
           
          keep if isocode == "AUT" | "AUS"
          would not have worked as this is parsed as if

          Code:
           
          (isocode == "AUT") | "AUS"
          and the text "AUS" cannot be interpreted as numeric and so cannot be evaluated as true or false.


          Comment


          • #6
            Good morning everybody,

            I`m happy to have found this thread and not opening a new one on a question, which I may resolve but at the moment I`m struggling to find the answer.


            I`m having a following variable (string) named Indicator.

            dataex Indicator in 153/157

            input str213 Indicator
            "Domestic extraction (Metal ores) (Tonnes)"
            "Indicator name: Domestic extraction (Non-metallic minerals) Data Source: UNEP(http://unep.org/)"
            "Description: The amount of non-metallic minerals extracted from the natural environment used in the economy."
            "INDICATOR NAME (unit)"
            "Domestic extraction (Non-metallic minerals) (Tonnes)"

            Now I want to only keep observations in which I have the actual variables and values, so I would want to drop e.g. each one including Indicator name or description, since those have just come into the dataset due to the importing of raw data.

            How can I tell Stata to drop any observation where the manifestation of the Indicator variable includes something like description or indicator name and so on.

            Thank you so much for support and best regards
            Tobi

            Comment


            • #7
              Perhaps

              Code:
              drop if strpos(lower(indicator), "indicator") | strpos(lower(indicator), "description")

              Comment


              • #8
                Hey Nick,

                thank you very much, I finally typed:

                drop if strpos(lower(Indicator), "indicator") | strpos(lower(Indicator), "description")

                (with indicator written small Stata was not able to find it)

                It perfectly worked

                Thanks and best regards
                Tobi

                Comment

                Working...
                X