Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping missing observations under several conditions

    Hej Statalist,

    I have a problem with dropping some observations. I analyse a trade data set and have a lot of missing data for the trade value, as I have observations for every combination of reporter partner commodity and year. My data consists of 686,760 observation and 664,327 of these are missing. In the displayed data set for all observations trade is zero, furthermore I only have two partner countries (BEN, GIN), but all other countries as reporters.
    Code:
    year reporter partner commodity                                        tradevalueus
    1992 "VNM" "BEN" "Albuminoids, modified starches, glues, enzymes"         .
    1992 "CYP" "BEN" "Wool, animal hair, horsehair yarn and fabric thereof"   .
    1992 "NOR" "BEN" "Glass and glassware"                                    .
    1992 "MAR" "BEN" "Furskins and artificial fur, manufactures thereof"      .
    1992 "GIN" "BEN" "Optical, photo, technical, medical, etc apparatus"      .
    1992 "MAC" "BEN" "Manufactures of plaiting material, basketwork, etc."    .
    1992 "BEN" "BEN" "Residues, wastes of food industry, animal fodder"       .
    1992 "BLR" "BEN" "Residues, wastes of food industry, animal fodder"       .
    1992 "BEN" "BEN" "Nickel and articles thereof"                            .
    1992 "COL" "BEN" "Fertilizers"                                            .
    1992 "ITA" "BEN" "Other made textile articles, sets, worn clothing etc"   .
    1992 "MOZ" "BEN" "Arms and ammunition, parts and accessories thereof"     .
    1992 "MOZ" "BEN" "Copper and articles thereof"                            .
    1992 "MOZ" "BEN" "Cork and articles of cork"                              .
    1992 "MSR" "BEN" "Miscellaneous edible preparations"                      .
    1992 "NER" "BEN" "Nuclear reactors, boilers, machinery, etc"              .
    1992 "KHM" "BEN" "Oil seed, oleagic fruits, grain, seed, fruit, etc, ne"  .
    1992 "GUY" "BEN" "Tobacco and manufactured tobacco substitutes"           .
    1992 "SWE" "BEN" "Tanning, dyeing extracts, tannins, derivs,pigments et"  .
    1992 "BMU" "BEN" "Fish, crustaceans, molluscs, aquatic invertebrates ne"  .
    1992 "MKD" "BEN" "Meat and edible meat offal"                             .
    1992 "SYC" "BEN" "Fish, crustaceans, molluscs, aquatic invertebrates ne"  .
    1992 "BMU" "BEN" "Meat, fish and seafood food preparations nes"           .
    1992 "HUN" "BEN" "Products of animal origin, nes"                         .
    1992 "PSE" "BEN" "Ores, slag and ash"                                     .
    1992 "SRB" "BEN" "Carpets and other textile floor coverings"              .
    1992 "AUT" "BEN" "Inorganic chemicals, precious metal compound, isotope"  .
    1992 "LKA" "BEN" "Cork and articles of cork"                              .
    1992 "CPV" "BEN" "Vegetable, fruit, nut, etc food preparations"           .
    1992 "MKD" "BEN" "Copper and articles thereof"                            .
    1992 "AUS" "BEN" "Nickel and articles thereof"                            .
    1992 "LBY" "BEN" "Plastics and articles thereof"                          .
    1992 "COG" "BEN" "Mineral fuels, oils, distillation products, etc"        .
    1992 "BRA" "BEN" "Live animals"                                           .           .
    I want to drop all the observations where there was no trade in a commodity between the partner and reporter in the last 5 years.
    As I googled, people suggested using foreach, but only for one condition, I couldn't figure out how to do it under several coniditions.
    So I would be really pleased if someone could help me out. Thank you very much in advance for your help.

    Martin

  • #2
    So there are two steps to this. The first is to calculate for each observation in your data set the maximum tradevalue for all observations with the same reporter, partner, and commodity and a year within the last five years. Then if that is zero or missing value, the observation should be dropped.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int year str3(reporter partner) str53 commodity byte tradevalueus
    1992 "VNM" "BEN" "Albuminoids, modified starches, glues, enzymes"        .
    1992 "CYP" "BEN" "Wool, animal hair, horsehair yarn and fabric thereof"  .
    1992 "NOR" "BEN" "Glass and glassware"                                   .
    1992 "MAR" "BEN" "Furskins and artificial fur, manufactures thereof"     .
    1992 "GIN" "BEN" "Optical, photo, technical, medical, etc apparatus"     .
    1992 "MAC" "BEN" "Manufactures of plaiting material, basketwork, etc."   .
    1992 "BEN" "BEN" "Residues, wastes of food industry, animal fodder"      .
    1992 "BLR" "BEN" "Residues, wastes of food industry, animal fodder"      .
    1992 "BEN" "BEN" "Nickel and articles thereof"                           .
    1992 "COL" "BEN" "Fertilizers"                                           .
    1992 "ITA" "BEN" "Other made textile articles, sets, worn clothing etc"  .
    1992 "MOZ" "BEN" "Arms and ammunition, parts and accessories thereof"    .
    1992 "MOZ" "BEN" "Copper and articles thereof"                           .
    1992 "MOZ" "BEN" "Cork and articles of cork"                             .
    1992 "MSR" "BEN" "Miscellaneous edible preparations"                     .
    1992 "NER" "BEN" "Nuclear reactors, boilers, machinery, etc"             .
    1992 "KHM" "BEN" "Oil seed, oleagic fruits, grain, seed, fruit, etc, ne" .
    1992 "GUY" "BEN" "Tobacco and manufactured tobacco substitutes"          .
    1992 "SWE" "BEN" "Tanning, dyeing extracts, tannins, derivs,pigments et" .
    1992 "BMU" "BEN" "Fish, crustaceans, molluscs, aquatic invertebrates ne" .
    1992 "MKD" "BEN" "Meat and edible meat offal"                            .
    1992 "SYC" "BEN" "Fish, crustaceans, molluscs, aquatic invertebrates ne" .
    1992 "BMU" "BEN" "Meat, fish and seafood food preparations nes"          .
    1992 "HUN" "BEN" "Products of animal origin, nes"                        .
    1992 "PSE" "BEN" "Ores, slag and ash"                                    .
    1992 "SRB" "BEN" "Carpets and other textile floor coverings"             .
    1992 "AUT" "BEN" "Inorganic chemicals, precious metal compound, isotope" .
    1992 "LKA" "BEN" "Cork and articles of cork"                             .
    1992 "CPV" "BEN" "Vegetable, fruit, nut, etc food preparations"          .
    1992 "MKD" "BEN" "Copper and articles thereof"                           .
    1992 "AUS" "BEN" "Nickel and articles thereof"                           .
    1992 "LBY" "BEN" "Plastics and articles thereof"                         .
    1992 "COG" "BEN" "Mineral fuels, oils, distillation products, etc"       .
    1992 "BRA" "BEN" "Live animals"                                          .
    end
    
    rangestat (max) tradevalueus, by(partner reporter commodity) interval(year -4 0)
    drop if missing(tradevalueus_max) | tradevalueus_max == 0
    To use this code you need to install the -rangestat- command, written by Robert Picard, Nick Cox, and Roberto Ferrer, and available from SSC.

    In this code, I have assumed that "the last five years" means the current year and the four years preceding. If you meant the five preceding years (excluding the current year) change -4 0 to -5 -1.

    In your example data, we cannot really illustrate the calculation because none of the observations have any trade with anybody shown, so every observation gets dropped. Presumably in your real data set, this is not the case. In the future, when posting example data, try to show an example that will better illustrate the desired results, and, in a situation like this, would include some that would be dropped and some that would not.

    Also, when posting example data, please use the -dataex- command, as I have done in this response. If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Hej Clyde,

      Thanks for the reply and for the code. It works perfectly. Thank you very much.

      Comment

      Working...
      X