Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Select a certain number of variables from .dat file

    Hello everyone,

    I have a big .dat file that makes my laptop freeze every single time that I try to open it. Since I am not going to use all variables in the dataset for my analysis, are there any ways to pre-select the desired variables? Here is my thinking. Assume that I want to pre-select variables x, y, and z; and the .dat file name is: Big_dataset
    Code:
    local vartokeep x y z
    use "Big_dataset"
        if ("`varstokeep'"!="") {
            local tokeep                                 // empty to keep list
            foreach var of local varstokeep {            // loop over all selected variables
                capture confirm variable `var'           // check whether variable exists in current wave
                if (!_rc) local tokeep `tokeep' `var'    // mark for inclusion if variable exists
                }
            keep   `tokeep' // keep selected variables
            }
    I am on the right track? Is there any quicker way?

    Thank you for your help.

  • #2
    I don't believe your code will help much, since all the data is read into memory before the excess variables are dropped. However,
    -use- allows a variable list:

    Code:
    use x y z using Big_dataset
    will do what you want. -use- also allows selection by row if the -if- qualifier. There is some further advice on this topic at https://www.nber.org/stata/efficient...selection.html
    Last edited by Daniel Feenberg; 10 Oct 2022, 07:45.

    Comment


    • #3
      Implicit in your code is the idea that not all variables are present in every wave. The following code applies the concepts from post #2 to the problem this presents.
      Code:
      local vartokeep make price mpg speed
      describe using "~/Downloads/autos", short varlist
      local allvar `r(varlist)'
      local keepvar : list allvar & vartokeep
      macro list _vartokeep _allvar _keepvar
      use `keepvar' using "~/Downloads/autos", clear
      describe
      Code:
      . local vartokeep make price mpg speed
      
      . describe using "~/Downloads/autos", short varlist
      
      Contains data                                 1978 automobile data
       Observations:            74                  10 Oct 2022 10:06
          Variables:            12                  
      Sorted by: foreign  
      
      . local allvar `r(varlist)'
      
      . local keepvar : list allvar & vartokeep
      
      . macro list _vartokeep _allvar _keepvar
      _vartokeep:     make price mpg speed
      _allvar:        make price mpg rep78 headroom trunk weight length turn displacement gear_ratio
                      foreign
      _keepvar:       make price mpg
      
      . use `keepvar' using "~/Downloads/autos", clear
      (1978 automobile data)
      
      . describe
      
      Contains data from ~/Downloads/autos.dta
       Observations:            74                  1978 automobile data
          Variables:             3                  10 Oct 2022 10:06
                                                    (_dta has notes)
      ------------------------------------------------------------------------------------------------
      Variable      Storage   Display    Value
          name         type    format    label      Variable label
      ------------------------------------------------------------------------------------------------
      make            str18   %-18s                 Make and model
      price           int     %8.0gc                Price
      mpg             int     %8.0g                 Mileage (mpg)
      ------------------------------------------------------------------------------------------------
      Sorted by: 
      
      .

      Comment

      Working...
      X