Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Drop command

    Clyde Schechter

    I will like will to create a database from an existing data for only individuals aged 16 - 65 who are employed, in labor force, have income above zero and working hours above zero or missing.

    I have three variables empstat, labfroce,incwage, uhrswork

    empstat which has a category for (1) N/a (2) employed (3) Unemployed and (4) Not in labor force.

    labforce which has category for (1) N/A (2) No, Not in the labor force and (3) in the labor force.

    will the following codes be right?

    Code:
    drop if age <16
    Code:
    drop if age >65

    Code:
    drop if empstat ==2| age <16
    Code:
    drop if empstat ==2| age >65
    Code:
    drop if empstat==0| age <16

    Code:
    drop if empstat==0| age >65


    Code:
    drop if empstat==3| age <16
    Code:
    drop if empstat==3| age >65
    *Dropping labour force data

    Code:
    drop if labforce ==1| age <16
    Code:
    drop if labforce ==1| age >65


    Code:
    drop if incwage ==0 |age <16
    Code:
    drop if incwage ==0 |age >65
    Code:
    drop if uhrswork==0 |age <16
    Code:
    drop if uhrswork==0 |age >65

    Clyde Schechter

  • #2
    No this isn't right. You can simplify the logic by writing inclusion/exclusion criteria in steps, since your criteria are mutually exclusive.

    For example
    Code:
    * Restrict to working ages
    keep if age >= 16 | age <= 65
    * further restrict to employed
    keep if empstat == 2
    * further restrict to those in labour force
    keep if labforce == 3
    * Restrict to those with incomes above zero
    keep if income > 0 & !missing(income)
    * Restrict to those with positive or missing hours of work
    keep if uhrswrk > 0 | missing (uhrswrk)
    (Technically the condition of missing hours worked isn't necessary since Stata treats all missing values as literally larger than any non-missing value. See a discussion about this at https://www.stata.com/support/faqs/data-management/logical-expressions-and-missing-values/)

    Comment


    • #3
      Thanks Leonardo Guizzetti

      Comment


      • #4
        The commands - drop - and - keep - have much plasticity, so to speak.

        You may get the results you wish within a single line.

        Please take a look at the example below:

        Code:
         . sysuse auto
        (1978 Automobile Data)
        
        . sum price mpg rep78
        
            Variable |        Obs        Mean    Std. Dev.       Min        Max
        -------------+---------------------------------------------------------
               price |         74    6165.257    2949.496       3291      15906
                 mpg |         74     21.2973    5.785503         12         41
               rep78 |         69    3.405797    .9899323          1          5
        
        . keep if price < 8000 & mpg > 16 & (rep78 ==3 | rep78==5)
        (43 observations deleted)
        
        . sum price mpg rep78
        
            Variable |        Obs        Mean    Std. Dev.       Min        Max
        -------------+---------------------------------------------------------
               price |         31    4628.355    764.2687       3291       6295
                 mpg |         31    23.54839    6.323705         17         41
               rep78 |         31    3.580645    .9228288          3          5
        Hopefully that helps.
        Last edited by Marcos Almeida; 18 Mar 2019, 05:03.
        Best regards,

        Marcos

        Comment

        Working...
        X