Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Explicit subscripting []

    The more I use stata, the more I understand that I don't understand explicit subscripting if it contains variables.
    For example, the following code is very simple: it would generate the first difference (between adjacent, not necessarily consecutive times), for each person, in ln_wage:

    Code:
    webuse nlswork, clear
    bysort idcode: gen d_lnwage = ln_wage[_n] - ln_wage[_n-1]
    However, I sometime see code such as:
    Code:
    by serial: replace famunt2=famunit[HHhead]
    And I don't understand what it's supposed to even do. The help file for subscripting doesn't contain any examples or explanations for something like this. Any guidance or explanation would be extremely helpful.

  • #2
    There's no example here, but first Stata will evaluate the subscript expression. Just as

    Code:
    _n - 1
    is evaluated as the current observation number MINUS 1, so also

    Code:
    HHhead
    is evaluated as the value in the present observation. But then by: implies that (e.g.) 1 will mean the first observation in the by: group (here by context a set of observations for a given family. So, you can work it out that with data like

    Code:
    clear 
    input serial HHhead 
    1       1
    1       1   
    1       1  
    2       1  
    2       1   
    2       1
    2       1   
    end
    HHhead[1] will mean a cross-reference to observations 1 and 4 for serial 1 and 2 as they are the first observations within each panel.

    A teacher's tip is that I only thought I understand by: fairly thoroughly when I'd written a paper explaining it!

    Comment

    Working...
    X