Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replacing missing values with already existing match

    Hello All,

    I have a dataset where I have two variables, id number and ownership structure. ID number is the unique identifier. Each ID number has one corresponding ownership structure. The same ID repeats several times in the dataset, but in certain cases it does not have the corresponding ownership structure. I run the code
    Code:
    bysort ownershipstructure ( id_n ) : replace ownershipstructure = ownershipstructure [_n-1] if missing( ownershipstructure )
    but Stata reports either "0 real changes made" or "weights not allowed". I use the command in other cases, sometimes it runs fine. Below, please see my data extraction

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long id_n str86 ownershipstructure
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 ""               
    1161 "publicly traded"
    1161 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1300 "publicly traded"
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 ""               
    1327 "publicly traded"
    1327 "publicly traded"
    1327 "publicly traded"
    1327 "publicly traded"
    1388 ""               
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1388 "publicly traded"
    1704 ""               
    1704 ""               
    1704 ""               
    1704 ""               
    1704 ""               
    1704 ""               
    1704 ""               
    1704 ""               
    1704 ""               
    1704 ""               
    1704 ""               
    1704 ""               
    end

    I would highly appreciate it if you could advise me on how I can improve my command.

    Thank you in advance,
    Nick

  • #2
    Code:
    bysort id_n (ownershipstructure): replace ownershipstructure = ownershipstructure[_N] if missing(ownershipstructure)

    Comment


    • #3
      Thank you very much! Could you please explain me the differece and the cases when we need to take _n-1 and when N? I'd highly appreciate it!

      Comment


      • #4
        Originally posted by Nick Baradar View Post
        Thank you very much! Could you please explain me the differece and the cases when we need to take _n-1 and when N? I'd highly appreciate it!
        1. The first problem is the use of bysort. Prefix bysort x (y) actually means "by x, sort y". So, putting the ID in the parentheses would not get the desired result.
        2. "_n - 1" would work if the category is a numeric variable because Stata sorts numerical missing to the bottom. For string, the empty cell will be sorted to the top, so instead of using the previous cell, my suggested command goes for the last cell within each ID, which is "_N".

        Comment


        • #5
          Weights not allowed is likely to be an error message when there is a space as in

          Code:
          ownershipstructure [_n-1]
          because
          Code:
            
           replace ownershipstructure = ownershipstructure
          is by itself perfectly legal, so that Stata is puzzled by
          Code:
          [_n-1]
          by itself and is guessing that it can only be an attempt to specify weights, not allowed at that point. Naturally
          Code:
          replace ownershipstructure = ownershipstructure
          is not what you want to do but Stata is not judging on intent, only content.

          Comment

          Working...
          X