Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to generate missing data under a proportion

    I want to simulate missing value based on my own dataset, but I don't know how to randomly 'replace' the data by missing value for a variable under a certain proportion. Is there any functions or commands for it? Thanks

  • #2
    Assuming that "under a certain proportion" means the rate at which you want the variable to be replaced with missing, then
    Code:
    quietly replace variable = . if runiform() < 0.25
    where the "certain proportion" is 25% in this example.

    Make sure that your original dataset has been saved somewhere (or else work with a frame copy) and (set the seed beforehand if you're interested in reproducibility.

    Comment


    • #3
      Originally posted by Joseph Coveney View Post
      Assuming that "under a certain proportion" means the rate at which you want the variable to be replaced with missing, then
      Code:
      quietly replace variable = . if runiform() < 0.25
      where the "certain proportion" is 25% in this example.

      Make sure that your original dataset has been saved somewhere (or else work with a frame copy) and (set the seed beforehand if you're interested in reproducibility.

      Thanks Joseph. It works well !

      Best regards
      Chang

      Comment

      Working...
      X