Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Support: most efficient way to enter data

    Hi everyone,

    I am a total newbie to Stata, so perhaps some of you can help me with this issue:

    For my master thesis, I will analyze a TV show. The data for this show do not exist yet, so I will create them in Stata directly. My question is about which way to best create the variables without causing issues with empty fields.

    This is the format of the quiz show: each episode of the quiz show consists of one candidate having to answer ten questions (= ten rounds). If needed, the candidate can choose to ask a person of the audience to help him with a question. The candidate and the person from the audience will then negotiate about the amount the person in the audience will receive for answering the question (e.g. question is worth 2000 euros, negotiated amount is 1500 euros for the candidate, 500 euros for the person of the audience).

    Issue:
    When the candidate answers the question with the help of someone from the audience, I need to record information about the person from the audience (name, age, etc.), as well as the order and amounts of the negotiation (e.g. first offer by candidate: 500 euros, then offer by other person 1500 euros, then offer candidate 700 euros ...).

    Idea:
    My idea was to create a variable for all of these aspects (e.g. question4firstoffer, question4secondoffer, ...) and leave the observations empty in case the candidate answers the question on his own. However, I am unsure whether this will cause problems later when I analyze the data as some of the variables will then always be empty.



    I hope this was somewhat understandable, please let me know in case it is unclear.
    Thank you so much for giving me your thoughts on this issue.

    Very best,
    Alexandra

  • #2
    Welcome to the Stata Forum / Statalist.

    To start, personally, I never use Stata (or whichever Stats package) "to create data directly", I mean, I use a spreadsheet (usually Excel). I believe inputting data in a spreadsheet is not the main purpose of any statistical package, albeit feasible.

    With regards to the variables, once created, if you don't have value of a variable for some observations, you're supposed to leave the case empty, generally speaking. However, depending on the question, you may wish to specify the reason for the missing data, such as "not applicable" or "refused to answer". Being this so, you may use a code to specify this pattern, such as 999 for "not applicable" and "99" for refused to answer". When imported to Stata, you can use - mvencode - to "tell" Stata" what it represents. Shall you decide to input data directly in Stata, you may use "." for unspecified missing data and, say, ".a" for "not applicable" and ".b" for "refused to answer".

    Hopefully that helps.
    Best regards,

    Marcos

    Comment

    Working...
    X