Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New to STATA and could use some help

    This is our second class and our professor has thrown this at us without ever using STATA before,


    Playing with data: Load the UNRATE data set into STATA. (UNRATE can be found in the in-class exercises in blackboard.)
    Create a time variable called t that starts at 1 and counts up by 1 every period. You can use the “gen” command to create variables and “_n” will refer to the row number. The full command will be “gen t = _n”. Is t in months, quarters or years?
    Show me a scatter plot of your data over time t. Print out the picture.
    Show me a regression with time as the independent variable and unemployment as the dependent variable. Print the results here.
    Analyze your results. Is time a significant predictor of unemployment? What direction does it go in?
    What’s the R^2 of your result?
    Create an indicator variable called “trump_term” that has the value 1 since Trump took office and 0 otherwise. (He took office in Jan 20, 2017.) Include it in your regression. Print your results.
    Analyze your results. Is “trump_term” significant/positive/negative?
    What direction does it go in?
    What happens to the time variable after adding trump_term?
    What’s the R^2 of your result? How did it change from the previous estimation?
    Print all of your effective code for this exercise.


    Can anyone help me with this over private message to understand atleast where to begin, thank you.
    Attached Files

  • #2
    If your professor hasn't even given an introduction to Stata, that is extremely surprissing, and potentially very sloppy considering the disparity in skill that students exhibit. I would ask your professor or your teaching assistant for guidance. It is very hard to help without knowing what you know and what the average student in your class is expected to know. In general, the FAQ strongly discourages us from helping with homework questions, although people seem to think that trying to nudge people in the right direction is acceptable. In that vein, here are some nudges:

    You have to import the data into Stata. The easiest way is to open Stata, then go to the File menu -> Import -> Excel data.

    You can examine your data in Excel, by the way.

    For scatterplots, you can use the command line in the bottom of the screen, and you would type something like

    Code:
    scatter y t
    Where y is whatever the dependent variable is. Many posters are not willing to download your data due to security reasons, but presumably you have multiple dependent variables, and the question is pretty vague. Try picking one. For that matter, you can use the drop down menus to do everything; if you have zero programming experience, you can start there, but we would recommend learning command line syntax in the long run.

    Regression commands are also going to look something like

    Code:
    regress y x
    where x is the independent variable.

    If you can't even do this, then you need to approach your professor or teaching assistant. If your professor's expectations are unreasonable, you should let him/her know - and if they are, you won't be the only person. On the other hand, if the problem is that you haven't been to class or office hours, then rectify that. (I'm not accusing you of this; I just don't know).
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment


    • #3
      Hello Steven. Please see point #4 here: https://www.statalist.org/forums/help#adviceextras

      In this forum, and many others, students struggling with homework are much more likely to get help if:
      1. They acknowledge that it is homework.
      2. They show what they have tried so far, and where (specifically) they are getting stuck.
      You pass on point #1, but not on #2. You appear not to have tried anything yet.

      You've asked how to start. The first instruction tells you to load the UNRATE dataset into Stata. The file extension is .xls. That means it's an Excel file. (If you didn't know that already, you could have Googled <.xls file extension>.) I suggest you try Googling with search terms like <Stata import excel>.

      Good luck!
      --
      Bruce Weaver
      Email: [email protected]
      Version: Stata/MP 19.5 (Windows)

      Comment


      • #4
        Originally posted by steven smith
        ...

        I have gotten to this part but I am stumped on how to continue with the next part

        ⦁ Create an indicator variable called “trump_term” that has the value 1 since Trump took office and 0 otherwise. (He took office in Jan 20, 2017.) Include it in your regression. Print your results.
        OK, you remember that you generated a variable called t, which counts up from 1. I imagine that your data are the unemployment rate observed at various time periods. You have to figure out which time period corresponds to inauguration day. Then, you could do something like

        Code:
        generate trump_term = t >= 10
        Change the value of 10 as appropriate. Browse the date in Excel or in Stata's own browser if you need to. I'm sorry, I can't be more specific, and if you need more specific advice, chances are you need to approach your teaching assistant or professor (as I said earlier)
        Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

        When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

        Comment


        • #5
          Hello again Steven. If you use CODE indicators (the # button on the toolbar), as noted in the FAQ, your code and output will be more readable.

          Re your question about creating an indicator variable, I Googled again with <Stata indicator variable> as the search term, and the first hit was this Stata FAQ: If you study the examples in the Using generate to create dummy variables section, you'll see that in general terms, this is how you can create an indicator (or dummy) variable using -generate-:

          Code:
          generate indicator = {some expression involving one or more variables} if {none of those variables are missing}
          Because the only possible values for indicator will be 1 (if the expression is true), 0 (if the expression is false) or missing (if any of the variables used in the expression are missing), I would modify that to include byte:

          Code:
          generate byte indicator = {some expression involving one or more variables} if {none of those variables are missing}
          In your case, you have been told to name the indicator variable trump_term. And presumably you have some kind of date variable. You also know the date when Trump's term began. So...

          Code:
          generate byte trump_term = {expression involving your date variable and date Trump term began} if {your date variable is not missing}
          What expression involving your date variable and the date the Trump term began will be true after the Trump term began, but false before it began? See if you can take it from there.
          Last edited by Bruce Weaver; 15 Feb 2019, 14:33. Reason: Crossed with Weiwen Ng's post (#5).
          --
          Bruce Weaver
          Email: [email protected]
          Version: Stata/MP 19.5 (Windows)

          Comment


          • #6
            The exercise is quite surprising. It's only "playing with data", but usually one tries to do it basically right. This is a time series, with heavy autocorrelation (that is, data at time t is correlated with data at time t-1, t-2, etc. And anyway a quick look at the scatter plot would tell you it's really not a good idea to do a linear regression of rate on time: the relation is obviously not linear. You might do a regression only on the data after january 2010, as the data points are roughly linear here, but the rest of the plot shows that it's only a local relationship. Looks like the point is only to find a distorted to show Trump's term positive effect on unemployment (negative coefficient). It may, or may not, but this exercise doesn't help to figure out.

            Comment

            Working...
            X