Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • FInd the right model

    Hello guys,

    Few weeks ago I started my first econometrics project at my university where I want to measure "The (medium term) Effect of Smoking Cessation on Bodyweight/BMI" by using a panel data set.
    My biggest problem right now is that I am not sure how to identify the effect that I am interested in although I spent so much time researching and looking for a fitting model on the internet.
    The data set goes from 2002 to 2012 where people answer the question about smoking/bodyweight every two years. Of course only a small sample of the participants take part in all years.

    Firstly I tried the model that my prof suggested me:
    Code:
    reg weight smoking $vars if l2.smoking == 1, r cluster(id)
    reg weightt+2 smoking $vars if l2.smoking == 1, r cluster(id)
    and so on...
    But the estimated value seemed to be too small and wrong: in t+4 the value descreased, which shouldnt happen I guess. Moreover I am not sure how the model deals with people who started to smoke once again in the years between l2.smoking and t+4.

    Then I tried two models with a) a dummy variable for the last year where the people smoked or b) a dummy variable for the first year of the cessation ... With these dummies I had no concern about the problem form the first model where I was not sure if the people started to smoke once again
    Code:
    reg weight dummy $vars, r cluster(id)
    reg weightt+2 dummy $vars, r cluster(id)
    but also
    reg dummy $vars, r cluster(id)
    Also tried to include i.year and even tried fixed effects, but once again the values were small and sometimes far away to be significant.

    My last idea was the Difference-in-Difference method with Always Smokers as controll group and the people who stopped smoking in 2006 or 2004 as the treatment group. I am aware that "treatment" is not exogenous but the results looked good.

    Since I found myself often on this website, especially because I havent touched Stata ever before I would ask the users here if they have an idea for me. I already asked a similiar question on another statistics forum but they suggested me a very different non-regression approach, which I have no idea about. Moreover the models do not have to the "perfect" because this is my first project in this field.

    Thanks !

    Click image for larger version

Name:	all4kg.png
Views:	1
Size:	43.3 KB
ID:	1425326
    Last edited by Jean Kellerberg; 11 Jan 2018, 10:29. Reason: spelling

  • #2
    Jean:
    first of all I'm wondering why you remain on -regress- when yoiu can use -xtreg- to analyze a panel dataset.
    Some comments about your post:
    - the reduction of -weight- in the last wave of data might be due to a monotonic pattern of missing data (see -mi-. glossary in Stata .pdf manual). In the light of that you should probably devote some hours to deal with misssing data (again, -mi- entries, including their reference sections, in Stata .pdf manual are a good place to start);
    - whenever it comes to analyze soimething, it's very likely that somebody else was presented in the past with the same research topic and have published something. Skimming through the literature in your research field can give you some clues/hints about a regression model that has a good chance to be accepted by reviewers/teachers/supervisor (by the way: what's the opinion of your professor of econometrics about the whole matter?);
    -preferring "the most" statistical significant model vs the one that gives a fair and true view of the data generating process is not an approach that I would advise about.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hey Carlo and other people,

      I already read about 20-30 papers but luckily I found a very interesting and "easy" model this weekend.

      So I am using the balanced panel data from 2002 to 2012, with 2 years gaps .. with two groups of people: people who smoked in all years and people who smoked from 2002 to 2006 and then stopped
      Lets just assume that the data is right and the people dont smoke between the years, I am not sure how the question about the smoking looked like

      I do 3 regressions
      1) only with years 2006 and 2012
      2) only with years 2006 and 2010
      3) only with years 2006 and 2008
      all other years are dropped in the tree regressions
      The paper used only one regression, while I am, as said in my first post interessted in the t+2 t+4 effects

      Now I do simple Fixed Effects (First Differences has the same results), Random Effects and then the Hausmann test
      The value of smoking should show me the effect of the cessation since we compare people who smoked and become non smokers to alwayssmokers
      Code:
      xtreg weight smoking $variables_like_in_the_paper, fe
      estimates store fixed1
      xtreg weight smoking $variables_like_in_the_paper, re
      estimates store random1
      hausman fixed1 random1
      All coefficients for smoking are highly significant and in the area I expected them to be.
      Only thing that bothers me a little bit is the small R^2 overall = 0.0013

      What do you think about it, I guess this could be right model I was looking for

      The paper is called
      Smoking Cessation and Changes in Body Mass Index Among Middle Aged and Older Adults, 2016 by Andy Sharma

      Comment


      • #4
        Jean:
        methodologically speaking, I would not support deleting years, in that you may end up with a (biased) sample which only tenously representative of the starting one.
        That said, I would consider dealing with missing data (especially if the missingness is informative) instead of analyizing the complete cases only.
        As far as the assumed low overall R-sq is concerned, please note that -fe- specification is expected to maximize the within R-sq, whereas -re- specification is expected to maximize the between R-sq.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment

        Working...
        X