Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel regression with dates before 1960

    Hi,

    I am working with a panel database by country(id) and monthly date in format %tm (I have dates since 1955). I am trying to do a regression with fixed effects by id and date, like this:

    Code:
    xtset id date
    reg var1 var2#var3 i.id i.date
    but Stata says that "date: factor variables may not contain negative values". I know this occurs because I have dates before 1960 but I don't know how to solve this. I tried to create a numeric variable for each month using this loop:

    Code:
    gen date_num=0
    local i=0
    forval x=-60(1)38{
    replace date_num=`i' if date==`x'
    local i=`i'+1
    }
    but the regression gives me unexpected results.


    Thank you

  • #2
    Maybe something like the following. (Begin at the "Begin here" comment; the first part just creates a dummy dataset for illustration.)

    .ÿ
    .ÿversionÿ16.1

    .ÿ
    .ÿclearÿ*

    .ÿ
    .ÿsetÿseedÿ`=strreverse("1553579")'

    .ÿquietlyÿsetÿobsÿ50

    .ÿ
    .ÿgenerateÿbyteÿpidÿ=ÿ_n

    .ÿgenerateÿdoubleÿpid_uÿ=ÿrnormal()

    .ÿ
    .ÿquietlyÿexpandÿ`=2019ÿ-ÿ1955'

    .ÿbysortÿpid:ÿgenerateÿintÿyearÿ=ÿ1954ÿ+ÿ_n

    .ÿ
    .ÿquietlyÿexpandÿ12

    .ÿbysortÿpidÿyear:ÿgenerateÿbyteÿmonthÿ=ÿ_n

    .ÿ
    .ÿgenerateÿintÿmonÿ=ÿym(year,ÿmonth)

    .ÿdropÿyearÿmonth

    .ÿformatÿmonÿ%tm

    .ÿ
    .ÿgenerateÿdoubleÿoutÿ=ÿpid_uÿ+ÿrnormal()

    .ÿ
    .ÿ*
    .ÿ*ÿBeginÿhere
    .ÿ*
    .ÿsummarizeÿmon,ÿmeanonly

    .ÿgenerateÿintÿmo0ÿ=ÿmonÿ-ÿr(min)

    .ÿ
    .ÿquietlyÿxtregÿoutÿi.mo0,ÿi(pid)ÿfe

    .ÿ
    .ÿquietlyÿtestparmÿi.mo0

    .ÿdisplayÿinÿsmclÿasÿtextÿ"F("ÿr(df)ÿ",ÿ"ÿr(df_r)ÿ")ÿ=ÿ"ÿasÿresultÿ%04.2fÿr(F)
    F(767,ÿ37583)ÿ=ÿ1.00

    .ÿdisplayÿinÿsmclÿasÿtextÿ"Probÿ>ÿFÿ=ÿ"ÿasÿresultÿ%04.2fÿr(p)
    Probÿ>ÿFÿ=ÿ0.49

    .ÿ
    .ÿexit

    endÿofÿdo-file


    .

    Comment


    • #3
      A similar issue occurring with categorical variables containing negative values was raised a while ago:
      https://www.stata.com/statalist/arch.../msg00656.html

      The question however remains: if rebasing the time-series or recoding the categorical values doesn't change the results of the method, then why is it a requirement that the negative values may not be present?

      Why doesn't Stata automatically replace them with other codes then? Is that a precaution mechanism in line with "You have a negative value, it may be is a missing code, so deal with it?". Then why it doesn't apply to e.g. regress or graph pie?

      Comment


      • #4
        The question however remains: if rebasing the time-series or recoding the categorical values doesn't change the results of the method, then why is it a requirement that the negative values may not be present?
        See also https://www.statalist.org/forums/for...gative-numbers for a more recent related question. One obvious reason for me is the naming convention of matrices in Stata. See the following example:

        Code:
        sysuse auto, clear
        regress mpg weight i.rep78
        mat l e(b)
        Res.:

        Code:
        . mat l e(b)
        
        e(b)[1,7]
                                1b.          2.          3.          4.          5.            
                weight       rep78       rep78       rep78       rep78       rep78       _cons
        y1  -.00550304           0  -.47860434  -.47156229  -.59903186   2.0862757   38.059415
        Now, we would expect that a category coded -3 will have the column name -3.rep78 in matrix e(b). Let us try this out.

        Code:
        sysuse auto, clear
        regress mpg weight i.rep78
        mat b= e(b)
        mat colnames b=  weight 1b.rep78 2.rep78 -3.rep78 4.rep78 5.rep78 _cons
        Res.:

        Code:
        . mat colnames b=  weight 1b.rep78 2.rep78 -3.rep78 4.rep78 5.rep78 _cons
        -3:  operator invalid
        r(198);

        Comment


        • #5
          It's perhaps a side issue, but no one should be distracted by the loop in #1. Addition of a constant could be done without a loop by

          Code:
          gen date_num  = date + 60

          Comment


          • #6
            Originally posted by Andrew Musau View Post
            One obvious reason for me is the naming convention of matrices in Stata.
            Matrix naming conventions might have something to do with it, but naming conventions are probably not the full story, as any of

            Code:
            matrix colnames b = weight 1b.rep78 2.rep78 -3rep78 4.rep78 5.rep78 _cons
            matrix colnames b = weight 1b.rep78 2.rep78 -3_rep78 4.rep78 5.rep78 _cons
            matrix colnames b = weight 1b.rep78 2.rep78 "-3 rep78" 4.rep78 5.rep78 _cons
            matrix colnames b = weight 1b.rep78 2.rep78 -3:rep78 4.rep78 5.rep78 _cons
            work just fine with the example in #4. Instead, the example appears to be suspect to the same questionable rule that factor variables cannot contain negative numbers. Note that the error message refers to an operator (meaning Stata notices #.name as factor-variable-notation); the error does not refer to an invalid name.

            Best
            Daniel

            Comment


            • #7
              That's my point Daniel. None of your examples produces "-3.rep78" as a column name. I agree that this may not be the main issue and it may be a very easy fix, but I was just pointing out that since matrices existed before factor variables, an additional complication would have been to address this issue if the developers were to allow negative values in factor variables.

              Code:
              sysuse auto, clear
              version 8
              regress mpg weight i.rep78
              matrix input mymat = (1,2, 3\3,4, 5)
              mat colnames mymat = 1.c 2.c 3.c
              mat colnames mymat = 1.c -2.c 3.c
              Res.:

              Code:
              . version 8
              
              . 
              . regress mpg weight i.rep78
              factor-variable operators not allowed
              r(101);
              
              . 
              . matrix input mymat = (1,2, 3\3,4, 5)
              
              . 
              . mat colnames mymat = 1.c 2.c 3.c
              
              . 
              . mat colnames mymat = 1.c -2.c 3.c
              -2:  operator invalid
              r(198);
              Last edited by Andrew Musau; 17 May 2020, 03:46.

              Comment

              Working...
              X