Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Descriptive analysis with panel data

    Hello everyone,

    I need to work with panel data.
    At the moment I make the descriptive analysis and I am not sure if I can use the "normal" commands
    Code:
    tabstat AGE INCOME SEX DEPOSIT, statistics( mean sd min p25 p50 p75 p90 max)
    pwcorr DEPOSIT AGE INCOME SEX , star(.1)
    It would be great to get a response!
    Thank you.
    Kind regards,
    Lisa

  • #2
    Much depends on what you want precisely. tabstat certainly permits by: or by() to allow panelwise summaries.

    For another example, pooling panels in a correlation might make sense scientifically, but the panel and dependence structure will usually make nonsense of P-values.

    Conversely, pooling panels with very different magnitudes will often produce pure garbage.

    It's hard to generalise. Here's one arbitrary example.

    Code:
    webuse grunfeld, clear
    
    * ssc inst sepscatter assumed 
    
    sepscatter invest mvalue, sep(company) ysc(log) xsc(log) ///
    legend(pos(3) col(1)) xla(50 100 200 500 1000 2000 5000) ///
    yla(1000 500 200 100 50 20 10 5 2, ang(h))
    Click image for larger version

Name:	sepscatter.png
Views:	1
Size:	15.9 KB
ID:	1312856


    The pooled pattern seems to make some sense to me, but the structure of between- and within- panel variation is essential to interpretation. pwcorr will not adjust P-values (you show interest in the horrible practice of starring, so N.B.) and as said the P-values can't be trusted but the correlations are still summary measures.

    Comment


    • #3
      But following your example. It is allowed to us
      Code:
      pwcorr invest mvalue kstock
      And I can still interpret it normally? Or should I just leave it at all?

      Comment


      • #4
        If "normally" means take the P-values literally, or even seriously, then absolutely not, as I flagged.

        Correlations obey the Cauchy-Schwarz inequality regardless of dependence structure, so magnitudes mean just about they usually mean, with the caveat of the example that "overall correlation" could just be an artefact of aggregation.

        Comment


        • #5
          ok, now I understand it.
          But is it possibly to solve the problem if I just use the command pwcorr for just one period? (so I don't have different points in time any more)
          And how can I solve the problem that I do need the descriptive statistics of my panel data for the analysis ( mean sd min p25 p50 p75 p90 max)?
          thanks in advance
          lisa

          Comment


          • #6
            Sorry, I don't understand what you are asking about tabstat that I haven't explained. You may need to show an example.

            A cross-section is a cross-section. What it means depends on the panel design. There is no problem of time dependence for a single time. There might be other kinds of dependence, e.g. spatial.

            Comment


            • #7
              ok, sorry I am so stressed, I read over your explanation for tabstat.
              I will try it!
              Thank you very much, nick!
              This is such a big help for sure!

              Comment

              Working...
              X