Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Text analysis - amount of words + stop words removal

    I would like to perform an analysis of several texts. One step is the stop word removal. So, I need to count the overall amount of words and the amount of stop words. I know that stop word removal is possible with the command txttool. However, I do not understand how I can use it and which stop words are removed. The texts are in 4 languages. English, French, Spanish and German. Does Stata provide this option? Or is there a better software to use for text analysis? I have figured out how to insert the text in a variable. This was possible with Wordstat. It creates one variable with the file name and one variable "Document" with the file text. So my idea is to make use of this variable containing the whole text and to analyse the text by performing commands that create further variables.

    Thank you

  • #2
    Just to underline, txttool is s user-written program.

    That said, as the help files show, the stopwords can be defined by you.
    Best regards,

    Marcos

    Comment


    • #3
      Thank you, Marcos.

      Comment

      Working...
      X