Hi All,
I have a variable that contains a 200-500 word description of individuals perspective on leadership. Many observations contain unwanted characters (e.g., # or %) and proper nouns. I'm using this data in a correlated topic model and would like to remove both the odd characters (anything that is not a letter) and all proper nouns. Any direction would be greatly appreciated!
I have a variable that contains a 200-500 word description of individuals perspective on leadership. Many observations contain unwanted characters (e.g., # or %) and proper nouns. I'm using this data in a correlated topic model and would like to remove both the odd characters (anything that is not a letter) and all proper nouns. Any direction would be greatly appreciated!
Comment