Say I wanna extract this table on tourism stats
I wanna have each occurrence of flag count to 7 (that is, obs 1/7 should count 1 - 7, obs 10 - 16 should count 1 - 7), and so on and so forth, so I can keep only those values of flag that aren't missing. How would I do this? The strange data structure comes from a test I ran, inspired from this post, so I was curious what other applications "fileread" might have, and I figured wikipedia web-scraping would work as a great test case.
Code:
clear * set obs 1 gen s = fileread("https://en.wikipedia.org/wiki/Tourism_in_China") export delimited using myfile3.txt, replace import delim myfile3.txt, clear keep v1-v4 g flag = 1 if strpos(v1, "flagicon") keep in 133/283
Comment