Say I wanna extract this table on tourism stats
I wanna have each occurrence of flag count to 7 (that is, obs 1/7 should count 1 - 7, obs 10 - 16 should count 1 - 7), and so on and so forth, so I can keep only those values of flag that aren't missing. How would I do this? The strange data structure comes from a test I ran, inspired from this post, so I was curious what other applications "fileread" might have, and I figured wikipedia web-scraping would work as a great test case.
Code:
clear *
set obs 1
gen s = fileread("https://en.wikipedia.org/wiki/Tourism_in_China")
export delimited using myfile3.txt, replace
import delim myfile3.txt, clear
keep v1-v4
g flag = 1 if strpos(v1, "flagicon")
keep in 133/283

Comment