Hello dear internauts!
I want to extract the country distribution from: https://www.etf.com/URTH - "URTH Top 10 Countries" halfway down the page. I intend to expand to other ETFs thereafter: So far, the only things I can imagine are ugly:
=IF(NOT(ISBLANK($A$1)),(importxml(concatenate("https://www.etf.com/",$A$1), "(//label[normalize-space()='United States']//following::span[@id=''][1])")),"")
=IF(NOT(ISBLANK($A$1)),index(IMPORTXML(concatenate("https://www.etf.com/",$A$1,"#overview"), "html"),1,2))
Other things I've tried: Copying the xpath / full xpath (with and without javascript disabled). As well as trying to create my own xpath (been a few hours :D)
One added hurdled I see is that, with javascript off, there aren't many IDs, and there are other lists nearby with the same @class structure.
Desired result: United States 68.34% Japan 6.63% United Kingdom 4.29% etc. Each country in a single or two cells (country name / percentage).
Some insight in how you proceeded to build the functioning query would be really welcome. If there is a completely different approach, I'm all ears!
Note: A1=URTH
use:
=ARRAYFORMULA(TRIM(QUERY(IMPORTXML("https://www.etf.com/"&A1,
"//div[@class='rowText col-md-12 col-sm-12 col-xs-12']"),
"limit 10", 0)))

if you want to work with those numbers:
=INDEX(QUERY(IMPORTXML("https://www.etf.com/"&A1,
"//div[@class='rowText col-md-12 col-sm-12 col-xs-12']"),
"limit 10", 0))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With