When using cmdlet InvokeWebRequest against some web with non-english characters, I see no way of defining the encoding of the response / page content.
I use simple GET on http://colours.cz/ucinkujici/ and names of those artists are corrupted. You can try it with this simple line:
Invoke-WebRequest http://colours.cz/ucinkujici
Is this caused by design of the cmdlet? Can I specify encoding somwhere somehow? Is there any workaround to get properly parsed response?
It seems to me you are correct :/
Here is one way to get the content right, by saving the response to a file first and then reading it into a variable with the correct encoding. however, you are not dealing with a HtmlWebResponseObject
:
Invoke-WebRequest http://colours.cz/ucinkujici -outfile .\colours.cz.txt
$content = gc .\colours.cz.txt -Encoding utf8 -raw
This will get you equally far:
[net.httpwebrequest]$httpwebrequest = [net.webrequest]::create('http://colours.cz/ucinkujici/')
[net.httpWebResponse]$httpwebresponse = $httpwebrequest.getResponse()
$reader = new-object IO.StreamReader($httpwebresponse.getResponseStream())
$content = $reader.ReadToEnd()
$reader.Close()
Should you really want such a HtmlWebResponseObject
, here is a way to get e.g. stuff from ParsedHtml
more or less "readable" with Invoke-WebRequest
($bad
vs. $better
):
Invoke-WebRequest http://colours.cz/ucinkujici -outvariable htmlwebresponse
$bad = $htmlwebresponse.parsedhtml.title
$better = [text.encoding]::utf8.getstring([text.encoding]::default.GetBytes($bad))
$bad = $htmlwebresponse.links[7].outerhtml
$better = [text.encoding]::utf8.getstring([text.encoding]::default.GetBytes($bad))
Update: Here is a new take on this, knowing you want to work with ParsedHtml
.
Once you have your content (see first 2-line snippet which 1) saves response to file and then 2) 'reads' the file content with the correct encoding), you can do this:
$ParsedHtml = New-Object -com "HTMLFILE"
$ParsedHtml.IHTMLDocument2_write($content)
$ParsedHtml.Close()
Et voilà :] E.g. $ParsedHtml.title
now shows correctly, guessing the rest will be OK as well…
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With