I tried to read a REST API, which is gzip encoded. To be exact, I tried to read the StackExchange API.
I already found the question Automatically Decode GZIP In TRESTResponse?, but that answer doesn't solve my issue for some reason.
Test setup
In XE5, I added a TRestClient, a TRestRequest and a TRestResponse with the following relevant properties. I set the BaseURL of the client, the resource and parameters of the request, and I set AcceptEncoding
of the request to gzip, deflate
, which should make it automatically decode gzipped responses.
object RESTClient1: TRESTClient
BaseURL = 'https://api.stackexchange.com/2.2'
end
object RESTRequest1: TRESTRequest
AcceptEncoding = 'gzip, deflate'
Client = RESTClient1
Params = <
item
Kind = pkURLSEGMENT
name = 'id'
Options = [poAutoCreated]
Value = '511529'
end
item
name = 'site'
Value = 'stackoverflow'
end>
Resource = 'users/{id}'
Response = RESTResponse1
end
object RESTResponse1: TRESTResponse
end
This results in the url:
https://api.stackexchange.com/2.2/users/511529?site=stackoverflow
I invoke the request like this, with two message boxes to show the url and the outcome of the request:
ShowMessage(RESTRequest1.GetFullRequestURL());
RESTRequest1.Execute; // Actual call
ShowMessage(RESTResponse1.Content);
If I call that url in a browser, I get a proper result, which is a json object with some of my user information in it.
Problem
However, in Delphi, I don't get the JSON response. In fact, I get a bunch of bytes which seems to be a mangled gzip response. I tried to decompress it with TIdCompressorZlib.DecompressGZipStream()
, but it fails with a ZLib Error (-3)
. When I inspect the bytes of the response myself, I see it starts with #1F#3F#08. This is especially weird, since the gzip header should be #1F#8B#08, so #8B is transformed into #3F, which is a question mark.
So it seems to me like the RESTClient has attempted to decode the gzip stream as if it was a UTF-8 response, and has replaced invalid sequences (#8B is in itself not a valid UTF-8 character) with a question mark.
Attempts (superficial)
I've done quite some experimenting, like
Unfortunately it still doesn't work and I still get a mangled response.
Attemps (digging into the VCL)
Eventually, I dug a little deeper, and dove into TRestRequest.Execute. I won't paste all the code here, but eventually it performs the request by calling
FClient.HTTPClient.Get(LURL, LResponseStream);
FClient is the TRESTClient that is linked to the request and LResponseStream is a TMemoryStream. I added LResponseStream.SaveToFile('...')
to the watches, so it would save this unprocessed result, et voilá, it gave me a valid gz file, which I could decompress to get my JSON.
A bug in the work-around?
But then, a couple of lines down, I see this piece of code:
if FClient.HTTPClient.Response.CharSet > '' then
begin
LResponseStream.Position := 0;
S := FClient.HTTPClient.ReadStringAsCharset(LResponseStream, FClient.HTTPClient.Response.CharSet);
LResponseStream.Free;
LResponseStream := TStringStream.Create(S);
end;
According to the comment above this block, this is done because the contents of the memory stream are "NOT encoded accordingly to a possibly present Encoding or Content-Type Charset parameter", which is considered a bug in Indy by the writer of this VCL code.
So basically, what happens here: the raw response is treated as a string and converted to the 'right' encoding. FClient.HTTPClient.Response.CharSet is 'UTF-8', which is indeed the encoding of the JSON, but unfortunately, this conversion should only be done after decompressing the stream, which isn't done yet. So this is considered a bug by me. ;)
I tried to dig deeper, but I couldn't find the place where this decompression should have taken place. The actual request is performed by an IIPHTTP instance, which is IPPeerAPI.dcu of which I don't have the source.
So...
So my question is twofold:
My setup: VCL Forms application, Windows 8.1, Delphi XE5 professional Update 2.
Update
Remy Lebeau's input in his answer to this question as well as his comment to the answer in the question Automatically Decode GZIP In TRESTResponse? put me on the right track.
Like he said, setting AcceptEncoding doesn't suffice, because the TIdHTTP that performs the actual request doesn't have a decompressor attached, so it can't decompress the gzip response. Based on the sparse resources, I got the idea that setting AcceptEncoding would automatically decompress the response too, but that idea was wrong.
Still, leaving AcceptEncoding empty doesn't work either in this case, since the API this is all about, which is the StackExchange API, is always compressed, regardless whether you specify that you accept gzip or not.
So the combination of a) an always compressed response, b) an HTTP client that cannot decompress and c) a TRESTRequest object that -incorrectly- assumed that the response is already properly decompressed together lead to this situation.
I see only two solutions, the first being to discard TRESTClient altogether and just perform the request with a plain TIdHTTP. A pity, since my goal was to explore the possibilities of the new REST components to see how they can make life easier.
So the other solution is to assign a compressor to the TIdHTTP that is used internally.
I managed to succeed, although unfortunately it undoes a lot of the abstraction that the TREST components are trying to introduce. This is the code that solves it:
var
Http: TIdCustomHTTP;
begin
// Get the TIdHTTP that performs the request.
Http := (RESTRequest1 // The TRESTRequest object
.Client // The TRESTClient
.HTTPClient // A TRESTHTTP object that wraps HTTP communication
.Peer // An IIPHTTP interface which is obtained through PeerFactory.CreatePeer
.GetObject // A method to get the object instance of the interface
as TIdCustomHTTP // The object instance, which is an TIdCustomHTTP.
);
// Attach a gzip decompressor to it.
Http.Compressor := TIdCompressorZLib.Create(Http);
After this, I can use the RESTRequest1 component to successfully fetch the JSON response (at least as text).
AcceptEncoding = 'gzip, deflate'
This is the root of your problem. You are manually telling the server that the response is allowed to be gzip encoded, but as far as I can see in the REST source code, the underlying TIdHTTP
object that TRESTClient
uses internally does not have a gzip decompressor assigned to it (even if it had one, assigning AcceptEncoding
manually would still be wrong, because TIdHTTP
sets up its own Accept-Encoding
header if a decompressor is assigned). I commented on that in the other question you linked to. So TIdHTTP
ends up returning the raw gzip bytes without decoding them, and then TRESTClient
converts them as-is to a charset-decoded UnicodeString
(since you are reading the Content
property). That is why you are seeing the bytes getting messed up.
You need to get rid of the AcceptEncoding
assignment.
Why does this happen?
Because TRestClient
does not assign a gzip decompressor to its internal TIdHTTP
object, but you are tricking the server into thinking it did.
should automatically decode the gzip stream when you set AcceptEncoding to 'gzip, deflate'
No, because there is no decompressor assigned.
Update: that being said, I would probably just drop TRESTClient
and use TIdHTTP
directly. The following works for me when I try it:
var
HTTP: TIdHTTP;
JSON: string;
begin
HTTP := TIdHTTP.Create;
try
HTTP.Compressor := TIdCompressorZLib.Create(HTTP);
// starting with SVN rev 5224, the TIdHTTP.IOHandler property no longer
// needs to be explicitly set in order to request HTTPS urls. TIdHTTP
// now creates a default SSLIOHandler internally if needed. But if you
// are using an older release, you will have to assign the IOHandler...
//
// HTTP.IOHandler := TIdSSLIOHandlerSocketOpenSSL.Create(HTTP);
//
JSON := HTTP.Get('https://api.stackexchange.com/2.2/users/511529?site=stackoverflow');
finally
Http.Free;
end;
ShowMessage(JSON);
end;
Displays:
{"items":[{"badge_counts":{"bronze":96,"silver":53,"gold":4},"account_id":240984,"is_employee":false,"last_modified_date":1419235802,"last_access_date":1419293282,"reputation_change_year":15259,"reputation_change_quarter":2983,"reputation_change_month":1301,"reputation_change_week":123,"reputation_change_day":0,"reputation":61014,"creation_date":1290042241,"user_type":"registered","user_id":511529,"accept_rate":100,"location":"Netherlands","website_url":"http://www.eftepedia.nl","link":"https://stackoverflow.com/users/511529/goleztrol","display_name":"GolezTrol","profile_image":"https://www.gravatar.com/avatar/b07c67edfcc5d1496365503712de5c2a?s=128&d=identicon&r=PG"}],"has_more":false,"quota_max":300,"quota_remaining":295}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With