Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Randomly appearing gzip headers

I have a long running script in a shared hosting environment that outputs a bunch of XML

Sometimes (only sometimes) a random GZIP header will appear in my output, and the output will be terminated.

For instance

0000000: 3c44 4553 435f 4c4f 4e47 3e3c 215b 4344  <DESC_LONG><![CD
0000010: 4154 415b 1fc2 8b08 0000 0000 0000 03c3  ATA[............
0000020: b3c3 8b57 c388 c38c 2b28 2d51 48c3 8bc3  ...W....+(-QH...
0000030: 8c49 5528 2e48 4dc3 8e4c c38b 4c4d c391  .IU(.HM..L..LM..
0000040: c3a3 0200 c291 4464 c383 1900 0000 0d0a  ......Dd........

or

0000000: 3c2f 5052 4f44 5543 543e 0d0a 1fc2 8b08  </PRODUCT>......
0000010: 0000 0000 0000 03c3 b3c3 8b57 c388 c38c  ...........W....
0000020: 2b28 2d51 48c3 8bc3 8c49 5528 2e48 4dc3  +(-QH....IU(.HM.
0000030: 8e4c c38b 4c4d c391 c3a3 0200 c291 4464  .L..LM........Dd
0000040: c383 1900 0000 0d0a                      ........

or

0000000: 3c4d 4544 4941 5f55 524c 3e2f 696d 6167  <MEDIA_URL>/imag
0000010: 6573 2f69 6d70 6f72 7465 642f 7374 6f63  es/imported/stoc
0000020: 6b5f 7072 6f64 3235 3339 365f 696d 6167  k_prod25396_imag
0000030: 655f 3531 3737 3439 3436 302e 6a70 673c  e_517749460.jpg<
0000040: 2f4d 4544 4941 5f55 1fc2 8b08 0000 0000  /MEDIA_U........
0000050: 0000 03c3 b3c3 8b57 c388 c38c 2b28 2d51  .......W....+(-Q
0000060: 48c3 8bc3 8c49 5528 2e48 4dc3 8e4c c38b  H....IU(.HM..L..
0000070: 4c4d c391 c3a3 0200 c291 4464 c383 1900  LM........Dd....
0000080: 0000 0d0a                                ....

The switch to GZIP does not seem to hit at any particular time og byte count, it can be after 1MB of data or after 15MB

The compiled blade template at the corresponding lines are as follows

<DESC_LONG><![CDATA[<?php echo $product->display_name; ?>]]></DESC_LONG>

-

</PRICES>
</PRODUCT>
<?php foreach($product->models()->get() as $model): ?>

-

<MEDIA_URL>/images/imported/<?php echo $picture->local_name; ?></MEDIA_URL>

I am at my wits end, I have tried the following:

  • Disable gzip on the server.
  • Run while(ob_get_level()){ ob_end_clean(); } before running the script
  • In .htaccess i have tried SetEnv no-gzip 1, SetEnv no-gzip dont-vary and various permutations thereof.

When I visit other pages, no gzip encoding or headers appear, so I'm thinking this is something with the output size or output buffer.

like image 271
Kristoffer Sall-Storgaard Avatar asked Feb 04 '14 09:02

Kristoffer Sall-Storgaard


3 Answers

Did you finally find out where these headers come from? I mean apache or php?

You can simulate xml generator scipt with something like:

echo file_get_contents('your_good_test.xml');

If you won't see any headers, I suggest to debug your xml generator. You can try to call header_remove(); before output.

If you see headers, you have to debug your web server. Try to disable gzip in apache by rewrite rule:

`RewriteRule . - [E=no-gzip:1]`

Whenever you have any proxy or balancer (nginx, squid, haproxy) you automaticly get one more firing line.

like image 176
Ostin Avatar answered Nov 15 '22 19:11

Ostin


your gziping is not related to server output that returns your main xml body. Otherwise the whole xml would be compressed.

These methods return GZIP sometimes because the source where these take the items is set to support gzip and are not asked properly.

$product->display_name
$product->models()->get()
$picture->local_name

Look inside these. - Check web calls for all places where headers are set. - temporally disable compression for database connection if any.

Add CDATA tags for all places where binary data could be returned to avoid main xml body building termination. Wait for an xml with bin data, Save bin data, unzip it and look what is inside. :-)

like image 20
Konstantin Ivanov Avatar answered Nov 15 '22 19:11

Konstantin Ivanov


This is more of a set of comments, but it is too long for the comment box.

First, this very likely NOT an output buffer issue. Even though <![CDATA[ and ]]> is not within PHP tags this doesn't mean that it doesn't pass through PHP's output buffer. To be clear, anything within a .php file will be placed in the PHP output buffer. The content within a .php file (including static content) is buffered outside of Apache and then passed back to Apache through this buffer when the script is finished. This means that your problem must lie within the code itself, which is a shot in the dark to solve without viewing the code.

My suggestions:

1) do a search within the script to find any instances of gz functions (gzcompress, gzdeflate, gzdecode, etc). I have seen scripts compress content if it was greater than a specific size and then decompress the content on the fly when taken from the DB. If that is the case you are likely dealing with a faulty comparison operation. In short, the logic within compression and decompression conditions is slightly off so it is failing to decompress SOME of the content.

2) do a search within the script to see how this data is fetched. Is it all from a database? Does any of it come from a stream? Is any of it fetched remotely? These questions might not directly lead to an answer but are vital. It can safely be assumed that these variables are being set with data already compressed when it shouldn't be. It requires knowing where/why/how the compression is taking place in order to answer why it is not being decompressed.

3) It matters greatly that it is working as expected on one system but not the other. The only times I have seen this happen was always due to differences in configuration. What operating system was your local machine using? What's the difference in local database (if any), what extensions might be missing/present on one or the other, possibly causing a function to fall back on different procedure on the two different machines.

EDIT: Also, and this is a small chance, but are you dealing with data that originated from an SQL dump from a different server? You said it works on your local host but not on a different host, so we know your dealing with two machines. Was there a third at some point? If so, it might have been compressed using a mismatched version/form of compression, or might be an issue with encoding.

like image 1
JSON Avatar answered Nov 15 '22 20:11

JSON