Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GZIPInputStream to String

I am attempting to convert the gzipped body of a HTTP response to plaintext. I've taken the byte array of this response and converted it to a ByteArrayInputStream. I've then converted this to a GZIPInputStream. I now want to read the GZIPInputStream and store the final decompressed HTTP response body as a plaintext String.

This code will store the final decompressed contents in an OutputStream, but I want to store the contents as a String:

public static int sChunk = 8192; ByteArrayInputStream bais = new ByteArrayInputStream(responseBytes); GZIPInputStream gzis = new GZIPInputStream(bais); byte[] buffer = new byte[sChunk]; int length; while ((length = gzis.read(buffer, 0, sChunk)) != -1) {         out.write(buffer, 0, length); } 
like image 492
Matt Avatar asked Sep 02 '10 13:09

Matt


People also ask

How to read from GZIPInputStream in java?

Decode bytes from an InputStream, you can use an InputStreamReader. A BufferedReader will allow you to read your stream line by line. Assuming the gzipped content is text, and not binary data. The content is text only.

How do I unzip a gzip string in Java?

byte[] compressed = compress(string); //In the main method public static byte[] compress(String str) throws Exception { ... ... return obj. toByteArray(); } public static String decompress(byte[] bytes) throws Exception { ... GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(bytes)); ... }


2 Answers

To decode bytes from an InputStream, you can use an InputStreamReader. Then, a BufferedReader will allow you to read your stream line by line.

Your code will look like:

ByteArrayInputStream bais = new ByteArrayInputStream(responseBytes); GZIPInputStream gzis = new GZIPInputStream(bais); InputStreamReader reader = new InputStreamReader(gzis); BufferedReader in = new BufferedReader(reader);  String readed; while ((readed = in.readLine()) != null) {     System.out.println(readed); } 
like image 192
Vivien Barousse Avatar answered Oct 02 '22 23:10

Vivien Barousse


You should rather have obtained the response as an InputStream instead of as byte[]. Then you can ungzip it using GZIPInputStream and read it as character data using InputStreamReader and finally write it as character data into a String using StringWriter.

String body = null; String charset = "UTF-8"; // You should determine it based on response header.  try (     InputStream gzippedResponse = response.getInputStream();     InputStream ungzippedResponse = new GZIPInputStream(gzippedResponse);     Reader reader = new InputStreamReader(ungzippedResponse, charset);     Writer writer = new StringWriter(); ) {     char[] buffer = new char[10240];     for (int length = 0; (length = reader.read(buffer)) > 0;) {         writer.write(buffer, 0, length);     }     body = writer.toString(); }  // ... 

See also:

  • Java IO tutorial
  • How to use URLConnecion to fire/handle HTTP requests

If your final intent is to parse the response as HTML, then I strongly recommend to just use a HTML parser for this like Jsoup. It's then as easy as:

String html = Jsoup.connect("http://google.com").get().html(); 
like image 30
BalusC Avatar answered Oct 03 '22 00:10

BalusC