I've just make excises of gzip on python.
import gzip f=gzip.open('Onlyfinnaly.log.gz','rb') file_content=f.read() print file_content
And I get no output on the screen. As a beginner of python, I'm wondering what should I do if I want to read the content of the file in the gzip file. Thank you.
open(filename, mode) is an alias for gzip. GzipFile(filename, mode) . I prefer the former, as it looks similar to with open(...) as f: used for opening uncompressed files. To read the entire file, simply use f.
Spark document clearly specify that you can read gz file automatically: All of Spark's file-based input methods, including textFile, support running on directories, compressed files, and wildcards as well. For example, you can use textFile("/my/directory"), textFile("/my/directory/. txt"), and textFile("/my/directory/.
Try gzipping some data through the gzip libary like this...
import gzip content = "Lots of content here" f = gzip.open('Onlyfinnaly.log.gz', 'wb') f.write(content) f.close()
... then run your code as posted ...
import gzip f=gzip.open('Onlyfinnaly.log.gz','rb') file_content=f.read() print file_content
This method worked for me as for some reason the gzip library fails to read some files.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With