Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rendering very large HTML file in-browser?

Tags:

html

browser

I'm trying to learn Python by working on a fun project - a Facebook message analyzer. I've downloaded my data off Facebook, which includes a set of html files. One of these - messages.htm - contains all of my messages. My goal is to take this html file and parse it out to output fun data like most common word, # of messages, etc.

The problem is that my messages.htm file is 270MB. I can inspect it fine in vim, but there's interesting patterns in the file and I'd like to compare the html code with how it's actually rendered properly on a browser so I can compare the code with the visuals and get a better sense of what's going on. But when I try to open this file in Firefox, FF crashes. I can open it in Chrome, but it just starts loading all the messages, and ~10 minutes in it hasn't even fully loaded one single message thread no matter how tiny the scroll bar gets. So this isn't feasible.

Is it even possible to fully render such a large and long HTML file?

like image 464
ShaneOH Avatar asked Jul 06 '15 08:07

ShaneOH


1 Answers

You can use lynx which is a text based browser to view a large html file. I have a 139M html file and I was able to view it very easily using lynx. lynx divides the entire document into pages and is able to load any given page very quickly. It also supports hyper-linking, so navigating within the html document (which was my use case) worked like a charm.

like image 177
ignite Avatar answered Sep 23 '22 06:09

ignite