Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to create out of DOM elements in Web Worker?

Context: I have a web application that processes and shows huge log files. They're usually only about 100k lines long, but it can be up to 4 million lines or more. To be able to scroll through that log file (both user initiated and via JavaScript) and filter the lines with decent performance I create a DOM element for each line as soon as the data arrives (in JSON via ajax). I found this better for performance then constructing the HTML at the back-end. Afterwards I save the elements in an array and I only show the lines that are visible.

For max 100k lines this takes only about a few seconds, but anything more takes up to one minute for 500k lines (not including the download). I wanted to improve the performance even more, so I tried using HTML5 Web Workers. The problem now is that I can't create elements in a Web Worker, not even outside the DOM. So I ended up doing only the json to HTML conversion in the Web Workers and send the result to the main thread. There it is created and stored in an array. Unfortunately this worsened the performance and now it takes at least 30 seconds more.

Question: So is there any way, that I'm not aware of, to create DOM elements, outside the DOM tree, in a Web Worker? If not, why not? It seems to me that this can't create concurrency problems, as creating the elements could happen in parallel without problems.

like image 578
Joren Van Severen Avatar asked Aug 05 '13 11:08

Joren Van Severen


7 Answers

Alright, I did some more research with the information @Bergi provided and found the following discussion on W3C mailing list:

http://w3-org.9356.n7.nabble.com/Limited-DOM-in-Web-Workers-td44284.html

And the excerpt that answers why there is no access to the XML parser or DOM parser in the Web Worker:

You're assuming that none of the DOM implementation code uses any sort of non-DOM objects, ever, or that if it does those objects are fully threadsafe. That's just not not the case, at least in Gecko.

The issue in this case is not the same DOM object being touched on multiple threads. The issue is two DOM objects on different threads both touching some global third object.

For example, the XML parser has to do some things that in Gecko can only be done on the main thread (DTD loading, offhand; there are a few others that I've seen before but don't recall offhand).

There is however also a workaround mentioned, which is using a third-party implementation of the parsers, of which jsdom is an example. With this you even have access to your own separate Document.

like image 105
Joren Van Severen Avatar answered Oct 23 '22 10:10

Joren Van Severen


So is there any way, that I'm not aware of, to create DOM elements, outside the DOM tree, in a Web Worker?

No.

Why not? It seems to me that this can't create concurrency problems, as creating the elements could happen in parallel without problems.

Not for creating them, you're right. But for appending them to the main document - they would need to be sent to a different memory (like it's possible for blobs) so that they're inaccessible from the worker thereafter. However, there's absolutely no Document handling available in WebWorkers.

I create a DOM element for each line as soon as the data arrives (in JSON via ajax). Afterwards I save the elements in an array and I only show the lines that are visible.

Constructing over 500k DOM elements is the heavy task. Try to create DOM elements only for the lines that are visible. To improve performance and showing the first few lines faster, you also might chunk their processing into smaller units and use timeouts in between. See How to stop intense Javascript loop from freezing the browser

like image 20
Bergi Avatar answered Oct 23 '22 09:10

Bergi


You have to understand the nature of a webworker. Programming with threads is hard, especially if you're sharing memory; weird things can happen. JavaScript is not equipped to deal with any kind of thread-like interleaving.

The approach of webworkers is that there is no shared memory. This obviously leads to the conclusion that you can't access the DOM.

like image 4
Halcyon Avatar answered Oct 23 '22 09:10

Halcyon


There is no direct way to access the DOM through Web Workers. I recently released @cycle/sandbox, it is still WIP, but it proves with the Cycle JS architecture it is fairly straight forward to declare UI behaviour in the Web Worker. The actual DOM is only touched in the main thread, but event listeners, and DOM updates are indirectly declared in the worker, and a synthesized event object is sent when something happens on those listeners. Furthermore it is straight forward to mount these sandboxed Cycle Components side-by-side regular Cycle Components.

http://github.com/aronallen/-cycle-sandbox/

like image 4
Aron Allen Avatar answered Oct 23 '22 10:10

Aron Allen


I don't see any reason why you can't construct html strings using web-workers. But I also don't think there would be much of a performance boost.

This isn't related to Web-Workers, but it relates to the problem you're trying to solve. Here are some thing that might help speed things up:

  1. Use DocumentFragments. Add elements to them as the data comes in, and add the fragments to the DOM at an interval (like once a second). This way you don't have to touch the DOM (and incur a redraw) every time a line of text is loaded.

  2. Do loading in the background, and only parse the lines as the user hits the bottom of the scroll area.

like image 3
posit labs Avatar answered Oct 23 '22 08:10

posit labs


According to https://developer.mozilla.org/en-US/docs/Web/Guide/Performance/Using_web_workers there's no access to the DOM from a web worker unfortunately.

like image 1
Strille Avatar answered Oct 23 '22 09:10

Strille


You have a couple of anti-patterns in your design:

  1. Creating a DOM object has considerable overhead, and you are creating potentially millions of them at once.
  2. Trying to get a web worker to manage the DOM is exactly what web workers are not for. They do everything else so the DOM event loop stays responsive.

You can use a cursor pattern to scroll through arbitrarily large sets of data.

  1. DOM posts a message to worker with start position and number of lines requested (cursor).
  2. Web worker random accesses logs, posts back the fetched lines (cursor data).
  3. DOM updates an element with the async cursor response event.

This way, the heavy lifting is done by the worker, whose event loop is blocked during the fetch instead of the DOM, resulting in happy non-blocked users marvelling at how smooth all your animations are.

like image 1
Dominic Cerisano Avatar answered Oct 23 '22 09:10

Dominic Cerisano