I have a wasm process (compiled from c++) that processes data inside a web application. Let's say the necessary code looks like this: <pre class="prettyprint"><code>std::vector<JSONObject> data for (size_t i = 0; i < data.size(); i++) { process_data(data[i]); if (i % 1000 == 0) { bool is_cancelled = check_if_cancelled(); if (is_cancelled) { break; } } } </code></pre> This code basically "runs/processes a query" similar to a SQL query interface: <img src="https://i.stack.imgur.com/WKPws.png" alt="enter image description here"> However, queries may take several minutes to run/process and at any given time the user may cancel their query. The cancellation process would occur in the normal javascript/web application, outside of the service Worker running the wasm. My question then is what would be an example of how we could know that the user has clicked the 'cancel' button and communicate it to the wasm process so that knows the process has been cancelled so it can exit? Using the <code>worker.terminate()</code> is not an option, as we need to keep all the loaded data for that <code>worker</code> and cannot just kill that worker (it needs to stay alive with its stored data, so another query can be run...). What would be an example way to communicate here between the javascript and worker/wasm/c++ application so that we can know when to exit, and how to do it properly? Additionally, let us suppose a typical query takes 60s to run and processes 500MB of data in-browser using cpp/wasm. <hr> Update: I think there are the following possible solutions here based on some research (and the initial answers/comments below) with some feedback on them: <ol> <li>Use two workers, with one worker storing the data and another worker processing the data. In this way the processing-worker can be terminated, and the data will always remain. Feasible? Not really, as it would take way too much time to copy over ~ 500MB of data to the webworker whenever it starts. This could have been done (previously) using SharedArrayBuffer, but its support is now quite limited/nonexistent due to some security concerns. Too bad, as this seems like by far the best solution if it were supported...</li> <li>Use a single worker using Emterpreter and using <code>emscripten_sleep_with_yield</code>. Feasible? No, destroys performance when using Emterpreter (mentioned in the docs above), and slows down all queries by about 4-6x.</li> <li>Always run a second worker and in the UI just display the most recent. Feasible? No, would probably run into quite a few OOM errors if it's not a shared data structure and the data size is 500MB x 2 = 1GB (500MB seems to be a large though acceptable size when running in a modern desktop browser/computer).</li> <li>Use an API call to a server to store the status and check whether the query is cancelled or not. Feasible? Yes, though it seems quite heavy-handed to long-poll with network requests every second from every running query.</li> <li>Use an incremental-parsing approach where only a row at a time is parsed. Feasible? Yes, but also would require a tremendous amount of re-writing the parsing functions so that every function supports this (the actual data parsing is handled in several functions -- filter, search, calculate, group by, sort, etc. etc.</li> <li>Use IndexedDB and store the state in javascript. Allocate a chunk of memory in WASM, then return its pointer to JavaScript. Then read database there and fill the pointer. Then process your data in C++. Feasible? Not sure, though this seems like the best solution if it can be implemented. </li> <li>[Anything else?]</li> </ol> In the bounty then I was wondering three things: <ol> <li>If the above six analyses seem generally valid?</li> <li>Are there other (perhaps better) approaches I'm missing? </li> <li>Would anyone be able to show a very basic example of doing #6 -- seems like that would be the best solution if it's possible and works cross-browser.</li> </ol>

<h3>Shared Thread</h3> Since the worker and the C++ function that it called share the same thread, the worker will also be blocked until the C++ loop is finished, and won't be able to handle any incoming messages. I think the a solid option would minimize the amount of time that the thread is blocked by instead initializing one iteration at a time from the main application. It would look something like this. <pre class="prettyprint"><code>main.js -> worker.js -> C++ function -> worker.js -> main.js </code></pre> <h3>Breaking up the Loop</h3> Below, C++ has a variable initialized at 0, which will be incremented at each loop iteration and stored in memory. C++ function then performs one iteration of the loop, increments the variable to keep track of loop position, and immediately breaks. <pre class="prettyprint"><code>int x; x = 0; // initialized counter at 0 std::vector<JSONObject> data for (size_t i = x; i < data.size(); i++) { process_data(data[i]); x++ // increment counter break; // stop function until told to iterate again starting at x } </code></pre> Then you should be able to post a message to the web worker, which then sends a message to main.js that the thread is no longer blocked. <h3>Canceling the Operation</h3> From this point, main.js knows that the web worker thread is no longer blocked, and can decide whether or not to tell the web worker to execute the C++ function again (with the C++ variable keeping track of the loop increment in memory.) <pre class="prettyprint"><code>let continueOperation = true // here you can set to false at any time since the thread is not blocked here worker.expensiveThreadBlockingFunction() // results in one iteration of the loop being iterated until message is received below worker.onmessage = function(e) { if (continueOperation) { worker.expensiveThreadBlockingFunction() // execute worker function again, ultimately continuing the increment in C++ } { return false // or send message to worker to reset C++ counter to prepare for next execution } } </code></pre> <h3>Continuing the Operation</h3> Assuming all is well, and the user has not cancelled the operation, the loop should continue until finished. Keep in mind you should also send a distinct message for whether the loop has completed, or needs to continue, so you don't keep blocking the worker thread.

How to cancel a wasm process from within a webworker

Tags:

c++

javascript

web-worker

webassembly

emscripten

I have a wasm process (compiled from c++) that processes data inside a web application. Let's say the necessary code looks like this:

Click to copy

std::vector<JSONObject> data
for (size_t i = 0; i < data.size(); i++)
{
    process_data(data[i]);

    if (i % 1000 == 0) {
        bool is_cancelled = check_if_cancelled();
        if (is_cancelled) {
            break;
        }
    }

}

This code basically "runs/processes a query" similar to a SQL query interface:

enter image description here

However, queries may take several minutes to run/process and at any given time the user may cancel their query. The cancellation process would occur in the normal javascript/web application, outside of the service Worker running the wasm.

My question then is what would be an example of how we could know that the user has clicked the 'cancel' button and communicate it to the wasm process so that knows the process has been cancelled so it can exit? Using the worker.terminate() is not an option, as we need to keep all the loaded data for that worker and cannot just kill that worker (it needs to stay alive with its stored data, so another query can be run...).

What would be an example way to communicate here between the javascript and worker/wasm/c++ application so that we can know when to exit, and how to do it properly?

Additionally, let us suppose a typical query takes 60s to run and processes 500MB of data in-browser using cpp/wasm.

Update: I think there are the following possible solutions here based on some research (and the initial answers/comments below) with some feedback on them:

Use two workers, with one worker storing the data and another worker processing the data. In this way the processing-worker can be terminated, and the data will always remain. Feasible? Not really, as it would take way too much time to copy over ~ 500MB of data to the webworker whenever it starts. This could have been done (previously) using SharedArrayBuffer, but its support is now quite limited/nonexistent due to some security concerns. Too bad, as this seems like by far the best solution if it were supported...
Use a single worker using Emterpreter and using emscripten_sleep_with_yield. Feasible? No, destroys performance when using Emterpreter (mentioned in the docs above), and slows down all queries by about 4-6x.
Always run a second worker and in the UI just display the most recent. Feasible? No, would probably run into quite a few OOM errors if it's not a shared data structure and the data size is 500MB x 2 = 1GB (500MB seems to be a large though acceptable size when running in a modern desktop browser/computer).
Use an API call to a server to store the status and check whether the query is cancelled or not. Feasible? Yes, though it seems quite heavy-handed to long-poll with network requests every second from every running query.
Use an incremental-parsing approach where only a row at a time is parsed. Feasible? Yes, but also would require a tremendous amount of re-writing the parsing functions so that every function supports this (the actual data parsing is handled in several functions -- filter, search, calculate, group by, sort, etc. etc.
Use IndexedDB and store the state in javascript. Allocate a chunk of memory in WASM, then return its pointer to JavaScript. Then read database there and fill the pointer. Then process your data in C++. Feasible? Not sure, though this seems like the best solution if it can be implemented.
[Anything else?]

In the bounty then I was wondering three things:

If the above six analyses seem generally valid?
Are there other (perhaps better) approaches I'm missing?
Would anyone be able to show a very basic example of doing #6 -- seems like that would be the best solution if it's possible and works cross-browser.

869

asked Aug 05 '19 20:08

David542

2 Answers

For Chrome (only) you may use shared memory (shared buffer as memory). And raise a flag in memory when you want to halt. Not a big fan of this solution (is complex and is supported only in chrome). It also depends on how your query works, and if there are places where the lengthy query can check the flag.

Instead you should probably call the c++ function multiple times (e.g. for each query) and check if you should halt after each call (just send a message to the worker to halt).

What I mean by multiple time is make the query in stages (multiple function cals for a single query). It may not be applicable in your case.

Regardless, AFAIK there is no way to send a signal to a Webassembly execution (e.g. Linux kill). Therefore, you'll have to wait for the operation to finish in order to complete the cancellation.

I'm attaching a code snippet that may explain this idea.

Click to copy

worker.js:

... init webassembly

onmessage = function(q) {
	// query received from main thread.
	const result = ... call webassembly(q);
	postMessage(result);
}

main.js:

const worker = new Worker("worker.js");
const cancel = false;
const processing = false;

worker.onmessage(function(r) {
	// when worker has finished processing the query.
	// r is the results of the processing.
	processing = false;

	if (cancel === true) {
		// processing is done, but result is not required.
		// instead of showing the results, update that the query was canceled.
		cancel = false;
		... update UI "cancled".
		return;
	}
	
	... update UI "results r".
}

function onCancel() {
	// Occurs when user clicks on the cancel button.
	if (cancel) {
		// sanity test - prevent this in UI.
		throw "already cancelling";
	}
	
	cancel = true;
	
	... update UI "canceling". 
}

function onQuery(q) {
	if (processing === true) {
		// sanity test - prevent this in UI.
		throw "already processing";
	}
	
	processing = true;
	// Send the query to the worker.
	// When the worker receives the message it will process the query via webassembly.
	worker.postMessage(q);
}

An idea from user experience perspective: You may create ~two workers. This will take twice the memory, but will allow you to "cancel" "immediately" once. (it will just mean that in the backend the 2nd worker will run the next query, and when the 1st finishes the cancellation, cancellation will again become immediate).

127

answered Oct 26 '22 22:10

Tomer

Shared Thread

Since the worker and the C++ function that it called share the same thread, the worker will also be blocked until the C++ loop is finished, and won't be able to handle any incoming messages. I think the a solid option would minimize the amount of time that the thread is blocked by instead initializing one iteration at a time from the main application.

It would look something like this.

Click to copy

main.js  ->  worker.js  ->  C++ function  ->  worker.js  ->  main.js

Breaking up the Loop

Below, C++ has a variable initialized at 0, which will be incremented at each loop iteration and stored in memory. C++ function then performs one iteration of the loop, increments the variable to keep track of loop position, and immediately breaks.

Click to copy

int x;
x = 0; // initialized counter at 0

std::vector<JSONObject> data
for (size_t i = x; i < data.size(); i++)
{
    process_data(data[i]);

    x++ // increment counter
    break; // stop function until told to iterate again starting at x
}

Then you should be able to post a message to the web worker, which then sends a message to main.js that the thread is no longer blocked.

Canceling the Operation

From this point, main.js knows that the web worker thread is no longer blocked, and can decide whether or not to tell the web worker to execute the C++ function again (with the C++ variable keeping track of the loop increment in memory.)

Click to copy

let continueOperation = true
// here you can set to false at any time since the thread is not blocked here

worker.expensiveThreadBlockingFunction()
// results in one iteration of the loop being iterated until message is received below

worker.onmessage = function(e) {
    if (continueOperation) {
        worker.expensiveThreadBlockingFunction()
        // execute worker function again, ultimately continuing the increment in C++
    } {
        return false
        // or send message to worker to reset C++ counter to prepare for next execution
    }
}

Continuing the Operation

Assuming all is well, and the user has not cancelled the operation, the loop should continue until finished. Keep in mind you should also send a distinct message for whether the loop has completed, or needs to continue, so you don't keep blocking the worker thread.

answered Oct 26 '22 20:10

Nathan Fries

Related questions
                            
                                make document.execCommand('insertText', false, 'message') work with draftjs?
                            
                                Make prototype accessible in vuex
                            
                                insertAdjacentHTML is not a function
                            
                                How can i test a component with setProps() of enzyme
                            
                                Testing the `React.createRef` api with Enzyme
                            
                                correct url not showing up in iframe copied from embed button
                            
                                Web Components: how to access a slotted element using shadowRoot.querySelector
                            
                                Roxy Fileman with TinyMCE 5 using file_picker_callback
                            
                                Adding a conditional to mustache/php
                            
                                Error: webpack.optimize.CommonsChunkPlugin has been removed, please use config.optimization.splitChunks instead
                            
                                Simple Promise and Then implementation
                            
                                different style for continuous css class
                            
                                Sorting a string array with uppercase first including accents
                            
                                How long is the delay for CSS scroll snap?
                            
                                how to integrate vue.js with django?
                            
                                JavaScript reference drop
                            
                                VoiceOver does not react to anchors and changing focus on iOS properly
                            
                                Clearing out typed text from a vuetify v-autocomplete after drop down item is selected
                            
                                How to create multiple pages with different languages from one template?
                            
                                React testing library cannot find any components used inside react-responsive media queries

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to cancel a wasm process from within a webworker

Tags:

c++

javascript

web-worker

webassembly

emscripten

David542

People also ask

2 Answers

Tomer

Shared Thread

Breaking up the Loop

Canceling the Operation

Continuing the Operation

Nathan Fries

Recent Activity

Donate For Us