Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient string manipulation in Javascript

I have a string (HTML content) and an array of position (index) objects. The string length is about 1.6 million characters and there are about 700 position objects.

ie:

var content = "<html><body><div class="c1">this is some text</div>...."
var positions = [{start: 20, end: 25}, {start: 35, end: 37}....]

I have to insert an opening span tag into every start position within the string and a close span tag into every end position within the string.

What is the most efficient way to do this?

So far I have tried sorting the positions array in reverse, then looping through and then using replace / splice to insert the tags, eg:

content = content.slice(0, endPosition) + "</span>" + content.substring(endPosition);
content = content.slice(0, startPosition) + "<span>" + content.slice(startPosition);

(Notice how I have started the loop from the end in order to avoid messing up the start/end positions).

But this takes about 3 seconds, which seems slow and inefficient to me.

What is a more efficient way to do this?

like image 718
joshua miller Avatar asked Nov 13 '18 10:11

joshua miller


People also ask

Is string strprimitive treated equally by the JavaScript engine?

If we assume that during runtime the wrapper object created from the primitive value string strprimitive, is treated equally with the object string strobject by the JavaScript engine in terms of performance, then we should expect to see the same latency while trying to access each variable’s length property.

What is the fastest way to handle strings in JavaScript?

Several languages offer fast string-handling classes such as StringBuilder in .NET and StringBuffer/StringBuilder in Java. There are a number of ways to concatenate strings in JavaScript:

What are string methods in JavaScript?

String methods help you to work with strings. Primitive values, like "John Doe", cannot have properties or methods (because they are not objects). But with JavaScript, methods and properties are also available to primitive values, because JavaScript treats primitive values as objects when executing methods and properties.

Why is string handling important in web applications?

It’s especially vital within web applications, since strings are regularly used to generate HTML output. Several languages offer fast string-handling classes such as StringBuilder in .NET and StringBuffer/StringBuilder in Java.


3 Answers

Instead of modifying the big string each time, try accumulating processed "chunks" in a new buffer:

content = '0123456789'
positions = [
  [1, 3],
  [5, 7]
]

buf = []
lastPos = 0

for (let [s, e] of positions) {
  buf.push(
    content.slice(lastPos, s),
    '<SPAN>',
    content.slice(s, e),
    '</SPAN>'
  )
  lastPos = e
}

buf.push(content.slice(lastPos))


res = buf.join('')
console.log(res)
like image 109
georg Avatar answered Oct 07 '22 20:10

georg


We can split content by chars into array, than did one loop to insert <span> </span> and than join back to string

var content = '<html><body><div class="c1">this is some text</div>....';
var positions = [{start: 20, end: 25}, {start: 35, end: 37}];
var arr = content.split('');

var arrPositions = {
  starts: positions.map(_ => _.start),
  ends: positions.map(_ => _.end)
}

var result = arr.map((char, i) => {
  if (arrPositions.starts.indexOf(i) > -1) {
    return '<span>' + char;
  }
  if (arrPositions.ends.indexOf(i) > -1) {
    return '</span>' + char;
  }
  return char
}).join('')

console.log(result)
like image 44
qiAlex Avatar answered Oct 07 '22 21:10

qiAlex


You can do:

const content = '<div class="c1">It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using Content here, content here, making it look like readable English. Many desktop publishing packages and web page editors now use Lorem Ipsum as their default model text, and a search for lorem ipsum will uncover many web sites still in their infancy. Various versions have evolved over the years, sometimes by accident, sometimes on purpose (injected humour and the like).</div>';
const positions = [{start: 24,end: 40}, {start: 160,end: 202}];
const result = positions
  .reduce((a, c, i, loopArray) => {
    a.array.push(
      content.slice(a.lastPosition, c.start), '<span class="blue">', content.slice(c.start, c.end), '</span>'
    );
    
    a.lastPosition = c.end;
    
    if (loopArray.length === ++i) {
      a.array.push(content.slice(a.lastPosition));
    }
    
    return a;
  }, {array: [], lastPosition: 0})
  .array
  .join('');

document.write(result);
.blue {color: blue;}
like image 22
Yosvel Quintero Arguelles Avatar answered Oct 07 '22 20:10

Yosvel Quintero Arguelles