Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get HTML attribute value as is via JavaScript

I have a website where I feed information to an analytics engine via the meta tag as such:

<meta property="analytics-track" content="Hey&nbsp;There!">

I am trying to write a JavaScript script (no libraries) to access the content section and retrieve the information as is. In essence, it should include the HTML entity and not transform/strip it.

The reason is that I am using PhantomJS to examine which pages have HTML entities in the meta data and remove them as they screw up my analytics data (For example, I'll have entries that include both Hey There! and Hey&nbsp;There! when in fact they are both the same page, and thus should not have two separate data points).

The most simple JS format I have is this:

document.getElementsByTagName('meta')[4].getAttribute("content")

And when I examined it in on console, it returns the text in the following format:

"Hey There!"

What I would like it to return is:

"Hey&nbsp;There!"

How can I ensure that the data returned will keep the HTML entity. If that's not possible, is there a way to detect HTML entity via JavaScript. I tried:

document.getElementsByTagName('meta')[4].getAttribute("content").includes('&nbsp;')

But it returns false

like image 651
Adib Avatar asked Mar 15 '23 04:03

Adib


1 Answers

Use queryselector to select the element with the property value "analytics-track", outerHTML to get the element as a String and match to select the unparsed value of the content property with Regex.

document.querySelector('[property=analytics-track]').outerHTML.match(/content="(.*)"/)[1];

See http://jsfiddle.net/sjmcpherso/mz63fnjg/

like image 58
sjm Avatar answered Mar 23 '23 01:03

sjm