Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing XML / RSS from URL using Java Script

Tags:

Hi i want to parse xml/rss from a live url like http://rss.news.yahoo.com/rss/entertainment using pure Java Script(not jquery). I have googled a lot. Nothing worked for me. can any one help with a working piece of code.

like image 664
Arun Kumar Munusamy Avatar asked Nov 23 '11 06:11

Arun Kumar Munusamy


People also ask

How do I find the RSS feed URL for a website?

It's easier than it sounds. Right click an empty space on the website you'd like an RSS feed for, then click View Page Source (the exact wording may vary depending on your browser). If searching for rss doesn't work, try atom instead. Look for an RSS URL, as you can see above, then copy it into your feed reader.

How do I query an RSS feed?

Get the Feed From the page source You can also access the page source by using your keyboard, with Ctrl+U on Windows computers and Cmd+U on Macbooks. Once you are on the page source, press Ctrl+F (or Cmd+F) and write RSS in the search bar. If the website has an RSS feed, this should find it.

What is XML RSS feed URL?

An RSS feed is an XML (Extensible Markup Language) file that contains structured information. The RSS feed for a website provides a way for people to easily keep up with their favorite publishers by providing a list of headlines and summaries of the latest content that readers can view via an RSS Feed Reader.


2 Answers

(You cannot have googled a lot.) Once you have worked around the Same Origin Policy, and if the resource is served with an XML MIME type (which it is in this case, text/xml), you can do the following:

var x = new XMLHttpRequest();
x.open("GET", "http://feed.example/", true);
x.onreadystatechange = function () {
  if (x.readyState == 4 && x.status == 200)
  {
    var doc = x.responseXML;
    // …
  }
};
x.send(null);

(See also AJAX, and the XMLHttpRequest Level 2 specification [Working Draft] for other event-handler properties.)

In essence: No parsing necessary. If you then want to access the XML data, use the standard DOM Level 2+ Core or DOM Level 3 XPath methods, e.g.

/* DOM Level 2 Core */
var title = doc.getElementsByTagName("channel")[0].getElementsByTagName("title")[0].firstChild.nodeValue;

/* DOM Level 3 Core */
var title = doc.getElementsByTagName("channel")[0].getElementsByTagName("title")[0].textContent;

/* DOM Level 3 XPath (not using namespaces) */
var title = doc.evaluate('//channel/title/text()', doc, null, 0, null).iterateNext();

/* DOM Level 3 XPath (using namespaces) */
var namespaceResolver = (function () {
  var prefixMap = {
    media: "http://search.yahoo.com/mrss/",
    ynews: "http://news.yahoo.com/rss/"
  };

  return function (prefix) {
    return prefixMap[prefix] || null;
  };
}());

var url = doc.evaluate('//media:content/@url', doc, namespaceResolver, 0, null).iterateNext();

(See also JSX:xpath.js for a convenient, namespace-aware DOM 3 XPath wrapper that does not use jQuery.)

However, if for some (wrong) reason the MIME type is not an XML MIME type, or if it is not recognized by the DOM implementation as such, you can use one of the parsers built into recent browsers to parse the responseText property value. See pradeek's answer for a solution that works in IE/MSXML. The following should work everywhere else:

var parser = new DOMParser();
var doc = parser.parseFromString(x.responseText, "text/xml");

Proceed as described above.

Use feature tests at runtime to determine the correct code branch for a given implementation. The simplest way is:

if (typeof DOMParser != "undefined")
{
  var parser = new DOMParser();
  // …
}
else if (typeof ActiveXObject != "undefined")
{
  var xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
  // …
}

See also DOMParser and HTML5: DOM Parsing and Serialization (Working Draft).

like image 169
PointedEars Avatar answered Oct 05 '22 23:10

PointedEars


One big problem you might run into is that generally, you cannot get data cross domain. This is big issue with most rss feeds.

The common way to deal with loading data in javascript cross domain is calls JSONP. Basically, this means that the data you are retrieving is wrapped in a javascript callback function. You load the url with a script tag, and you define the function in your code. So when the script loads, it executes the function and passes the data to it as an argument.

The problem with most xml/rss feeds is that services that only provide xml tend not to provide JSONP wrapping capability.

Before you go any farther, check to see if your data source provides a json format and JSONP functionality. That will make this a lot easier.

Now, if your data source doesn't provide json and jsonp functionality, you have to get creative.

On relatively easy way to handle this is to use a proxy server. Your proxy runs somewhere under your control, and acts as a middleman to get your data. The server loads your xml, and then your javascript does the requests to it instead. If the proxy server runs on the same domain name then you can just use standard xhr(ajax) requests and you don't have to worry about cross-domain stuff.

Alternatively, your proxy server can wrap the data in a jsonp callback and you can use the method mentioned above.

If you are using jQuery, then xhr and jsonp requests are built-in methods and so make doing the coding very easy. Other common js libraries should also support these. If you are coding all of this from scratch, its a little more work but not terribly difficult.

Now, once you get your data hopefully its just json. Then there's no parsing needed.

However, if you end up having to stick with an xml/rss version, and if you're jQuery, you can simply use jQuery.parseXML http://api.jquery.com/jQuery.parseXML/.

like image 20
Geuis Avatar answered Oct 06 '22 01:10

Geuis