Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I parse a namespace using the SAX parser?

Using a twitter search URL ie. http://search.twitter.com/search.rss?q=android returns CSS that has an item that looks like:

<item>
      <title>@UberTwiter still waiting for @ubertwitter  android app!!!</title>
      <link>http://twitter.com/meals69/statuses/21158076391</link>
      <description>still waiting for an app!!!</description>
      <pubDate>Sat, 14 Aug 2010 15:33:44 +0000</pubDate>
      <guid>http://twitter.com/meals69/statuses/21158076391</guid>
      <author>Some Twitter User</author>
      <media:content type="image/jpg" height="48" width="48" url="http://a1.twimg.com/profile_images/756343289/me2_normal.jpg"/>
      <google:image_link>http://a1.twimg.com/profile_images/756343289/me2_normal.jpg</google:image_link>
      <twitter:metadata>
        <twitter:result_type>recent</twitter:result_type>
</twitter:metadata>
</item>

Pretty simple. My code parses out everything (title, link, description, pubDate, etc.) without any problems. However, I'm getting null on:

<google:image_link>

I'm using Java to parse the RSS feed. Do I have to handle compound localnames differently than I would a more simple localname?

This is the bit of code that parses out Link, Description, pubDate, etc:

@Override
    public void endElement(String uri, String localName, String name)
            throws SAXException {
        super.endElement(uri, localName, name);
        if (this.currentMessage != null){
            if (localName.equalsIgnoreCase(TITLE)){
                currentMessage.setTitle(builder.toString());
            } else if (localName.equalsIgnoreCase(LINK)){
                currentMessage.setLink(builder.toString());
            } else if (localName.equalsIgnoreCase(DESCRIPTION)){
                currentMessage.setDescription(builder.toString());
            } else if (localName.equalsIgnoreCase(PUB_DATE)){
                currentMessage.setDate(builder.toString());
            } else if (localName.equalsIgnoreCase(GUID)){
                currentMessage.setGuid(builder.toString());
            } else if (uri.equalsIgnoreCase(AVATAR)){
                currentMessage.setAvatar(builder.toString());
            } else if (localName.equalsIgnoreCase(ITEM)){
                messages.add(currentMessage);
            } 
            builder.setLength(0);   
        }
    }

startDocument looks like:

@Override
    public void startDocument() throws SAXException {
        super.startDocument();
        messages = new ArrayList<Message>();
        builder = new StringBuilder();

    }

startElement looks like:

@Override
    public void startElement(String uri, String localName, String name,
            Attributes attributes) throws SAXException {
        super.startElement(uri, localName, name, attributes);
        if (localName.equalsIgnoreCase(ITEM)){
            this.currentMessage = new Message();
        } 
    }

Tony

like image 646
Silvestri Avatar asked Aug 14 '10 15:08

Silvestri


1 Answers

An element like <google:image_link> has the local name image_link belonging to the google namespace. You need to ensure that the XML parsing framework is aware of namespaces, and you'd then need to find this element using the appropriate namespace.

For example, a few SAX1 interfaces in package org.xml.sax has been deprecated, replaced by SAX2 counterparts that include namespace support (e.g. SAX1 Parser is deprecated and replaced by SAX2 XMLReader). Consult the documentation on how to specify the namespace uri or qualified (prefixed) qName.

See also

  • Wikipedia/XML namespace
  • package org.xml.sax
  • saxproject.org - Namespaces
like image 117
polygenelubricants Avatar answered Sep 24 '22 10:09

polygenelubricants