Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to query XML data in database column using Linq to SQL and Linq to XML?

XML:

<root>
    <item>
        <href>http://myurl</href>
    </item>
    <item>
        <href>http://myurl2</href>
    </item>
</root>

The XML data is stored in a database table.

Can I build a Linq query which selects the rows, extracts the XML and then, for example, extracts all the href tags? The end result would then be a list of all URLS for all selected rows.

This is my attempt, but it is not giving me what I want - which would be a list of all hrefs for all selected users. I just get a list of empty IEnumerations.

var all = from bm in MYTABLE
        select new { name=bm.SPP_USER_ID, xml=(string) bm.SPP_BOOKMARKS_XML};

var docs = from x in all
        select XDocument.Parse(x.xml);
var href = from h in docs
        select h.Descendants("href");


Solution

There were 2 problems.

-- I guess that the query is only executed when a result is actually demanded. As I progressed from an SQL to an XML query, the resulting query became a mix of SQL and XML and thereby unexecutable. My solution was to force a result by converting the SQL query to a result list. This separated the linq-sql from linq-xml.

var docs = from x in all
        select XDocument.Parse(x.xml);
var docs2 = docs.ToList();  // force result

-- The second problem was that I forgot to add the namespace to my XML query. Once I did that I got the required result

XNamespace ns = "http://acme/bookmarks";
var href = from h in docs2
    select h.Descendants(ns + "href");

Thanks for all your help!

like image 904
paul Avatar asked Jun 14 '11 12:06

paul


1 Answers

Here are the couple of things I tried.

  1. Setup: I created a database with a table MYTABLE, containing 2 columns. Column 1 (SPP_USER_ID) is varchar(255), column 2 (SPP_BOOKMARKS_XML) is xml data type. I added 2 rows, mirroring those that mellamokb used. I created a LINQ to SQL design surface and dragged in this table. My app is just a console app that loads the LINQ to SQL data context and then calls your commands.
  2. I queried directly against the data table, as follows:

            var all = from bm in context.MYTABLEs
                      select new { name = bm.SPP_USER_ID, xml = (string)bm.SPP_BOOKMARKS_XML };
    
            var docs = from x in all
                       select XDocument.Parse(x.xml);
    
            var href = from h in docs
                       select h.Descendants("href");
    

    The result was a collection of IEnumerables. I.e. for each row in the database, I got an IEnumerable containing all of the "href" descendants. So in this scenario, I got two results; the first result was an IEnumerable that contained 2 elements, and the second result was an IEnumerable that contained 1 element. It seems like, from your description, this is at least close to what you want. However, you maybe aren't seeing what you expect just because of the way your query is structured.

    Note that it is rather difficult sometimes to get Visual Studio to actually show you the results of a query like this. I've found that I frequently have to "watch" the item twice before it bothers to fire the result enumeration.

  3. I tried to duplicate the code written in your OP as follows:

            var MYTABLE = (from bm in context.MYTABLEs select bm).ToList();
    
            var all = from bm in MYTABLE
                      select new { name = bm.SPP_USER_ID, xml = (string)bm.SPP_BOOKMARKS_XML };
    
            var docs = from x in all
                       select XDocument.Parse(x.xml);
    
            var href = from h in docs
                       select h.Descendants("href");
    

    When I did this, it resulted in the "xml" value being trashed because it was already an XDocument (?) so casting it to string resulted in all of the tags being stripped out. So, the next two statements failed with an XML parse exception.

like image 171
GalacticCowboy Avatar answered Nov 15 '22 12:11

GalacticCowboy