Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a fastest way to do search through xml

Tags:

c#

search

xml

Suppose i have an XML file, that i use as local database, like this):

<root>
 <address>
  <firstName></firstName>
  <lastName></lastName>
  <phone></phone>
 </address>
</root>

I have a couple of questions:
1. What will be a fastest way to find address(or addresses) in XML where firstName contains 'er' for example?
2. Is it possible to do without whole loading of XML file in memory?

P.S. I am not looking for XML file alternatives, ideally i need a search that not depend on count of addresses in XML file. But i am realist, and it seems to me that it not possible.

Update: I am using .net 4
Thanks for suggestions, but it's more scientific task than practical.. I probably looking for more fastest ways than linq and xmltextreader.

like image 452
Andrew Orsich Avatar asked Dec 03 '22 03:12

Andrew Orsich


1 Answers

LINQ to Xml works pretty fine:

XDocument doc = XDocument.Load("myfile.xml");
var addresses = from address in doc.Root.Elements("address")
                where address.Element("firstName").Value.Contains("er")
                select address;

UPDATE: Try to look at this question on StackOverflow: Best way to search data in xml files?.

Marc Gravell's accepted answer works using SQL indexing:

First: how big are the xml files? XmlDocument doesn't scale to "huge"... but can handle "large" OK.

Second: can you perhaps put the data into a regular database structure (perhaps SQL Server Express Edition), index it, and access via regular TSQL? That will usually out-perform an xpath search. Equally, if it is structured, SQL Server 2005 and above supports the xml data-type, which shreds data - this allows you to index and query xml data in the database without having the entire DOM in memory (it translates xpath into relational queries).

UPDATE 2: Read also another link taken by the previous question that explains how the structure of the XML affects performances: http://www.15seconds.com/issue/010410.htm

like image 99
as-cii Avatar answered Dec 12 '22 23:12

as-cii