Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to parse XML files in C#?

Tags:

c#

sql

xml

c#-4.0

I have to load many XML files from internet. But for testing with better speed i downloaded all of them (more than 500 files) of the following format.

<player-profile>
  <personal-information>
    <id>36</id>
    <fullname>Adam Gilchrist</fullname>
    <majorteam>Australia</majorteam>
    <nickname>Gilchrist</nickname>
    <shortName>A Gilchrist</shortName>
    <dateofbirth>Nov 14, 1971</dateofbirth>
    <battingstyle>Left-hand bat</battingstyle>
    <bowlingstyle>Right-arm offbreak</bowlingstyle>
    <role>Wicket-Keeper</role>
    <teams-played-for>Western Australia, New South Wales, ICC World XI, Deccan Chargers, Australia</teams-played-for>
    <iplteam>Deccan Chargers</iplteam>
  </personal-information>
  <batting-statistics>
    <odi-stats>
      <matchtype>ODI</matchtype>
      <matches>287</matches>
      <innings>279</innings>
      <notouts>11</notouts>
      <runsscored>9619</runsscored>
      <highestscore>172</highestscore>
      <ballstaken>9922</ballstaken>
      <sixes>149</sixes>
      <fours>1000+</fours>
      <ducks>0</ducks>
      <fifties>55</fifties>
      <catches>417</catches>
      <stumpings>55</stumpings>
      <hundreds>16</hundreds>
      <strikerate>96.95</strikerate>
      <average>35.89</average>
    </odi-stats>
    <test-stats>
      .
      .
      .
    </test-stats>
    <t20-stats>
      .
      .
      .    
    </t20-stats>
    <ipl-stats>
      .
      .
      . 
    </ipl-stats>
  </batting-statistics>
  <bowling-statistics>
    <odi-stats>
      <matchtype>ODI</matchtype>
      <matches>378</matches>
      <ballsbowled>58</ballsbowled>
      <runsgiven>64</runsgiven>
      <wickets>3</wickets>
      <fourwicket>0</fourwicket>
      <fivewicket>0</fivewicket>
      <strikerate>19.33</strikerate>
      <economyrate>6.62</economyrate>
      <average>21.33</average>
    </odi-stats>
    <test-stats>
      .
      .
      . 
    </test-stats>
    <t20-stats>
      .
      .
      . 
    </t20-stats>
    <ipl-stats>
      .
      .
      . 
    </ipl-stats>
  </bowling-statistics>
</player-profile>

I am using

XmlNodeList list = _document.SelectNodes("/player-profile/batting-statistics/odi-stats");

And then loop this list with foreach as

foreach (XmlNode stats in list)
  {
     _btMatchType = GetInnerString(stats, "matchtype"); //it returns null string if node not availible
     .
     .
     .
     .
     _btAvg = Convert.ToDouble(stats["average"].InnerText);
  }

Even i am loading all files offline, parsing is very slow Is there any good faster way to parse them? Or is it problem with SQL? I am saving all extracted data from XML to database using DataSets, TableAdapters with insert command.

EDIT: Now for using XmlReader please give some code of XmlReader for above document. for now, i have done this

void Load(string url) 
{
    _reader = XmlReader.Create(url); 
    while (_reader.Read()) 
    { 
    } 
} 

Availible Methods for XmlReader are confusing. What i need is to get batting and bowling stats completly, batting and bowling stats are different, while odi,t2o,ipl etc are same inside bowling and batting.

like image 822
SMUsamaShah Avatar asked Dec 01 '22 10:12

SMUsamaShah


1 Answers

You can use an XmlReader for forward only, fast reading.

like image 114
Carra Avatar answered Dec 07 '22 00:12

Carra