PHP Dom document html is faster or preg_match_all function is faster?

Question

I got a doubt in mind that which one is faster in processing?

dom document or preg_match_all with curl function is faster in html page parsing?? and will dom document function leave a trace on other server like curl function do? For example in curl function we use a user agent to define who is accessing but in dom document there is nothing.

Andy Lester · Accepted Answer

Does it matter which is faster if one gives you incorrect results?

Matching with regular expressions to get a single bit of data out of the document will be faster than parsing an entire HTML document. But regular expressions cannot parse HTML correctly in all cases.

See http://htmlparsing.com/regexes.html, which I have started to address this common question. (And for the rest of you reading this, I can use help. The source is on github, and I need examples for many different languages.)

Gordon · Answer

Regular expressions will likely be faster, but they are also likely the worse choice. Unless you have benchmarked and profiled your application and found nothing else to optimize, you should look into a proper existing parser.

While Regular Expressions can be used to match HTML, it takes a thorough effort to come up with a reliable parser. PHP offers a bunch of native extensions to work with XML (and HTML) reliably. There is also a number of third party libraries. See my answer to

Best Methods to parse HTML

As for sending a custom user agent, this is possible with DOM too. You have to create a custom stream context and attach it with the underlying libxml functions. You can supply any of the available HTTP Stream context options this way. See my answer to

DOMDocument::validate() problem

for an example how to supply a custom UserAgent.

PHP Dom document html is faster or preg_match_all function is faster?

Tags:

dom

php

mathew

2 Answers

Andy Lester

Gordon

Recent Activity

Donate For Us

PHP Dom document html is faster or preg_match_all function is faster?

Tags:

dom

php

mathew

2 Answers

Andy Lester

Gordon

Related questions

Recent Activity

Donate For Us