Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# HTML string -> get length without html

I have string with HTML images, for example:

string str = "There is some nice <img alt='img1' src='img/img1.png' /> images in this <img alt='img2' src='img/img2.png' /> string. I would like to ask you <img alt='img3' src='img/img3.png' /> how Can I can I get the Lenght of the string?";

I would like to get the lenght of the string without the images and the count of images. So, the result should be:

int strLenght = 111;
int imagesCount= 3;

Can you show me the most effective way, please?

Thanks

like image 231
Lubos Marek Avatar asked Feb 07 '23 03:02

Lubos Marek


1 Answers

I'd suggest to use a real HTML parser, for example HtmlAgilityPack. Then it's simple:

string html = "There is some nice <img alt='img1' src='img/img1.png' /> images in this <img alt='img2' src='img/img2.png' /> string. I would like to ask you <img alt='img3' src='img/img3.png' /> how Can I can I get the Lenght of the string?";

var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
int length = doc.DocumentNode.InnerText.Length;               // 114
int imageCount = doc.DocumentNode.Descendants("img").Count(); // 3

This is what DocumentNode.InnerText returns in your sample, you've skipped some spaces:

There is some nice  images in this  string. I would like to ask you  how Can I can I get the Lenght of the string?
like image 159
Tim Schmelter Avatar answered Feb 16 '23 01:02

Tim Schmelter