How do I get the visible text portion of a web page with selenium webdriver without the HTML tags?
I need something equivalent to the function HtmlPage.asText() from Htmlunit.
It is not enough to take the text with the function WebDriver.getSource and parse it with jsoup because there could be in the page hidden elements (by external CSS) which I am not interested in them.
To get the text of the visible on the page we can use the method findElement(By. tagname()) method to get hold of . Next can then use the getText() method to extract text from the body tag. WebElement l=driver.
Doing By. tagName("body") (or some other selector to select the top element), then performing getText() on that element will return all of the visible text.
UiPath. Core. Activities. GetVisibleText Extracts a string and its information from an indicated UI element using the Native screen scraping method. This activity can also be automatically generated when performing screen scraping, along with a...
We can extract text from a webpage using Selenium webdriver and save it as a text file using the getText method. It can extract the text for an element which is displayed (and not hidden by CSS).
Doing By.tagName("body")
(or some other selector to select the top element), then performing getText()
on that element will return all of the visible text.
I can help you with C# Selenium.
By using this you can select all the text on that particular page and save it to a text file at your preferred location.
Make sure you are using this stuff:
using System.IO; using System.Text; using OpenQA.Selenium; using OpenQA.Selenium.Support.UI;
After reaching the particular page try using this code.
IWebElement body = driver.FindElement(By.TagName("body")); var result = driver.FindElement(By.TagName("body")).Text; // Folder location var dir = @"C:Textfile" + DateTime.Now.ToShortDateString(); // If the folder doesn't exist, create it if (!Directory.Exists(dir)) Directory.CreateDirectory(dir); // Creates a file copiedtext.txt with all the contents on the page. File.AppendAllText(Path.Combine(dir, "Copiedtext.txt"), result);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With