Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read table data using selenium python?

Following is the table HTML source which seems to be very complex for selenium to read its contents.. Can somebody help me, reading this data into python using selenium?

<div class="general_table">
    <div class="general_s">
        <div class="general_text1">Name</div>
        <div class="general_text2">Abhishek</div>
    </div>
    <div class="general_m">
        <div class="general_text1">Last Name</div>
        <div class="general_text2">Kulkarni</div>
    </div>
    <div class="general_s">
        <div class="general_text1">Phone</div>
        <div class="general_text2"> 13613123</div>
    </div>
    <div class="general_m">
        <div class="general_text1">Cell Phone</div>
        <div class="general_text2">82928091</div>
    </div>         
    <div class="general_s">
        <div class="general_text1">City</div>
        <div class="general_text2"></div>
    </div>
    <div class="general_m">
        <div class="general_text1">Model</div>
        <div class="general_text2"> DELL PERC H700</div>
    </div>
</div>
like image 693
Abhishek Kulkarni Avatar asked Feb 15 '26 02:02

Abhishek Kulkarni


2 Answers

To read this table using selenium webdriver, xpath seems to be the easy way -

I'm do not know python properly so the code might be wrong but the idea seems to be right -

To find out the number of div tags with in the general_table we use the xpath -

driver.find_elements_by_xpath(("//*[@class='general_table']/div") which will return a List with size - 6.

Then you can loop through each of the elements using a loop -

for(int i=1;i<=list.length;i++){
    String text1 = driver.find_element_by_xpath("//*[@class='general_table']/div["+i+"]/div[1]").text;
    String text2 = driver.find_element_by_xpath("//*[@class='general_table']/div["+i+"]/div[2]").text;
}

You can read all the tags in the table by this way.

like image 136
Hari Reddy Avatar answered Feb 16 '26 15:02

Hari Reddy


Use selenium to grab the page source (so you get the real content after all the js/ajax stuff) and something like BeautifulSoup to parse it.

from bs4 import BeautifulSoup

soup = BeautifulSoup("""<div class="general_table">
    <div class="general_s">
        <div class="general_text1">Name</div>
        <div class="general_text2">Abhishek</div>
    </div>
    <div class="general_m">
        <div class="general_text1">Last Name</div>
        <div class="general_text2">Kulkarni</div>
    </div>
    <div class="general_s">
        <div class="general_text1">Phone</div>
        <div class="general_text2"> 13613123</div>
    </div>
    <div class="general_m">
        <div class="general_text1">Cell Phone</div>
        <div class="general_text2">82928091</div>
    </div>         
    <div class="general_s">
        <div class="general_text1">City</div>
        <div class="general_text2"></div>
    </div>
    <div class="general_m">
        <div class="general_text1">Model</div>
        <div class="general_text2"> DELL PERC H700</div>
    </div>
</div>""")

def tags(iterable):
    return filter(lambda x: not isinstance(x, basestring), iterable)

for table in soup.find_all('div', {'class': 'general_table'}):
    for line in tags(table.contents):
        for i, column in enumerate(tags(line.contents)):
            if column.string:
                print column.string.strip(),
            if i:
                print ',',
            else:
                print ':',
        print ''    

Result:

Name : Abhishek , 
Last Name : Kulkarni , 
Phone : 13613123 , 
Cell Phone : 82928091 , 
City : 
Model : DELL PERC H700 , 
like image 44
Paulo Scardine Avatar answered Feb 16 '26 15:02

Paulo Scardine



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!