Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scraping <td> values on table generate by Javascript to Python

I've run with a problem with my web app. Here's my code:

@app.route('/addrec',methods = ['POST', 'GET'])
def addrec():

   if g.user:
        if request.method == 'POST':

#UPPER PANE

            payor = request.form['payor']


            receiptno = request.form['OR']
            paymentmethod = request.form['paymentmethod']
            naive_dt = time.strftime("%m/%d/%Y")
            collectiondate = naive_dt = datetime.now() 
            message = request.form['message']
#LOWER PANE
            url_to_scrape = 'http://localhost:5000/form'
            r = requests.get(url_to_scrape)
            soup = BeautifulSoup(r.text)
            nature = []
            for table_row in soup.select("table.inmatesList tr"):
              cells = table_row.findAll('td')
              if len(cells) > 0:
                nature = cells[0].text.strip()
                natureamt = cells[1].text.strip()
                nature = {'nature': nature, 'nature': natureamt}
                nature_list.append(nature)
            ent = Entry(receiptno, payor,officer, paymentmethod, collectiondate,message, nature_list)
            add_entry(ent)
            actions="Applied"

            return redirect(url_for('form'))

   return redirect(url_for('home'))

As you can see I am getting each of the values from my forms and is scraping the values in my table using beautifulsoup. However after I click the submit button, it loads forever. I am getting the valeus from the upper pane but not in the table.

By the way I am generating my cells from a javascript function onClick. Just in case my javascript might be the problem. or maybe there's an easy way to extract these values from the javascrip functions -> python. Here's my javascript code and HTML

<script type="text/javascript">
    function deleteRow(o){
     var p=o.parentNode.parentNode;
         p.parentNode.removeChild(p);
      }

     function addRow()
      {

        var table = document.getElementById("datatable"),
          newRow = table.insertRow(table.length),
          cell1 = newRow.insertCell(0),
          cell2 = newRow.insertCell(1),
          cell3 = newRow.insertCell(2),


          name = document.getElementById("form").value,
          amount = document.getElementById("amount").value;


          delete1 = delete1 = '<input type="button"  class="btn btn-danger" class="glyphicon glyphicon-trash"id="delete" value="Delete" onclick="deleteRow(this)">';
        cell1.innerHTML = name;
        cell2.innerHTML = amount;
        cell3.innerHTML = delete1;

        findTotal();
      }


 function findTotal(){
   var arr = document.querySelectorAll("#datatable td:nth-child(2)");
   var tot=0;

   for(var i=0;i<arr.length;i++){
      var enter_value   = Number(arr[i].textContent)
      if(enter_value)
                tot += Number(arr[i].textContent);
      }
   document.getElementById('total').value = tot;
 }

</script>

HTML:

                    <form name="noc">  

                      <input class="form-control input-lg" id="form" list="languages" placeholder="Search" type="text" required>
                      <br>
                      <input class="form-control input-lg" id="amount" list="languages" placeholder="Amount" type="number" required>
                      <br>
                      <button onclick="addRow(); return false;">Add Item</button>
                    </form>




      <table id="datatable" class="table table-striped table-bordered" cellspacing="0" width="100%">
                <thead>
    <tr>
    <th>Nature of Collection</th>
    <th>Amount</th>
    <th></th>

    </tr>
        </thead>
<tbody>
<tr>

        </tr>     
</tbody>

</table>

The data of these scraped values, I expect them to be saved to my database. On a cell. If possible I would like the list to be inserted in a column so I can get them later.

Or is there a way I can get the lists on a cleaner and better way to my database? Any help is appreciated. Thank you!

like image 369
Brix Avatar asked Dec 15 '17 16:12

Brix


People also ask

Which is better for web scraping JavaScript or Python?

Python is your best bet. Libraries such as requests or HTTPX makes it very easy to scrape websites that don't require JavaScript to work correctly. Python offers a lot of simple-to-use HTTP clients. And once you get the response, it's also very easy to parse the HTML with BeautifulSoup for example.

Is it possible to scrape JavaScript?

Whether it's a web or mobile application, JavaScript now has the right tools. This article will explain how the vibrant ecosystem of NodeJS allows you to efficiently scrape the web to meet most of your requirements.

Is JS good for web scraping?

Web scraping with JavaScript is a very useful technique to extract data from the Internet for presentation or analysis. However, in this era of dynamic websites, it becomes difficult to accurately extract data from the web because of the ever-changing nature of data.


1 Answers

So it looks like you're using requests to try and get data generated by JS. Well this isn't going to work, unless you know some magic a lot of people don't. Requests can't deal with the JS, so it never runs. You should be able to get the data using selenium or something to automate a browser. Otherwise, I don't think you're going to be able to scrape it like this. But if someone knows a way to get JS generated data with requests, please post it.

like image 151
SuperStew Avatar answered Oct 18 '22 08:10

SuperStew