Say I wish to scrape products on this page(http://shop.coles.com.au/online/national/bread-bakery/fresh/bread#pageNumber=2¤tPageSize=20)
But the products is loaded from a post
request. A lot of posts here suggest to simulate a request to get dynamic contents, but in my case the Form Data
is unknown for me, i.e. catalogId
, categoryId
.
I'm wondering is it possible to get the response
after the ajax call is finished?
You can get the catalogId
and other parameter values needed to make the POST request from the form
with id="search"
:
<form id="search" name="search" action="http://shop.coles.com.au/online/SearchDisplay?pageView=image&catalogId=10576&beginIndex=0&langId=-1&storeId=10601" method="get" role="search">
<input type="hidden" name="storeId" value="10601" id="WC_CachedHeaderDisplay_FormInput_storeId_In_CatalogSearchForm_1">
<input type="hidden" name="catalogId" value="10576" id="WC_CachedHeaderDisplay_FormInput_catalogId_In_CatalogSearchForm_1">
<input type="hidden" name="langId" value="-1" id="WC_CachedHeaderDisplay_FormInput_langId_In_CatalogSearchForm_1">
<input type="hidden" name="beginIndex" value="0" id="WC_CachedHeaderDisplay_FormInput_beginIndex_In_CatalogSearchForm_1">
<input type="hidden" name="browseView" value="false" id="WC_CachedHeaderDisplay_FormInput_browseView_In_CatalogSearchForm_1">
<input type="hidden" name="searchSource" value="Q" id="WC_CachedHeaderDisplay_FormInput_searchSource_In_CatalogSearchForm_1">
...
</form>
Use the FormRequest
to submit this form.
I'm wondering is it possible to get the response after the ajax call is finished?
Scrapy is not a browser - it does not make additional AJAX requests to load the page and there is nothing built-in to execute JavaScript. You may look into using a real browser and solve it on a higher level - look into selenium
package. There is also the related scrapy-splash
project.
See also:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With