Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scrape shopee using request is getting error type 2

I am trying to scrape shopee sites using requests. With an example site https://shopee.co.id/Paha-Fillet-Ayam-Organik-Lacto-Farm-500gr-Paha-Fillet-Segar-Ayam-Probiotik-Organik-Paha-Boneless-Ayam-MPASI-Ayam-Sehat-Ayam-Anti-Alergi-Daging-Ayam-MPASI-i.382368918.8835294847

I notice that it is using an api

My current code is as follows

import requests
url='https://shopee.co.id/api/v4/item/get?itemid=8835294847&shopid=382368918'
header={
    "x-api-source": 'pc',
    'af-ac-enc-dat': 'null'
}
response=requests.get(url,headers=header,verify=True)

The response json that I am getting

{'tracking_id': '396e3995-dff2-4813-82e7-f7326026d714',
 'action_type': 2,
 'error': 90309999,
 'is_customized': False,
 'is_login': False,
 'platform': 0,
 'report_extra_info': 'eyJlbmNyeXB0X2tleSI6Im.....}

the response headers is as follows:

 {'Server': 'SGW', 'Date': 'Sat, 14 Jan 2023 02:14:33 GMT',
 'Content-Type': 'application/json', 'Transfer-Encoding': 'chunked',
 'Connection': 'keep-alive', 'Vary': 'Accept-Encoding',
 'cache-control': 'no-store, max-age=0', 'Content-Encoding': 'gzip'}

Can someone help me, as I am not understanding why it does not return the response.json properly.

like image 567
Hal Avatar asked Mar 20 '26 23:03

Hal


1 Answers

The site does not return json due to missing Cross Site Request Forgery protection token. You will need to add the X-CSRFToken header to the request, which can usually be retrieved from:

csrftoken cookie()
the csrf token meta tag in html

Shopee has a csrf token cookie, but at the moment I can't figure out how it got there (usually the server sends it in the cookie response, but shopee doesn't do that).

Edit: I forgot that the site also sends the af-ac-enc-dat header in the https://shopee.co.id/api/v4/item/get?itemid=8835294847&shopid=382368918 request, but I have no idea how to get it. So I wrote the request interception code in selenium webdriver to get the response of this request. And it works!

Install selenium wire to capture requests:

pip install selenium-wire

Code:

from seleniumwire import webdriver
import zlib

site_url = "https://shopee.co.id/Paha-Fillet-Ayam-Organik-Lacto-Farm-500gr-Paha-Fillet-Segar-Ayam-Probiotik-Organik-Paha-Boneless-Ayam-MPASI-Ayam-Sehat-Ayam-Anti-Alergi-Daging-Ayam-MPASI-i.382368918.8835294847"

driver = webdriver.Chrome()
driver.maximize_window()

# define scope to capture only specified url
driver.scopes = ["https://shopee.co.id/api/v4/item/get\?.*"]

print("starting to capture")

driver.get(site_url)

assert driver.last_request, "no request found"

target_response = driver.last_request.response
target_encoding = target_response.headers["content-encoding"]

target_response_data = target_response.body

if target_encoding:
    if target_encoding == "gzip":
        print("content is encoded")
        # from https://stackoverflow.com/a/2695575
        target_response_data = zlib.decompress(target_response_data, 16 + zlib.MAX_WBITS)
    else:
        raise ValueError("unsupported encoding")

print()
print("found data: ")
print(target_response_data)
print()

print("closing window")

driver.close()

Outputs:

starting to capture
content is encoded

found data: 
b'{"error":null,"error_msg":null,"data":{"itemid":8835294847,"shopid":382368918,"userid":0,"price_max_before_discount":-1,"has_lowest_price_guarantee":false,"price_before_discount":0,"price_min_before_discount":-1,"exclusive_price_info":null,"hidden_price_display":null,"price_min":7500000000,"price_max":7500000000,"price":7500000000,"stock":50,"discount":null,"historical_sold":12,"sold":0,"show_discount":0,"raw_discount":0,"min_purchase_limit":0,"overall_purchase_limit":{"order_max_purchase_limit":0,"overall_purchase_limit":null,"item_overall_quota":null,"start_date":null,"end_date":null},"pack_size":null,"is_live_streaming_price":null,"show_free_return":null,"name":"Paha Fillet Ayam Organik Lacto Farm 500gr, Paha Fillet Segar, Ayam Probiotik Organik, Paha Boneless | Ayam MPASI | Ayam Sehat | Ayam Anti Alergi | Daging Ayam MPASI"........ cut line'

closing window
like image 133
Jurakin Avatar answered Mar 22 '26 12:03

Jurakin



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!