I am trying following code:
import requests
headers = {
'authority': 'www.nseindia.com',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.183 Safari/537.36 OPR/72.0.3815.320',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'sec-fetch-site': 'none',
'sec-fetch-mode': 'navigate',
'sec-fetch-user': '?1',
'sec-fetch-dest': 'document',
'accept-language': 'en-GB,en;q=0.9',
}
nse = requests.Session()
x = nse.get("https://www.nseindia.com/", headers=headers)
print(x.text)
Following code is working on my pc but when I put it in aws it is not responding.
I have also checked ping https://www.nseindia.com/
it is working.
requests is working for other sites like google but not working for this specific site on aws.
In EC2:
Python 3.8.5 (default, Jul 28 2020, 12:59:40)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> headers = {
... 'authority': 'www.nseindia.com',
... 'upgrade-insecure-requests': '1',
... 'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.183 Safari/537.36 OPR/72.0.3815.320',
... 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
... 'sec-fetch-site': 'none',
... 'sec-fetch-mode': 'navigate',
... 'sec-fetch-user': '?1',
... 'sec-fetch-dest': 'document',
... 'accept-language': 'en-GB,en;q=0.9',
... }
>>> nse = requests.Session()
>>> nse.get("https://www.nseindia.com/", headers=headers)
No output from last line.
In my PC:
Python 3.8.5 (default, Jul 28 2020, 12:59:40)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> headers = {
... 'authority': 'www.nseindia.com',
... 'upgrade-insecure-requests': '1',
... 'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.183 Safari/537.36 OPR/72.0.3815.320',
... 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
... 'sec-fetch-site': 'none',
... 'sec-fetch-mode': 'navigate',
... 'sec-fetch-user': '?1',
... 'sec-fetch-dest': 'document',
... 'accept-language': 'en-GB,en;q=0.9',
... }
>>> nse = requests.Session()
>>> nse.get("https://www.nseindia.com/", headers=headers)
<Response [200]>
>>>
Problem detected:
IN EC2
ping www.nseindia.com
PING www.nseindia.com (23.9.215.115) 56(84) bytes of data.
64 bytes from a23-9-215-115.deploy.static.akamaitechnologies.com (23.9.215.115): icmp_seq=1 ttl=51 time=1.07 ms
64 bytes from a23-9-215-115.deploy.static.akamaitechnologies.com (23.9.215.115): icmp_seq=2 ttl=51 time=1.09 ms
IN PC
ping www.nseindia.com
PING www.nseindia.com (23.35.32.140) 56(84) bytes of data.
64 bytes from a23-35-32-140.deploy.static.akamaitechnologies.com (23.35.32.140): icmp_seq=1 ttl=57 time=65.8 ms
64 bytes from a23-35-32-140.deploy.static.akamaitechnologies.com (23.35.32.140): icmp_seq=2 ttl=57 time=61.5 ms
64 bytes from a23-35-32-140.deploy.static.akamaitechnologies.com (23.35.32.140): icmp_seq=3 ttl=57 time=73.1 ms
ping to different IP.
The Lambda runtimes for Python 3.8 and later do not include the 'requests' module.
The AWS SDK for Python (Boto3) enables you to use Python code to interact with AWS services like Amazon S3. For example, you can use the SDK to create an Amazon S3 bucket, list your available buckets, and then delete the bucket you just created.
You get different IP after ping because www.nseindia.com
is delivered to you through akamai CDN. So you are pinging different edge location whether you are doing this from home/work or AWS servers.
What's more, IP address ranges of AWS are publicly known. Thus, its not uncommon for websites to explicitly block AWS connections, to protect from scraping, attacks or otherwise unwanted access. Thus it seems that nseindia is blocking all these AWS IP addresses. It is a known issue as indicated here and here for examples.
The solution is not to use AWS nor other popular could providers (nseindia also blocks others). You could try to proxy your AWS requests through some commercial VPN maybe, home/work network, or something that is not blacklisted. Sadly, this is try-and-see approach. But you have to also consider potential legal/ethical issues of bypassing these restrictions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With