Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selenium is not loading TikTok pages

I'm implementing a TikTok crawler using selenium and scrapy

start_urls = ['https://www.tiktok.com/trending']
....
def parse(self, response):
    options = webdriver.ChromeOptions()
    from fake_useragent import UserAgent
    ua = UserAgent()
    user_agent = ua.random
    options.add_argument(f'user-agent={user_agent}')
    options.add_argument('window-size=800x841')
    driver = webdriver.Chrome(chrome_options=options)
    driver.get(response.url)

The crawler open Chrome but it does not load videos. Image loading

The same problem happens also using Firefox No loading page using Firefox

The same problem using a simple script using Selenium

from selenium import webdriver
import time


driver = webdriver.Firefox()
driver.get("https://www.tiktok.com/trending")
time.sleep(10)
driver.close()

driver = webdriver.Chrome()
driver.get("https://www.tiktok.com/trending")
time.sleep(10)
driver.close()
like image 797
user12512567 Avatar asked Oct 16 '22 08:10

user12512567


People also ask

Does TikTok detect Selenium?

Selenium doesn't work (gets detected) for sites like Instagram and TikTok.

Can sites block Selenium?

The answer is YES! Websites can detect the automation using JavaScript experimental technology navigator. webdriver in the navigator interface. If the website is loaded with automation tools like Selenium, the value of navigator.

How can I use Selenium without being blocked?

The best way to get blocked is to use a proxy while running selenium headlessly. Remember Tiktok/Facebook know something is up the minute you go headless, cause they can see fag that. If you are not using a proxy, they generally allow a good flow of requests before slowing and eventually shutting down responses…

What is Selenium stealth?

A python package selenium-stealth to prevent detection. This programme is trying to make python selenium more stealthy. As of now selenium-stealth only support Selenium Chrome. After using selenium-stealth you can prevent almost all selenium detections.


1 Answers

Did u try to navigate further within the selenium browser window? If an error 404 appears on following sites, I have a solution that worked for me:

I simply changed my User-Agent to "Naverbot" which is "allowed" by the robots.txt file from Tik Tok

(Robots.txt)

After changing that all sites and videos loaded properly.

Other user-agents that are listed under the "allow" segment should work too, if you want to add a rotation.

like image 192
zebo Avatar answered Oct 20 '22 10:10

zebo