這篇文章跟大家分享 selenium 的 pageLoadStrategy 參數如何設定。
對於一個新加載的dom,頁面啥時候開始接受命令由頁面的加載策略決定,也就是說,我們通過修改頁面加載策略,可以使頁面即使處於加載中,也能接受我們的命令,從這點可以解決webdriver.get的阻塞問題。而每類webdriver都有一個對應的配置文件放在特定的類DesiredCapabilities裡面,通過修改裡面的pageLoadStrategy,可以使webdriver的頁面加載策略發生改變。
When Selenium loads a page/url by default it follows a default configuration with pageLoadStrategy
set to normal
. To make Selenium not to wait for full page load we can configure the pageLoadStrategy
. pageLoadStrategy
supports 3 different values as follows:
normal
(full page load)eager
(interactive)none
Here is the code block to configure the pageLoadStrategy
:
Firefox :
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
caps = DesiredCapabilities().FIREFOX caps["pageLoadStrategy"] = "normal" # complete #caps["pageLoadStrategy"] = "eager" # interactive #caps["pageLoadStrategy"] = "none"
driver = webdriver.Firefox(desired_capabilities=caps, executable_path=r'C:\path\to\geckodriver.exe')
driver.get("http://google.com")
Chrome :
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
caps = DesiredCapabilities().CHROME
caps["pageLoadStrategy"] = "normal" # complete
#caps["pageLoadStrategy"] = "eager" # interactive
#caps["pageLoadStrategy"] = "none" driver =
webdriver.Chrome(desired_capabilities=caps, executable_path=r'C:\path\to\chromedriver.exe')
driver.get("http://google.com")
Note :
pageLoadStrategy
valuesnormal
,eager
andnone
is a requirement as per WebDriver W3C Editor’s Draft butpageLoadStrategy
value aseager
is still a WIP (Work In Progress) within ChromeDriver implementation. You can find a detailed discussion in “Eager” Page Load Strategy workaround for Chromedriver Selenium in Python
———ChromeDriver 77.0.3865.10 (2019-08-06)———
Supports Chrome version 77
- Resolved issue 1902: Support eager page load strategy [Pri-2]
- Resolved issue 2809: ChromeDriver waits for all child frames to be ready when pageLoadStrategy != none [Pri-2]
v77 之後的 ChromeDriver 可以使用上述的方式來設定 eager page load strategy
From the Webdriver specs:
For commands that cause a new document to load, the point at which the command returns is determined by the session’s page loading strategy.
When Page Loading
takes too much time and you need to stop downloading additional subresources (images, css, js etc) you can change the pageLoadStrategy
through the webdriver
.
As of this writing, pageLoadStrategy
supports the following values :
normal
This stategy causes Selenium to wait for the full page loading (html content and subresources downloaded and parsed).eager
This stategy causes Selenium to wait for the DOMContentLoaded event (html content downloaded and parsed only).none
This strategy causes Selenium to return immediately after the initial page content is fully received (html content downloaded).
By default, when Selenium
loads a page, it follows the normal
pageLoadStrategy
.
其中PageLoadStrategy有三種選擇:
- (1) NONE: 當html下載完成之後,不等待解析完成,selenium會直接返回
- (2) EAGER: 要等待整個dom樹加載完成,即DOMContentLoaded這個事件完成,僅對html的內容進行下載解析
- (3) NORMAL: 即正常情況下,selenium會等待整個界面加載完成(指對html和子資源的下載與解析,如JS文件,圖片等,不包括ajax)
相關文章:
The Desired Capabilities implementation.
https://github.com/SeleniumHQ/selenium/blob/master/py/selenium/webdriver/common/desired_capabilities.py
table of page load strategies
https://www.w3.org/TR/webdriver/#dfn-table-of-page-load-strategies