Python的Playwright – 1.1 BrowserContexts

  • Post author:
  • Post category:python



BrowserContexts

提供了一种操作多个独立浏览器会话的方法,如果一个页面打开另一个页面,例如通过window.open调用,弹出窗口将属于父页面的浏览器上下文。

from playwright.sync_api import sync_playwright
import time

with sync_playwright() as sp:
    browser = sp.chromium.launch(headless=False)
    context = browser.new_context()
    page = context.new_page()
    
    page.goto('http://www.baidu.com')
    # 创建新页面
    with context.expect_page() as cp:
        page.click('a[target="_blank"]')
    new_page = cp.value
    # 等新页面加载完毕
    new_page.goto('http://www.bilibili.com')
    if page.context is new_page.context:
        print("亲儿子!")

    context.close()
    browser.close()


context.on('event',func)

根据PyCharm的提示,大概这几种event


  • close

    : 在以下情况下触发:

    1.浏览器上下文已关闭/

    context.close()



    2.浏览器应用程序关闭或崩溃。

    3.

    browser.close()

    方法被调用。

  • page

    : 创建新页面时触发

    context.new_page()

  • serviceworker

    : 本质上充当 Web 应用程序、浏览器与网络(可用时)之间的代理服务器,常用的功能是截获请求和缓存资源文件


browser_context.expect_event(event, **kwargs)

: 参考

context.on()


对于参数

from playwright.sync_api import sync_playwright


def func(page):
    print('新建一个页面')
    if page:
        print(page)
        return True
    else:
        return None


with sync_playwright() as sp:
    browser = sp.chromium.launch()
    context = browser.new_context()
    context.expect_event("page", predicate=func)
    print('请求百度前')
    page = context.new_page()
    # 代码实现
    page.goto("http://www.baidu.com")
    # page.close()
    print('请求结束后')
    context.close()
    browser.close()
"""
输出:
请求百度前
新建一个页面
<Page url='about:blank'>
请求结束后
"""

如果不用

predicate

而用常规函数,如下:

from playwright.sync_api import sync_playwright

with sync_playwright() as sp:
    browser = sp.chromium.launch()
    context = browser.new_context()
    context.expect_event("page", lambda: print('新建个页面'))
    print('请求百度前')
    page = context.new_page()
    # 代码实现
    page.goto("http://www.baidu.com")
    # page.close()
    print('请求结束后')
    context.close()
    browser.close()
"""
输出:
请求百度前
新建个页面
请求结束后
Future exception was never retrieved
future: <Future finished exception=Error('Target page, context or browser has been closed')>
playwright._impl._api_types.Error: Target page, context or browser has been closed
"""


context.expect_page()

: 官网解释:Performs action and waits for a new Page to be created in the context. If predicate is provided, it passes Page value into the predicate function and waits for predicate(event) to return a truthy value. Will throw an error if the context closes before new Page is created.

也就是说,操作等待后(比如page.click(‘链接url’))会创建一个新的页面,生成的对象可以直接对新窗口进行操作:

from playwright.sync_api import sync_playwright
import time

with sync_playwright() as sp:
    browser = sp.chromium.launch(headless=False)
    context = browser.new_context()
    page = context.new_page()
    page.goto('http://www.baidu.com')

    # 1. 正常情况下,点击新页面连接,不会影响主page的内容
    page.click('a[target="_blank"]')
    time.sleep(2)
    print(page.title())  # 百度一下,你就知道

    # 2.第二种方式,控制新页面,弹窗的处理方式同理,不过是主页面的弹窗,可以用page直接处理:page.expect_popup()
    with context.expect_page() as cp:
        page.click('a[target="_blank"]')
    new_page = cp.value
    # 等新页面加载完毕
    new_page.wait_for_load_state()
    print(new_page.title())  # 百度新闻——海量中文资讯平台


    # 3. 第三种方式,如果触发新页面的操作未知,可以用通用函数来处理,弹窗:page.on('popup',func)
    def handle_new_page(cp):
        cp.wait_for_load_state()
        print(cp.title())


    # 等待触发新页面的操作产生,交给控制函数处理
    context.on('page', handle_new_page)


context.route()

: 路由(过滤功能)提供了修改浏览器上下文中任何页面发出的网络请求的能力。启用路由后,每个匹配 url 模式的请求都将停止,除非它继续、完成或中止。

from playwright.sync_api import sync_playwright
import re

with sync_playwright() as sp:
    browser = sp.chromium.launch(headless=False)
    context = browser.new_context()
    page = context.new_page()
    # 代码实现
    page.on('response', lambda response: print('<!--Response', response.url, response.status))
    # route()拦截网址响应,并做出相应动作
    page.route(
        re.compile(".*bilibili.*"),
        lambda route: route.fulfill(status=404))
    page.goto("http://www.bilibili.com")

    context.close()
    browser.close()


browser_context.unroute(url, **kwargs)

: 解除route