Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[Feature]: set_input_files fails when using a remote browser if the file is on the remote machineΒ #39383

@yamanjain

Description

@yamanjain

πŸš€ Feature Request

[Feature]: Support zero-round-trip file uploads for files already on the remote browser's filesystem

When connecting to a remote browser, set_input_files currently expects the file to reside on the machine executing the Playwright script. If the file is already located on the remote machine hosting the browser, there is no native way to instruct the remote browser to simply attach its own local file.

Currently, the only workaround with playwright involves fetching the file from the remote server just to immediately send it back to the remote browser over the CDP connection.

I have attached a proof of concept code below to show it is possible with raw CDP in the form of a test script. It will be great to have this built in with Playwright so that we can use native Playwright efficiency.

"""
Test file upload via CDP β€” validates inject_local_file cross-platform.

Same pattern as test_cdp_download.py: connects to an existing Chrome via CDP,
runs tests that exercise the file upload pipeline, reports results.

Usage:
    python test_cdp_upload.py [cdp_url]
    Default: http://localhost:9222
"""

import asyncio
import logging
import re
import sys
import platform as _platform_mod

from playwright.async_api import Page, async_playwright
import asyncio

logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")


async def inject_local_file(page: Page, selector: str, file_path: str, max_node_retries: int = 10):
    """
    Bypasses Playwright's virtual file system to inject a local 
    client file directly into a remote Chrome instance via raw CDP.

    Called AFTER Playwright has confirmed element is attached (wait_for in smart_file_upload).
    Uses bounded retries for CDP nodeId resolution (race window between Playwright and CDP).

    Note: setFileInputFiles triggers Chrome's internal events, which frameworks
    like PrimeFaces detect and process immediately (auto-upload + clear input).
    We intentionally do NOT verify files.length afterwards β€” the input may
    already be cleared by the framework, but the upload has succeeded.
    """
    cdp = await page.context.new_cdp_session(page)
    
    try:
        # 1. Resolve the CDP nodeId with bounded retries (handles DOM mutation race)
        input_node_id = None
        for attempt in range(max_node_retries):
            try:
                doc = await cdp.send("DOM.getDocument")
                root_node_id = doc["root"]["nodeId"]

                node = await cdp.send("DOM.querySelector", {
                    "nodeId": root_node_id,
                    "selector": selector
                })
                
                input_node_id = node.get("nodeId", 0)
                # CDP returns nodeId=0 when the selector matches nothing
                if input_node_id != 0:
                    break
                    
                logging.debug(f"CDP nodeId not found for {selector}, attempt {attempt + 1}/{max_node_retries}")
                await asyncio.sleep(0.2)
                
            except Exception as e:
                error_msg = str(e).lower()
                # Catch specific CDP DOM mutation/stale node errors
                if "could not find node" in error_msg or "node is detached" in error_msg:
                    logging.debug(f"DOM mutation caught for {selector}. Retry {attempt + 1}/{max_node_retries} (Error: {error_msg})")
                    await asyncio.sleep(0.2)
                    continue
                raise
        else:
            raise TimeoutError(f"CDP could not find {selector} after {max_node_retries} retries")

        # 2. Inject the local file path directly into the Chrome process.
        #    setFileInputFiles triggers Chrome's internal input events, which
        #    PrimeFaces/JSF frameworks detect immediately (auto-upload + clear input).
        #    We do NOT check files.length β€” PrimeFaces may clear the input before
        #    we can read it, but the upload has already succeeded.
        await cdp.send("DOM.setFileInputFiles", {
            "nodeId": input_node_id,
            "files": [file_path]
        })
        logging.info(f"File injected via CDP: {file_path}")

        # Dispatch change event as a safety net for frameworks that don't
        # hook into Chrome's internal events (PrimeFaces usually doesn't need this,
        # but other JSF frameworks might).
        try:
            await page.locator(selector).evaluate("node => node.dispatchEvent(new Event('change', { bubbles: true }))")
        except Exception:
            pass  # Element may have been replaced by framework already

    finally:
        # Always detach the session to free up resources
        await cdp.detach()


def clean_path_by_remote_os(platform, file_path: str) -> str:
    # Normalize path separators for the remote browser's OS
    if platform and platform.lower().startswith("win"):
        file_path = file_path.replace('/', '\\')
        # Collapse double backslashes but preserve UNC prefix (\\server)
        if file_path.startswith('\\\\'):
            file_path = '\\\\' + file_path[2:].replace('\\\\', '\\')
        else:
            file_path = file_path.replace('\\\\', '\\')
    else:
        file_path = file_path.replace('\\', '/')
        # Collapse double slashes
        while '//' in file_path:
            file_path = file_path.replace('//', '/')
    return file_path


async def smart_file_upload(page: Page, selector: str, file_path: str, clean_file_path_by_remote_os: bool, use_set_input_files : bool):
    """
    Uploads a file handling both local and remote browser scenarios.
    If running in a remote context, uses basic CDP injection to upload from the client's machine.
    Otherwise uses standard Playwright set_input_files.
    """
    if clean_file_path_by_remote_os:
        platform = await page.evaluate("navigator.platform")
        file_path = clean_path_by_remote_os(platform, file_path) 
    if not use_set_input_files:
        # logging.info(f"Waiting for {selector} to attach to the DOM...")
        await page.locator(selector).wait_for(state="attached")
        await inject_local_file(page, selector, file_path)
    else:
        # Standard local execution
        logging.info(f"No Task Context. Using standard Playwright upload for file: {file_path}")
        await page.locator(selector).set_input_files(file_path)


async def main():

    cdp_url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost:9228"
    results = []


    async def run_test(name, coro, results_list):
        try:
            result = await coro()
            results_list.append({"test": name, **result})
            status = result.get("status", "?")
            detail = result.get("detail", "")
            print(f"\n{'='*60}\n{name}\n  {status}\n  {detail}\n")
        except Exception as exc:
            results_list.append({"test": name, "status": f"❌ {exc}"})
            print(f"\n{'='*60}\n{name}\n  ❌ EXCEPTION: {exc}\n")

    async with async_playwright() as pw:
        browser = await pw.chromium.connect_over_cdp(cdp_url)
        context = browser.contexts[0]
        page = context.pages[0] if context.pages else await context.new_page()

        # Detect platform and set file paths
        platform = await page.evaluate("navigator.platform")
        server_os = _platform_mod.system().lower()
        print (f"server_os={server_os}, remote_os={platform}")
        
        if platform.lower().startswith("win"):
            # Pick a file that definitely exists on the browser's machine
            # test_file = "C:\\Windows\\System32\\drivers\\etc\\hosts"
            # dirty_path = "C:/Windows/System32/drivers/etc/hosts"

            local_test_file_path = r"C:\localfolder\test_upload.txt"
            remote_test_file_path = r"c:\remotefolder\test_upload.txt"
            # Use a path with the WRONG separators to test normalization
            # Give Linux-style path β€” smart_file_upload should convert to Windows
            local_dirty_test_file_path = r"C:/localfolder/test_upload.txt"
            remote_dirty_test_file_path = r"c:/remotefolder/test_upload.txt"

            # Try to set a UNC path (it won't work unless the share exists, 
            # but we can verify the CDP call doesn't crash)
            local_unc_test_file_path = r"\\localnetwork\test\2748\testupload.pdf"
            remote_unc_test_file_path = r"\\remotenetwork\test\2748\testupload.pdf"
        else:
            # Similar logic for Linux  
            test_file = "/etc/hosts"
            dirty_path = "\\etc\\hosts"
        
        async def reset_page(page):
            # Create a simple HTML page with a file input
            await page.goto("about:blank")
            await page.set_content("""
                <html><body>
                    <h1>Upload Test</h1>
                    <input type="file" id="testInput" />
                    <div id="result">No file selected</div>
                    <script>
                        document.getElementById('testInput').addEventListener('change', function() {
                            var f = this.files[0];
                            document.getElementById('result').textContent = 
                                f ? 'File: ' + f.name + ' (' + f.size + ' bytes)' : 'No file selected';
                        });
                    </script>
                </body></html>
            """)

        async def get_result(page, test_file):
            await asyncio.sleep(0.5)
            result_text = await page.locator("#result").text_content()
            pattern = r"[1-9]\d* (bytes)"
            match = re.search(pattern, result_text)
            if match:            
                return {
                    "status": f"βœ… {result_text}",
                    "detail": f"file={test_file}"
                }
            else:
                return {
                    "status": f"❌ {result_text}",
                    "detail": f"file={test_file}"
                }


        async def test_pattern(page: Page, test_file, test_description,clean_path_by_remote_os,  use_set_input_files):
            await reset_page(page)
            try:
                await smart_file_upload(page, "#testInput", test_file, clean_file_path_by_remote_os = clean_path_by_remote_os, use_set_input_files=use_set_input_files)
            except Exception as e:
                print (f"Error in {test_description} {e}")
            
            return await get_result(page, test_description)
        
        async def run_all_tests():
            use_set_input_files = [True, False]
            files_to_test = [local_test_file_path, local_dirty_test_file_path, local_unc_test_file_path,  remote_test_file_path, remote_dirty_test_file_path, remote_unc_test_file_path]
            test_descriptions = ["local file path",  "local dirty path", "local UNC path", "remote file path", "remote dirty path", "remote UNC path"]
            for use_set_input_file in use_set_input_files:
                if use_set_input_file:
                    print ("***************Using Playwright built in set_input_files***************")
                else:
                    print ("***************Using CDP injection***************")
                for idx, test_description in enumerate(test_descriptions):
                    test_file = files_to_test[idx]
                    test_description = test_descriptions[idx]
                    print (f"{idx+1} {test_description}")
                    result = await test_pattern(page, test_file, test_description, clean_path_by_remote_os=True , use_set_input_files=use_set_input_file)
                    logging.info (result)

        await run_all_tests()

if __name__ == "__main__":
    asyncio.run(main())

Example

await page.locator(selector).set_input_files(remote_file_path, resolve_remote_browser_path = True)

Motivation

I currently am trying to setup Playwright scripts which run remotely on Ubuntu (EC2 or any VPS) and control browser on my local Windows desktop using CDP.

This helps me do tasks I have to do daily like downloading reports from websites with the flexibility of continue right from where Playwright left if something crashes.

It works great with workarounds for downloading and uploading files but native Playwright efficiency is missed.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions