Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

MagMueller
Copy link
Collaborator

@MagMueller MagMueller commented Aug 30, 2025

2 Critical fixes:

  • Our input_text did not use code inside dispatchKeyEvent
  • we marked labels as interactive by default, this lead to that sometimes we clicked on the wrong element - e.g. apartments.com

- Added logging for new elements detected during actions in the Agent class.
- Implemented a human-like text field clearing method in DefaultActionWatchdog, utilizing Ctrl+A and Backspace.
- Improved focus handling for label elements, ensuring they are only interactive if they do not have a 'for' attribute.
- Updated clickable element detection logic to account for labels pointing to inputs.

These changes improve the robustness of user interactions and enhance debugging capabilities.
- Enhanced the tag check to include truly interactive elements.
- Removed special handling for 'label' elements, as they are now managed by other attribute checks to prevent interference with clickable elements.

These updates improve the accuracy of interactive element detection in the DOM serializer.
- Added type hint for CDPSession in the _focus_element_simple method.
- Enhanced logging for focus attempts, including exception details.
- Reduced sleep duration in scrollIntoViewIfNeeded for better performance.
- Updated text clearing logic to ensure it only occurs after successful focus.

These changes enhance the robustness of element interaction and improve debugging capabilities.
Copy link

github-actions bot commented Aug 30, 2025

Agent Task Evaluation Results: 2/3 (67%)

View detailed results
Task Result Reason
amazon_laptop ✅ Pass The agent successfully navigated to amazon.com, performed a search for 'laptop', and returned the name and details of the first laptop result. The output includes the product title, price, rating, and number of reviews, fulfilling the task requirements.
browser_use_pip ✅ Pass The agent correctly identified and provided the pip installation command 'pip install browser-use' as requested. The output includes the exact command and additional relevant context, meeting the success criteria.
captcha_cloudflare ❌ Fail The agent attempted to solve the captcha on the specified page but was unable to successfully complete it and extract the 'hostname' value as requested. The extracted content only included example JSON responses and site information related to the captcha solving process, but no actual dictionary labeled 'Captcha is passed successfully!' with a hostname field was found. Additionally, the expected hostname value was 'example.com', which was not present in any of the extracted data. Therefore, the agent did not fulfill the task requirements.

Check the evaluate-tasks job for detailed task execution logs.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 3 files

React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai to give feedback, ask questions, or re-run the review.


# ENHANCED TAG CHECK: Include truly interactive elements
# ENHANCED TAG CHECK: Include truly interactive elements
# Note: 'label' removed - labels are handled by other attribute checks below - other wise labels with "for" attribute can destry the real clickable element on appartments.com
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typos in the added comment reduce clarity: "other wise", "destry", and "appartments.com".

Prompt for AI agents
Address the following comment on browser_use/dom/serializer/clickable_elements.py at line 98:

<comment>Typos in the added comment reduce clarity: &quot;other wise&quot;, &quot;destry&quot;, and &quot;appartments.com&quot;.</comment>

<file context>
@@ -94,14 +94,14 @@ def is_interactive(node: EnhancedDOMTreeNode) -&gt; bool:
 
-		# ENHANCED TAG CHECK: Include truly interactive elements
+				# ENHANCED TAG CHECK: Include truly interactive elements
+		# Note: &#39;label&#39; removed - labels are handled by other attribute checks below - other wise labels with &quot;for&quot; attribute can destry the real clickable element on appartments.com
 		interactive_tags = {
 			&#39;button&#39;,
</file context>
Suggested change
# Note: 'label' removed - labels are handled by other attribute checks below - other wise labels with "for" attribute can destry the real clickable element on appartments.com
# Note: 'label' removed - labels are handled by other attribute checks below; otherwise labels with "for" attribute can destroy the real clickable element on apartments.com

'type': 'keyDown',
'key': 'a',
'code': 'KeyA',
'modifiers': 2, # Ctrl modifier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Meta (Cmd) on macOS for select-all; set modifiers to 4 on Darwin to ensure Ctrl/Cmd+A works cross-platform.

(This reflects your team's feedback about using tabs for indentation in fix suggestions.)

Prompt for AI agents
Address the following comment on browser_use/browser/watchdogs/default_action_watchdog.py at line 649:

<comment>Use Meta (Cmd) on macOS for select-all; set modifiers to 4 on Darwin to ensure Ctrl/Cmd+A works cross-platform.

(This reflects your team&#39;s feedback about using tabs for indentation in fix suggestions.)</comment>

<file context>
@@ -583,80 +584,184 @@ async def _type_to_page(self, text: str):
+					&#39;type&#39;: &#39;keyDown&#39;,
+					&#39;key&#39;: &#39;a&#39;,
+					&#39;code&#39;: &#39;KeyA&#39;,
+					&#39;modifiers&#39;: 2,  # Ctrl modifier
+					&#39;windowsVirtualKeyCode&#39;: 65,
 				},
</file context>

'text': char,
'key': char,
'code': key_code,
'windowsVirtualKeyCode': ord(char.upper()) if char.isalpha() else ord(char),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deriving windowsVirtualKeyCode from ASCII leads to wrong virtual key codes for punctuation; either omit this field for printable chars or map to correct VK codes to avoid mis-typed input.

Prompt for AI agents
Address the following comment on browser_use/browser/watchdogs/default_action_watchdog.py at line 846:

<comment>Deriving windowsVirtualKeyCode from ASCII leads to wrong virtual key codes for punctuation; either omit this field for printable chars or map to correct VK codes to avoid mis-typed input.</comment>

<file context>
@@ -698,145 +803,67 @@ async def _input_text_element_node_impl(self, element_node, text: str, clear_exi
 						&#39;text&#39;: char,
 						&#39;key&#39;: char,
+						&#39;code&#39;: key_code,
+						&#39;windowsVirtualKeyCode&#39;: ord(char.upper()) if char.isalpha() else ord(char),
 					},
 					session_id=cdp_session.session_id,
</file context>

'-': 'Minus',
'_': 'Underscore',
'@': 'At',
'!': 'Exclamation',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using non-standard 'Exclamation' for 'code' is incorrect; map to the proper base key code (e.g., 'Digit1') and handle Shift via modifiers for reliable typing behavior.

(This reflects your team's feedback about using tabs for indentation in fix suggestions.)

Prompt for AI agents
Address the following comment on browser_use/browser/watchdogs/default_action_watchdog.py at line 597:

<comment>Using non-standard &#39;Exclamation&#39; for &#39;code&#39; is incorrect; map to the proper base key code (e.g., &#39;Digit1&#39;) and handle Shift via modifiers for reliable typing behavior.

(This reflects your team&#39;s feedback about using tabs for indentation in fix suggestions.)</comment>

<file context>
@@ -583,80 +584,184 @@ async def _type_to_page(self, text: str):
+			&#39;-&#39;: &#39;Minus&#39;,
+			&#39;_&#39;: &#39;Underscore&#39;,
+			&#39;@&#39;: &#39;At&#39;,
+			&#39;!&#39;: &#39;Exclamation&#39;,
+			&#39;?&#39;: &#39;Question&#39;,
+			&#39;:&#39;: &#39;Colon&#39;,
</file context>

…ionWatchdog

- Updated key code mappings for special characters to reflect correct usage with modifiers.
- Enhanced text field clearing method to use platform-specific modifiers (Cmd for macOS, Ctrl for others) for a more human-like interaction.
- Removed unnecessary `windowsVirtualKeyCode` assignments for printable characters to prevent incorrect virtual key code usage.

These changes improve the accuracy of character input handling and enhance the robustness of text field interactions.
@MagMueller MagMueller merged commit e7a7a62 into main Aug 30, 2025
27 of 55 checks passed
@MagMueller MagMueller deleted the fix-typing branch August 30, 2025 01:51
await asyncio.sleep(0.01)

# Small delay between characters to look human (realistic typing speed)
await asyncio.sleep(0.001)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Text Input Sequence Fails CDP Standards

The _input_text_element_node_impl method's text input sequence deviates from standard CDP. It omits the crucial char event and incorrectly places the text parameter in keyDown events, which can lead to unreliable input on some sites. Furthermore, it fails to send necessary modifier keys (e.g., Shift) for special characters, potentially causing incorrect input for characters like _ or @.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant