-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-133953: Add attach
command to pdb
#133954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
cc @godlygeek |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tried running this yet, or looked closely at the tests, but here's some things that jump out to me from an initial code review
Lib/pdb.py
Outdated
@@ -709,6 +709,27 @@ def _get_asyncio_task(self): | |||
task = None | |||
return task | |||
|
|||
def _get_pid_from_process(self, process): | |||
"""process could be a subprocess.Popen, multiprocessing.Process or a pid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this seems misleading - process
is really a string that evaluates to a Popen
or a Process
or a pid.
Lib/pdb.py
Outdated
elif isinstance(process, int): | ||
return process | ||
|
||
self.error(f"Invalid process {process}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should use the repr here, instead:
self.error(f"Invalid process {process}") | |
self.error(f"Invalid process {process!r}") |
Lib/pdb.py
Outdated
# Error message is already displayed | ||
return None | ||
|
||
if isinstance(process, (Process, Popen)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth considering accepting any object with a pid
attribute... that'd save you the imports of multiprocessing
and subprocess
and would work on asyncio.subprocess.Process
, too. And potentially processes from 3rd party libraries, too
Lib/pdb.py
Outdated
return None | ||
|
||
if isinstance(process, (Process, Popen)): | ||
return process.pid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth ensuring that process.pid
is an int (it might not be due to monkeypatching, for instance).
Lib/pdb.py
Outdated
if isinstance(process, (Process, Popen)): | ||
return process.pid | ||
elif isinstance(process, int): | ||
return process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest considering:
if isinstance(process, (Process, Popen)): | |
return process.pid | |
elif isinstance(process, int): | |
return process | |
if isinstance(process, int): | |
return process | |
pid = getattr(process, "pid", None) | |
if isinstance(pid, int): | |
return pid |
@@ -2741,6 +2779,8 @@ def _ensure_valid_message(self, msg): | |||
# Due to aliases this list is not static, but the client | |||
# needs to know it for multi-line editing. | |||
pass | |||
case {"attach": int()}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth adding a comment explaining what this message type is used for, if only for consistency with the other message types.
case {"attach": int()}: | |
case {"attach": int()}: | |
# Have the client to attach to a new process. |
if pid is not None: | ||
self.message(f"Attaching to process {pid}") | ||
try: | ||
attach(pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some assumptions in attach
that the _PdbClient
is being run from the main thread - the entire signal handling approach relies upon it, at the very least - I think it'll just wind up failing with an exception if run from a non-main thread, at the point where it tries to install the signal handler.
I think we should check if we're in the main thread explicitly and self.error()
if not.
Or, thinking outside of the box, we could spawn a new process to do the attaching from its main thread, and return control back to the parent process after it finishes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The great thing about attaching directly from the process is that there's the automatic parent-child relation. So even with the tracing restriction, you can still attach to your child processes - spawning a new process won't let you do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, and that's a nice advantage - but certainly not one big enough to live with Ctrl+C not working! And you only get that advantage when starting off in non-remote PDB - if you started off in remote PDB, the child process won't be a child of the client (which means it also won't work when you attach
to a child process and then attach
again to a grandchild process).
The most reasonable choice might just be to make it so that the pdb.attach
module level function raises an exception if it's called from any thread but the main thread. Then the attach
PDB command will work from any thread when running remote PDB, and from the main thread when running normal PDB, but will fail when when called on a non-main thread in non-remote PDB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(which means it also won't work when you attach to a child process and then attach again to a grandchild process).
Is that true? I thought the trace_scope applies to descendants, not only direct children. So grandchildren should work? (I have a test that does that, if our linux buildbot has the restriction, it should prove the theory)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most reasonable choice might just be to make it so that the pdb.attach module level function raises an exception if it's called from any thread but the main thread.
That seems a reasonable option. It won't work when _PdbClient
is not in the main thread. We can explore other options in the future, but for now, making what works work seems to be the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(which means it also won't work when you attach to a child process and then attach again to a grandchild process).
Is that true? I thought the trace_scope applies to descendants, not only direct children. So grandchildren should work?
Ah, yes, you're right.
This PR added a protocol from server to client:
{"attach": int(pid)}
, which tells the client to attach to another process. This is the best way I can think of to avoid multiple hops between processes, while keeping the synchronization. We use the natural stack on the pdb client to simulate the process hop.attach
command is added topdb
, which is not exactly the same thing - we don't need the protocol change if we don't need multiple hop. A single attach from the client to the server would just work.attach
command to pdb #133953