-
-
Notifications
You must be signed in to change notification settings - Fork 17
Arbitrary Syscall Invocation #235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Build mistakes + doc mistakes resolution
…d exception in syscall handler callback
…s stopped inside a syscall
Finally works on i386 as well, Now just AArch64 remaining |
</div> | ||
Additionally, when the syscall is a [`fork`](https://man7.org/linux/man-pages/man2/fork.2.html), [`vfork`](https://man7.org/linux/man-pages/man2/vfork.2.html) or [`clone`](https://man7.org/linux/man-pages/man2/clone.2.html), the function will also restore the state in the child process / thread. This is done by copying the registers from the parent process / thread to the child process / thread. | ||
|
||
As you can see, registers values are restored after the syscall is executed to reduce the chances of the process crashing. However, be mindful that the syscall is indeed executed. Thus, the state of the process will have changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it may be worth adding explicitely the clarification that the memory is not restored since it is intended to be so
@@ -385,6 +385,7 @@ def apply_on_thread(self: Amd64PtraceRegisterHolder, target: ThreadContext, targ | |||
|
|||
# setup generic syscall properties | |||
target_class.syscall_number = _get_property_64("orig_rax") | |||
target_class.syscall_num_register = _get_property_64("rax") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not really needed. Under arbitrary syscall calling, we control the on_enter status, hence, we can use syscall_number
for this purpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still need a variable to tell the status handler what syscall I want to hijack. Also, to me this addition disambiguates the meaning of the syscall_number
property (which will create confusion in someone else other than me when they try to set the value for some other weird use case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just build the on_enter callback there, and pass it the syscall number:
[...] # Rest of invoke_syscall
def on_enter_invoke(t, _):
t.syscall_number = syscall
[...] # Rest of invoke_syscall
Doesn't this work? I agree that the weirdness with syscall_number has to be fixed, I don't think that adding a new attribute is really the proper way to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I have to agree with Alessandro, this time
@@ -267,6 +267,7 @@ def apply_on_thread(self: Aarch64PtraceRegisterHolder, target: ThreadContext, ta | |||
target_class.instruction_pointer = _get_property_64("pc") | |||
|
|||
# setup generic syscall properties | |||
target_class.syscall_num_register = _get_property_64("x8") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment for amd64
@@ -108,6 +108,7 @@ def apply_on_thread(self: I386OverAMD64PtraceRegisterHolder, target: ThreadConte | |||
|
|||
# setup generic syscall properties | |||
target_class.syscall_number = _get_property_32("orig_rax") | |||
target_class.syscall_num_register = _get_property_32("rax") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment for amd64
@@ -164,6 +164,7 @@ def apply_on_thread(self: I386PtraceRegisterHolder, target: ThreadContext, targe | |||
|
|||
# setup generic syscall properties | |||
target_class.syscall_number = _get_property_32("orig_eax") | |||
target_class.syscall_num_register = _get_property_32("eax") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
|
||
if not self._is_in_background(): | ||
self.__polling_thread_command_queue.put((self.__threaded_cont_to_syscall, (thread,))) | ||
self.__polling_thread_command_queue.put((self.__threaded_wait, ())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you do the wait, you should do the join. No? It should work also as it is but idk, just double check for race or edge cases
|
||
is_cloning_event = syscall_name in ["fork", "vfork", "clone", "clone3"] | ||
|
||
if is_cloning_event: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separate function, please
child.syscall_number = syscall_number | ||
|
||
# - Restore registers | ||
child.step() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was fine for def con, not for production
if isinstance(getattr(thread.regs, reg_name), int | float) and reg_name != "_thread_id": | ||
setattr(child.regs, reg_name, getattr(thread.regs, reg_name)) | ||
else: | ||
# If the syscall is a fork, we need to fix the state of the new process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is not a fork???
if isinstance(getattr(thread.regs, reg_name), int | float) and reg_name != "_thread_id": | ||
setattr(child.regs, reg_name, getattr(thread.regs, reg_name)) | ||
return retval | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about base pointer, stack pointer and pippo stuff for a new thread?
@@ -25,3 +25,8 @@ def __init__(self: Amd64ThreadContext, thread_id: int, registers: Amd64PtraceReg | |||
|
|||
# Register the thread properties | |||
self._register_holder.apply_on_thread(self, Amd64ThreadContext) | |||
|
|||
@property | |||
def num_syscall_args(self: Amd64ThreadContext) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate the abstract method and everything LoL, but for once I am going to have to say that I don't think we need this yet. There are only a couple of platforms where syscalls do not take 6 arguments, we support none of them.
I'd rater have a concrete method that returns 6 in the abstract ThreadContext, and we override that method in the concrete subclass only for the platforms where we need it (if we ever decide to support PowerPC 32, for example)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Btw, I think ARM32 (which is arguably more likely than PowerPC) is also 7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhmm I should look into support for arm32 over aarch64 actually, I thought that they were incompatible.
In any case, I would just override that method for arm32. Less lines of code is better (in my opinion) ((in this case)).
reg_bit_count = get_platform_gp_register_size(self.arch) * 8 | ||
negative_threshold = 2 ** (reg_bit_count - 1) | ||
|
||
if new_pid >= negative_threshold: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pid_t
is a signed 32 bit integer on all platforms, you could just check that new_pid.bit_length() <= 31, and it's not a platform-dependent check
|
||
# Invoke the syscall | ||
if PLATFORM == "i386": | ||
# On i386, the mmap syscall has a different signature: it takes a struct instead of the arguments directly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just call mmap_pgoff
on i386 instead of mmap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you can avoid all of this
self.handled_syscalls[syscall_number].on_enter_user is None | ||
and self.handled_syscalls[syscall_number].on_exit_user is None | ||
): | ||
if not self._is_in_background(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this flow is not totally correct for cloning functions (and some others).
If I invoke a normal syscall, this happens:
process is in group stop -> thread does SYSCALL and hits on_enter
-> thread does SYSCALL and hits on_exit
-> end of invoke_syscall
. All is fine, I think.
If I invoke fork/clone/whatever, I think that this is what happens:
process is in group stop -> thread does SYSCALL and hits on_enter
-> thread does SYSCALL -> kernel notifies us of a new FORK/CLONE_EVENT -> ptrace_status_handler
receives the event and then does a process-wide cont
-> thread hits on_exit
-> end of invoke_syscall
. So this has unexpected side-effects for multithreaded process, I think, and I am also not sure how this doesn't break the flow for single-threaded processes, because the main thread gets cont'd too.
Now, what happens if I invoke the exit syscall from a thread? We should support thread-suicide I think, and I honestly don't know if and how this flow could handle that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this, and I found some other syscalls that call for "special treatment":
seccomp
: I don't remember how we have it implemented now, but this is another syscall that (I suppose) generates a third event in-between the two SYSCALL stops.execve
/execveat
: these probably break libdebug in general, not only when injected, but we should probably add an error if the user attempts to invoke them, because we would definitely break.
Thinking about the whole syscall invocation thingy, I've actually been wondering if we should think of a non-blocking implementation too:
r = d.run()
d.invoke_syscall("read", d.regs.rax, 0x10)
r.sendline(b"provola")
This makes no sense in a real script, I know, but it would deadlock everything because the read syscall will never terminate, waiting for the sendline right after. Am I hallucinating? Probably. Is there an actual sane case of something like this that could happen in a real script? I think so, actually.
So should invoke_syscall
be non-blocking like cont
? Should we have an optional non-blocking mode? Does this make sense? This whole syscall invocation thing has been like opening a whole can of worms, I think.
cc @io-no the "we should do it for consistency" API connoisseur, I think I gave him enough material to insult me for a week in this message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the third event is not an issue if it occurs within the context of a handled syscall, since any continue
operation in libdebug
becomes a ptrace_syscall
. I could suggest maintaining a similar mechanism also for the invocation.
void LibdebugPtraceInterface::cont_thread(Thread &t)
{
if (ptrace(handle_syscall ? PTRACE_SYSCALL : PTRACE_CONT, t.tid, NULL, t.signal_to_forward) == -1) {
throw std::runtime_error("ptrace cont failed");
}
t.signal_to_forward = 0;
}
The seccomp event will be managed internally during the wait
loop. Therefore, it is sufficient to have a wait
somewhere to prevent the event from causing issues.
Is this what you meant, or did I misunderstand?
That said, I agree that seccomp handling should be improved overall.
I might have to agree for the second time today regarding the API. It makes sense to have a non-blocking API — this would make it consistent with the rest of the API, not just continue
(we still need to fix step
and a few other APIs that are currently blocking).
The only intended way to wait for the program to stop should be through d.wait
, in my opinion — just my two cents.
At this point, maybe the entire mechanism of invoke_syscall
could be managed through special handles.
An exit handle, transparent to the user, would manage restoring the original state of the process and simply stop the process at the point where the syscall was invoked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the third event is not an issue if it occurs within the context of a handled syscall, since any continue operation in libdebug becomes a ptrace_syscall. I could suggest maintaining a similar mechanism also for the invocation.
Yeah, it's not a problem if we do the handlers and everything, I was just saying that it is probably a problem in the current implementation of invoke_syscall
, so the issue is not only with clone
and fork
.
elif PLATFORM == "i386": | ||
ret = d.invoke_syscall("clone", clone_flags, stack_base, stack_base + 0x04, d.regs.gs, stack_base + 0x08) | ||
elif PLATFORM == "aarch64": | ||
# To retrieve the TLS base, we need to use the TPIDR_EL0 register |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tls
parameter is nullable in the clone syscall.
You are not setting the CLONE_SETTLS flag, so I don't think that this is needed, just pass 0x0 and it will work.
I think this is the same for the other two parameters as well, parent_tid
and child_tid
.
if any(not isinstance(arg, int) for arg in args): | ||
raise TypeError("All arguments must be integers.") | ||
|
||
self._ensure_process_stopped() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not really needed.
Returns: | ||
int: The return value of the syscall. | ||
""" | ||
# Initial checks to ensure the syscall can be invoked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should check that the thread is on an executable page before injecting a syscall. What if we are on a RW page? Or if we are at the boundary of an executable page and we don't have enough space for a syscall instruction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, since we inject it before the current IP, we should.
This pull request implements the long-awaited arbitrary system call invocation.
API:
d.invoke_syscall("write", 1, d.regs.rbp, 0x10)
Addresses #169 #225