Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

sijieaaa
Copy link
Contributor

Hi,

I find sometimes there are memory explosion issues when using LCB. Memory usage continuously increases until my server (max 1TB RAM) becomes IDLE.
image

I make these two changes, and memory explosion issues have never shown again.

  1. Force to kill process by PID
  2. Set default memory limit to 4GB (4 * 1024 * 1024 * 1024), as stated in the original comments.

Hope this fix looks good to you. :)
Thanks.

@Naman-ntc
Copy link
Contributor

Hi, thanks for the note. I would highly recommend not restricting the memory limit but reducing parallelization if you face memory issues, as it can lead to many false negatives in the grading.

@sijieaaa
Copy link
Contributor Author

Thanks for your reply.

  1. The issue is not caused by parallelization. Even with 1 process, if the generated codes contain any zombie part, the IDLE issue will show up, and LCB cannot kill such zombie processes clearly.
  2. Besides timeout, memory limit is also an important criteria in most coding competitions. If the generated code needs unlimited memory to pass tests, then the code may not be the correct answer.
  3. Below is the result of Qwen-2.5-Coder-7B-Instruct (codegen 2025-01-01 -- 2025-05-01) under different memory limits, no significant differences are shown.
Pass@1 Pass@5 Memory Limit
19.8 23.2 4GB
20.1 24.0 32GB
20.2 24.3 128GB

Based on above, I still suggest to add memory limit.

@Naman-ntc
Copy link
Contributor

Naman-ntc commented Jun 19, 2025

The issue is not caused by parallelization. Even with 1 process, if the generated codes contain any zombie part, the IDLE issue will show up, and LCB cannot kill such zombie processes clearly.

Thanks for following up. This is interesting, I think this behavior was not common for earlier models (since they used to perform poorly on hard problems) when I designed the autograder.

Besides timeout, memory limit is also an important criteria in most coding competitions. If the generated code needs unlimited memory to pass tests, then the code may not be the correct answer.

Yes, but I am not entirely sure what a good memory limit is when computing the results, given the small variance across memory limits (particularly when you do parallel execution). I will investigate this. Thanks for your thoughts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants