-
Notifications
You must be signed in to change notification settings - Fork 183
Description
When building cmark-gfm to asm.js with Emscripten I ran into the issue that the cmark library suffers a Javascript stack overflow during loading. It happens in both debug and release builds.
To reproduce:
- Install Emscripten SDK (reproduced with 3.1.50 and 3.1.59)
- Modify api_test/CMakeLists.txt to make emscripten output an HTML test page (see attached CMakeLists.txt)
- Build using the attached batch file
- Go to api_test inside the build directory
- Double-click api_test.html
- Open Developer Tools in your web browser
- In the Console tab, there will be a message saying "Uncaught RangeError: Maximum call stack size exceeded"
CMakeLists.txt
build_cmarkgfm_js.bat.txt
It appears this error is emitted while loading the cmark-gfm library code, i.e. before any execution has happened.
I tracked it down to the very large switch statement in case_fold_switch.inc. If you look at api_test.js (debug build, for readability) in an editor and search for cmark_utf8proc_case_fold and then scroll down you will see that Emscripten has generated a very deeply nested set of {} blocks - over 1400 levels deep. This evidently exceeds the browser's Javascript stack capacity,
function cmark_utf8proc_case_fold($0, $1, $2) {
$0 = $0 | 0;
$1 = $1 | 0;
$2 = $2 | 0;
var $5 = 0, $22 = 0, wasm2js_i32$0 = 0, wasm2js_i32$1 = 0;
$5 = __stack_pointer - 32 | 0;
__stack_pointer = $5;
HEAP32[($5 + 28 | 0) >> 2] = $0;
HEAP32[($5 + 24 | 0) >> 2] = $1;
HEAP32[($5 + 20 | 0) >> 2] = $2;
label$1 : {
label$2 : while (1) {
if (!((HEAP32[($5 + 20 | 0) >> 2] | 0 | 0) > (0 | 0) & 1 | 0)) {
break label$1
}
(wasm2js_i32$0 = $5, wasm2js_i32$1 = cmark_utf8proc_iterate(HEAP32[($5 + 24 | 0) >> 2] | 0 | 0, HEAP32[($5 + 20 | 0) >> 2] | 0 | 0, $5 + 16 | 0 | 0) | 0), HEAP32[(wasm2js_i32$0 + 12 | 0) >> 2] = wasm2js_i32$1;
label$3 : {
label$4 : {
if (!((HEAP32[($5 + 12 | 0) >> 2] | 0 | 0) >= (0 | 0) & 1 | 0)) {
break label$4
}
$22 = HEAP32[($5 + 16 | 0) >> 2] | 0;
label$5 : {
label$6 : {
label$7 : {
label$8 : {
label$9 : {
label$10 : {
label$11 : {
label$12 : {
label$13 : {
label$14 : {
label$15 : {
label$16 : {
label$17 : {
label$18 : {
label$19 : { -----> goes on for many, many levels up to label$1407
Evidently this is an emscripten code gen issue which the huge switch statement provokes. Tried changing the compiler optimize setting (including -Os) but it didn't help.
I was able to work around it by rearranging the code in utf8.c and case_fold_switch.inc to use if's instead of a switch. The code generated from that is much less deeply nested and loads & runs correctly (console reports all tests passed).
Another alternative might be some sort of lookup table.