Do not use `Double` arithmetics in `Integer.parseInt()`. #5193

sjrd · 2025-06-06T14:58:05Z

Only use Int arithmetics. To detect overflow, we precompute a table of the maximum string length for each radix.

Int arithmetics is faster than Double arithmetics. The previous code used Doubles to have a concise way of detecting the overflow, which is not bad on JS engines. Given a cached overflow detection mechanism, resorting to Doubles is not necessary anymore.

ekrich · 2025-06-06T17:25:41Z

I was wondering if you could take a peek and make any recommendations based on this work? https://github.com/scala-native/scala-native/blob/main/javalib/src/main/scala/java/lang/Integer.scala

sjrd · 2025-06-06T18:00:51Z

Those are fine implementations, but not as good as what's in this PR. In particular, caching the max is worthwhile, because divisions by non-constants are expensive. Otherwise it's pretty similar.

gzm0 · 2025-06-07T06:33:28Z

I'll review fully, but could you amend the description to include some information about why this is better?

sjrd · 2025-06-07T07:50:49Z

I'll review fully, but could you amend the description to include some information about why this is better?

I added a paragraph in the PR description. I'll integrate it in the commit message next time I amend/rebase/etc. The short answer is "it's faster" ;)

Edit: done

javalib/src/main/scala/java/lang/Integer.scala

gzm0 · 2025-06-08T08:36:25Z

javalib/src/main/scala/java/lang/Integer.scala

+      val digit = Character.digitWithValidRadix(s.charAt(safeLen), radix)
+      if (digit == -1 || (result ^ SignBit) > (radixInfo.overflowBarrier ^ SignBit))
+        fail()
+      result = result * radix + digit


Just to check my understanding: we need both maxLength and overflowBarrier because up to here, result could be 0, so we couldn't detect overflow as-we-go.

result cannot be 0 here. We trim leading zeros beforehand. If safeLen != len, it means there were maxLength characters after the leading zeros. The main loop has executed exactly maxLength - 1 times, and during the first iteration digit was at least 1. Therefore at this point result is at least radix^(maxLength - 2).

The maxLength check ahead of time is so that we can avoid the overflow checks during the main iteration. The overflow barrier is required for that last overflow check. We could write the algorithm without maxLength at all, and check for overflow at every iteration instead.

I see. So if we were willing pay the overflow check on every iteration, we would not need maxLength.

Do we have any idea of how this performance / code size / complexity trade-off looks like?

It feels to me like it's a lot of complexity to optimize something that'd we'd expect to be somewhat slow anyways.

gzm0 · 2025-06-08T08:40:14Z

javalib/src/main/scala/java/lang/Integer.scala

+  private final class StringRadixInfo(val maxLength: Int, val overflowBarrier: Int)
+
+  /** Precomputed table for parseIntInternal. */
+  private lazy val StringRadixInfos: Array[StringRadixInfo] = {


Did you compare this in code size and speed with two Array[Int]s? Especially since lookup in overflowBarriers would be rare, this might be worth it.

Code size is unfortunately not meaningfully different. The extra lazy val and the extra loop to compute it compensate for the removed class StringRadixInfo (which is actually very small). IMO the code is less clean if we split into two arrays (see diff), so I lean towards keeping the class StringRadixInfo.

I just realized: Isn't it (relatively) easy to calculate the overflow barrier? So if we stick with the version that pre-calculates safeLen (which I'm not super convinced about TBH, see other comment), would it be a trade-off worth considering to calculate overflowBarrier again if we actually need it?

I'm getting a bit the feeling that we are over-optimizing here :-/ w/o clear target use-cases for usage of parseInt, I feel it might be very difficult to chose the "right" trade off.

javalib/src/main/scala/java/lang/Integer.scala

gzm0

The code for the approach in this PR LGTM. However, now that I understand it better, I'm not sure it is worth the complexity to keep the lookup table.

Please let me know what your thoughts are. It might very well be that I'm missing something related to performance that is obvious to you.

Only use `Int` arithmetics. To detect overflow, we compute an overflow barrier in way that should typically be constant-folded. `Int` arithmetics are faster than `Double` arithmetics. The previous code used `Double`s to have a concise way of detecting the overflow, which is not bad on JS engines. However, we some careful analysis of the possible overflows, we can do better. We split the implementation of `parseInt` and `parseUnsignedInt`, since they have more differences than common parts at this point. Moreover, the justifications are quite different in each. The new algorithms are also much more Wasm-friendly.

sjrd · 2025-06-19T14:29:50Z

So, after a day spent exploring many alternatives and benchmarking them both on JS and Wasm, eventually I found the best of all worlds: it's faster than before on both engines; it stays as small as it is today; and it doesn't use any lookup table.

The only downside is that we have two different implementations for parseInt and parseUnsignedInt. That does not increase code size, because previously the common implementation was inlined in each to constant-fold the signed parameter anyway.

gzm0

Nice!

sjrd requested a review from gzm0 June 6, 2025 14:58

sjrd force-pushed the parseint-without-doubles branch from 6d4a737 to c46f3a5 Compare June 6, 2025 15:19

ekrich mentioned this pull request Jun 6, 2025

Improve Integer.parseInt based on new Scala.js implementation scala-native/scala-native#4363

Open

sjrd force-pushed the parseint-without-doubles branch from c46f3a5 to 4f2dccf Compare June 7, 2025 16:04

gzm0 requested changes Jun 8, 2025

View reviewed changes

sjrd force-pushed the parseint-without-doubles branch from 4f2dccf to f340b41 Compare June 8, 2025 10:15

sjrd requested a review from gzm0 June 8, 2025 10:16

gzm0 requested changes Jun 15, 2025

View reviewed changes

sjrd force-pushed the parseint-without-doubles branch from f340b41 to e9bfb21 Compare June 19, 2025 14:27

sjrd requested a review from gzm0 June 19, 2025 14:29

gzm0 approved these changes Jun 21, 2025

View reviewed changes

gzm0 merged commit d345530 into scala-js:main Jun 21, 2025
3 checks passed

sjrd deleted the parseint-without-doubles branch June 21, 2025 20:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do not use `Double` arithmetics in `Integer.parseInt()`. #5193

Do not use `Double` arithmetics in `Integer.parseInt()`. #5193

Uh oh!

sjrd commented Jun 6, 2025 •

edited

Loading

Uh oh!

ekrich commented Jun 6, 2025

Uh oh!

sjrd commented Jun 6, 2025

Uh oh!

gzm0 commented Jun 7, 2025

Uh oh!

sjrd commented Jun 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

gzm0 Jun 8, 2025

Uh oh!

sjrd Jun 8, 2025

Uh oh!

gzm0 Jun 15, 2025

Uh oh!

gzm0 Jun 8, 2025

Uh oh!

sjrd Jun 8, 2025

Uh oh!

gzm0 Jun 15, 2025

Uh oh!

Uh oh!

gzm0 left a comment

Uh oh!

sjrd commented Jun 19, 2025

Uh oh!

gzm0 left a comment

Uh oh!

Uh oh!

Uh oh!

Do not use Double arithmetics in Integer.parseInt(). #5193

Do not use Double arithmetics in Integer.parseInt(). #5193

Uh oh!

Conversation

sjrd commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ekrich commented Jun 6, 2025

Uh oh!

sjrd commented Jun 6, 2025

Uh oh!

gzm0 commented Jun 7, 2025

Uh oh!

sjrd commented Jun 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gzm0 Jun 8, 2025

Choose a reason for hiding this comment

Uh oh!

sjrd Jun 8, 2025

Choose a reason for hiding this comment

Uh oh!

gzm0 Jun 15, 2025

Choose a reason for hiding this comment

Uh oh!

gzm0 Jun 8, 2025

Choose a reason for hiding this comment

Uh oh!

sjrd Jun 8, 2025

Choose a reason for hiding this comment

Uh oh!

gzm0 Jun 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gzm0 left a comment

Choose a reason for hiding this comment

Uh oh!

sjrd commented Jun 19, 2025

Uh oh!

gzm0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Do not use `Double` arithmetics in `Integer.parseInt()`. #5193

Do not use `Double` arithmetics in `Integer.parseInt()`. #5193

sjrd commented Jun 6, 2025 •

edited

Loading

sjrd commented Jun 7, 2025 •

edited

Loading