Use PPrint to handle printing of REPL output values #23849

lihaoyi · 2025-09-01T01:40:08Z

This PR demonstrates using the https://github.com/com-lihaoyi/PPrint library to handle pretty-printing of values in the REPL.

Visible improvements:

Data structures like sequences and case classes are now nicely formatted and indented, including deeply nested data structures, to make best use of the vertical and horizontal space available
Strings are consistently quoted in collections and case classes, rather than sometimes quoted and sometimes not.
The rendering of Seq("") and Seq() is no longer identical (sometimes!)
Unusual characters within strings are now properly quoted, rather than being butchered during the rendering
(not shown) Character literals like 'X' are properly pretty-printed with quotes
Adjustments to the syntax highlighting colour scheme, making it subjectively easier to read and converging it with the scheme used by pprint's own internal highlighter:
- Unified StringColor and LiteralColor as green rather than red. This should help avoid the red of literals being visually confused with the red of error messages when pretty printing code during compilation errors, which is something I have had problems with in the past
- Highlighted capitalized identifies like Foo or Seq or List, since the vast majority of these identifiers are likely to be the companion object of types, and highlighting them helps greatly in visually finding your way around pretty-printed data structures

Before:

After:

Notes:

This PR only uses PPrint for formatting and not coloring, relying on the existing REPL code that deals with syntax highlighting (with tweaks). Using PPrint's highlighter directly would require a larger refactor that can come in a follow up iff we decide to do so
We build pprint/fansi/sourcecode from source using sourceGenerators. This requires a bit of patching to work around -Xexplicit-nulls and -Xfatal-warnings, but otherwise is straightforward and means for all intents and purposes it's just part of the Dotty codebase. We mangle the package paths to make them dotty.shaded.* packages to avoid conflict with user code
The verbosity of PPrint can be configured, e.g. we can decide whether we want to print field names or not. By default it prints field names for any case class with more than 1 field

I set the default pprint dimensions to width=100 height=50, and added a import dotty.shaded.pprint.pprintln to the predef of every REPL so users have pprintln available in scope. Users who want to print more than 50 lines can call pprintln which prints up to 100x500 by default, and can take a custom height=9999 if they want to print more.

The numbers 100x50 and 100x500 are heuristics:

100x50 as the default for echo-ed values is a heuristic optimizing for terminal use, where width=100 approximates the common maximum width people tend to format their code to (typically 80-120), and 50 reflects about 0.5 to 1 vertical screenful of text so it doesn't kick previous terminal output off the top of your terminal
100x500 as the default for pprintln is a heuristic optimizing for non-terminal use: it's about 5-10 vertical screenfuls of text, and about the limit of what we expect people to usefully be able to skim through. Typically, in most cases when the output is larger than this, you'd want to cut it down by selecting a subset of the output programmatically.
If the user really wants to print everything, they can run pprintln(foo, height=99999) or similar

The heuristics can be tweaked, but they should provide a decent baseline for printing a useful amount of output to screen without flooding the user's terminal.

TODO/Future-Work:

Automatically select max height/width based on terminal size, and provide a helper (similar to Ammonite's show(...)) to bypass the max height. For now, it's fixed at the default width of 100 columns
We can use the same approach to make use of os-lib and other libraries within scala3-compiler by building them from source
Make use of fansi elsewhere in the dotty codebase. e.g. the highlighting of stack traces via the code syntax highlighter is super ugly and could be cleaned up:

lihaoyi · 2025-09-01T01:40:42Z

CC @odersky @hamzaremmal as we discussed this when I visited lausanne

Gedochao · 2025-09-01T08:58:46Z

We can't use os-lib or requests while still supporting Java 8, as they require Java 11, so for now I just use java.io/java.nio to do the same thing. It looks super ugly, but when we start requiring Java >=17 we can clean this up

@lihaoyi we already require Java >= 17, the main branch is already set to 3.8

Gedochao · 2025-09-01T08:56:46Z

project/Build.scala

+    val downloads = Seq(
+      "https://repo1.maven.org/maven2/com/lihaoyi/pprint_3/0.9.3/pprint_3-0.9.3-sources.jar",
+      "https://repo1.maven.org/maven2/com/lihaoyi/fansi_3/0.5.1/fansi_3-0.5.1-sources.jar",
+      "https://repo1.maven.org/maven2/com/lihaoyi/sourcecode_3/0.4.3-M5/sourcecode_3-0.4.3-M5-sources.jar",


...wait, is 0.4.3-M5 a stable version? 🤔

might want to bring it to stable before we depend on it in the compiler repo 😅

It's almost a year old, so I guess so haha. I can tag a stable version if you would like, but the contents of the sourcejar will be unchanged

if it's become stabilised, by all means. 👍
I'd rather avoid milestone versions here.

I agree with @Gedochao. A stable version should be used here. Also, @lihaoyi what are the versioning scheme these 3 libraries follow? I'm not a fan of cloning the sources and change the package name. I prefer to just have a dependency and use the actual library (which we do for jline and will soon do for asm too).

Also, others have raised concerns in the past about using Scala libraries in the compiler codebase affecting the bootstrapping process. By building from source, we treat it effectively as Dotty's own source files, removing any divergence in the code paths: they are treated identically to scala3's own sources. If scala3 can compile itself, it should be able to compile these sources without issue

Why has it been downloaded every time?

Seems no checkmd5?

Extract the common version to fields?

We should indeed compile these from sources. We can depend on binaries for Java libraries (hence jline and asm are fine), but not for Scala libraries.

We should indeed compile these from sources. We can depend on binaries for Java libraries (hence jline and asm are fine), but not for Scala libraries.

Could you explain why? I thought Scala is maintaining backwards binary/tasty compatibility. Doesn't that mean we shohld always be able to depend on older scala 3 jars in the scala3 compiler regardless of how kuch bootstrapping we do?

Because circular dependencies are evil. They're not too evil across time (Av2 -> Bv1 -> Av1), but they're still difficult to reason about.

And even though Scala 3 will forever be backward compat, an eventual Scala 4 wouldn't. We shouldn't paint our build into a corner. Scala 2 tried this several times over its lifetime, and rolled back every time. It's a massive pain every time it happens. There would need to be a huge upside to depending on a binary for that to be offset.

lihaoyi · 2025-09-01T09:00:36Z

@Gedochao the community_build_n jobs were failing when I was using Java 11 APIs such as the java.net.HttpClient (transitively via requests-scala)

tgodzik · 2025-09-08T09:30:43Z

Looks to me like

scala3/project/scripts/bootstrappedOnlyCmdTests

Line 48 in 7cbadac

./bin/scala -classpath "$OUT1" -M "$MAIN" > "$tmp"

is failing, though I have no idea about that tests. Something connected to tmp and the new changes.

WojciechMazur · 2025-09-08T09:57:24Z

This line of test outputs gives us a hint: https://github.com/scala/scala3/actions/runs/17377385477/job/49667379148?pr=23849#step:9:8944
test 'hello world' = 'Building Dotty...
Which is printed for

scala3/project/scripts/bootstrappedOnlyCmdTests

Line 32 in 59b3679

test "$EXPECTED_OUTPUT" = "$(cat "$tmp")"

I bealive the test assumed that ./bin/scala is ready to use and does not need to require building compiler/runner again.
However, in the logs we can see that it does actually bootstrap the new binary and finishes by printing hello world as expected.
It might be related to changes made to the build file and introduced source generator.
A workaround can trimming the output to the last lines, but it would not fix the underlying issue - need to build compiler on each usage from ./bin/scala

lihaoyi · 2025-09-08T10:43:10Z

Shouldn't using bin/scala build the compiler locally? I assumed that such a script is meant to give you the scala compiler/REPL representing the compiler sources currently checked out, and not some cached version of the compiler/REPL built from earlier sources. Or is that mistaken?

Gedochao · 2025-09-08T11:33:11Z

Shouldn't using bin/scala build the compiler locally? I assumed that such a script is meant to give you the scala compiler/REPL representing the compiler sources currently checked out, and not some cached version of the compiler/REPL built from earlier sources. Or is that mistaken?

bin/scala runs the official Scala runner... being Scala CLI 😅 So no, it doesn't build the compiler locally.

lihaoyi · 2025-09-09T01:33:46Z

Thanks @Gedochao ! Looks like adding a caching layer to the source downloader is enough to avoid the noisy logs and make the test pass

lihaoyi · 2025-09-09T08:07:32Z

@WojciechMazur Another argument for downloading and unpacking sourcejars on the fly v.s. vendoring it statically is that we already do that for a bunch of different source jars in https://github.com/scala/scala3/blob/main/project/Build.scala: mtags-shared in one case and scalajs-ir in three other cases. So for PPrint/Fansi/SourceCode to do the same would be in line with the current design patterns

lihaoyi · 2025-09-09T08:55:47Z

I've cut over the shadedSourceGenerator implementation to use SBT's IO.* APIs rather than raw java.net/java.io. Now it should look a lot more similar to the code that downloads/unpacks mtags-shared and scalajs-ir

project/Build.scala

tgodzik

Thanks for the great work! We discussed it during the core meeting today and decided that it should be good to merge. I just have two minor comments to improve the maintainability

project/Build.scala

lihaoyi · 2025-09-11T02:14:45Z

In the last commits I set the default pprint dimensions to width=100 height=50, and added a import dotty.shaded.pprint.pprintln to the predef of every REPL so users have pprintln available in scope. Users who want to print more than 50 lines can call pprintln which prints up to 100x500 by default, and can take a custom height=9999 if they want to print more.

The numbers 100x50 and 100x500 are heuristics:

100x50 as the default for echo-ed values is a heuristic optimizing for terminal use, where width=100 approximates the common maximum width people tend to format their code to (typically 80-120), and 50 reflects about 0.5 to 1 vertical screenful of text so it doesn't kick previous terminal output off the top of your terminal
100x500 as the default for pprintln is a heuristic optimizing for non-terminal use: it's about 5-10 vertical screenfuls of text, and about the limit of what we expect people to usefully be able to skim through. Typically, in most cases when the output is larger than this, you'd want to cut it down by selecting a subset of the output programmatically.
If the user really wants to print everything, they can run pprintln(foo, height=99999) or similar

The heuristics can be tweaked, but they should provide a decent baseline for printing a useful amount of output to screen without flooding the user's terminal.

Setting the default max width and height based on the users terminal is a TODO, I can submit a follow up PR to do so

tgodzik

Looks good from my side. That is amazing work and persistance @lihaoyi ! Thanks!

sjrd · 2025-09-15T11:59:36Z

Could we squash all those commits, or at least have reasonable commit messages, please?

lihaoyi · 2025-09-15T12:04:08Z

@sjrd I assumed you would use the Squash and Merge button on the Github UI, that will embed the PR description in the commit message which will make it easy to follow understand the commit when going through history later on

lihaoyi · 2025-09-15T12:15:00Z

If you want I can squash it manually too, if you don't want to use the squash button

He-Pin · 2025-09-15T12:16:47Z

he means the . , which will present after squashing merge

lihaoyi · 2025-09-15T12:17:48Z

There's a button to squash-and-use-PR-description-as-commit-message IIRC? That's what I've been using this whole time the last decade

tgodzik · 2025-09-15T12:43:56Z

Done, with so many commits, github took the PR description as the message

Thanks again @lihaoyi !

lihaoyi added 2 commits September 1, 2025 09:14

.

d03ac8f

.

690ba71

lihaoyi added 20 commits September 1, 2025 09:48

.

e143442

build libraries from source

cf81680

build libraries from source

1be063a

.

2562be1

.

96b6d03

.

8351dbe

.

4bcb2c5

.

85f73ce

.

8074f15

.

ce4e81d

.

f48a8d8

.

5ac49a6

.

10bb601

.

b79e803

.

f4b89d7

.

2a4a1c1

fix

cb90ed2

.

a4cc58d

.

c7d109b

.

8082a5d

Gedochao reviewed Sep 1, 2025

View reviewed changes

Gedochao requested review from odersky, tgodzik, WojciechMazur and hamzaremmal September 1, 2025 09:00

.

7118ce2

.

94e599a

sjrd reviewed Sep 9, 2025

View reviewed changes

project/Build.scala Outdated Show resolved Hide resolved

.

22aa06c

tgodzik requested changes Sep 10, 2025

View reviewed changes

project/Build.scala Outdated Show resolved Hide resolved

project/Build.scala Outdated Show resolved Hide resolved

lihaoyi added 3 commits September 11, 2025 10:02

.

f526951

.

2e7fef6

.

49f5726

lihaoyi added 4 commits September 11, 2025 10:15

.

2fda5e1

.

4d1139c

.

38831b7

.

347bdd9

tgodzik approved these changes Sep 12, 2025

View reviewed changes

Gedochao removed the stat:needs decision Some aspects of this issue need a decision from the maintainance team. label Sep 15, 2025

Gedochao approved these changes Sep 15, 2025

View reviewed changes

tgodzik merged commit cacd117 into scala:main Sep 15, 2025
44 checks passed

Use PPrint to handle printing of REPL output values #23849

Use PPrint to handle printing of REPL output values #23849

Uh oh!

Conversation

lihaoyi commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lihaoyi commented Sep 1, 2025

Uh oh!

Gedochao commented Sep 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lihaoyi Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lihaoyi commented Sep 1, 2025

Uh oh!

tgodzik commented Sep 8, 2025

Uh oh!

WojciechMazur commented Sep 8, 2025

Uh oh!

lihaoyi commented Sep 8, 2025

Uh oh!

Gedochao commented Sep 8, 2025

Uh oh!

lihaoyi commented Sep 9, 2025

Uh oh!

lihaoyi commented Sep 9, 2025

Uh oh!

lihaoyi commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

tgodzik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lihaoyi commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tgodzik left a comment

Choose a reason for hiding this comment

Uh oh!

sjrd commented Sep 15, 2025

Uh oh!

lihaoyi commented Sep 15, 2025

Uh oh!

lihaoyi commented Sep 15, 2025

Uh oh!

He-Pin commented Sep 15, 2025

Uh oh!

lihaoyi commented Sep 15, 2025

Uh oh!

Uh oh!

tgodzik commented Sep 15, 2025

Uh oh!

Uh oh!

lihaoyi commented Sep 1, 2025 •

edited

Loading

lihaoyi Sep 1, 2025 •

edited

Loading

lihaoyi commented Sep 9, 2025 •

edited

Loading

lihaoyi commented Sep 11, 2025 •

edited

Loading