Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Revive productElementName to extract case class field names #6951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

olafurpg
Copy link
Contributor

This commit adds two methods to the scala.Product trait:

trait Product {
  /** Returns the field name of element at index n */
  def productElementName(n: Int): String
  /** Returns field names of this product. Must have same length as productIterator */
  def productElementNames: Iterator[String]
}

Both methods have a default implementation which returns the empty
string for all field names.

This commit then changes the code-generation for case classes to
synthesize a productElementName method with actual class field names.

The benefit of this change is that it becomes possible to pretty-print
case classes with field names, for example

case class User(name: String, age: Int)

def toPrettyString(p: Product): String =
  p.productElementNames.zip(p.productIterator)
   .map { case (name, value) => s"$name=$value" }
   .mkString(p.productPrefix + "(", ", ", ")")

toPrettyString(User("Susan", 42))
// res0: String = User(name=Susan, age=42)

The downside of this change is that it produces more bytecode for each
case-class definition. Running :javacp -c for a case class with three
fields yields the following results

> case class A(a: Int, b: Int, c: Int)
> :javap -c A
  public java.lang.String productElementName(int);
    Code:
       0: iload_1
       1: istore_2
       2: iload_2
       3: tableswitch   { // 0 to 2
                     0: 28
                     1: 33
                     2: 38
               default: 43
          }
      28: ldc           78                 // String a
      30: goto          58
      33: ldc           79                 // String b
      35: goto          58
      38: ldc           80                 // String c
      40: goto          58
      43: new           67                 // class java/lang/IndexOutOfBoundsException
      46: dup
      47: iload_1
      48: invokestatic  65                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
      51: invokevirtual 70                 // Method java/lang/Object.toString:()Ljava/lang/String;
      54: invokespecial 73                 // Method java/lang/IndexOutOfBoundsException."<init>":(Ljava/lang/String;)V
      57: athrow
      58: areturn

Thanks to Adriaan's help, the estimated cost per productElementName
appears to be fixed 56 bytes and then 10 bytes for each field with
the following breakdown:

  • 3 bytes for the
    string info
    (the actual characters are already in the constant pool)
  • 4 bytes for the tableswitch entry
  • 2 bytes for the ldc to load the string
  • 1 byte for areturn

In my opinion, the bytecode cost is acceptably low thanks to the fact
that field name literals are already available in the constant pool.

This commit adds two methods to the `scala.Product` trait:
```scala
trait Product {
  /** Returns the field name of element at index n */
  def productElementName(n: Int): String
  /** Returns field names of this product. Must have same length as productIterator */
  def productElementNames: Iterator[String]
}
```

Both methods have a default implementation which returns the empty
string for all field names.

This commit then changes the code-generation for case classes to
synthesize a `productElementName` method with actual class field names.

The benefit of this change is that it becomes possible to pretty-print
case classes with field names, for example
```scala
case class User(name: String, age: Int)

def toPrettyString(p: Product): String =
  p.productElementNames.zip(p.productIterator)
   .map { case (name, value) => s"$name=$value" }
   .mkString(p.productPrefix + "(", ", ", ")")

toPrettyString(User("Susan", 42))
// res0: String = User(name=Susan, age=42)
```

The downside of this change is that it produces more bytecode for each
case-class definition. Running `:javacp -c` for a case class with three
fields yields the following results
```scala
> case class A(a: Int, b: Int, c: Int)
> :javap -c A
  public java.lang.String productElementName(int);
    Code:
       0: iload_1
       1: istore_2
       2: iload_2
       3: tableswitch   { // 0 to 2
                     0: 28
                     1: 33
                     2: 38
               default: 43
          }
      28: ldc           78                 // String a
      30: goto          58
      33: ldc           79                 // String b
      35: goto          58
      38: ldc           80                 // String c
      40: goto          58
      43: new           67                 // class java/lang/IndexOutOfBoundsException
      46: dup
      47: iload_1
      48: invokestatic  65                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
      51: invokevirtual 70                 // Method java/lang/Object.toString:()Ljava/lang/String;
      54: invokespecial 73                 // Method java/lang/IndexOutOfBoundsException."<init>":(Ljava/lang/String;)V
      57: athrow
      58: areturn
```

Thanks to Adriaan's help, the estimated cost per `productElementName`
appears to be fixed 56 bytes and then 10 bytes for each field with
the following breakdown:

* 3 bytes for the
  [string info](https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.4.3)
  (the actual characters are already in the constant pool)
* 4 bytes for the tableswitch entry
* 2 bytes for the ldc to load the string
* 1 byte for areturn

In my opinion, the bytecode cost is acceptably low thanks to the fact
that field name literals are already available in the constant pool.
@scala-jenkins scala-jenkins added this to the 2.13.0-M5 milestone Jul 18, 2018
@olafurpg
Copy link
Contributor Author

This is an alternative solution for #6936 The difference is that this PR

  • does not change the behavior of toString to minimize the breaking change
  • makes it possible to implement custom pretty-printing formats for case classes by exposing productElementName to library developers

@olafurpg
Copy link
Contributor Author

I haven't validated yet the exact bytecode overhead from this change. A good solution would be to compile larger projects and see the total added bytes to the resulting packaged jar.

I opened this PR too fast as I just realized I unfortunately won't have time to complete this PR as I'm already quite occupied with getting a stable Scalafix release out. Feel free to pick this up if you are interested. No attribution needed, the diff is mostly just uncommenting existing code anyways.

@olafurpg olafurpg closed this Jul 18, 2018
@julienrf
Copy link
Contributor

What was the reason for removing productElementName?

@olafurpg
Copy link
Contributor Author

I don't know if productElementName ever made it into an official Scala release, but there was a comment saying it should be possible to accomplish the same functionality by inspecting the classfiles instead (I assume via runtime reflection, and it was removed to minimize generated bytecode).

@SethTisue
Copy link
Member

In my opinion, the bytecode cost is acceptably low

agree

@retronym
Copy link
Member

retronym commented Jul 18, 2018

We could ratchet the fixed cost down a bit by adding a static factory method to scala.runtime for the IIOBE. Java 9 introduced a constructor overload that accepts an int, sadly we can't use that yet.

We should test the interaction with case classes that define or inherit a non-default implementation of productElementName.

@@ -48,4 +48,15 @@ trait Product extends Any with Equals {
* @return in the default implementation, the empty string
*/
def productPrefix = ""

def productElementName(n: Int): String =
if (n >= 0 && n < n) ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n < productArity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch.

.map { case (name, value) => s"$name=$value" }
.mkString(p.productPrefix + "(", ", ", ")")
println(pretty(User("Susan", 42)))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also test with:

  • non alphanumeric names (is the name.decoded done before code gen?)
  • case classes with non-public elements (just in case there is any problems in the current implementation related to these comments)
  • case classes with secondary parameter lists or with secondary constructors (again, to make sure we don't have latent bugs in product* code gen)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

Copy link
Contributor Author

@olafurpg olafurpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review @retronym. I won't be able to address the comments (at least not until around September) but maybe someone from the contributors thread https://contributors.scala-lang.org/t/case-class-tostring-new-behavior-proposal-with-implementation/2056/40?u=olafurpg is motivated to pick this up.

@@ -48,4 +48,15 @@ trait Product extends Any with Equals {
* @return in the default implementation, the empty string
*/
def productPrefix = ""

def productElementName(n: Int): String =
if (n >= 0 && n < n) ""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch.

.map { case (name, value) => s"$name=$value" }
.mkString(p.productPrefix + "(", ", ", ")")
println(pretty(User("Susan", 42)))
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

@cb372
Copy link
Contributor

cb372 commented Jul 25, 2018

I’d like to pick this up if nobody else has already. I’ll look at adding some of the tests suggested above.

@fommil
Copy link
Contributor

fommil commented Jul 29, 2018

FYI scalaz-deriving's instance of Show is able to do this without touching the case class, supporting older versions of scala.

import scalaz.{deriving,Show}

@deriving(Show)
case class Foo(s: String)

etc, for arbitrary arity.

@SethTisue SethTisue removed this from the 2.13.0-M5 milestone Aug 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants