-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Description
This is an issue to discuss yet another form of sum type to Go. This is not a full proposal, and not a language change that I expect will be adopted in the near future. I am filing this issue as a point of reference to discuss this particular approach.
A sum type, tagged union, variant, variant record, choice type, discriminated union, disjoint union, or coproduct is a type which can hold a value of one of a fixed set of types. (Thanks Wikipedia for the list.) Rust calls these enums, which means that when someone says Go should support "enums" they might be asking for a C-style enumeration of integers or a Rust-style enumeration of types.
I posit that most people who want this feature in Go are looking for something along the lines of a Rust enum: https://doc.rust-lang.org/book/ch06-01-defining-an-enum.html
To guide what follows, I'm going to also propose a few motivating examples which any design should support well.
- A Direction type holding one of four cardinal directions: North, South, East, and West.
- A Maybe type which either holds a value of some type or nothing.
- An IP type holding either a
[4]byteIPv4 address or a[16]byteIPv6 address.
A digression: Can we just use interfaces?
Many proposals for sum types in Go aim to extend interface types to cover this case. Since interfaces already support listing a set of type variants for use in type constraints, it's tempting to say that we could just permit something like this:
type IPv4 [4]byte
type IPv6 [16]byte
type IP interface { IPv4 | IPv6 }I think that while this is tempting, it's the wrong approach.
First, this doesn't support the Direction case well at all. Do we need to define types for North/South/East/West?
The zero value of an interface is nil. Do we require that sum types all have an additional zero-value case that users need to handle, even if they don't actually want a zero value?
You need to define top-level symbols (types) for every variant. That's not so bad in the IPv4/IPv6 case, but it's not difficult to imagine types where this is a problem.
The syntax doesn't scale to lots of variants.
I just don't like this approach; I think it's trying to force two similar but different concepts (interfaces and sum types) together in a way that doesn't work well.
Digression over, proposal
Go structs are a "product type" in the terminology of type theory. Let's define a sum type using similar syntax to our product types. We'll call a sum type a "union" in the following, but I'm not wedded to the name:
type Direction union {
North, South, East, West atom
}
type Maybe[T any] union {
Unset atom
Set T
}
type IP union {
IPv4 [4]byte
IPv6 [16]byte
}This is exactly the struct syntax, except instead of fields we'll have "variants".
I've also introduced a new predeclared type here, called "atom". (Again, I'm not wedded to the name. "unit" might also work.) An atom is a struct{}, just like an any is an interface{}:
type atom = struct{}
We use atoms to indicate variants that have no data. The new identifier is introduced because it looks weird and hacky to write struct{} all over the place and struct{}{} is worse.
A value of a union type always contains exactly one of the type variants. The zero value of a union is the zero value of the first variant, so for our above examples: North, Unset, and 0.0.0.0. Yes, this means the order of the variants in the declaration is important. So is the order of values if you use iota to define an enum with something like const ( North = Direction(iota); South; East; West ). It's fine.
Selectors on variant types: T.Variant
Variants of a union may be named with a selector expression on the type like T.Variant. This is going to be ambiguous with method selectors, but selectors are already ambiguous on fields and methods so I think that's fine.
For a variant of type atom, the selector expression evaluates to a value of the union type with that variant. So for example:
d := Direction.North // d is a Direction with variant North
if d == Direction.North {
}
switch d {
case Direction.North:
case Direction.South:
case Direction.East:
case Direction.West:
}For a non-atom variant, evaluating the selector expression is an error.
m = Maybe[int].Set // error: evaluating non-atom union selectorSelectors on variant values: v.Variant
As with structs, a selector expression may be used on a value of a union type.
In an assignment statement, a selector expression sets the value of a union to that variant.
// Set m to the "Set" variant with value 42.
var m Maybe[int]
m.Set = 42
m.Unset = atom{} // but m = Maybe.Unset is probably clearer
// Set ip to 127.0.0.1
var ip IP
IP.IPv4 = [4]byte{127, 0, 0, 1}When evaluated, a selector expression on a non-atom variant yields two values: The value of the variant and an untyped bool indicating whether that is the currently set variant.
A selector expression on an atom variant yields only one value: An untyped bool indicating whether that is the currently set variant.
var m Maybe[int]
if v, ok := m.Set; ok {
// v is the value of the Maybe
}
if m.Unset {
// the Maybe is not set
}To avoid accidents, it is an error to assign a atom variant to two values, or a non-atom variant to a single value.
v := m.Set // error: this must be v, ok := m.Set
v, ok := m.Unset // error: this must be just v := m.UnsetComposite literals
Composite literals of union type must contain at most one keyed element. They may not contain elements without a key.
d := Direction{North: atom{}} // same as d := Direction.North
m := Maybe[int]{Set: 42}
m := Maybe[int]{42} // error: elements must be keyed
ip := IP{IPv4: [4]byte{...}, IPv6: [16]byte{...}} // error: too many elementsUnion switches
We add a new form of switch statement, the "union switch". This behaves much the same as a type switch, but over the variants of a union.
switch d.(union) {
case North:
case South:
case East:
case West:
}
var m Maybe[int]
switch v := m.(union) {
case Unset:
// v is not in scope here.
case Set:
// v has type int
}
Union switches must be exhaustive. You can include a default case if you want a non-exhaustive switch.
// Error: Failed to handle all cases.
switch d.(union) {
case North:
}
// This is fine.
switch d.(union) {
case North:
default:
}
Addressability
Union variants are not addressable. (But you can put a pointer to something addressable in a variant if you want.)
Comparability
Unions are comparable if all their variants are comparable.
FAQ
Q: Do we need this atom type? We could just omit the type from the union definition for the "has no value case".
That works, but it looks a lot like an embedded type:
type Maybe[T any] union {
Unset // looks like an embedded Unset type
Set T
}Also, atoms are pretty nice for non-union uses, like map[string]atom or chan atom.
Q: How about just Direction{North}? Direction{North: atom{}} is weird.
This works, but Direction{North} looks like an unkeyed composite literal and we're going to interpret North as a key. Maybe that's okay. But you can just write Direction.North, which is simpler.
Q: How about m := Maybe[int].Set(42), as opposed to Maybe[int]{Set: 42}?
I considered extending type conversion syntax for creating a union variant in this fashion. I don't like it, though, because it makes the variant look like a type. One of my notions in this design is that while in programming language theory each variant may be a type, expressing those types in the Go type system just adds confusion.
Q: What if I write x := someMaybe.Set?
Evaluating a selector for a non-atom variant always yields two values, and it is an error to assign the selector to a single value. This isn't like map access where the second ", ok" value is optional.
If we allowed a selector like this to evaluate to a single value, we'd have two choices:
It could evaluate to the value of the variant, and panic if that isn't the current variant. People hate panics. If you're really certain of the currently selected variant, you can write x, _ := someMaybe.Set and discard the bool.
Alternatively, it could evaluate to a bool indicating whether the variant is selected, so you could write if someMaybe.Set { … }. However, this invites incredibly confusing errors if the value of the variant is a bool:
m := Maybe[bool]{Set: false}
if m.Set {
// This executes, because Set is the selected variant,
// even though the *value* of set if false...
}So, no, selectors on non-atom variants always yield two values.
Q: I really don't want adding a new variant to my exported union to be a breaking API change, but the exhaustive type switch requirement makes it one.
Add a blank case to your union:
type Extendable union {
_ atom // default case
North, South, East, West
// new directions to be added as science discovers them
}Now type switches all need to contain a "default:" clause. Also, your zero value is now distinct from any of the defined variants.
Q: Can I have methods on a union type?
Of course, it's just a type.
type IP union {
IPv4 [4]byte
IPv6 [16]byte
}
func (ip IP) String() string { ... }Q: Do you think we should add this (or some form of sum type)?
I don't know.
The argument in favor is that this is one of the most common language feature requests (usually phrased as a request for "enums", which may indicate that this often comes from frustrated Rust programmers). Some form of union or sum type is a conventional language feature, and it's perhaps a bit strange that Go doesn't have one. I think the above design also fits fairly nicely into the language; I don't think we'd have regretted it if it had been present in Go 1.0.