-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Proposal: Punning in record type definitions #12484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
Incidentally, the LexiFi compiler had this exact feature for a long time, but a few years ago we removed it in order to reduce the diff with upstream. Can't remember right now if it was ever proposed for upstream inclusion. |
What is the advantage of this punning? It seems like record type definition is not that common. |
This syntax change also allows the punning of inline record defintions:
The advantage is conciseness as well as similarity with the expressions. |
+1 for this proposal. There are countless times that I have written It's a quite common practice in my experience to create toplevel types for groups of fields that would otherwise go to a single record type. One question (about a potential enhancement and I'm certainly not suggesting it being a blocker): module M = struct
type foo
type bar
end
type t = {
foo : M.foo;
bar : M.bar;
} (correct me if I'm wrong but) I don't think this works currently:
but it's probably the most desirable solution. An alternative sane and useful (but maybe weirder) choice would be type t = {
M.foo; M.bar;
} |
Indeed, this is not supported. This hack could work I guess
But that's not really nice. I think it'd make sense to support |
That's precisely why I dislike this proposal. I prefer if the eye can easily spot in which language (type or expression) a definition is. This proposition meddles that at the expense of the person reading the code because we lose important contextual signifier (the When I read the examples here, my brain is constantly wondering whether I'm not reading a record value and expanding the definitions to normal record field declarations. I wish I could simply read that instead (yes I'm very lazy). Current punning notations make the reasonable assumption that variable names often coincide with record field names or variable names (for the horrendously obscure I my opinion that's a wrong and undesirable assumption. You should have less type names than record field names because types hints at the regularities of your data. Suppose I add a creation date and a modification date to your
or perhaps:
Personally I'd rather read:
This proposal increases the entropy in the language. But I guess it works if you want to give more work to people who devise code formatting tools. |
I find this to be a compelling argument against this feature. A different argument against is that conciseness of type declarations does not matter much because type declarations are only written once, as opposed to record expressions and patterns which are written repeatedly. |
I agree with @dbuenzli and have the same reaction when reading the code examples above. I see no purpose to this proposal. |
I'm biased towards this feature but
I agree with this point in general, but think it is more or less sacrificeable in this case since IIUC there is no intertwining of expression language and record declaration at the syntax level. Adding this feature does reduce the ability to infer syntactical context at a glance to a degree but does not create intersection between type expressions and value expressions.
I disagree. For "common types" like (edit (clarification): I do not disagree that type names coinciding with record field names is less frequent, but I disagree it is rare and uncommon such that the convenience of this syntax should be disregarded. I think the opposite is true: when type names (especially long and descriptive ones) do coincide with record field names, this syntax is very convenient and useful. Although it's arguably not without drawbacks, it's a neat feature and worthy to be considered, and I am in favor with it as it will materially benefit the codebases that I work with day in and day out.) e.g. when have the following "inner" types, type student_data = {
home_class : string;
enrollment_year : int;
}
type faculty_data = {
office_room_number : string;
faculty_id : string;
}
type personal_info = {
first_name : string;
last_name : string;
} it is very desirable to be able to write type person =
| Faculty of { personal_info; faculty_data }
| Student of { personal_info; student_data } instead of type person =
| Faculty of {
personal_info : personal_info;
faculty_data : faculty_data;
}
| Student of {
personal_info : personal_info;
student_data : student_data;
} In this case, this feature does not only save keystrokes, it also ensures one does not accidentally declare a field with typo in its field name such as In other words, the usefulness of this syntax differs depending on the type of program. I'd like to argue it is unfair to assume one should not write programs in OCaml where this coding style is beneficial, and when this coding style suits the best, this syntax offers great benefits. |
I would also like to disagree with this. There are valid cases to have record types whose field name labels are used only once (or twice) in value expression contexts. One prominent example is when interfacing with external systems that talk in JSON. In this scenario it is not uncommon to declare a type and use it in the value expression context only once (or twice). |
Looks harmless to me. Besides once you factor in modularity and program evolution, I don't think it makes much sense. At a certain point you might want or need to factor out that I don't think there's much gain to link field names and type names; on the contrary. But even if you think there is, the notation only seems to bring tiny benefits in edge cases which you trade against decreased overall simplicity, clarity and regularity. |
IMHO, I find the proposal interesting. |
In a codebase where every other longish field name coincides with its type name, it defies expectation, and for most cases, it constitutes a genuine typo. Well, if the argument is that typos in variable / field / type names are harmless, it's another discussion. At least my opinion is that typos should be avoided, and typos in the inconsistency category is outright harmful as they are breeding ground for genuine mistakes.
Well, this an (intuitive) feature and nothing is taken away from the language. IMO you have nothing to lose except someone may write code that may not fit everyone's preference. I don't see your point about modularity and program evaluation. When the time comes, just desugar and do it the old way. If your points are true, then there is no utility value in the existing field punning syntax.
Well, egde cases or not really depends on what you use OCaml for. As OCaml is a general purpose language, and since both (1) it is reasonable to use long and descriptive type names and (2) there are certain users finding this feature useful hold, IMO it is unfair to dismiss those cases. In cases where this syntax is useful, the benefits are not tiny:
Again, I don't think it is fair to overlook the benefits just because your code base / preferred coding style would not benefit from it. Also IMO this syntax does not outrightly decrease simplicity or clarity. You can indeed argue the opposite:
The only thing it cause is the regularity of the language, but this is purely a problem of preference and taste. You can configure your formatter to enforce your preferred style in codebase that you controls. (sorry for the many edits and my bad grammar..) |
I'm not sure exactly how you come to this conclusion. The existing field punning syntax makes sense to me because it links the same kind of objects, named values and I find this equality to be pervasive in the code. The punning of this PR makes less sense to me because it links named values to their type name which I see as less essential. In general types are here to sort your values, not necessarily match the name you give them (unless you are strong into hungarian notation :-).
I'm not overlooking the benefits I'm weighting them. Languages are a delicate thing and like any system you design more features and choices do not necessarily translate into better systems.
I find this view of a language rather problematic. The goal of a programming language is to have a common notation and semantics to explain computational processes to other humans1. If we all live in different dialects, the benefits are lost. Footnotes
|
"No utility value" was definitely an overstating as the situations are indeed different in some perspectives. It was my short sighting. Albeit it is less pervasive and its utility probably won't be universally accepted, I'd like to argue that linking field names and type names are reasonable, and there exist programmers doing that from time to time (and for good reasons). My understanding is that this syntax makes as much sense as the existing field punning for such programmers (me included obviously) in those situations. A syntactical feature does not have to be essential / super-widely applicable to be helpful. I'm not arguing that this feature is unconditionally a good addition to OCaml, I just like to advocate that this feature would be very useful to us and would like to argue that the benefits outweigh the drawbacks. Therefore I'd like to see this proposal being accepted.
It is unarguably the case where type names in general need not match the field (/ variable / method / ..) names. But my point is that there exist valid, non-edge, and reasonable cases where such matching makes sense and is extremely useful.
Naming things in code is hard but important, so avoiding coming up with new names (when it's reasonable to do so) is a good thing. (And no, I'm not into Hungarian notation (I do occasionally adopting similar naming scheme locally and selectively when it helps code readability))
Completely agree. Yet my understand was that you were saying that the benefits of this syntax being "tiny" because the codebases you work with and/or your preferred coding style do not benefit much from it. Which is a fine comment to make, but IMO unfair to this discussion given that this syntax is helpful to others and (arguably) in principle.
My apologies on my bad wording could be understood as claiming that sacrificing regularity of language by adding feature is purely a problem of preference and taste in the general sense. I do also share the view of your counter argument and think such sacrifice should be taken seriously and be thought-through in general. What I was trying to say is that in this case (by introducing the proposed syntax), the reduction of regularity is limited and it causes only minor (?) frustrations to some programmers but not fundamental problems to all programmers (at least in my understanding and from what I read from the comments.) Thus it boils down to a problem of preference and can be solved (at least partially) by formatters. I'd also like to stress that this syntax does not odd out on itself, as OCaml already has the field punning syntax: again, the reduction in linguistic regularity is limited (although in a different perspective than the discussion in the last paragraph.) |
Maybe I'm missing something and my question isn't so much related to the punning proposal but, @bluddy :
|
@xvw It's based on experience. Even if one defines many record types, it's a tiny percentage of the code compared to the number of times one builds records. The high churn of needing to create and update records is what makes a compelling argument for punning. I guess one could get into the habit of having every function return a record type of its return values, but the very act of defining a type creates a burden on the programmer that discourages this kind of behavior, making tuples a better choice. Is this what you do? Do you define a record type per function return value? Otherwise I can't really understand the need to make record definition more nimble than it already is. |
I this assumption holds more or less true depending on the kind of code you are writing. Of course when you are writing libraries your types describe a generic kind of data (dates, formats, locale, etc.) or data-structures (lists, trees, etc.). But when you are writing an application your types describe specific kinds of data which are only to be found locally. In this case it makes sense to name the types the same way you'll name variables holding that data (and thus fields holding them). In this respect, the example on my initial message was not convincing. It should read something more like
Moreover, all this discussions around variable name made me reconsider something I had put aside in order to make the proposal smaller: Maybe type variables should be allowed in punned fields.
This would encourage the use of significant names for type variables. |
One thing I'd really like to see is named return values. I think we could use the syntax of record types (without defining a record type) to provide this.
And, just like inline records, you wouldn't be able to bind on the "pseudo-record": you would need to destruct it. Conceptually, the function doesn't return a value, it returns several (just like constructors with an inline record don't hold a value but rather several).
This is the counterpart to named parameters which are already very useful in making function signature self-documenting. Consider
|
Although I could think of places where punning type variables may be helpful in saving key strokes, it seems like an unnecessary feature to me while adding completely novel grammar. The expected behavior for punning field name with field type would be immediate (or at least make sense I believe) for programmers who are familiar with the existing field name punning in value expression, while punning type variables would be a completely new syntax and I bet we can say the same. Also, I kind of doubt for its utility value. As type variables are completely locally scoped, there are fewer incentives to punning IMO: you can just abbreviate! Punning field name with type name is different: both field names and type names are non-local (to the type definition) and they have to have the correct (and sometime long) name. |
I’m not an expert on type systems, but it seems to me that the named return value proposal you gave would easily mess up type inference. And worse (?), if the record type is indeed private, then .. it would cause so many problems. You’d want the type to be structural, and not a private record type to use the return values sanely. (At least if the goal here is to have “named return value” instead of some special kind of abstraction: some function you can only call as the expression being matched upon) You really want the return type to be structural. And you already have a good candidate: object type. A closer alternative would be module type (not sure whether you can write it as a literal) but you won’t get variance support (I think?) What’s closer to what you want is probably the private record type for polymorphic variant types. I remember I asked Jacque Garrigue why there’s no such feature already. I cannot remember his exact answer but it’s along the line of “could be done, but not easily, and with a lot of design considerations (e.g. subtyping) to think of”. I use a technique called Poor man’s record type: Anyway, this direction of discussion is definitely off-topic and should probably happen somewhere else if we want/need to continue. |
To be clear, I don't want a record type. I just want to reuse the record syntax (type and expression and pattern) but I don't want there to be a type. It's just like inline records in patterns |
Alas, I used terrible and misleading terminology. By type foo = A of { field1: unit; field2: unit } The type for the variant's argument is nominal but isn't addressable by the programmer. What I was trying to argue is that having non-programmer addressable inline nominal types as function return values is probably not a good idea. A better alternative is structural record(-like) types which can already be achieved using object types. If the focus is on syntax, I can see it being a wishful thing, but it definitely needs more discussion (which is probably going to be even more controversy than this PR..) (I think a new issue/discussion should be created for this conversation, but I'm answering here for easier cross-reference. It would be nice if the repo admin could move these comments) |
This is a proposal for an addition to the syntax of OCaml: punning in record type definitions.
This is fully backwards compatible in that existing programs are completely unaffected. (Unless I got something wrong?) It allows punning in record type definitions (and in in-line records in variant definitions).
E.g.,
Note that I am unfamiliar with the structure of the compiler. As a result, the implementation of this MR may be somewhat naïve. I'd welcome guidance on how to implement this better.