Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Simplify Records #174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eteeselink opened this issue Nov 1, 2014 · 26 comments
Closed

Simplify Records #174

eteeselink opened this issue Nov 1, 2014 · 26 comments

Comments

@eteeselink
Copy link

Hi Lee (& contributors), as I wrote on Hacker News, I'd like to discuss some ideas about Records with you. I'm looking for your opinion most of all, at this point. If there is consensus, I'd be happy to help implementing things as well.

I use Records a lot in my codebases - usually I have Map<string, SomeRecordClass>, possibly nested in another map or two, all over the place.

I like the concept of generating a Record class at runtime. It doesn't mesh well with TypeScript, but it's very powerful and really leverages some of the things JavaScript can do better than many other languages.

What I'm not so sure about, is the very large set of methods a Record has. I don't often iterate over all the keys in a Record. I don't usually want to build a map of all fields in a record except those that are more than 5. Maybe, somehow, it's not very useful that Record derives from Map. I'm not even sure it needs to implement Iterable.

As long as you can convert it into something that does.

So, how about this:

  • Record does not implement Map or anything like that
  • Record has only two methods, set and toMap.
  • The rest of a Record's namespace is nicely saved for fields. It means you can easily access fields called anything other than 'set' and 'toMap'. You'd need to do record.toMap().get('set') if your Record has a field called 'set'. Feels controversial, but really in practice this seems like an edge case.
  • Note that you don't really need update: You can just do var newRecord = oldRecord.set('someSet', oldRecord.someSet.add(5)); which is hardly more verbose, and even shorter when you don't have TypeScript / ES6 lambda syntax.

This approach has a number of advantages. First of all, it becomes feasible to simply implement it with a vanilla JS object and Object.freeze. I'm not sure what the performance characteristics are, but I suspect that the structures underlying Map (and thus, the current Record) aren't that much more efficient than a simple shallow clone for very small amounts of fields (like is typical for records). set would just shallow-clone the object, overwrite one value (or more, maybe), then freeze the new object.

That, in turn has a number of tangible advantages:

  • First, shorthand field access (record.fieldName) works on ES3 too (currently it is implemented with defineProperty so only ES5 browsers and up will manage)
  • Second, I suspect that there's more chances for leveraging TypeScript - Immutable.Record could simply be a base class (with set and toMap implementations) that people can derive their own classes from. I haven't completely worked this out yet, but I feel like there's ways here to at least make it somewhat better from the TypeScript perspective.

Finally, somewhat offtopic, but why did you choose obj.set('field', value) over obj.set({field: value})?

I know that all this totally breaks backward compatibility. At this stage I'm more interested in discussing the concept, and then if more people think it's a good idea, we can see if/how to implement it.

@eteeselink
Copy link
Author

Slightly related, you can get TypeScript safety for updating record fields with a syntax like record.set(obj => obj.field = 6).

In this implementation, set() passes {} to obj in the callback, but in TypeScript-o-world we say that it's actually an instance of the record's class or interface. This way, typing record.set(obj => obj. in a fancy IDE gets you autocomplete with the available fields and all that. The return value of the callback will contain all fields that need to be updated.

Problem with this approach is that the syntax feels quite unnatural. A single set implementation can support both syntaxes of course, so it's no biggie, but still, a bit odd. Just wanted to share, in case this approach hadn't been considered yet.

@abergs
Copy link

abergs commented Nov 4, 2014

I'm interested in this conversation, commenting so that I get updates 👍

@pluma
Copy link
Contributor

pluma commented Nov 4, 2014

@abergs You can subscribe to GitHub issues by clicking the subscribe button on the right. No need to comment.

@leebyron
Copy link
Collaborator

leebyron commented Nov 5, 2014

I really agree with a lot of what you're saying here. Thanks for spending the time to write it up. I'm interested in devoting v4 to a better Record API.

First, I agree that Records implementing the Map spec doesn't really make sense. If you're iterating over your record as if it's homogeneous, then you're probably doing it wrong. This was my mistake.

I think the Record prototype API should be a little fuller than what you're proposing, but not by much. I think I would propose:

  • get – safely access any field, in case it collides with a prototype method. Also allows use with getIn called by a parent.
  • getIn — deep access
  • has — maybe even call this hasOwnProperty. Not sure.
  • set — persistent set
  • setIn — deep persistent set
  • update — has positive performance implications, allows point-free updater functions, and allows updateIn called from above.
  • updateIn — deep persistent update.
  • hashCode — treat Records as values
  • equals — treat Records as values
  • toObject — back to JavaScript land. Also, think of this as "toMutable"
  • toString — obvi.

I'm also considering having Record be a base prototype, not extending Object.

@abergs
Copy link

abergs commented Nov 5, 2014

I really like that proposal @leebyron! Would be very nice with a sound Record API. :)

@leebyron
Copy link
Collaborator

leebyron commented Nov 5, 2014

This approach has a number of advantages. First of all, it becomes feasible to simply implement it with a vanilla JS object and Object.freeze. I'm not sure what the performance characteristics are, but I suspect that the structures underlying Map (and thus, the current Record) aren't that much more efficient than a simple shallow clone for very small amounts of fields (like is typical for records). set would just shallow-clone the object, overwrite one value (or more, maybe), then freeze the new object.

Unfortunately, Object.freeze carries an very significant performance hit. To the point where Map dramatically out-performs it. However, I do agree that a much simpler underlying (unfrozen) object likely wins the performance war. What I would love to do is benchmark the performance of gets and sets for these three cases (frozen obj, regular obj, Map). My expectation is that Map becomes more performant after some threshold of key size, and that we can internally intelligently switch to use Map only when copying huge objects becomes slower.

@leebyron
Copy link
Collaborator

leebyron commented Nov 5, 2014

First, shorthand field access (record.fieldName) works on ES3 too (currently it is implemented with defineProperty so only ES5 browsers and up will manage)

Unfortunately, there's no way to enforce immutability for ES3 with shorthand property access. Object.freeze is ES5 only, along with defineProperty.

Our only two options here are to drop support for ES3 (predominantly IE8) or to offer an non-shorthand API (get/set). That's ultimately why I decided to also expose get/set and not only property access. If you only care about ES5 browsers, you can use property access!

@leebyron
Copy link
Collaborator

leebyron commented Nov 5, 2014

Finally, somewhat offtopic, but why did you choose obj.set('field', value) over obj.set({field: value})?

Consistency with other uses of set both in this library and in ES6. Map and Set's set can't accept an object, because their keys can be anything, not just strings. Records must have strings, but keeping the API consistent was important. Map has merge which accepts an object. Maybe it makes sense to have something like Record.assign similar to Object.assign which conj's multiple records or objects together into a single Record which would support this case.

@eteeselink
Copy link
Author

Woops, I forgot all the deep access things. I see why those are necessary. Not entirely sure about update: you'd not typically need to call it directly, and lacking it could probably be worked around from above in the object hierarchy. But even then, consistency is good, I guess. In general, great proposal.

One other strong advantage of backing Record by a plain ES3 object, is that you can simply pass a Record to any library that expects an object (including when it uses hasOwnProperty or Object.keys and so on), and it'll just work. Sure, toObject isn't far away, but it's nice if it doesn't matter.

One thing about freeze and enforcing immutability: this is JavaScript. Even if we try, it's a dynamically typed language that allows a lot of stuff, way too much probably. As such, I'm not entirely sure that it's necessary to enforce immutability in production code. Would it be interesting to ship two builds of the library, one for development, and one for production? Much like React, the development library would do all kinds of checks and Object.freeze and all that (skipping it if not available), and the production library would just back Record with a plain old object, end of story. I guess we can assume that even if Object.freeze does not work on every browser, it works on every developer's most commonly used browser.

If all development and automated testing uses a version that enforces immutability, then maybe it's no disaster if production code doesn't.

(I just realized that the React way might not be good enough; we'd want to be able to run automated tests against production code, but still have it fail if it breaks immutability. Users may expect harder guarantees here than f.ex. React's PropTypes).

Finally, good point with the signature of set. I like the idea of assign (as long as it lives on Record and not the prototype).

@leebyron
Copy link
Collaborator

leebyron commented Nov 5, 2014

One other strong advantage of backing Record by a plain ES3 object, is that you can simply pass a Record to any library that expects an object (including when it uses hasOwnProperty or Object.keys and so on), and it'll just work. Sure, toObject isn't far away, but it's nice if it doesn't matter.

Totally agree. It would be excellent to ensure these properties.

One thing about freeze and enforcing immutability: this is JavaScript. Even if we try, it's a dynamically typed language that allows a lot of stuff, way too much probably. As such, I'm not entirely sure that it's necessary to enforce immutability in production code. Would it be interesting to ship two builds of the library, one for development, and one for production? Much like React, the development library would do all kinds of checks and Object.freeze and all that (skipping it if not available), and the production library would just back Record with a plain old object, end of story. I guess we can assume that even if Object.freeze does not work on every browser, it works on every developer's most commonly used browser.

If all development and automated testing uses a version that enforces immutability, then maybe it's no disaster if production code doesn't.

Yes, we do this in React and also all over the place in FB production JavaScript. In dev mode, we are strict and throw exceptions loudly, and in production we're more likely to fail silently with the hope that nothing bad will happen that we didn't catch in dev.

In practice, this is pretty hard to manage outside of a larger software shop like Facebook. You really need significant coverage over your dev build to ensure all edge and corner cases are caught before production. This kind of discipline requires organizational work. Therefore, easy to fuck up. We even fuck it up once every few weeks, and need to hot-fix things that we should have caught in dev but didn't.

One step further into the future for Record is part of an ES7/ES8 spec that would enable VM optimizations and techniques for parallelization. One of the reasons to prefer immutable data in multi-threaded environments is you need no locks. JavaScript's WebWorker interface is simplistic to shield us from the perils of multi-threading, but a truly immutable value should be able to pass to and from WebWorkers without serialization overhead. I recognize that this is overkill when targeting ES3/5 browsers of today, but something I'm keeping top of mind.

@eteeselink
Copy link
Author

I see your point. You can't expect all users of Immutable.js to have the same testing discipline.

However, in practice, what happens when code tries to mutate an immutable object?

  • Immutability enforced: throw error, execution stops. End-user experiences "it doesn't work when I click this button". (at Facebook, developers get a big red bar in their face and some styrofoam USB rocket launcher fires at the wrong guy)
  • Immutability not enforced: something weird goes wrong, errors later down the line. End-user experiences "it doesn't work when I click this button and then that one".

In my personal opinion, a warning with the Record documentation that the minified build does not by default enforce immutability, is good enough. The advantages are pretty big.

I don't know about how these ES7/8 VM tricks work, but I assume it'll mean telling the VM "we promise, this won't go wrong"? In that case, it's still a userland promise so there's no real difference to passing an Immutable.Map or a plain old JS object.

@leebyron
Copy link
Collaborator

leebyron commented Nov 5, 2014

in practice, what happens when code tries to mutate an immutable object?

IMHO immutability enforced, throw an error and stop execution.

I should clarify that the reason we don't use Object.freeze in production is purely because of the performance tax. I actually think Object.freeze's behavior of silently failing on mutation attempt is actually bad. It's really easy to think everything is working when it's subtly not working. Also, execution behavioral differences between development and production environments are often the source of great pain and frustration. "Bug repros in prod, but not in my dev environment" is a heart-sinking feeling where you know you're about to lose a day. I just want to avoid anything like that.

I don't know about how these ES7/8 VM tricks work, but I assume it'll mean telling the VM "we promise, this won't go wrong"? In that case, it's still a userland promise so there's no real difference to passing an Immutable.Map or a plain old JS object.

If I can spec this, an ES7/8 environment could provide immutable Record as a VM promise, not a userland promise. That's the only way to let the VM optimize it (memory usage, access patterns, shared memory between threads). A polyfill of this for <= ES6 would offer immutability as a userland promise to provide the same developer API and behavior, but maybe not the same performance characteristics.

@leebyron
Copy link
Collaborator

leebyron commented Nov 5, 2014

As an example of prod/dev behavior - at Facebook we use a lot of invariant() style assertions which check a condition and throw an error if it's broken. We don't remove these in production despite the fact that removing them could save some download bytes and condition checking. The value of having the same behavior in prod overrides that.

I think I've got us on a tangent though :), we both agreed above that expecting usage of a development build to catch issues could be problematic enough.

@plievone
Copy link

plievone commented Nov 7, 2014

I actually think Object.freeze's behavior of silently failing on mutation attempt is actually bad.

It throws in strict mode, but unfortunately depending on mutation attempt context, not on creation context, so it easily goes silent.

As an aside, immutable in React source tree uses Object.freeze followed by Object.seal for some reason.

@DavidTimms
Copy link
Contributor

How about using an plain object to store the properties internally, then using getters to proxy from the record to the internal object. We could use setters to throw an error on a mutation attempt.

For browsers that don't support ES5, we can fall back to putting the properties directly on the record object, meaning they are mutable, but any attempt to mutate them would be caught by the developer because it would throw an error in their modern development browser.

@leebyron
Copy link
Collaborator

leebyron commented Nov 8, 2014

Thanks David, I think that's probably pretty close to what I'd like to do.

@Pauan
Copy link

Pauan commented Nov 8, 2014

Ignoring whether this is a good idea or not, I would just like to point out that the performance of making a copy of an object is not as good as you might think.

Here is a benchmark I ran (using Node.js and Benchmark.js). It tests the performance of changing a key in an object that has 1 key:

Mutable object         x 12,002,295 ops/sec ±0.31% (98 runs sampled)
Mutable object copying x  7,729,702 ops/sec ±0.58% (102 runs sampled)
Frozen object copying  x    223,040 ops/sec ±1.52% (93 runs sampled)
Immutable-js Map       x  2,979,875 ops/sec ±0.50% (99 runs sampled)

As you can see, frozen objects take a huge performance hit in V8.

With 4 keys, the performance of Map started to be better than object copying:

Mutable object         x 12,002,520 ops/sec ±0.33% (102 runs sampled)
Mutable object copying x  1,464,477 ops/sec ±0.90% (98 runs sampled)
Frozen object copying  x    117,928 ops/sec ±1.61% (91 runs sampled)
Immutable-js Map       x  1,847,183 ops/sec ±1.46% (95 runs sampled)

And as the number of keys increased, the performance of object copying degraded. Here's 10 keys:

Mutable object         x 11,323,007 ops/sec ±1.51% (95 runs sampled)
Mutable object copying x    697,039 ops/sec ±1.35% (100 runs sampled)
Frozen object copying  x     58,053 ops/sec ±2.99% (96 runs sampled)
Immutable-js Map       x  1,883,083 ops/sec ±2.06% (89 runs sampled)

For any object that has more than 3 keys, it is clear that there is no performance benefit to object copying. So if you change Records, you should do so for reasons other than performance.

@DavidTimms
Copy link
Contributor

Thanks for those numbers. I didn't realize Maps were quite that fast.

@eteeselink
Copy link
Author

Nice numbers indeed! I had blatantly assumed that it wouldn't be so bad, so it's good to stand corrected.

My goal, however, wasn't performance, but simplicity. It looks like the solution we're converging to isn't much simpler - it appears that the current implementation (backed by a Map) is maybe just fine.

One goal I had was being able to pass a Record to any function that expects a plain old JS object, but it appears that that's nearly impossible to achieve.

Because of all this, I still stand by the API changes as @leebyron proposed (fewer methods, don't derive from Map), but not by any of the "how this could be done" ideas I proposed.

Btw, Lee, I do believe the proposal of methods lacks a toMap or toKeyedIterable or something like that. Immutable.Map(myRecord.toObject()) would do the trick, but feels like a waste.

@abergs
Copy link

abergs commented Nov 13, 2014

If we do choose to leave Map as a the backing implementation, do we still get the structural sharing in the new implementation of Record? Because that... would be nice:)

By the way, I was just thinking of ways to further use structural sharing. Would it be possible to have a data structure that would represent a Collection of Records, where all records have structural sharing between one another? Say I have a large Collection where alot of data is similiar. A record may have 10-15 properties and we have 1000 Records in our Collection.

Wouldn't it be very swell if we could share similiar properties between Records? :)

I'm very new to the world of immutability so what I say might just be crazy talk. Would appreciate any feedback regardless :)

@leebyron
Copy link
Collaborator

@abergs - if each record in the collection was a modification of a previous record then this works today! Just sub List for "Collection".

Structural sharing is the result of (Representation A) + (Operation X) = (Representation B), so if you can express your 1000 similar records in terms of "previous-record + operation" then absolutely.

Otherwise, it's actually pretty hard to do the reverse, taking B - A to get X. It's possible, but you're likely to trade a 2-3x improvement in memory footprint for a linear (O(N)) degradation in runtime performance.

@leebyron
Copy link
Collaborator

@eteeselink good points, thanks again for your input. Conversion methods like toObject and friends will need to be there.

I will probably still implement by backing with Map.

One goal I had was being able to pass a Record to any function that expects a plain old JS object, but it appears that that's nearly impossible to achieve.

This is indeed tricky. Record uses property accessor functions, which allows for the myRecord.myField expression, however it requires an ES5 runtime that supports this feature (e.g. not IE8). So fairly modern browsers will do fine with treating Record as a simple read-only object, but broad browser support requires using get()

@leebyron
Copy link
Collaborator

@Pauan mind sharing your perf script? It would be interesting to run it again since some perf improvements for small Maps have been added and I'm not sure if it was before or after your test.

@Pauan
Copy link

Pauan commented Dec 21, 2014

@leebyron Sure: https://gist.github.com/Pauan/ea872c10d32d8d11ebd0

I just ran the benchmarks again with Immutable-js version 3.4.1 (installed using npm), and I got the same results.

@dminkovsky
Copy link

I am ignorant as to the whole of the motivation for Record, so I'm probably way off target here... but I discovered Record and this thread while looking for some way to make something like an immutable monadic Bean, if you know what I am saying. I am backing a model object with a Map, and providing setters that return a new model object backed by a new map that is the result of set()ting on the original map. But what I really want is to not use Object objects at all.

@leebyron
Copy link
Collaborator

Merging this into #286.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants