[SR-3342] Investigate Different Bridged Collection Allocation Layouts


   |                  |                 |
   |------------------|-----------------|
   |Previous ID       | SR-3342      |
   |Radar             | None         |
   |Original Reporter | Gankro (JIRA User)      |
   |Type              | Improvement    |

   <details>
  <summary>Additional Detail from JIRA</summary>

   |                  |                 |
   |------------------|-----------------|
   |Votes             | 0         |
   |Component/s       | Standard Library    |
   |Labels            | Improvement, Performance, StarterBug        |
   |Assignee          | None      |
   |Priority          | Medium      |

   

   md5: 6a7560206774e41295068639d7aa9105

  </details>





**Issue Description:**


(Note: all of the arguments found below also apply to Dictionary and Set; they have the same design.)

The current design of an Array bridged to NSArray involves three potential allocations in the non-verbatim bridged case (e.g. \`Array\<Int\> as NSArray\`):

-   A: The Native Array storage (includes count/capacity metadata)

-   B: The Bridged Array storage

-   C: \_SwiftDeferredNSArray, a class that just stores a pointer to A and B

Today when we bridge an Array we end up with this layout:

design 1:

    C 
    +—> A [….]
    +—> B [….]

On first access of C as an NSArray, B is allocated and populated with the bridged contents of A. The idea behind this design is that sometimes some API may request an NSArray be constructed **and then never access it**. In this case, we can save a lot of work by deferring the construction of B.

That said: can we eliminate one of these allocations? I think we can all agree A and B should be separate to avoid massive space wasteage on never-bridged arrays. The question is if we should fold C into A or B, producing:

design 2:

    A […]
    + —> B[…]

Or

design 3:

    B[….]
    +—> A [….]

The standard library team concluded design 2 is incredibly dubious, as it has many significant disadvantages:

-   Requires Array operations to get bogged down in atomically invalidating B.

-   Leaks of B, as it lays hidden in every native Array regardless of how many "bridged" Arrays there are left.

-   Exposes A to objc_setAssociatedObject, which means the compiler can't optimize deinits of Arrays in the same way (they can have arbitrary hidden side-effects).

Regardless, this design has the advantage that it makes Array *verbatim bridgeable*. So bridging an Array of Arrays is a no-op. It's *not* toll-free, because B must still be constructed and populated on first access.

Design 3, however, is legitimately interesting. It avoids an extra allocation at the cost of always making the (much larger) B allocation. The initialization of B can still be deferred by storing an atomic flag to run a CAS loop on (currently the pointer to B is used as this flag). Basically this design could be worth it if it turns out most bridged Arrays are actually used, making deferred allocation a waste.

I suspect this isn't true, and Design 1 is actually the best one. A Dictionary containing Arrays is a good example of something that produces lots of bridged arrays that will probably never be accessed.

As such the Swift team has no intent to work on this. But this is a great issue for a Swift community member to look into! I'm happy to mentor anyone who wants to investigate.


   

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SR-3342] Investigate Different Bridged Collection Allocation Layouts #45930

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development


Previous ID	SR-3342
Radar	None
Original Reporter	Gankro (JIRA User)
Type	Improvement


Votes	0
Component/s	Standard Library
Labels	Improvement, Performance, StarterBug
Assignee	None
Priority	Medium

[SR-3342] Investigate Different Bridged Collection Allocation Layouts #45930

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions