-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[sgen] Fix xref computation with tarjan bridge #18239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
In order to reproduce this issue I used the test case from dotnet/android#2518. Issues with tarjan bridge are pretty popular and, even though I either didn't try or couldn't reproduce other issues, this PR might fix all other similar reported issues like : dotnet/android#1368, dotnet/android#2049, #14282, dotnet/android#3905 |
When using `MONO_GC_DEBUG=bridge=` debug option
Between C# and java (on android) there are objects that live on both worlds. This means that there exists a C# object with a corresponding java object. The relationship between them is strong, meaning if C# object is alive then java object must stay alive, and vice-versa. We keep java bridge objects always alive through a GCHandle (on the java gc). When doing a C# collection we select all bridge objects that appear to be dead on the C# side. These objects are candidates for collection, assuming the java side has nothing against it. Before triggering a collection on the java side (following the mono gc) we switch all the strong gchandles to the java objects to be weak (for these objects that are candidate to be collected) and we recreate the reference graph from the C# side to the java side (if C# Bridge1 can reference C# Bridge2 then, on the java side, we add a reference from Java Bridge1 to Java Bridge2; this is done by adding to an array of references inside Java Bridge1). In order to minimize the amount of work that needs to be done on the java side, we compute the minimal amount of references that need to be added, by computing the strongly connected components of the object graph. An optimized way to construct the SCCs and the xrefs is by using the tarjan algorithm (https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm). Our algorithm is non recursive (scan_stack emulates the recursive order of traversal in a dfs algorithm, while loop_stack is the stack used by the algorithm). A color represents an SCC. We have some optimizations in place where we might merge colors if they don't contain bridges, since the client only cares about SCCs containing bridge objects and the links between them. color_merge_array is used to keep track of all neighbors of a node until we are creating the scc for that node. It is populated when scanning all the refs inside an object (compute_low). All the colors in the color_merge_array will be cross references with that scc. Before this commit we were only clearing the color_merge_array when creating an SCC. This is problematic because we could end up with xrefs inside of an SCC that belong to another SCC. Consider the simple graph of nodes 0,1,2,3 where 0 <=> 2, 0->1, 2->3. Assume we start scanning with node 0. When creating the SCC for node 3 the followed path is 0 -> 2 -> 3, while the loop stack will contain (0,1,2,3). After creating SCC for node 3, we will finish scanning node 2 which would detect the xref to Bridge3, which would have been added to the color_merge_array. Because node 2 is not the root of the SCC it belongs to (its lowlink points towards node 0 which has a lower index), we are not creating an SCC with it, and the link to node 3 remains in color_merge_array. Because the next node from the scan_stack is node 1, which is also the root of the SCC that it belongs to, we will create an SCC for it and wrongly add the node 3 reference from color_merge_array to it. In order to fix this issue, we will always clear the color_merge_array once we finished scanning the xrefs for a node. If the node in question is not the root of an scc, then we will remember them as xrefs pointing out from this object. When we finally reach node 0 (which will be the root of the SCC containing nodes 0 and 2), we will then know that all xrefs for this color are the union of the xrefs of all objects belonging to this color (which represents the objects that we are popping from the loop_stack until we encounter the root node). Even though this change adds required bookeeping for xrefs, I didn't notice any change in performance on the bridge tests that we have in mono/tests.
When this optimization is enabled, the tarjan bridge will create more SCCs in order to reduce amount of xrefs in the graph. This would render the `bridge-compare-to` debug flag unusable with tarjan bridge.
* [sgen] Include also derived classes as bridges When using `MONO_GC_DEBUG=bridge=` debug option * [sgen] Fix xref computation with tarjan bridge Between C# and java (on android) there are objects that live on both worlds. This means that there exists a C# object with a corresponding java object. The relationship between them is strong, meaning if C# object is alive then java object must stay alive, and vice-versa. We keep java bridge objects always alive through a GCHandle (on the java gc). When doing a C# collection we select all bridge objects that appear to be dead on the C# side. These objects are candidates for collection, assuming the java side has nothing against it. Before triggering a collection on the java side (following the mono gc) we switch all the strong gchandles to the java objects to be weak (for these objects that are candidate to be collected) and we recreate the reference graph from the C# side to the java side (if C# Bridge1 can reference C# Bridge2 then, on the java side, we add a reference from Java Bridge1 to Java Bridge2; this is done by adding to an array of references inside Java Bridge1). In order to minimize the amount of work that needs to be done on the java side, we compute the minimal amount of references that need to be added, by computing the strongly connected components of the object graph. An optimized way to construct the SCCs and the xrefs is by using the tarjan algorithm (https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm). Our algorithm is non recursive (scan_stack emulates the recursive order of traversal in a dfs algorithm, while loop_stack is the stack used by the algorithm). A color represents an SCC. We have some optimizations in place where we might merge colors if they don't contain bridges, since the client only cares about SCCs containing bridge objects and the links between them. color_merge_array is used to keep track of all neighbors of a node until we are creating the scc for that node. It is populated when scanning all the refs inside an object (compute_low). All the colors in the color_merge_array will be cross references with that scc. Before this commit we were only clearing the color_merge_array when creating an SCC. This is problematic because we could end up with xrefs inside of an SCC that belong to another SCC. Consider the simple graph of nodes 0,1,2,3 where 0 <=> 2, 0->1, 2->3. Assume we start scanning with node 0. When creating the SCC for node 3 the followed path is 0 -> 2 -> 3, while the loop stack will contain (0,1,2,3). After creating SCC for node 3, we will finish scanning node 2 which would detect the xref to Bridge3, which would have been added to the color_merge_array. Because node 2 is not the root of the SCC it belongs to (its lowlink points towards node 0 which has a lower index), we are not creating an SCC with it, and the link to node 3 remains in color_merge_array. Because the next node from the scan_stack is node 1, which is also the root of the SCC that it belongs to, we will create an SCC for it and wrongly add the node 3 reference from color_merge_array to it. In order to fix this issue, we will always clear the color_merge_array once we finished scanning the xrefs for a node. If the node in question is not the root of an scc, then we will remember them as xrefs pointing out from this object. When we finally reach node 0 (which will be the root of the SCC containing nodes 0 and 2), we will then know that all xrefs for this color are the union of the xrefs of all objects belonging to this color (which represents the objects that we are popping from the loop_stack until we encounter the root node). Even though this change adds required bookeeping for xrefs, I didn't notice any change in performance on the bridge tests that we have in mono/tests. * [sgen] Some logging improvements in tarjan bridge * [sgen] Disable optimization when comparing bridge outputs When this optimization is enabled, the tarjan bridge will create more SCCs in order to reduce amount of xrefs in the graph. This would render the `bridge-compare-to` debug flag unusable with tarjan bridge.
|
@BrzVlad - do you think this should be back ported to 2019-12 & 2019-10 ? |
|
I'd rather not backport since it is risky |
* [sgen] Include also derived classes as bridges When using `MONO_GC_DEBUG=bridge=` debug option * [sgen] Fix xref computation with tarjan bridge Between C# and java (on android) there are objects that live on both worlds. This means that there exists a C# object with a corresponding java object. The relationship between them is strong, meaning if C# object is alive then java object must stay alive, and vice-versa. We keep java bridge objects always alive through a GCHandle (on the java gc). When doing a C# collection we select all bridge objects that appear to be dead on the C# side. These objects are candidates for collection, assuming the java side has nothing against it. Before triggering a collection on the java side (following the mono gc) we switch all the strong gchandles to the java objects to be weak (for these objects that are candidate to be collected) and we recreate the reference graph from the C# side to the java side (if C# Bridge1 can reference C# Bridge2 then, on the java side, we add a reference from Java Bridge1 to Java Bridge2; this is done by adding to an array of references inside Java Bridge1). In order to minimize the amount of work that needs to be done on the java side, we compute the minimal amount of references that need to be added, by computing the strongly connected components of the object graph. An optimized way to construct the SCCs and the xrefs is by using the tarjan algorithm (https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm). Our algorithm is non recursive (scan_stack emulates the recursive order of traversal in a dfs algorithm, while loop_stack is the stack used by the algorithm). A color represents an SCC. We have some optimizations in place where we might merge colors if they don't contain bridges, since the client only cares about SCCs containing bridge objects and the links between them. color_merge_array is used to keep track of all neighbors of a node until we are creating the scc for that node. It is populated when scanning all the refs inside an object (compute_low). All the colors in the color_merge_array will be cross references with that scc. Before this commit we were only clearing the color_merge_array when creating an SCC. This is problematic because we could end up with xrefs inside of an SCC that belong to another SCC. Consider the simple graph of nodes 0,1,2,3 where 0 <=> 2, 0->1, 2->3. Assume we start scanning with node 0. When creating the SCC for node 3 the followed path is 0 -> 2 -> 3, while the loop stack will contain (0,1,2,3). After creating SCC for node 3, we will finish scanning node 2 which would detect the xref to Bridge3, which would have been added to the color_merge_array. Because node 2 is not the root of the SCC it belongs to (its lowlink points towards node 0 which has a lower index), we are not creating an SCC with it, and the link to node 3 remains in color_merge_array. Because the next node from the scan_stack is node 1, which is also the root of the SCC that it belongs to, we will create an SCC for it and wrongly add the node 3 reference from color_merge_array to it. In order to fix this issue, we will always clear the color_merge_array once we finished scanning the xrefs for a node. If the node in question is not the root of an scc, then we will remember them as xrefs pointing out from this object. When we finally reach node 0 (which will be the root of the SCC containing nodes 0 and 2), we will then know that all xrefs for this color are the union of the xrefs of all objects belonging to this color (which represents the objects that we are popping from the loop_stack until we encounter the root node). Even though this change adds required bookeeping for xrefs, I didn't notice any change in performance on the bridge tests that we have in mono/tests. * [sgen] Some logging improvements in tarjan bridge * [sgen] Disable optimization when comparing bridge outputs When this optimization is enabled, the tarjan bridge will create more SCCs in order to reduce amount of xrefs in the graph. This would render the `bridge-compare-to` debug flag unusable with tarjan bridge. Commit migrated from mono/mono@376c46b
|
Release status update A new Preview version of Xamarin.Android has now been published that includes the fix from this item. The fix is not yet included in a Release version. I will update this item again when a Release version is available that includes the fix. Fix included in Xamarin.Android 10.3.0.33 Fix included on Windows in Visual Studio 2019 version 16.6 Preview 2. To try the Preview version that includes the fix, check for the latest updates in Visual Studio Preview. Fix included on macOS in Visual Studio 2019 for Mac version 8.6 Preview 1. To try the Preview version that includes the fix, check for the latest updates on the Preview updater channel. |
|
Release status update A new Release version of Xamarin.Android has now been published that includes the fix from this item. Fix included in Xamarin.Android 10.3.1.0. Fix included on Windows in Visual Studio 2019 version 16.6. To get the new version that includes the fix, check for the latest updates or install the latest version from https://visualstudio.microsoft.com/downloads/. Fix included on macOS in Visual Studio 2019 for Mac version 8.6. To get the new version that includes the fix, check for the latest updates on the Stable updater channel. |
…)" This reverts commit c6b3e5d.
…)" This reverts commit c6b3e5d.
Between C# and java (on android) there are objects that live on both worlds. This means that there exists a C# object with a corresponding java object. The relationship between them is strong, meaning if C# object is alive then java object must stay alive, and vice-versa. We keep java bridge objects always alive through a GCHandle (on the java gc). When doing a C# collection we select all bridge objects that appear to be dead on the C# side. These objects are candidates for collection, assuming the java side has nothing against it. Before triggering a collection on the java side (following the mono gc) we switch all the strong gchandles to the java objects to be weak (for these objects that are candidate to be collected) and we recreate the reference graph from the C# side to the java side (if C# Bridge1 can reference C# Bridge2 then, on the java side, we add a reference from Java Bridge1 to Java Bridge2; this is done by adding to an array of references inside Java Bridge1). In order to minimize the amount of work that needs to be done on the java side, we compute the minimal amount of references that need to be added, by computing the strongly connected components of the object graph.
An optimized way to construct the SCCs and the xrefs is by using the tarjan algorithm (https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm). Our algorithm is non recursive (scan_stack emulates the recursive order of traversal in a dfs algorithm, while loop_stack is the stack used by the algorithm). A color represents an SCC. We have some optimizations in place where we might merge colors if they don't contain bridges, since the client only cares about SCCs containing bridge objects and the links between them. color_merge_array is used to keep track of all neighbors of a node until we are creating the scc for that node. It is populated when scanning all the refs inside an object (compute_low). All the colors in the color_merge_array will be cross references with that scc.
Before this commit we were only clearing the color_merge_array when creating an SCC. This is problematic because we could end up with xrefs inside of an SCC that belong to another SCC. Consider the simple graph of nodes 0,1,2,3 where 0 <=> 2, 0->1, 2->3. Assume we start scanning with node 0. When creating the SCC for node 3 the followed path is 0 -> 2 -> 3, while the loop stack will contain (0,1,2,3). After creating SCC for node 3, we will finish scanning node 2 which would detect the xref to Bridge3, which would have been added to the color_merge_array. Because node 2 is not the root of the SCC it belongs to (its lowlink points towards node 0 which has a lower index), we are not creating an SCC with it, and the link to node 3 remains in color_merge_array. Because the next node from the scan_stack is node 1, which is also the root of the SCC that it belongs to, we will create an SCC for it and wrongly add the node 3 reference from color_merge_array to it. In order to fix this issue, we will always clear the color_merge_array once we finished scanning the xrefs for a node. If the node in question is not the root of an scc, then we will remember them as xrefs pointing out from this object. When we finally reach node 0 (which will be the root of the SCC containing nodes 0 and 2), we will then know that all xrefs for this color are the union of the xrefs of all objects belonging to this color (which represents the objects that we are popping from the loop_stack until we encounter the root node).
Even though this change adds required bookeeping for xrefs, I didn't notice any change in performance on the bridge tests that we have in mono/tests.