Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Simplify RID model and handle Linux libc flavors in orchestrated builds #113765

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Apr 14, 2025

Conversation

jkoritzinsky
Copy link
Member

This is needed to fix the linux-bionic builds in the VMR.

Contributes to dotnet/source-build#4955

This is needed to fix the linux-bionic builds in the VMR
@Copilot Copilot AI review requested due to automatic review settings March 21, 2025 16:10
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

Files not reviewed (1)
  • eng/DotNetBuild.props: Language not supported

Copy link
Member

@ViktorHofer ViktorHofer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this imply that due to L18, the passed in TargetOS value will never be used?

@jkoritzinsky
Copy link
Member Author

Yes. The problem is that the runtime's build scripts process and change the passed in TargetOS when determining the RID. Because of our "nested inner build" logic, we effectively were stripping the "-bionic" part of "linux-bionic" from the inner build invocation and building the product in a weird blend of linux, linux-bionic, and android.

By letting this calculation happen again and overwriting the TargetOS based on the RID, we can restore the calculation of the TargetOS to the correct value by re-extracting it from the RID (which we calculate based on the originally-passed-in TargetOS).

@tmds
Copy link
Member

tmds commented Mar 24, 2025

My 2 cents: it may be interesting to globally introduce a property that is the "portable rid OS corresponding to the target". That one would be linux-bionic. dotnet/sdk#44800 is also about having such a property.

@am11
Copy link
Member

am11 commented Mar 24, 2025

Because of our "nested inner build" logic, we effectively were stripping the "-bionic" part of "linux-bionic" from the inner build invocation and building the product in a weird blend of linux, linux-bionic, and android.

It sounds like we are missing handling of linux-bionic somewhere next to the existing handling of linux-musl which should have the same behavior but doesn't (e.g. alpine distro packages use source-build without issues). If we can find and fix that problem, I think that would be better targeted fix rather than exposing these props.

Ideally, eng/OSArch.props + eng/RuntimeIdentifier.props type of platform spec resolution should be shared via arcade/eng/common so the handling of supported platforms and effective platforms (build/host/target) is unified across the board; single source of truth.

@jkoritzinsky
Copy link
Member Author

Alpine doesn't have a problem because musl and glibc builds are more similar.

@jkoritzinsky
Copy link
Member Author

Coming back to this, here's the flow of what's happening in main and why musl doesn't have a problem here:

  1. When the VMR orchestrator passes -os linux-bionic to the runtime outer build script, we hit this block:

    runtime/eng/build.sh

    Lines 290 to 292 in 7efe7f7

    linux-bionic)
    os="linux"
    __PortableTargetOS=linux-bionic
  2. We pass the os variable as the TargetOS property to the build:
    arguments="$arguments /p:TargetOS=$os"
  3. We initialize the RID based on the os variable:

    runtime/eng/build.sh

    Lines 133 to 147 in 7efe7f7

    initDistroRid()
    {
    source "$scriptroot"/common/native/init-distro-rid.sh
    local passedRootfsDir=""
    local targetOs="$1"
    local targetArch="$2"
    local isCrossBuild="$3"
    # Only pass ROOTFS_DIR if __DoCrossArchBuild is specified and the current platform is not an Apple platform (that doesn't use rootfs)
    if [[ $isCrossBuild == 1 && "$targetOs" != "osx" && "$targetOs" != "android" && "$targetOs" != "ios" && "$targetOs" != "iossimulator" && "$targetOs" != "tvos" && "$targetOs" != "tvossimulator" && "$targetOs" != "maccatalyst" ]]; then
    passedRootfsDir=${ROOTFS_DIR}
    fi
    initDistroRidGlobal "${targetOs}" "${targetArch}" "${passedRootfsDir}"
    }
  4. The outer build tries to evaluate the TargetOS property based on the RID, but this is ignored:
    <TargetOS>$(TargetRid.Substring(0, $(_targetRidPlatformIndex)))</TargetOS>
  5. We pass the TargetOS property to the inner build:
    <InnerBuildArgs Condition="'$(DotNetBuildSourceOnly)' != 'true'">$(InnerBuildArgs) $(FlagParameterPrefix)os $(TargetOS)</InnerBuildArgs>
  6. The inner build sees the TargetOS property as linux instead of linux-bionic:

    runtime/eng/build.sh

    Lines 264 to 265 in 7efe7f7

    linux)
    os="linux" ;;
  7. As linux-bionic is not a cross build, but linux is, we end up not having the -cross flag and build fails.
  • linux-musl is a cross-build like linux, so -cross will be present. Since we have the correct RID passed in, we end up building everything correctly even though we pass -os linux today.
  • If we pass the -cross flag for linux-bionic. We must pass a rootfs. If we pass a rootfs, we end up building the Android crypto stack instead of the OpenSSL one because we miss setting the "Force use OpenSSL Crypto" due to the rootfs being provided.

This change lets the logic at step 4 correct the adjustment in step 2 using the RID that's passed in by the orchestrator.

I'd like to get in this change as-is. After we switch to flat code-flow to the VMR, we can revisit how we handle RIDs and platform specification and simplify it from the VMR where we can do all the changes in one PR.

@tmds
Copy link
Member

tmds commented Mar 26, 2025

Note that in the first step, build.sh sets __PortableTargetOS=linux-bionic. dotnet/runtime then uses that to initialize _portableOS_. __PortableTargetOS/_portableOS_ gets lost when it goes from the vmr outer build to the runtime inner build (TargetOS=linux). For linux-musl the __PortableTargetOS gets "recovered" here.

@am11
Copy link
Member

am11 commented Mar 26, 2025

Thanks for the explainer @jkoritzinsky and @tmds.

We were discussing whether it makes sense to maintain a standalone linux-bionic RID going forward: #106748 (comment). Probably telemetry would be able to clarify how many non-Android installations are using linux-bionic in .NET 8 and 9, and whether it would make sense to provide it with feature switches on top of android runtime. Separate RID implies that the maintainers of 3p nupkg with native assets need to provide two same/similar/identical binaries with different names per arch, both require Android NDK to be present on the build machine regardless the code uses any auxiliary platform feature. If the use-cases of linux-bionic are only dev oriented (such as Termux, which is an emulator), then we can likely live with a single android runtime. e.g. Rosetta emulator is also supported by osx-* RIDs without a specialized package for it. Runtime just disables doublemapping (DOTNET_EnableWriteXorExecute=0) among other features when emulator is detected. Windows, OSX and FreeBSD C libs are also tied to the platform (as bionic-libc is tied to Android) with no specialized runtime packs.

__PortableTargetOS/_portableOS_ gets lost when it goes from the vmr outer build to the runtime inner build (TargetOS=linux).

Probably it would be a better fix here if PortableTargetOS is propagated separately from VMR to inner builds. That sort of a subclass platform context maybe useful in other places.

@tmds
Copy link
Member

tmds commented Mar 26, 2025

Probably it would be a better fix here if PortableTargetOS is propagated separately from VMR to inner builds. That sort of a subclass platform context maybe useful in other places.

Yes, that was my suggestion in #113765 (comment).

@jkoritzinsky
Copy link
Member Author

I don't want to propagate it separately as our inner build invocation should be as close to a standard build invocation as possible. I don't want to have to maintain two different ways of passing the OS.

I'll see if I can make this a little cleaner.

…props so we can reduce duplication and get rid of the TreatAsLocalProperty property. Remove ToolsRID
… our output rid, OutputRID, or for the host RID, NETCoreSdkRuntimeIdentifier.
@jkoritzinsky
Copy link
Member Author

I've pulled apart the RID calculations again and significantly cleaned them up and reduced the set of different RID concepts. Lets see how CI likes it.

@jkoritzinsky
Copy link
Member Author

/azp run runtime, runtime-diagnostics, runtime-dev-innerloop, dotnet-linker-tests

Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@NikolaMilosavljevic
Copy link
Member

I've verified these changes with a full source-build (Fedora x64). I did not notice any differences between artifacts produced in the test build and the baseline. My test might not have exercised the code paths fully. @jkoritzinsky let me know if there is a more appropriate source-build scenario.

@ViktorHofer
Copy link
Member

@jkoritzinsky assuming that this PR will remove the remaining runtime patch in the VMR, we need to get this in before April 23rd so that we are patch free by that date. Just in case you didn't know about this deadline.

@jkoritzinsky
Copy link
Member Author

jkoritzinsky commented Apr 10, 2025

Yeah I split off #114285 because of the deadline. I just need a code review and this will be ready to merge.

<InnerBuildArgs Condition="'$(_portableOS)' == 'win'">$(InnerBuildArgs) $(FlagParameterPrefix)os windows</InnerBuildArgs>
<InnerBuildArgs>$(InnerBuildArgs) $(FlagParameterPrefix)os $(_portableOS)</InnerBuildArgs>
<!-- Mobile builds are never "cross" builds as they don't have a rootfs-based filesystem build. -->
<InnerBuildArgs Condition="'$(CrossBuild)' == 'true' or ('$(TargetArchitecture)' != '$(BuildArchitecture)' and '$(TargetsMobile)' != 'true')">$(InnerBuildArgs) $(FlagParameterPrefix)cross</InnerBuildArgs>
Copy link
Member

@akoeplinger akoeplinger Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

osx/win x64 host building for osx/win arm64 target is in the same situation and also doesn't use rootfs.

I know this is pre-existing but maybe for your cleanup instead of CrossBuild and the -cross flag we should have BuildUsingCrossRootFs and only apply it where we actually use a rootfs ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to do that in a future PR so we can get this in by the deadline.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, sorry if I wasn't clear but I was suggesting to keep this in mind for a separate cleanup PR.

Copy link
Member

@ViktorHofer ViktorHofer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good. Considering queuing runtime and VMR (source-build & unfiied-build) official builds for additional validation.

@jkoritzinsky jkoritzinsky changed the title Allow TargetOS and TargetRid to be overridden in DotNetBuild.props Simplify RID model and handle Linux libc flavors in orchestrated builds Apr 14, 2025
@jkoritzinsky jkoritzinsky merged commit 19b0dc9 into dotnet:main Apr 14, 2025
157 of 160 checks passed
@jkoritzinsky jkoritzinsky deleted the dotnetbuild-local-props branch April 14, 2025 17:13
MichalStrehovsky added a commit that referenced this pull request Apr 17, 2025
We don't have a need for host in native AOT but after #113765 the outerloop started failing with:

```
  Building native export: "/usr/local/bin/clang-20" -O2 -shared -fpic -D DNNE_ASSEMBLY_NAME=Microsoft.Interop.Tests.NativeExports -D DNNE_COMPILE_AS_SOURCE -I "/__w/1/s/.packages/dnne/2.0.5/tools/platform" -I "/__w/1/s/artifacts/bin/linux-arm64.Release/corehost" -o "/__w/1/s/artifacts/obj/NativeExports/Release/net10.0/linux-arm64/dnne/bin/Microsoft.Interop.Tests.NativeExportsNE.so" --target=aarch64-linux-gnu --gcc-toolchain=/crossrootfs/arm64/usr --sysroot=/crossrootfs/arm64  "/__w/1/s/artifacts/obj/NativeExports/Release/net10.0/linux-arm64/dnne/Microsoft.Interop.Tests.NativeExports.g.c" "/__w/1/s/.packages/dnne/2.0.5/tools/platform/platform.c" -lstdc++ "/__w/1/s/artifacts/bin/linux-arm64.Release/corehost/libnethost.a" --target=aarch64-linux-gnu --gcc-toolchain=/crossrootfs/arm64/usr --sysroot=/crossrootfs/arm64 -fuse-ld=lld  -Wl,--rpath-link=/crossrootfs/arm64/lib/aarch64-linux-gnu -Wl,--rpath-link=/crossrootfs/arm64/usr/lib/aarch64-linux-gnu 
clang-20 : error : no such file or directory: '/__w/1/s/artifacts/bin/linux-arm64.Release/corehost/libnethost.a' [/__w/1/s/src/libraries/System.Runtime.InteropServices/tests/TestAssets/NativeExports/NativeExports.csproj]
```

because some of the interop testing uses DNNE. This is a blind attempt to fix it.
jkotas pushed a commit to jkotas/runtime that referenced this pull request Apr 23, 2025
We don't have a need for host in native AOT but after dotnet#113765 the outerloop started failing with:

```
  Building native export: "/usr/local/bin/clang-20" -O2 -shared -fpic -D DNNE_ASSEMBLY_NAME=Microsoft.Interop.Tests.NativeExports -D DNNE_COMPILE_AS_SOURCE -I "/__w/1/s/.packages/dnne/2.0.5/tools/platform" -I "/__w/1/s/artifacts/bin/linux-arm64.Release/corehost" -o "/__w/1/s/artifacts/obj/NativeExports/Release/net10.0/linux-arm64/dnne/bin/Microsoft.Interop.Tests.NativeExportsNE.so" --target=aarch64-linux-gnu --gcc-toolchain=/crossrootfs/arm64/usr --sysroot=/crossrootfs/arm64  "/__w/1/s/artifacts/obj/NativeExports/Release/net10.0/linux-arm64/dnne/Microsoft.Interop.Tests.NativeExports.g.c" "/__w/1/s/.packages/dnne/2.0.5/tools/platform/platform.c" -lstdc++ "/__w/1/s/artifacts/bin/linux-arm64.Release/corehost/libnethost.a" --target=aarch64-linux-gnu --gcc-toolchain=/crossrootfs/arm64/usr --sysroot=/crossrootfs/arm64 -fuse-ld=lld  -Wl,--rpath-link=/crossrootfs/arm64/lib/aarch64-linux-gnu -Wl,--rpath-link=/crossrootfs/arm64/usr/lib/aarch64-linux-gnu 
clang-20 : error : no such file or directory: '/__w/1/s/artifacts/bin/linux-arm64.Release/corehost/libnethost.a' [/__w/1/s/src/libraries/System.Runtime.InteropServices/tests/TestAssets/NativeExports/NativeExports.csproj]
```

because some of the interop testing uses DNNE. This is a blind attempt to fix it.
agocke pushed a commit that referenced this pull request Apr 30, 2025
We don't have a need for host in native AOT but after #113765 the outerloop started failing with:

```
  Building native export: "/usr/local/bin/clang-20" -O2 -shared -fpic -D DNNE_ASSEMBLY_NAME=Microsoft.Interop.Tests.NativeExports -D DNNE_COMPILE_AS_SOURCE -I "/__w/1/s/.packages/dnne/2.0.5/tools/platform" -I "/__w/1/s/artifacts/bin/linux-arm64.Release/corehost" -o "/__w/1/s/artifacts/obj/NativeExports/Release/net10.0/linux-arm64/dnne/bin/Microsoft.Interop.Tests.NativeExportsNE.so" --target=aarch64-linux-gnu --gcc-toolchain=/crossrootfs/arm64/usr --sysroot=/crossrootfs/arm64  "/__w/1/s/artifacts/obj/NativeExports/Release/net10.0/linux-arm64/dnne/Microsoft.Interop.Tests.NativeExports.g.c" "/__w/1/s/.packages/dnne/2.0.5/tools/platform/platform.c" -lstdc++ "/__w/1/s/artifacts/bin/linux-arm64.Release/corehost/libnethost.a" --target=aarch64-linux-gnu --gcc-toolchain=/crossrootfs/arm64/usr --sysroot=/crossrootfs/arm64 -fuse-ld=lld  -Wl,--rpath-link=/crossrootfs/arm64/lib/aarch64-linux-gnu -Wl,--rpath-link=/crossrootfs/arm64/usr/lib/aarch64-linux-gnu 
clang-20 : error : no such file or directory: '/__w/1/s/artifacts/bin/linux-arm64.Release/corehost/libnethost.a' [/__w/1/s/src/libraries/System.Runtime.InteropServices/tests/TestAssets/NativeExports/NativeExports.csproj]
```

because some of the interop testing uses DNNE. This is a blind attempt to fix it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants