-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Avoiding loading and processing same cert file to improve cold start performance … #97267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @dotnet/area-system-security, @bartonjs, @vcsjones Issue DetailsThis PR is to improve cold start performance which occurs during certificate load #96740 . This happens across multiple distros(checked in AmazonLinux2023 and RHEL(9.3)), the same file (in this case /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem) is getting re-processed twice.
Tested the method with above fix I see the total time reduced from 103ms to 65ms..
|
The reason for this is that the file we get from OpenSSL (through
Nice! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't end up resolving the final path for links.
(And, if we use strings, we should specify the StringComparison
on the HashSet
)
Rather than storing the path strings, we can extend the TryStatFile
method that gets called above so it returns as an out
argument a tuple with the long Ino
and long Dev
from struct FileStatus
.
This tuple can then be stored in the processedFiles
HashSet
.
cc @adamsitnik |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove the skipStat
argument from this method. We no longer want to ever skip the stat.
And, the changes to this functions can be something like:
if (!TryStatFile(file, out lastModified, out (long, long) filedId) ||
!processedFiles.Add(filedId))
{
return false;
}
We don't need to wait till the end of the function to add the file since next time we try to process it, we expect the same result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tmds I am going to remove skiptest
args in next commit as suggested. But we can't return false
if processedFiles
already have an entry, this will trigger a false positive indicating directory reading is unsuccessful, which will cause to read all certs from /etc/ssl/certs
(which we don't want as I believe that's the fallback plan).
Rather, I will add the entry to processedFiles
if readdata
is true . Second time, if processedFiles
has an entry, will return true(basically letting caller know that it was a success) without re-processing it. Hope it clarifies, let me know if you think otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will trigger a false positive indicating directory reading is unsuccessful, which will cause to read all certs from /etc/ssl/certs (which we don't want as I believe that's the fallback plan).
That's right, if we return false
, it messes up `hasStoreData.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid adding new functions, and keep using the return bool for success pattern. The caller can ignore the out
arguments (using out _
) he doesn't need.
We'd have:
private static bool TryStatFile(string path, out DateTime lastModified, out (long, long) fileId)
=> CallStat(...
private static bool TryStatDirectory(string path, out DateTime lastModified)
=> CallStat(...
private static bool CallStat(string path, out DateTime lastModified, out (long, long) fileId)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
working on it in the next commit.. thank you for providing feedback
fc32fee
to
8fa5d2b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: keep the name TryStat
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: only keep 1 TryStatFile
and update the caller that ignores fileId
by passing it out _
as the last argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
c8ed96a
to
3f0980a
Compare
rebased and re-targeted to main.. |
We would like this to be in the .NET8.0 service release as it will help in all recent RHEL based distros and potentially more. |
@tmds @vcsjones @jeffhandley what else needs to be done to get this merged and back ported to .NET 8 ? |
// In order to maintain "finalization-free" the GetNativeCollections method would need to | ||
// DangerousAddRef, and the callers would need to DangerousRelease, adding more interlocked operations | ||
// on every call. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The blank line shouldn't have been removed, since the comment isn't about the following code, it's about code that's not present; making it a separate logical code paragraph. But it's not worth spinning on.
|
||
private static bool TryStatFile(string path, out DateTime lastModified) | ||
=> TryStat(path, Interop.Sys.FileTypes.S_IFREG, out lastModified); | ||
|
||
private static bool TryStatDirectory(string path, out DateTime lastModified) | ||
=> TryStat(path, Interop.Sys.FileTypes.S_IFDIR, out lastModified); | ||
=> TryStat(path, Interop.Sys.FileTypes.S_IFDIR, out lastModified, out _); | ||
|
||
private static bool TryStat(string path, int fileType, out DateTime lastModified) | ||
private static bool TryStatFile(string path, out DateTime lastModified, out (long, long) fileId) | ||
=> TryStat(path, Interop.Sys.FileTypes.S_IFREG, out lastModified, out fileId); | ||
|
||
private static bool TryStat(string path, int fileType, out DateTime lastModified, out (long, long) fileId) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally the TryStatFile/TryStatDirectory order would have been maintained so the diff showed that it was really just adding the out. But, again, it's not worth spinning on.
It won't be backported to .NET 8. Basically, the only things that get backported to LTS releases are:
The loader logic hasn't really changed in the last several releases, so I don't think this is addressing a regression. So while a 50% startup reduction for this scenario is good, it's not the sort of thing we do in servicing. If I'm wrong, and you have numbers that show that the same code was <=80ms to start in .NET 6, but is now over 100ms in .NET 8... then I can submit that as the justification, but there's no guarantee that the servicing review process would take it. |
@bartonjs I would request the team to consider as .NET 8 is going to stay for long. |
@birojnayak Our customers have a strong preference for stability and minimal churn in servicing. We have our servicing bar documented at https://dotnet.microsoft.com/en-us/platform/support/policy/dotnet-core#servicing. Our backport PRs require written justification that meets this bar. This issue has been present for number of years. In order to justify backport of the fix, we would need to explain what changed recently that this issue needs backporting now, when we have lived with it for many years. It is not a recent regression. If you can help us to write a valid justification, we can try to get it approved for backport. (".NET 8 is going to stay for long." is not valid justification for a backport.) |
The linked PR describes and implements a workaround in AWS lambda: aws/aws-lambda-dotnet#1661 |
@jkotas I will leave it you folks to decide if your team wants to backport to .NET 8.0 (as we have solved internally for AWS lambda via the solution I suggested to team). As long as it is coming in .NET9.0 , it's going to help all. |
this is awesome @sebastienros |
This PR is to improve cold start performance which occurs during certificate load #96740 . This happens across multiple distros(checked in AmazonLinux2023 and RHEL(9.3)), the same file (in this case /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem) is getting re-processed twice.
Tested the method with above fix I see the total time reduced from 103ms to 65ms..