Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Shared/Java: Add shared Guards library and switch Java to use it. #19573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

aschackmull
Copy link
Contributor

@aschackmull aschackmull commented May 23, 2025

This adds a new shared Guards library, which provides complex implication logic between guards. The implementation is heavily inspired by the corresponding Java and C# versions.
The Java Guards library is then switched to use this new library, which results in a number of precision improvements for the nullness and useless comparison test queries.

There's currently a known FP related to correlated conditions in assert statements that I've documented as a qltest. I plan to fix that in a follow-up PR.

Review of the shared implementation (the single file shared/controlflow/codeql/controlflow/Guards.qll) is likely best done by reading the final result, but the other changes can be reviewed commit-by-commit.

}

private module LogicInput_v2 implements GuardsImpl::LogicInputSig {
private import semmle.code.java.dataflow.SSA as SSA

Check warning

Code scanning / CodeQL

Names only differing by case Warning

SSA is only different by casing from Ssa that is used elsewhere for modules.
@aschackmull aschackmull force-pushed the guardslib branch 2 times, most recently from 969365f to 4abca27 Compare May 27, 2025 06:50
@aschackmull aschackmull force-pushed the guardslib branch 3 times, most recently from f4e0076 to ed640ab Compare June 17, 2025 11:49
@aschackmull aschackmull marked this pull request as ready for review June 17, 2025 12:05
@aschackmull aschackmull requested a review from a team as a code owner June 17, 2025 12:05
@aschackmull aschackmull requested a review from hvitved June 17, 2025 12:06
Copy link
Contributor

@hvitved hvitved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good, great to finally have a shared guards library 🎉

* controls the basic block `A`, in this case because the true branch dominates
* `A`, but more elaborate controls-relationships may also hold.
* For example, in
* ```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add java after bacticks

predicate strictlyDominates(BasicBlock bb);
}

predicate dominatingEdge(BasicBlock bb1, BasicBlock bb2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add QL doc describing how this differs from bb1.strictlyDominates(bb2)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied qldoc from the shared BasicBlocks lib.

/** Gets the integer that this value represents, if any. */
int asIntValue() { this = TValue(TValueInt(result), true) }

/** Gets the boolean that this value represents, if any. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Boolean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? We write e.g. "integer" with lowercase. It's not really referring to a specific class or anything, it's just the English language meaning of "boolean", right? And the ql type that's being returned is also the lowercased boolean.

v1.asBooleanValue() = true and
g2.(Case).getSwitchExpr() = switchExpr and
v2.asBooleanValue() = false and
g1 != g2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to generalize to say: if g1, v1 represents matching case i, then g2, v2 represents non-matching any case j < i?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, why is this implication even needed? I would have thought that dominance came for free from the CFG?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implication is needed if the CFG uses a jump-table interpretation (which Java mostly does), because then non-matching a case isn't represented in a specific edge. And it's only the default case that tends to be relevant, since knowing e.g. not "foo" when you already know "bar" in case "foo": .. case "bar": .. tends to be pointless.

}

class SsaWriteDefinition extends SsaDefinition {
Expr getDefinition();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps just move this into SsaDefinition and then get rid of SsaWriteDefinition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to reflect the shared SSA api as much as possible here. Hence this choice.

* starting from a given set of base cases.
*/
cached
module ImpliesTC<baseGuardValueSig/2 baseGuardValue> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

* control flow. It may also be a switch case, which as a guard is considered
* to evaluate to either true or false depending on whether the case matches.
*/
final class Guard extends PreGuard {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this class have a charpred to restrict it somehow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question and indeed a subtle change from the language-specific guards libraries. I think we can do it with guardControlsBranchEdge(this, _, _, _) if we change a bunch of Guard references to PreGuard. Again, this is something that would merit a dca rerun, so I'd like to investigate this option as a follow-up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH, I'm not sure that it would actually achieve much, but I do agree that it would make more sense to have it.

predicate parameterMatch(ParameterPosition ppos, ArgumentPosition apos);

/** A non-overridable method with a boolean return value. */
class BooleanMethod {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could all of this logic be generalized to methods that return constant values? E.g. like a Compare method that typically returns -1, 0, or 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to generalise it to at least null/not-null and exception/no-exception. But something based around such integers might make sense as well. I'm already working on a follow-up branch that touches on this.

* wrappers. This can be used to instantiate the `additionalImpliesStep`
* predicate.
*/
module CustomGuard<CustomGuardInputSig CustomGuardInput> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something like WrapperGuard would be a better name.

As we discussed offline, it might be better to remove the nesting of parameterized modules; I'll let you decide (perhaps follow-up).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already looking at this in a follow-up branch.


class BasicBlock = J::BasicBlock;

predicate dominatingEdge(BasicBlock bb1, BasicBlock bb2) { J::dominatingEdge(bb1, bb2) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that dominatingEdge is exposed directly from the java.qll module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something to maybe fix in the glorious future. Historically the contents of Dominance.qll have been exposed by default for Java.

* with logical implications based on SSA.
*/
module Logic<LogicInputSig LogicInput> {
private import LogicInput

Check warning

Code scanning / CodeQL

Redundant import Warning

Redundant import, the module is already imported inside
LogicInput
.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants