Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Validate sarif against schema before uploading #39

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 15, 2020

Conversation

robertbrignull
Copy link
Contributor

Validates files to be uploaded against the SARIF 2.1.0 schema. Should stop invalid files from being uploaded and thus the user will get an error immediately instead of having to wait to see if their upload is processed correctly.

Also makes other code in the upload action easier because we can assume the sarif files are correctly formatted and thus can skip a lot of manual validation code. For example in #38

Do we think this is the right approach for validation within the action? Is this the right schema to use? Will this be strict enough, or too strict? How much testing do we need in this to be confident we've hit a good compromise?

Merge / deployment checklist

  • Run test builds as necessary. Can be on this repository or elsewhere as needed in order to test the change - please include links to tests in other repos!
    • CodeQL using init/analyze actions
    • 3rd party tool using upload action
  • Confirm this change is backwards compatible with existing workflows.
  • Confirm the readme has been updated if necessary.

@marcogario
Copy link
Contributor

Do we think this is the right approach for validation within the action?
How much testing do we need in this to be confident we've hit a good compromise?

I think this approach is simple and gets us a lot. There is very little else that we could validate at this stage, and it would mostly revolve around generated code and fingerprints (that are "considered" later in the toolchain).

Is this the right schema to use? Will this be strict enough, or too strict?

The schema is the correct one. However, on ingestion we are stricter in a few cases. For example, we require all results to have a message. We believe those requirements to be reasonable but we do not have enough data. Therefore, I would prefer to receive valid SARIF that does not satisfy our internal requirements, so that we can keep track of these occurrences and revisit this decision later.

Do you by any chance have an example of how a failure/error message would look in the action console?

@robertbrignull
Copy link
Contributor Author

The format of the error mesages is currently not very good, so I'll improve that so it incluced the line number and ideally a more complete message.

e.g.

Unable to upload "./src/testdata/invalid-sarif.sarif" as it is not valid SARIF:
is not of a type(s) array

And unfortunately the SARIF from shift-left currently fails the validation. I'll investigate what this is

Unable to upload "/home/runner/work/test-electron/test-electron/reports/source-js-report.sarif" as it is not valid SARIF:
contains duplicate item

@robertbrignull
Copy link
Contributor Author

Can someone else check https://github.com/Anthophila/test-electron/runs/699818763?check_suite_focus=true and see if it looks correct? It looks like a false positive to me as the schema for that array is

"description": "An array of strings, containing in order the command line arguments passed to the tool from the operating system.",
"type": "array",
"minItems": 0,
"uniqueItems": false,
"items": {
  "type": "string"
}

so it shouldn't be warning about duplicate items. Is this a bug in the validator?

@marcogario
Copy link
Contributor

marcogario commented May 22, 2020

I think you are right. It is a bug in the validator. I tried the following python -m jsonschema -i source-js-report.sarif sarif-schema-2.1.0.json and it works. Since jsonschema does not output a "success" message (simply says nothing), you can also try to change the snippet you identified to require uniqueItems and see how it fails.

(If you want some unfounded speculation, I think the validator is just checking if uniqueItems is in the dictionary, and assuming that false values are not serialized. Wouldn't be the first time I see something like that)

@robertbrignull
Copy link
Contributor Author

robertbrignull commented May 22, 2020

Your hunch seems to be correct. Finding the site of the problem and fixing it was reasonably easy so I've opened tdegrunt/jsonschema#301. In the meantime we could wait for that fix to propogate to a published version or we could just remove the uniqueItems: false properties from our schema. I guess we should remove them for now and perhaps we can add them back in once the library is updated.

@robertbrignull
Copy link
Contributor Author

Also, https://github.com/Anthophila/test-electron/runs/699818763?check_suite_focus=true is an example of what a failure looks like.

@robertbrignull
Copy link
Contributor Author

Shiftleft now succeeds: https://github.com/Anthophila/test-electron/actions/runs/112621402

Also I had to add in process.exitCode = 0; to the tests which feels like an enormous hack. This is necessary because validateSarifFileSchema calls core.setFailed and this sets the exit code. The alternative would be for validateSarifFileSchema to not call core.setFailed, but I like that all the knowledge of the validation library is contained in that method. Therefore I'm willing to accept some ugly hacks in the tests.

@robertbrignull
Copy link
Contributor Author

Now rubocop is failing

{
    "property": "instance.runs[0].results[0].locations[0].physicalLocation.region.startColumn",
    "message": "must have a minimum value of 1",
    "schema": {
      "description": "The column number of the first character in the region.",
      "type": "integer",
      "minimum": 1
    },
    "name": "minimum",
    "argument": 1,
    "stack": "instance.runs[0].results[0].locations[0].physicalLocation.region.startColumn must have a minimum value of 1"
  }

@robertbrignull
Copy link
Contributor Author

Seems like rubocop may be using 0-based indexing for its startColumn property, and this is indeed against the spec. Here's what the result looks like

{
  "ruleId": "Style/FrozenStringLiteralComment",
  "ruleIndex": 0,
  "message": {
    "text": "Style/FrozenStringLiteralComment: Missing frozen string literal comment."
  },
  "locations": [
    {
      "physicalLocation": {
        "artifactLocation": {
          "uri": "Gemfile",
          "uriBaseId": "%SRCROOT%",
          "index": 0
        },
        "region": {
          "startLine": 1,
          "startColumn": 0,
          "endColumn": 1
        }
      }
    }
  ],
  "partialFingerprints": {
  }
},

@robertbrignull
Copy link
Contributor Author

@arthurnn, did you write the CodeScanning::SarifFormatter for the rubocop workflow? See the above problem that the sarif is failing the validation.

@arthurnn
Copy link
Member

arthurnn commented May 22, 2020

@arthurnn
Copy link
Member

Have released version 0.3.0 for the gem that fixes the issue.

@robertbrignull
Copy link
Contributor Author

@chrisgavin can you take on reviewing this PR? I think the only special action that needs to be taken before merge is check who's using rubocop 0.2.0 and if necessary help them upgrade. I can do that if we're happy with the code.

As an explanation of what's happening here, the upload would already not succeed if the SARIF if invalid but this would not be clear to the uploader. This check is merely moving the error checking earlier in the process so the uploader can see more easily and fix it.

@robertbrignull robertbrignull assigned chrisgavin and unassigned Daverlo Jun 8, 2020
Copy link
Contributor

@chrisgavin chrisgavin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good change to me. It's a share the server-side validation was not enough to catch the RuboCop issue.

@robertbrignull robertbrignull merged commit 8a8a49d into master Jun 15, 2020
@robertbrignull robertbrignull deleted the validate_sarif branch June 15, 2020 14:09
@github-actions github-actions bot mentioned this pull request Jun 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants