Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add a command to consolidate database against external sources #644

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
amousset opened this issue Aug 15, 2022 · 15 comments
Open

Add a command to consolidate database against external sources #644

amousset opened this issue Aug 15, 2022 · 15 comments

Comments

@amousset
Copy link
Member

We could add a command to:

  • detect missing aliases/related IDs (based on CVE and/or GHSA data), and maybe open pull requests automatically to add them
  • detect advisories present as GHSA but not in advisory-db
@amousset
Copy link
Member Author

A simple PoC using osv data https://gist.github.com/amousset/4585cf0d59d1f243536af70081fdd477

@amousset
Copy link
Member Author

This produces an output looking like:

[...]
Missing alias GHSA-fg7r-2g4j-5cgr in RUSTSEC-2021-0124
Missing alias GHSA-fhvj-7f9p-w788 in RUSTSEC-2020-0034
Missing alias GHSA-wrvc-72w7-xpmj in RUSTSEC-2019-0026
Missing alias GHSA-pphf-f93w-gc84 in RUSTSEC-2020-0111
Missing alias GHSA-wgx2-6432-j3fw in RUSTSEC-2020-0025
Missing alias GHSA-5325-xw5m-phm3 in RUSTSEC-2021-0074
Missing alias CVE-2020-36511 in RUSTSEC-2020-0153
Missing alias CVE-2020-36514 in RUSTSEC-2020-0155
Missing advisory GHSA-pvh2-pj76-4m96 in advisory-db
Missing advisory GHSA-v935-pqmr-g8v9 in advisory-db

This could be used to automate aliases update (GHSA and CVE). Advisories present as GHSA- but not RUSTSEC- are not actionable immediately but it could be useful to be aware of them anyway.

@Shnatsel what is your opinion about this?

@Shnatsel
Copy link
Member

We really need something like this! Thanks for writing it!

I have been experimenting with this as well, although I was querying the GHSA API, not OSV data. I was originally blocked by GHSA not including references to RustSec advisories, but it has been fixed since.

I am happy with detection via the regex on "references" field, that's also how I implemented it. I'm afraid that fetching the entire ZIP file might get unwieldy over time, and querying an API since the last update would incur less traffic - but it also requires a lot more complexity, so I'm fine with just fetching a ZIP for now.

We will need both to import the IDs, and to import advisories that we don't presently carry. The GHSA export to OSV is rather poor right now, with a lot of data lost or exported incorrectly (especially version specifications), so I'm afraid we'll have to use the GHSA API for anything more than IDs.

@Shnatsel
Copy link
Member

Example of poor version specification when exporting from GHSA to OSV:

GHSA-mjvm-mhgc-q4gp gets exported to https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2022/08/GHSA-mjvm-mhgc-q4gp/GHSA-mjvm-mhgc-q4gp.json which omits the Git ranges, then (rightfully) shows up on OSV website as "no fix available" because the Git ranges are not exported, and replaced with a numeric version specification stating that everything from version 0 onwards is affected.

@amousset
Copy link
Member Author

I'm afraid that fetching the entire ZIP file might get unwieldy over time, and querying an API since the last update would incur less traffic

The zip file is currently 875 kB so I think we have some margin, plus we don't need it to run too often.

We will need to import advisories that we don't presently carry.

I don't remember is there was a conclusion regarding licenses and how we could use a cc-by-sa source, do we consider it possible?

@Shnatsel
Copy link
Member

GHSA is actually under CC-BY, and yes, that's possible. We would need to add a license field and a field for the attribution link, and also display those on the website, but nothing about that is infeasible.

We'll be either missing the GHSA advisories in cargo audit, or using OSV in cargo audit (which is not feasible because the GHSA export is incorrect at present), or importing the CC-BY data to RustSec DB. The latter sounds like the least bad option.

@amousset
Copy link
Member Author

which omits the Git ranges, then (rightfully) shows up on OSV website as "no fix available" because the Git ranges are not exported, and replaced with a numeric version specification stating that everything from version 0 onwards is affected.

In this case I wonder if it's not the right behavior given the provided version range. The Github security advisory targets the "Rust ecosystem" while the osv export targets specifically "crates.io", where git commits aren't really meaningful, and it's quite the same for advisory-db. What do you think advisory-db should contain in this case?

@Shnatsel
Copy link
Member

I don't think we can publish an advisory given only a git range. But the problem is that we don't have a way of knowing which advisories were originally git-only, and skipping them.

@tarcieri
Copy link
Member

It would be helpful to have the source for the package as it appears in Cargo.lock

@Shnatsel
Copy link
Member

Indeed, since the git commit hash is available to us through Cargo, we could extend the database to support Git version ranges as well.

It was me who named the OSV ecosystem "crates.io", and perhaps that was a little short-sighted. I was trying to accommodate the potential emergence of other registries, but failed to account for the fact that Git repos would not fall under the "crates.io" label.

@Shnatsel
Copy link
Member

Anyway, hypothetical support for Git repos is a ways off. Step 1 is importing IDs. Step 2 is importing the advisories we can import, losslessly. I will bring up the issues with OSV export with people from Github; they should be addressed eventually, although I don't have the timeframe. After that we may consider supporting Git-only advisories in the tooling.

@Shnatsel
Copy link
Member

I have looked into GHSA's export to OSV a bit more, and discussed it with a Github engineer.

It appears that in the above instance a human has mapped the Git ranges to the crates.io package range, which is good for us because that's what we care about and perform matching on.

The not-so-great news is that GHSA currently uses a custom field for version specification instead of mapping the version ranges to OSV version ranges like we do. So parsing their advisories with OSV rules will silently return incorrect results, and cause numerous false positives. The OSV spec has been revised to accommodate what GHSA is doing, but GHSA has not switched to the standardized field yet. That switch is tracked in github/advisory-database#470

@amousset
Copy link
Member Author

amousset commented Sep 5, 2022

If the github/advisory-database#470 switch gets done, do you see any other obstacles preventing us from importing GHSA from osv?

@Shnatsel
Copy link
Member

Shnatsel commented Sep 5, 2022

I am not aware of any blockers. I am a little wary of using something that's clearly not the native format and currently silently produces nonsensical results, but OTOH some people consume RustSec data via OSV, there's no separation between Github data and ours in the API, and if people encounter that we'll encounter it as well, so... good for eating our own dog food, I guess? Or rather, someone else's dog food that our users are fed whether we want it or not.

@kornelski
Copy link
Contributor

Integration would be nice. I've noticed osv has a bunch of advisories that aren't in rustsec database. I'd prefer to just the rustsec client to get all advisories.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants