Added a KeywordDetector plugin#76
Conversation
Added _remove_password_secrets_if_line_reported_already
Because is often the case that e.g.
SUPER_SECRET_VALUE = 'c3VwZXIgbG9uZyBzdHJ'
is reported both by the PasswordDetector and another plugin.
To minimize diff size, we will simply not report findings from
the PasswordDetector if another plugin reports a secret on the
same line.
| @@ -0,0 +1,68 @@ | |||
| """ | |||
| This code was extracted in part from | |||
| https://github.com/PyCQA/bandit. Using similar heuristic logic, | |||
There was a problem hiding this comment.
Greetings @ericwb and @viraptor, you both seem like the primary maintainers of Bandit.
This code is largely borrowed from plugins/general_hardcoded_password.py and I've tried my best to reference it as such per the APACHE license of Bandit. Can you please give this your blessing, or let me know if it's not to your liking?
Cheers,
Kevin
There was a problem hiding this comment.
This looks 👌 to me. Thank you very much for checking!
| ) | ||
|
|
||
|
|
||
| class PasswordDetector(BasePlugin): |
|
|
||
|
|
||
| class PasswordDetector(BasePlugin): | ||
| """This checks for private keys by determining whether the blacklisted |
| self.data[filename].update(results) | ||
| self.data[filename].update(file_results) | ||
|
|
||
| def _remove_password_secrets_if_line_reported_already( |
There was a problem hiding this comment.
Instead of removing, why don't we change our baseline schema to display types in list form? This way, we can preserve all the intelligence that we gather on a particular flagged secret.
There was a problem hiding this comment.
Well, it would break all the existing baseline's we made (like when we changed the high entropy string types). I'm not sure how hard that would be to implement though, I can look into that.
There was a problem hiding this comment.
Hmm, good point. Though, arguably a step in the right direction?
Thankfully, we built things for doing better version bumps, since the last time. We would need to update the merge_baseline function, and the pre-commit hook should get all active clients slowly upgrading.
The only gotchas I see are:
- Server needs to be updated (and handle stale clients through detected baseline versions)
- We need to handle this case in audit.
Perhaps the fix is merely to implement the dedup logic in auditing (as it's the only place that I can find that uses the type output from the baseline)?
e.g.
def to_list(value):
try:
iter(value)
return value
except ValueError:
return [value]
...
plugin = initialize.from_secret_type(
to_list(secret['type']),
plugin_settings,
)
There was a problem hiding this comment.
re: Instead of removing, why don't we change our baseline schema to display types in list form? This is gonna lead to some super duper ugly code b/c a lot of the code was built around how a single plugin generates a PotentialSecret. I think it is okay to write, but I'm just mentioning b/c it'll get pretty ugly. 😁
e.g. we would still want to yield each individual plugin in _results_accumulator, and so after changing the type attribute of a PotentialSecret to be a list, I would do what I'm doing now, except in addition to removing, I would also add the type to the secret reported by another plugin:
for secret in file_results:
if secret.type == 'Password':
password_secrets.append(secret)
else:
line_numbers_of_other_plugins.add(secret.lineno)
for password_secret in password_secrets:
if password_secret.lineno in line_numbers_of_other_plugins:
del file_results[password_secret]
+ for secret in file_results:
+ if password_secret.lineno == secret.lineno:
+ secret.type.append('Password')If you can give that your blessing I'm happy to make the types into a list 🙏
There was a problem hiding this comment.
When you said Perhaps the fix is merely to implement the dedup logic in auditing, did you mean you don't mind having all the Password ones separate from the others in the baseline? We might double our baseline sizes but I think I'm leaning towards that a little :D
There was a problem hiding this comment.
(Stating for the record, discussed in person yesterday, that we will just take the baseline size increase, and punt of de-duping in the audit functionality until later.)
…lugins report that same line, change PotentialSecret .type back to a single string
|
Maybe add a note to the README that this at least attempts to block these now? Otherwise looks good to me. |
|
Good point! |
Clean up reporting docs
Our version of https://github.com/PyCQA/bandit/blob/master/bandit/plugins/general_hardcoded_password.py#L24