-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Case insensitive String comparisons #2450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…rings" perform string compares case-insensitively; Modified the Acceptance tests affected by the above 2 keywords; Added an Acceptance test to prove out the case-insensitive string comparison;
…rings" perform string compares case-insensitively; (#1) Modified the Acceptance tests affected by the above 2 keywords; Added an Acceptance test to prove out the case-insensitive string comparison;
…bles, such that returned messages match the currently expected output (e.g. "FAIL: 'test' != 'TEST1'" rather than"FAIL: 'test' != 'test1'" when the user passed in "TEST1"); Added Acceptance tests for all the keywords affected by the case-insensitivity changes;
…d not be changed when running
PR #2438 by @krizex already fixed the problem with settings preserving some date between runds he had reported as #2437. This commit slightly cleans up the code and unit tests in the PR. A new unit/integration test to validate the real problem with --rerunfailed that this problem caused is added too.
Added Pygments styles and configured docutils to be compatible with them. Embedding Pygments styles to all Libdoc outputs may be a bit questionable, but they are small enough that I don't think it is a real problem. We may also later add syntax highlighting to Libdoc's default syntax too. Also made background color used by examples etc. in Libdoc outputs and also in logs and reports a little lighter. Syntax highlighting looks better on a little lighter background and this background color is used also in the User Guide.
No need to use external Robot lexer anymore now that Pygments 2.1 suppors all new features. Also made it an error to use invalid syntax highlight language.
…rings" perform string compares case-insensitively; Modified the Acceptance tests affected by the above 2 keywords; Added an Acceptance test to prove out the case-insensitive string comparison;
…bles, such that returned messages match the currently expected output (e.g. "FAIL: 'test' != 'TEST1'" rather than"FAIL: 'test' != 'test1'" when the user passed in "TEST1"); Added Acceptance tests for all the keywords affected by the case-insensitivity changes;
…rk into CaseInsensitive
…rings" perform string compares case-insensitively; Modified the Acceptance tests affected by the above 2 keywords; Added an Acceptance test to prove out the case-insensitive string comparison;
…bles, such that returned messages match the currently expected output (e.g. "FAIL: 'test' != 'TEST1'" rather than"FAIL: 'test' != 'test1'" when the user passed in "TEST1"); Added Acceptance tests for all the keywords affected by the case-insensitivity changes;
…rings" perform string compares case-insensitively; Modified the Acceptance tests affected by the above 2 keywords; Added an Acceptance test to prove out the case-insensitive string comparison;
…bles, such that returned messages match the currently expected output (e.g. "FAIL: 'test' != 'TEST1'" rather than"FAIL: 'test' != 'test1'" when the user passed in "TEST1"); Added Acceptance tests for all the keywords affected by the case-insensitivity changes;
This allows catching these errors using e.g. Run Keyword And Expect Error and also allows teardowns to continue after them.
…d not be changed when running
PR #2438 by @krizex already fixed the problem with settings preserving some date between runds he had reported as #2437. This commit slightly cleans up the code and unit tests in the PR. A new unit/integration test to validate the real problem with --rerunfailed that this problem caused is added too.
Added Pygments styles and configured docutils to be compatible with them. Embedding Pygments styles to all Libdoc outputs may be a bit questionable, but they are small enough that I don't think it is a real problem. We may also later add syntax highlighting to Libdoc's default syntax too. Also made background color used by examples etc. in Libdoc outputs and also in logs and reports a little lighter. Syntax highlighting looks better on a little lighter background and this background color is used also in the User Guide.
No need to use external Robot lexer anymore now that Pygments 2.1 suppors all new features. Also made it an error to use invalid syntax highlight language.
…rings" perform string compares case-insensitively; Modified the Acceptance tests affected by the above 2 keywords; Added an Acceptance test to prove out the case-insensitive string comparison;
…bles, such that returned messages match the currently expected output (e.g. "FAIL: 'test' != 'TEST1'" rather than"FAIL: 'test' != 'test1'" when the user passed in "TEST1"); Added Acceptance tests for all the keywords affected by the case-insensitivity changes;
…d not be changed when running
PR #2438 by @krizex already fixed the problem with settings preserving some date between runds he had reported as #2437. This commit slightly cleans up the code and unit tests in the PR. A new unit/integration test to validate the real problem with --rerunfailed that this problem caused is added too.
…rings" perform string compares case-insensitively; Modified the Acceptance tests affected by the above 2 keywords; Added an Acceptance test to prove out the case-insensitive string comparison;
…bles, such that returned messages match the currently expected output (e.g. "FAIL: 'test' != 'TEST1'" rather than"FAIL: 'test' != 'test1'" when the user passed in "TEST1"); Added Acceptance tests for all the keywords affected by the case-insensitivity changes;
…rings" perform string compares case-insensitively; Modified the Acceptance tests affected by the above 2 keywords; Added an Acceptance test to prove out the case-insensitive string comparison;
…bles, such that returned messages match the currently expected output (e.g. "FAIL: 'test' != 'TEST1'" rather than"FAIL: 'test' != 'test1'" when the user passed in "TEST1"); Added Acceptance tests for all the keywords affected by the case-insensitivity changes;
…rings" perform string compares case-insensitively; Modified the Acceptance tests affected by the above 2 keywords; Added an Acceptance test to prove out the case-insensitive string comparison;
…bles, such that returned messages match the currently expected output (e.g. "FAIL: 'test' != 'TEST1'" rather than"FAIL: 'test' != 'test1'" when the user passed in "TEST1"); Added Acceptance tests for all the keywords affected by the case-insensitivity changes;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't yet look at tests, but code looks very good. Most important things to be fixed:
- Keywords that accept a container should be enhanced to convert items in the container to lower case.
- Docs and implementation with keywords that always expect arguments to be passed as strings can be simplified.
``ignore_case`` is False by default. If True, it indicates that | ||
``first`` and ``second`` should be compared case-insensitively, | ||
provided that ``first`` and ``second`` are string types. See `Boolean | ||
arguments` section for more details. (New in RF 3.0.1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good docs and formatting. Two minor issues:
(New in RF 3.0.1)
is a inconsistent with how this kind of information is displayed elsewhere. Could you change it to, for example,This option is new in Robot Framework 3.0.1.
?ignore_case is False by default. If True, it indicates ...
is a bit redundant considering that the argument isignore_case=False
. The current wording is fine, but perhaps something likeIf ignore_case is given a true value, it indicates ...
would be a little better.
If both arguments are multiline strings, the comparison is done using | ||
`multiline string comparisons`. | ||
""" | ||
if is_truthy(ignore_case) and is_string(first) and is_string(second): | ||
first = first.lower() | ||
second = second.lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is great, but it is unfortunately subject to this annoying IronPython bug:
http://ironpython.codeplex.com/workitem/33133
We have utils.lower()
utility that could be used instead of string.lower()
or alternatively we could use string.upper()
here. Or then we could argue that we don't need to care about this because the bug is fixed in latest IronPython 2.7.x releases. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to not add workarounds for Python engines that have bugs which need to be resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding workarounds like that is annoying, but the framework working incorrectly on some platform is worse. It's especially bad if tests fail on that platform when run on CI. In this particular case we can argue that people ought to use newish IronPython versions where this problem is fixed. Like we do on our CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, are you advocating for the stance that users of IronPython 2.7.x and below should upgrade to fix the bug with "string.lower()" in their interpreter or modify all the "string.lower()" calls to "utils.lower(string)" to workaround that bug?
My opinion says to leave the string conversions and have the users upgrade since bugs with "string.lower()" are likely giving problems elsewhere.
``ignore_case`` is False by default. If True, it indicates that | ||
``first`` and ``second`` should be compared case-insensitively, | ||
provided that ``first`` and ``second`` are string types. See `Boolean | ||
arguments` section for more details. (New in RF 3.0.1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this particular keyword, provided that first and second are string types
is misleading. This keyword first converts inputs to strings and thus values are always compared case-insensitively if when ignore_case
is enabled. I guess the whole provided that ...
piece could just be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough.
See `Should Be Equal` for an explanation on how to override the default | ||
error message with ``msg`` and ``values``. | ||
""" | ||
self._log_types_at_info_if_different(first, second) | ||
first, second = [self._convert_to_string(i) for i in (first, second)] | ||
if is_truthy(ignore_case) and is_string(first) and is_string(second): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to use is_string
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
"""Fails if objects are unequal after converting them to strings. | ||
|
||
See `Should Be Equal` for an explanation on how to override the default | ||
error message with ``msg`` and ``values``. | ||
|
||
``ignore_case`` is False by default. If True, it indicates that | ||
``first`` and ``second`` should be compared case-insensitively, | ||
provided that ``first`` and ``second`` are string types. See `Boolean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provided that ... are string types
is wrong also here.
"""Fails if ``item1`` does not contain ``item2`` ``count`` times. | ||
|
||
Works with strings, lists and all objects that `Get Count` works | ||
with. The default error message can be overridden with ``msg`` and | ||
the actual count is always logged. | ||
|
||
``ignore_case`` is False by default. If True, it indicates that | ||
``item1`` and ``item2`` should be compared case-insensitively, | ||
provided that ``item1`` and ``item2`` are string types. See `Boolean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same discussion about docs and implementation as with should_not_contain
applies also to this keyword.
"""Fails if the given ``string`` matches the given ``pattern``. | ||
|
||
Pattern matching is similar as matching files in a shell, and it is | ||
always case-sensitive. In the pattern ``*`` matches to anything and | ||
``?`` matches to any single character. | ||
|
||
``ignore_case`` is False by default. If True, it indicates that | ||
``first`` and ``second`` should be compared case-insensitively, | ||
provided that ``first`` and ``second`` are string types. See `Boolean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly as with other keywords that should always be used with strings, provided that ... are string types
is unnecessary.
if is_truthy(ignore_case) and is_string(string) and is_string(pattern): | ||
prv_string = string.lower() | ||
prv_pattern = pattern.lower() | ||
if self._matches(prv_string, prv_pattern): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to explicitly check are string
and pattern
strings or not. Simply using
if is_truthy(ignore_case):
string = string.lower()
pattern = pattern.lower()
ought to be enough.
"""Fails unless the given ``string`` matches the given ``pattern``. | ||
|
||
Pattern matching is similar as matching files in a shell, and it is | ||
always case-sensitive. In the pattern, ``*`` matches to anything and | ||
``?`` matches to any single character. | ||
|
||
``ignore_case`` is False by default. If True, it indicates that | ||
``string`` and ``pattern`` should be compared case-insensitively, | ||
provided that ``string`` and ``pattern`` are string types. See `Boolean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provided that ... are string types
is unnecessary here too.
if is_truthy(ignore_case) and is_string(string) and is_string(pattern): | ||
prv_string = string.lower() | ||
prv_pattern = pattern.lower() | ||
if not self._matches(prv_string, prv_pattern): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to check are string
and pattern
strings here either.
This is quite a big PR. It would be easier to review changes if each keyword and its "not" variant would be in a separate pull request. Simple cases like |
As already discussed in code comments, some keywords currently show the lower cased strings in the error message and others show the original. Probably showing the original is better, but that needs to be done consistently. If we agree it is a good idea, we should enhance On the code level we have two general possibilities to preserve the original values. Either we can just keep a reference to the original values and use them in the error message like this: orig_first = first
orig_second = second
if is_truthy(case_insensitive):
first = first.lower()
second = second.lower()
if not first.startswith(second):
# raise exception using `orig_first` and `orig_second` Alternatively we could do the actual validation differently depending on case-sensitivity: if not is_truthy(case_insensitive):
passed = first.startswith(second)
else:
passed = first.lower().startswith(second.lower())
if not passed:
# raise exception using `first` and `second` Probably the former is more explicit and easier to understand. In this case the latter is shorted, but if the validation part is more complex, duplicating it isn't a good idea. |
…ileLinksInOutput # Conflicts: # src/robot/libraries/BuiltIn.py
I just revisited all of the above, and I'm more convinced that your suggestion to handle this update as multiple bite-size PR's is much wiser idea. I'll go ahead and close this PR and then reintroduce them as a single PR focused on one logical grouping of keywords (i.e. "Should Be Equal"/"Should Not Be Equal"), keeping all your above feedback in mind of course :-) |
Sounds great! |
Added input parameter to all the keywords that commonly compare strings, and could use that feature;
Modified the function logic to trigger off the input parameter and compare strings case-insensitively;