-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Record validated tests duration #12638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
d740c10
WIP add test duration
tiurin 197f45f
Truncate file before writing new data
tiurin 9020ccd
Add consistent order
tiurin 1a9a5cf
Remove unused code
tiurin ab67b97
Add is_aws and exception check back
tiurin 7ec5f65
Round durations to 2-digit precision
tiurin 2483cdf
Update validation timestamp only on successful test call
tiurin 7d858b7
Move is_aws and exception check to the beginning
tiurin 8ce5d11
Re-validate an existing snapshot test
tiurin af49541
Delete dummy tests used for development
tiurin 73b1aa4
Remove logging after development
tiurin 25de430
Update file only on teardown phase
tiurin d6d73aa
Revert "Delete dummy tests used for development"
tiurin 5bec439
Get test outcome only on call phase
tiurin bab340d
Break down by execution phase
tiurin 80ee541
Use old-style hookwrapper for consistency
tiurin c4f1150
Remove dummy development tests
tiurin 1628cb0
Re-validate ESM CFN tests after hook refactoring
tiurin 56fb4b0
Reorganize durations object
tiurin b6cb8a6
Add note about duration recordings to parity testing readme
tiurin 49adf9d
Use public imports for TestReport and StashKey
tiurin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,17 +9,28 @@ | |
import json | ||
import os | ||
from pathlib import Path | ||
from typing import Optional | ||
from typing import Dict, Optional | ||
|
||
import pluggy | ||
import pytest | ||
from pluggy import Result | ||
from pytest import StashKey, TestReport | ||
|
||
from localstack.testing.aws.util import is_aws_cloud | ||
|
||
durations_key = StashKey[Dict[str, float]]() | ||
""" | ||
Stores phase durations on the test node between execution phases. | ||
See https://docs.pytest.org/en/latest/reference/reference.html#pytest.Stash | ||
""" | ||
test_failed_key = StashKey[bool]() | ||
""" | ||
Stores information from call execution phase about whether the test failed. | ||
""" | ||
|
||
|
||
def find_snapshot_for_item(item: pytest.Item) -> Optional[dict]: | ||
def find_validation_data_for_item(item: pytest.Item) -> Optional[dict]: | ||
base_path = os.path.join(item.fspath.dirname, item.fspath.purebasename) | ||
snapshot_path = f"{base_path}.snapshot.json" | ||
snapshot_path = f"{base_path}.validation.json" | ||
|
||
if not os.path.exists(snapshot_path): | ||
return None | ||
|
@@ -29,65 +40,76 @@ def find_snapshot_for_item(item: pytest.Item) -> Optional[dict]: | |
return file_content.get(item.nodeid) | ||
|
||
|
||
def find_validation_data_for_item(item: pytest.Item) -> Optional[dict]: | ||
base_path = os.path.join(item.fspath.dirname, item.fspath.purebasename) | ||
snapshot_path = f"{base_path}.validation.json" | ||
@pytest.hookimpl(hookwrapper=True) | ||
def pytest_runtest_makereport(item: pytest.Item, call: pytest.CallInfo): | ||
""" | ||
This hook is called after each test execution phase (setup, call, teardown). | ||
""" | ||
result: Result = yield | ||
report: TestReport = result.get_result() | ||
|
||
if not os.path.exists(snapshot_path): | ||
return None | ||
if call.when == "setup": | ||
_makereport_setup(item, call) | ||
elif call.when == "call": | ||
_makereport_call(item, call) | ||
elif call.when == "teardown": | ||
_makereport_teardown(item, call) | ||
|
||
with open(snapshot_path, "r") as fd: | ||
file_content = json.load(fd) | ||
return file_content.get(item.nodeid) | ||
return report | ||
|
||
|
||
def record_passed_validation(item: pytest.Item, timestamp: Optional[datetime.datetime] = None): | ||
def _stash_phase_duration(call, item): | ||
durations_by_phase = item.stash.setdefault(durations_key, {}) | ||
durations_by_phase[call.when] = round(call.duration, 2) | ||
|
||
|
||
def _makereport_setup(item: pytest.Item, call: pytest.CallInfo): | ||
_stash_phase_duration(call, item) | ||
|
||
|
||
def _makereport_call(item: pytest.Item, call: pytest.CallInfo): | ||
_stash_phase_duration(call, item) | ||
item.stash[test_failed_key] = call.excinfo is not None | ||
|
||
|
||
def _makereport_teardown(item: pytest.Item, call: pytest.CallInfo): | ||
_stash_phase_duration(call, item) | ||
|
||
# only update the file when running against AWS and the test finishes successfully | ||
if not is_aws_cloud() or item.stash.get(test_failed_key, True): | ||
return | ||
|
||
base_path = os.path.join(item.fspath.dirname, item.fspath.purebasename) | ||
file_path = Path(f"{base_path}.validation.json") | ||
file_path.touch() | ||
with file_path.open(mode="r+") as fd: | ||
# read existing state from file | ||
try: | ||
content = json.load(fd) | ||
except json.JSONDecodeError: # expected on first try (empty file) | ||
except json.JSONDecodeError: # expected on the first try (empty file) | ||
content = {} | ||
|
||
# update for this pytest node | ||
if not timestamp: | ||
timestamp = datetime.datetime.now(tz=datetime.timezone.utc) | ||
content[item.nodeid] = {"last_validated_date": timestamp.isoformat(timespec="seconds")} | ||
test_execution_data = content.setdefault(item.nodeid, {}) | ||
|
||
# save updates | ||
fd.seek(0) | ||
json.dump(content, fd, indent=2, sort_keys=True) | ||
fd.write("\n") # add trailing newline for linter and Git compliance | ||
timestamp = datetime.datetime.now(tz=datetime.timezone.utc) | ||
test_execution_data["last_validated_date"] = timestamp.isoformat(timespec="seconds") | ||
|
||
durations_by_phase = item.stash[durations_key] | ||
test_execution_data["durations_in_seconds"] = durations_by_phase | ||
|
||
# TODO: we should skip if we're updating snapshots | ||
joe4dev marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# make sure this is *AFTER* snapshot comparison => tryfirst=True | ||
@pytest.hookimpl(hookwrapper=True, tryfirst=True) | ||
def pytest_runtest_call(item: pytest.Item): | ||
outcome: pluggy.Result = yield | ||
total_duration = sum(durations_by_phase.values()) | ||
durations_by_phase["total"] = round(total_duration, 2) | ||
|
||
# we only want to track passed runs against AWS | ||
if not is_aws_cloud() or outcome.excinfo: | ||
return | ||
# For json.dump sorted test entries enable consistent diffs. | ||
# But test execution data is more readable in insert order for each step (setup, call, teardown). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. praise: neat attention to detail ✨ |
||
# Hence, not using global sort_keys=True for json.dump but rather additionally sorting top-level dict only. | ||
content = dict(sorted(content.items())) | ||
|
||
record_passed_validation(item) | ||
|
||
|
||
# this is a sort of utility used for retroactively creating validation files in accordance with existing snapshot files | ||
joe4dev marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# it takes the recorded date from a snapshot and sets it to the last validated date | ||
# @pytest.hookimpl(trylast=True) | ||
# def pytest_collection_modifyitems(session: pytest.Session, config: pytest.Config, items: list[pytest.Item]): | ||
# for item in items: | ||
# snapshot_entry = find_snapshot_for_item(item) | ||
# if not snapshot_entry: | ||
# continue | ||
# | ||
# snapshot_update_timestamp = datetime.datetime.strptime(snapshot_entry["recorded-date"], "%d-%m-%Y, %H:%M:%S").astimezone(tz=datetime.timezone.utc) | ||
# | ||
# record_passed_validation(item, snapshot_update_timestamp) | ||
# save updates | ||
fd.truncate(0) # clear existing content | ||
fd.seek(0) | ||
json.dump(content, fd, indent=2) | ||
fd.write("\n") # add trailing newline for linter and Git compliance | ||
|
||
|
||
@pytest.hookimpl | ||
|
2 changes: 1 addition & 1 deletion
2
tests/aws/services/lambda_/event_source_mapping/test_cfn_resource.snapshot.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
8 changes: 7 additions & 1 deletion
8
tests/aws/services/lambda_/event_source_mapping/test_cfn_resource.validation.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,11 @@ | ||
{ | ||
"tests/aws/services/lambda_/event_source_mapping/test_cfn_resource.py::test_adding_tags": { | ||
"last_validated_date": "2024-11-06T11:55:29+00:00" | ||
"last_validated_date": "2025-05-19T09:33:12+00:00", | ||
"durations_in_seconds": { | ||
"setup": 0.54, | ||
"call": 69.88, | ||
"teardown": 54.76, | ||
"total": 125.18 | ||
} | ||
} | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we track that anyway and capture it? That way we would avoid flipping between potentially minutes of setup time and only a few microseconds otherwise 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I thought about was to get test collection name via
item.session.config.args
, e.g.['aws/services/lambda_/event_source_mapping']
. It can be used as a part of key, or a unique property, so that test duration is updated if the test has been validated within the same test collection.However, such usage means we should record durations for each new test collection, which is confusing. Or, record only for a predefined collections, e.g. only for individual runs, or only for class runs. Which is also confusing and can be opaque ("why my durations haven't been updated?"). Also, test ordering might come into play for collections. Plus, need to sanitize args as they may contain full path and reveal local setup details. Quite hard to factor in due to many unknown details and their unknown impact.
I'd bet now on simplicity, see if durations actually flip a lot (somewhat good sign, means tests are re-validated, hehe) and learn how we can adapt a format if needed.