-
Notifications
You must be signed in to change notification settings - Fork 311
Description
This logic here:
Lines 1005 to 1031 in 7ba8b62
| for line in self.parse(report): | |
| if not line: | |
| continue | |
| try: | |
| value = self.parse_line(line, report) | |
| if value is None: | |
| continue | |
| elif type(value) is list or isinstance(value, types.GeneratorType): | |
| # filter out None | |
| events = list(filter(bool, value)) | |
| else: | |
| events = [value] | |
| except Exception: | |
| self.logger.exception('Failed to parse line.') | |
| self.__failed.append((traceback.format_exc(), line)) | |
| else: | |
| events_count += len(events) | |
| self.send_message(*events) | |
| for exc, line in self.__failed: | |
| report_dump = report.copy() | |
| report_dump.change('raw', self.recover_line(line)) | |
| if self.parameters.error_dump_message: | |
| self._dump_message(exc, report_dump) | |
| if self._Bot__destination_queues and '_on_error' in self._Bot__destination_queues: | |
| self.send_message(report_dump, path='_on_error') |
does not work with all recover_line_* methods. Some methods use the parameter line, others use self.current_line. The overall logic is fine, but there is a major bug:
process collects all fails (self.__failed is appended with line) in the first loop (for line in self.parse(report))
In the second loop (for exc, line in self.__failed), recover_line is called with line.
If recover_line accesses self.current_line, the data is wrong, as self.current_line is then the last line of the report, not the actual one.
Unfortunately, simply fixing some recover_line_* functions is not enough, the process in self.process needs to be thought through and eventually adapted as well.
self.current_lineshould be deleted after the parsing end to prohibit this error in the future.self.recover_linebehaviour should be harmonized, making it applicable for use inself.parse_lineand inself.processself.processshould be investigated
In the future we also need better tests, but that's a bigger task and I'm afraid we can't stem that on a short term. And unfortunately the issue is important, as it leads to bogus (wrong/duplicated) data in the dumps and therefore loss of data.