-
Notifications
You must be signed in to change notification settings - Fork 215
Description
Overview
It's important to recognize what kind of errors can happen in node execution. Some errors can be result of network problems in which case node should be capable to recover and continue its job, but others can be fatal and recovery cannot be possible.
We need to design a unify way to categorize different types of errors and handle them appropriately. Also it's important to correctly chain errors thrown from lower layers so that final exception contains all relevant information. Unfortunately, in current code even this basic handling of exception is not done right.
Design
Basic knowledge about exceptions
- Swallowing exceptions is strictly forbidden
- When exception is caught it must be re-thrown (as is or wrapped in another exception) or written to the log
- More info about chained exceptions:
https://docs.oracle.com/javase/7/docs/api/java/lang/Throwable.html
- Exceptions should NOT be used to control flow of the program
- Exceptions in cats effects works differently from Java exceptions
- Catching all Throwable (not NonFatal) errors is dangerous (see item above)
- Transforming exceptions to
Stringlooses information and should only be done on the exit point not inside domain logic - Catching exceptions should be done on the latest point and not on every function call
API
Transforming of exceptions to other representation should be done only when error is handled and code will continue normal operation or error is given to external source or caller.
This is important for API implementations which is a boundary where errors must be transformed depending on the underlying protocol. We are all familiar with HTTP error codes and their meaning. gRPC basically works the same so defining custom ServiceError type in every response is obviously wrong.
rchain/models/src/main/protobuf/DeployServiceV1.proto
Lines 72 to 235 in 6c8dbce
| message EventInfoResponse{ | |
| oneof message{ | |
| ServiceError error = 1; | |
| BlockEventInfo result = 2; | |
| } | |
| } | |
| message ExploratoryDeployResponse{ | |
| oneof message{ | |
| ServiceError error = 1; | |
| DataWithBlockInfo result = 2; | |
| } | |
| } | |
| // doDeploy | |
| message DeployResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| string result = 2; | |
| } | |
| } | |
| // deployStatus | |
| message DeployStatusResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| DeployExecStatus deployExecStatus = 2; | |
| } | |
| } | |
| message DeployExecStatus { | |
| oneof status { | |
| ProcessedWithSuccess processedWithSuccess = 1; | |
| ProcessedWithError processedWithError = 2; | |
| NotProcessed notProcessed = 3; | |
| } | |
| } | |
| message ProcessedWithSuccess { | |
| repeated Par deployResult = 1; | |
| LightBlockInfo block = 2 [(scalapb.field).no_box = true]; | |
| } | |
| message ProcessedWithError { | |
| string deployError = 1; | |
| LightBlockInfo block = 2 [(scalapb.field).no_box = true]; | |
| } | |
| message NotProcessed { | |
| string status = 1; | |
| } | |
| // getBlock | |
| message BlockResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| BlockInfo blockInfo = 2; | |
| } | |
| } | |
| // visualizeDag | |
| message VisualizeBlocksResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| string content = 2; | |
| } | |
| } | |
| // machineVerifiableDag | |
| message MachineVerifyResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| string content = 2; | |
| } | |
| } | |
| // getBlocks | |
| message BlockInfoResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| LightBlockInfo blockInfo = 2; | |
| } | |
| } | |
| // listenForDataAtName | |
| message ListeningNameDataResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| ListeningNameDataPayload payload = 2; | |
| } | |
| } | |
| message ListeningNameDataPayload { | |
| repeated DataWithBlockInfo blockInfo = 1; | |
| int32 length = 2; | |
| } | |
| // listenForDataAtPar | |
| message RhoDataResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| RhoDataPayload payload = 2; | |
| } | |
| } | |
| message RhoDataPayload { | |
| repeated Par par = 1; | |
| LightBlockInfo block = 2 [(scalapb.field).no_box = true]; | |
| } | |
| // listenForContinuationAtName | |
| message ContinuationAtNameResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| ContinuationAtNamePayload payload = 2; | |
| } | |
| } | |
| message ContinuationAtNamePayload { | |
| repeated ContinuationsWithBlockInfo blockResults = 1; | |
| int32 length = 2; | |
| } | |
| // findDeploy | |
| message FindDeployResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| LightBlockInfo blockInfo = 2; | |
| } | |
| } | |
| message PrivateNamePreviewPayload { | |
| repeated bytes ids = 1; // a la GPrivate | |
| } | |
| // lastFinalizedBlock | |
| message LastFinalizedBlockResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| BlockInfo blockInfo = 2; | |
| } | |
| } | |
| // isFinalized | |
| message IsFinalizedResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| bool isFinalized = 2; | |
| } | |
| } | |
| message BondStatusResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| bool isBonded = 2; | |
| } | |
| } | |
| message StatusResponse { | |
| oneof message { | |
| ServiceError error = 1; | |
| Status status = 2; | |
| } | |
| } |
More information about gRPC error handling.
https://www.grpc.io/docs/guides/error/
Categorization of errors
Exceptions as the name suggest represent exceptional situation which when happen should provide enough information for human (or program) to understand the nature of the error or be able to search for more data based on the error message.
For these purposes, errors usually contain an error code which is used to group similar type of errors or to directly identify the error.
For example HTTP error 404 means that requested resource is not found.
https://www.rfc-editor.org/rfc/rfc9110.html#name-404-not-found
Or HTTP 500 to represent any kind of server error.
https://www.rfc-editor.org/rfc/rfc9110.html#name-500-internal-server-error
Error codes are also useful to give users easier way to search for errors or even generate link to documentation based on provided error codes.