Every day we deal with errors
, warnings
and messages
while writing, debugging or reviewing code. The three types belong to conditions
in R
. You might hope to see as few of them as possible, but actually they are so helpful when they describe the problem concisely and refer to its source. So if you write functions or code for yourself or others, it is a good practice to spend more time in writing descriptive conditions. I, personally, was not following this advice all the time, but as I am getting into this habit, I learnt more about the condition handling system in R
and also about the improvements rlang
provides.
In this post I will highlight the basics of condition handling. Then we’ll see the benefits of custom conditions provided by rlang
. By the end, we will be able to understand how to generate custom conditions and throw errors with more details as shown in the following example.
Conditions in base R
in R
conditions
are regular objects and they mainly include:
error
: signaled bystop()
warning
: generated bywarning()
message
: generated bymessage()
Let’s see an example with a simple function; my_sqrt()
that raises an error when a negative number is passed to it.
## define my_sqrt() that only takes positive numbers
my_sqrt <- function(x){
if((x) < 0) {
stop("x must be positive")
} else {
sqrt(x)
}
}
Now if you pass -1 to my_sqrt()
, it will exit and show you the message which you specified inside stop()
.
## pass -ve number to my_sqrt()
my_sqrt(-1)
Error in my_sqrt(-1): x must be positive
But how can we handle conditions
and decide what to do when they are generated?
Condition handling
tryCatch
is one of the ways to inspect condition objects and control what happens when a condition is signaled. For instance, we can define an error handler to decide what happens when my_sqrt()
fails. Here, function(cnd) cnd
, the error handler passed to the error
argument inside tryCatch()
says “catch the error object and return it”.
If you inspect the returned value sqrt_cnd
you can see a list with:
message
: the error message you defined inmy_sqrt
.call
: the function call that raised this error.
## define an error handler to return the error object when an error is thrown
sqrt_cnd <- tryCatch(error = function(cnd) cnd, my_sqrt(-1))
str(sqrt_cnd)
List of 2
$ message: chr "x must be positive"
$ call : language my_sqrt(-1)
- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
Since the error handler is a normal function, you can decide what to return other than the message
and call
. For instance, instead of catching the error and throwing it, you can return a value; here 0
.
## define an error handler to return 0 when an error is thrown
sqrt_cnd <- tryCatch(error = function(x) 0, my_sqrt(-1))
str(sqrt_cnd)
num 0
But What if we had a chain of functions, where a function calls another one?
Conditions with a chain of functions
In practice, we usually write functions that call other functions and we might get lost if we don’t have an easy way to find the source of the error or decide what to do when it is thrown.
To see this case, let’s define two functions for demo purposes:
get_val()
: which return the random value if positive and raises an error if negative. (this is to simulate random inputs or fetching data from users, database, etc.)
## define get_val() to simulate random input values
get_val <- function(){
val <- runif(1, -10, 10)
if (val < 1){
stop("Can't get val")
} else {
val
}
}
double_value()
: which callsget_val()
and multiplies the returned value by 2.
## Note that `mult_val()` it is not a very practical example,
## because the function doesn't do a single task related to its name,
## but I am just using it for demo purposes
mult_val <- function(mult_by = 2){
x <- get_val()
x*mult_by
}
In case val
is negative in get_val()
, an error will be thrown as follows:
## in case val negative
get_val()
Error in get_val(): Can't get val
Similarly, when we call mult_val()
, the error will jump and and we will see an error message.
## in case val negative
mult_val()
Error in get_val(): Can't get val
In both cases, we have the same error message and we have no info about the value of val
that caused the error.
So is there a way to see more info about the error, like the exact value of val
?, or could we write more detailed messages?
Conditions in rlang
In principle, it is possible to create custom condition objects to pass more meta-data about the error. But in base R
, it is kind of confusing compared to what rlang
provides. I had to look up the base R
way and check some examples every time I wanted to handle such cases!
So rlang
provides functions that correspond to base R
ones as follows:
rlang | base R |
---|---|
abort() | stop() |
warn() | warning() |
inform() | message() |
rlang
functions are designed to deal with condition objects and create custom ones easily, unlike base R
functions that are focused on messages.
Custom conditions
abort()
versus stop()
To clarify the difference, let’s modify get_val()
and use abort()
instead of stop()
. Here you can see three arguments passed to abort()
:
message
: the error message which is similar to the one passed tostop()
in the previous example..subclass
: a subclass of the condition to differentiate errors.val
: the particular value that caused the error.
You can pass more values to abort()
, and it will return a custom error object with a list of all these values.
## define get_val() to simulate random input values
get_val <- function(){
val <- runif(1, -10, 10)
if (val < 1){
rlang::abort(message = "Can't get val",
.subclass ="get_val_error",
val = val)
} else {
val
}
}
To inspect the custom error object returned by get_val()
, you can use tryCatch()
and assign the value to custom_cnd
.
## define an error handler to return the custom error object
custom_cnd <- tryCatch(error = function(cnd) cnd, get_val())
Notice that:
- the error object has the main classes in addition to the defined subclass
get_val_error
. - the value
val
which caused the error is available and you can access it usingcustom_cnd$val
.
## inspect custom_cnd
str(custom_cnd, max.level = 1)
List of 5
$ message: chr "Can't get val"
$ call : NULL
$ trace :List of 3
..- attr(*, "class")= chr "rlang_trace"
$ parent : NULL
$ val : num -2.86
- attr(*, "class")= chr [1:4] "get_val_error" "rlang_error" "error" "condition"
So here’s a quick comparison between the error object returned by rlang::abort()
in this example and the one returned by stop()
in the previous section.
So now we have more meta-data about the error and a specific subclasse. How can we use this with chained functions?
Error messages (Catch, modify, rethrow)
Let’s say, we want to get a more precise message when we call mult_val()
that calls get_val()
. For instance, a message like:
“Can’t calculate value because get_val()
raised an error as val
was negative (-1.5648)”
We can define an error handler get_val_handler()
to access the values returned in the custom error object thrown by get_val()
then return a message based on these values.
What get_val_handler()
basically does is to:
- define a basic error message “Can’t calculate value”, that will be shown anyways.
- check the class of the error object returned by
get_val()
. If the class belongs to a specific subclassget_val_error
, the message gets modified to include the value ofval
. - return an error object with the final message and a subclass
mult_val_error
.
## define an error handler to modify the message
get_val_handler <- function(cnd) {
msg <- "Can't calculate value"
if (inherits(cnd, "get_val_error")) {
msg <- paste0(msg, " as `val` passed to `get_val()` equals (", cnd$val,")")
}
rlang::abort(msg, "mult_val_error")
}
So now if you use get_val_handler
with get_val()
inside mult_val()
, you basically say:
“If you catch an error from get_val()
, get the value of val
that caused the error and add it to the error message that will be returned by mult_val()
”
## use get_val_handler() with mult_val()
mult_val <- function(mult_by = 2){
x <- tryCatch(error = get_val_handler, get_val())
x*mult_by
}
And here you can see an example of the modified error message including the value of val
that caused the error, which you couldn’t have access to earlier with the default stop()
function.
mult_val()
Error: Can't calculate value as `val` passed to `get_val()` equals (-2.8569)
[90mCall `rlang::last_error()` to see a backtrace[39m
If you want to inspect the error object returned by mult_val()
, you can see the details including the new subclass mult_val_error
, to which this error object belongs.
## define an error handler to return the error object
modified_cnd <- tryCatch(error = function(cnd) cnd, mult_val())
str(modified_cnd, max.level = 1)
List of 4
$ message: chr "Can't calculate value as `val` passed to `get_val()` equals (-2.8569)"
$ call : NULL
$ trace :List of 3
..- attr(*, "class")= chr "rlang_trace"
$ parent : NULL
- attr(*, "class")= chr [1:4] "mult_val_error" "rlang_error" "error" "condition"
Conclusion
Conditions can be our friends and guides through debugging and code review processes. How useful they are depends on how clear and concise the info they give. rlang
provides an easy way to deal with custom conditions. It allows us to pass meta-data about the conditions, which helps in better reporting and handling. The previous examples showed the differences between rlang
and base r
conditions. They also demonstrated how to deal with custom conditions. Most importantly, we saw how to catch, modify and rethrow an error in chained functions. So what remains is to make use of this flexibility to handle conditions and write more informative messages.
Extra Resources
- The tidyverse style guide: Error messages, by Hadley Wickham
- Advanced R - Exceptions and debugging, by Hadley Wickham
- Reducing bewilderment by improving errors, by Lionel Henry
Notes
The version of
rlang
used here isrlang_0.2.2.9001
. I am not sure if everything works in the same way in earlier versions.The functions used in the examples are not perfect since they are not pure, they don’t perform a single clear task and their names do not reflect their purpose. However, they were just used for demo purposes.