Debugging code in any language can be tricky. Even with the many different debugging tools available out there, I know most of us still have a tendency to lean into the print("Here")
tricks. But I thought I’d share a little strategy that I like to use in R.
Particularly when you’re working with package development, certain bugs can be tough to reach. If you haven’t learned to use browser()
in R, I highly recommend it! But even so, it can still be frustrating to use – especially when you’re dealing with methods such as vectorization or looping. Stopping the code at the problematic element or iteration can be a pain. Sometimes you need to work with the data right at that spot, but spend some time experimenting with different ways to fix the problem.
So here’s what I like to do. Let’s say you have a buggy function:
buggy_function <- function(x, y) { step1 <- mtcars %>% mutate( flag = if_else(gear > x && carb > y, TRUE, FALSE) ) step1 %>% group_by(flag) %>% summarize( mean = mean(mpg), sd = sd(mpg) ) }
buggy_function(3,3)
## # A tibble: 1 × 3 ## flag mean sd ## <lgl> <dbl> <dbl> ## 1 TRUE 20.1 6.03
Hm. Not what I expected. If you dig into mtcars
, you’ll see that there very well should be some FALSE
records for the flag variable we created. The first step here would be digging into the step1
dataset to see what’s happening. That’s easy enough using the browser()
function.
buggy_function <- function(x, y) { step1 <- mtcars %>% mutate( flag = if_else(gear > x && carb > y, TRUE, FALSE) ) browser() step1 %>% group_by(flag) %>% summarize( mean = mean(mpg), sd = sd(mpg) ) }
buggy_function(3, 3)
Called from: buggy_function(3, 3) Browse[1]>
This gives us the opportunity to stop the function from executing at this exact point and explore the environment. For this example, this is pretty straightforward, but you will inevitably hit more complex scenarios. What if your issue happens on the 112th iteration of some vectorized function? Here I’m using mtcars
, but what if the starting dataset of this function where we create step1
is dynamic? In those cases, testing out code while using browser()
can be a pain. It’s not very hard to inadvertently exit the browser and lose the environment where you were trying to debug. At that point, you need to start it up again and navigate your way back to the problem section.
So here’s a pretty simple technique to work around this problem. Move the troublesome data into your global environment so you can experiment freely. How would we do that?
> buggy_function(3, 3) Called from: buggy_function(3, 3) Browse[1]> assign('step1', step1, envir=globalenv())
From within the browser, you can use the assign()
function to take an object from the executing environment and assign it to another. Put simply – we’re grabbing that data so we can play with it outside of the browser. Here, I’m assigning a variable named step1
using the object step1
from within the executing function we’re debugging, and assigning it within the global environment. From there, I can experiment from outside the browser.
step1 %>% select(gear, carb, flag)
## gear carb flag ## 1 4 4 FALSE ## 2 4 4 FALSE ## 3 4 1 FALSE ## 4 3 1 FALSE ## 5 3 2 FALSE ## 6 3 1 FALSE
With a little experimenting, I can find my issue. Within my if_else()
call, the condition I’m using is written incorrectly
step1$gear > 3 && step1$carb > 3
## [1] TRUE
Whoops! It returns a single `TRUE` or `FALSE`. That’s because `&` and `&&` function differently (learn more here). Fixing that up I can see my mistake:
step1$gear > 3 & step1$carb > 3
## [1] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE ## [14] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE ## [27] FALSE FALSE TRUE TRUE TRUE FALSE
And finally I can fix my function!
buggy_function <- function(x, y) { step1 <- mtcars %>% mutate( flag = if_else(gear > x & carb > y, TRUE, FALSE) ) step1 %>% group_by(flag) %>% summarize( mean = mean(mpg), sd = sd(mpg) ) }
buggy_function(3, 3)
## # A tibble: 2 × 3 ## flag mean sd ## <lgl> <dbl> <dbl> ## 1 FALSE 20.5 6.67 ## 2 TRUE 18.5 2.40
And I’m good to go! I’ve found this quite helpful and hope you do too. Do you have any of your favorite debugging techniques that you’d like to share?
You can check out the repository for this post right here.