Getting help in R: do as I say, not as I've done

Dec 17, 2019 rstats, help

“Ever tried. Ever failed. No matter. Try again. Fail again. Fail better.” - Samuel Beckett

“Try not. Do. Or do not.” - Yoda

Why?

As promised on Twitter, this is my list of tips for getting help in R. My Twitter thread of tips was inspired by something I witnessed at the end of a webinar. The webinar presenter is the author of many popular R packages, and so the webinar drew a very large audience (hundreds of people). The webinar was great, and the host was effusive afterwards. Unfortunately, the praise went a bit off the rails, and included the host telling attendees that the presenter is “always on GitHub” and if they have additional questions to “post issues on the [*****] package repo” or email the presenter to get them answered. The host, who identified themselves as a student, obviously did not know any better, but as an experienced R user, code breaker,¹ and question asker, the host’s comments set off alarm bells in my head.

🔔 🔔 🔔

Why did this comment cause this reaction? Well, in my experience, when getting help in R, your absolute last resort should be to directly interact with the package author. Maintaining an R package takes a lot of work, and most R package writers maintain their package(s) in their spare time. They do not have time to answer every question.² Save yourself and R package authors time by using these tips to answer your own questions.

DIY

These tips are also important for becoming a more self-sufficient R user. Many people asking R questions are students of some kind. When I was a student, I was surrounded by people who were way better than me at this stuff, and I often went to them for help. Now that I’m not sharing office space with them, I have to fend more or less for myself. In my office, I’m now the expert in a lot of R things, so I can’t look to anyone else for help. If I have an error, I can’t just bring my computer to Eric and say “WTF” like I could when we were in school. By becoming self-sufficient, you are making an investment in your future. It’s harder than just asking [insert your favorite R package author here] what the answer is, but it’s better for you.³

The Tips

These tips are presented in a different order than in my Twitter thread, being optimized based on more careful thought and feedback from others in the thread. If you have an R question or problem, try these tips in the order presented to find your solution.

1. Read the documentation 📖

Did you forget an argument? Does something have to be a character that is a factor? If a function is giving you a weird error, read the documentation! If you don’t have the patience, run the examples in the documentation line-by-line first, then try to apply it to your context. R folks get famously testy online when you ask a question that is answered in the help files or vignettes.⁴

2. Google the error 🔍

After reading the documentation, you should Google the error message. In your search, make sure you remove any object or function names that are unique to your environment. By using only the relevant words from the error message, you’re more likely to get a result on StackOverflow or GitHub where your problem has already been solved.

Here is a common example: Error in my_df$my_col: object of type closure is not subsettable. Remove the reference to your data and google away.

3. Search smarter, not harder 🧠

Make your search smarter. Use boolean searches on Google, GitHub, and/or StackOverflow with a few keywords. For example, say you have a question about plot margins and ggplot2. You might try typing ggplot2 AND margins into Google, or [ggplot2] margin in StackOverflow. For GitHub, go to the ggplot2 issues and search for is:issue margin.

4. Burn it all down 🔥

In the original thread, this was lower on the list at #8. That’s probably because it just didn’t occur to me earlier. I am pretty much constantly “burning it all down”: I never save my workspace, and every day and every switch between R projects begins with an empty workspace and no packages attached. By starting fresh every time, you are forcing yourself into better reproducibility. If you are afraid to clear your workspace out, that is a sign you need to be writing data to file more. (See e.g. readr::write_rds.) Before diving deeper into what’s going wrong for you, start over. This will often solve your problem.

5. Make a “reprex” 🔁

This step involves making the problem as small as possible, which is known as making a “minimally reproducible example” or a “reprex.” In making a reprex, you should isolate what is happening, using as little information as possible. So if it’s a ggplot question, for example, make a toy data frame with the minimum number of rows and columns to reproduce the behavior. For more complete information on making a reprex, watch Jenny Bryan’s webinar on the reprex package. In my experience, if I’ve reached the point of making a reprex, creating one will help me figure out what’s going on and solve the problem without going elsewhere at least 75% of the time.

6. Ask Twitter with #rstats 🐦

After combing the web and making a reprex, I usually turn to twitter if I haven’t found an answer yet. Include the hashtag #rstats in your tweet so that it will be seen by more people in the R community. Twitter is best for open-ended or big picture questions that might not have a neat answer or may be hard to google. Here’s an example when I got some help from the Twitterverse. If your question requires a lot of code, make a GitHub gist containing your reprex and link to it in the tweet.

7. Phone a friend (or colleague) ☎️

Ask a colleague or a friend. By this point, your question will be very well-defined and if they know the answer and have a couple minutes, they’ll be able to get it to you quickly.

8. Sleep on it 😴

Sleep on it. This is especially helpful when you’ve been battling with an error for many hours or late into the night. Just stop. Then, try again the next day. If it’s not bedtime, find another project to work on that works a different part of your brain, like chores, reading a paper, or taking a walk. Let your subconscious mind work on it for a while, and you’ll find a solution when you return to the problem.

9. Ask your question on an online community 💬

Ask a question on StackOverflow or in RStudio community. Don’t forget to use the reprex package to help others help you. If others can make the weird behavior/error happen on their systems, then they can actually help you! While you wait for an answer, repeat some previous steps in the meantime to see if you can answer your own question. (If you can answer, make sure to post the solution for others in the future!) Wait at least 24 hours before going to the next step.

10. File an issue on GitHub 🙋

If you still haven’t figured it out, this is when you file a GitHub issue, and include the reprex with it. These days, many packages have an issue template. Make sure you follow it! Maybe repeat some previous steps in the meantime to see if you can answer your own question. Be patient. Expect to wait a while for an answer if the the package repo hasn’t been updated lately or if it’s incredibly popular.

The End?

The vast majority of the time, when I follow these steps in this order, I will get an answer to my question. I would say that since I’ve started thinking about getting R help in this way, 90% of my questions are answered after I’ve done step 5, and 98% are answered after I’ve done step 8. If you’ve made it all the way through #10, be prepared for the solution to be, “This cannot be done.” When this is the case, the more generous package maintainers will give you a reason why it is not possible, and may even invite you to implement the fix yourself and submit a pull request.

At this point, you’re on your own again. But luckily, you’ve developed some self-sufficiency and can begin hacking away if it’s an interesting and/or important enough problem to you. If not, just move on.

Code breaker in the “OMG why doesn’t this code work” sense, not the kick-ass WWII female codebreakers way.↩︎
I have one R package on CRAN and it’s very hard to motivate myself to work on it, since it has nothing to do with my work anymore and good alternatives exist.↩︎
I suspect my belief in self-sufficiency is also one of the reasons an intro stat student of mine once said I was “the worst professor ever of any class in high school or college.” (Hey, at least they gave me a degree & a promotion.) My students did not like it when I answered their questions with questions, Ron Swanson style.↩︎
Writing documentation is coding, too! If you don’t read it, you’re ignoring the authors’ hard work.↩︎

Sam Tyner-Monroe, Ph.D.

Managing Director, Responsible AI

I am an applied statistician and data scientist, with a wide range of skills and experiences. I’m passionate about using data to make a difference.