-
Notifications
You must be signed in to change notification settings - Fork 13
Coding guidelines
Key principles to follow when producing code for ITHIM-R
Source: http://style.tidyverse.org/
Includes guidance on:
- Object names (variable names)
- Spacing
- Argument names (inputs to functions)
- Indentation (with some RStudio helpers add-ins)
- Long lines of text
- Comments
"There are only two hard things in Computer Science: cache invalidation and naming things."
--- Phil Karlton
Variable and function names should use only lowercase letters, numbers, and _
.
Use underscores (_
) (so called snake case) to separate words within a name.
# Good
day_one
day_1
# Bad
DayOne
dayone
Base R uses dots in function names (contrib.url()
) and class names
(data.frame
), but it's better to reserve dots exclusively for the S3 object
system. In S3, methods are given the name function.class
; if you also use
.
in function and class names, you end up with confusing methods like
as.data.frame.data.frame()
.
Generally, variable names should be nouns and function names should be verbs. Strive for names that are concise and meaningful (this is not easy!).
# Good
day_one
# Bad
first_day_of_the_month
djm1
Where possible, avoid re-using names of common functions and variables. This will cause confusion for the readers of your code.
# Bad
T <- FALSE
c <- 10
mean <- function(x) sum(x)
Put a space before and after =
when naming arguments in function calls.
Always put a space after a comma, never before, just like in regular English.
# Good
average <- mean(x, na.rm = TRUE)
# Bad
average<-mean(x,na.rm=TRUE)
average <- mean(x ,na.rm = TRUE)
Most infix operators (==
, +
, -
, <-
, etc.) should be surrounded by
spaces. The exception are those with relatively high precedence:
^
, :
, ::
, and :::
. To highlight operator precedence, use parentheses rather than irregular spacing.
# Good
height <- (feet * 12) + inches
sqrt(x^2 + y^2)
x <- 1:10
base::get
# Bad
height<-feet*12 + inches
sqrt(x ^ 2 + y ^ 2)
x <- 1 : 10
base :: get
Put spaces around ~
in double-sided formulas, but omit it in single-sided formulas.
# Good
y ~ x
tribble(
~col1, ~col2,
"a", "b"
)
# Bad
y~x
tribble(
~ col1, ~ col2,
"a", "b"
)
Place a space before (
when its used with a keyword like if
, for
or function
:
# Good
if (debug) show(x)
plot((x + y) ^ 2, (x - y) ^ 2)
# Bad
if(debug)show(x)
plot ( (x + y) ^ 2, (x - y) ^ 2)
Extra spacing (i.e., more than one space in a row) is ok if it improves
alignment of equal signs or assignments (<-
).
list(
total = a + b + c,
mean = (a + b + c) / n
)
Do not place spaces around code in parentheses or square brackets (unless there's a comma, in which case see above).
# Good
if (debug) do(x)
diamonds[5, ]
# Bad
if ( debug ) do(x) # No spaces around debug
x[1,] # Needs a space after the comma
x[1 ,] # Space goes after comma not before
Do not put a space after the tidy evaluation bang-bang (!!
) or bang-bang-bang (!!!
) operators.
# Good
call(!!xyz)
# Bad
call(!! xyz)
call( !! xyz)
call(! !xyz)
A function's arguments typically fall into two broad categories: one supplies the data to compute on; the other controls the details of computation. When you call a function, you typically omit the names of data arguments, because they are used so commonly. If you override the default value of an argument, use the full name:
# Good
mean(1:10, na.rm = TRUE)
# Bad
mean(x = 1:10, , FALSE)
mean(, TRUE, x = c(1:10, NA))
Avoid partial matching.
Curly braces, {}
, define the most important hierarchy of R code. To make this
hierarchy easy to see, always indent the code inside {}
by two spaces.
A symmetrical arrangement helps with finding related braces: the opening brace
is the last, the closing brace is the first non-whitespace character in a line.
Code that is related to a brace (e.g., an if
clause, a function declaration, a
trailing comma, ...) must be on the same line as the opening brace.
# Good
if (y < 0 && debug) {
message("y is negative")
}
if (y == 0) {
if (x > 0) {
log(x)
} else {
message("x is negative or zero")
}
} else {
y ^ x
}
test_that("call1 returns an ordered factor", {
expect_s3_class(call1(x, y), c("factor", "ordered"))
})
tryCatch(
{
x <- scan()
cat("Total: ", sum(x), "\n", sep = "")
},
interrupt = function(e) {
message("Aborted by user")
}
)
# Bad
if (y < 0 && debug)
message("Y is negative")
if (y == 0)
{
if (x > 0) {
log(x)
} else {
message("x is negative or zero")
}
} else { y ^ x }
It's ok to drop the curly braces if you have a very simple and short if
statement that fits on one line. If you have any doubt, it's better to use the
full form.
y <- 10
x <- if (y < 20) "Too low" else "Too high"
Strive to limit your code to 80 characters per line. This fits comfortably on a printed page with a reasonably sized font. If you find yourself running out of room, this is a good indication that you should encapsulate some of the work in a separate function.
If a function call is too long to fit on a single line, use one line each for
the function name, each argument, and the closing )
.
This makes the code easier to read and to change later.
# Good
do_something_very_complicated(
something = "that",
requires = many,
arguments = "some of which may be long"
)
# Bad
do_something_very_complicated("that", requires, many, arguments,
"some of which may be long"
)
As described under [Argument names], you can omit the argument names for very common arguments (i.e. for arguments that are used in almost every invocation of the function). Short unnamed arguments can also go on the same line as the function name, even if the whole function call spans multiple lines.
map(x, f,
extra_argument_a = 10,
extra_argument_b = c(1, 43, 390, 210209)
)
You may also place several arguments on the same line if they are closely
related to each other, e.g., strings in calls to paste()
or stop()
. When
building strings, where possible match one line of code to one line of output.
# Good
paste0(
"Requirement: ", requires, "\n",
"Result: ", result, "\n"
)
# Bad
paste0(
"Requirement: ", requires,
"\n", "Result: ",
result, "\n")
Use <-
, not =
, for assignment.
# Good
x <- 5
# Bad
x = 5
Don't put ;
at the end of a line, and don't use ;
to put multiple commands
on one line.
Use "
, not '
, for quoting text. The only exception is when the text already
contains double quotes and no single quotes.
# Good
"Text"
'Text with "quotes"'
'<a href="http://style.tidyverse.org">A link</a>'
# Bad
'Text'
'Text with "double" and \'single\' quotes'
In data analysis code, use comments to record important findings and analysis decisions. If you need comments to explain what your code is doing, consider rewriting your code to be clearer. If you discover that you have more comments than code, considering switching to RMarkdown.
-
Overarching modules
-
Overarching project domains
- Coding guidelines
- Documentation
- Scientific publications
- Dissemination
- Case studies/applications