R
R is an easy-to-learn high-level programming language especially tailored to data science applications.
Pros:
- Easy to learn
- High-level
- Dynamically typed
- Dynamic evaluation, enabling powerful functions and more readable specification of formulas and models
- Very reliable ecosystem, compared to Python.
Packages need to meet many requirements and demonstrate compatibility with dependencies.
- High-quality data science packages
- Large community and help sources
- Rapid prototyping
- Excellent glue language
- Many packages interface with C or other fast languages to achieve performant code
- Interoperable with many other languages
- Reliable and standardized tooling
- Fluent inferface: supports chaining functions, yielding much more readable analysis scripts.
- Excellent notebook support
- Native vector/matrix support in the language
- Good documention: functions are generally well-documented and include code examples.
Cons:
- Lack of proper namespaces: you are guaranteed to stumble into import order conflicts or naming conflicts.
- Cryptic error messages: R's standard error messages are very confusing. The inconsistent/unreadable stack trace does not help either.
- Confusing object-oriented systems: there are at least three systems for implementing object-oriented programming to some degree (S3, S4, R6). Enough said.
- Inconsistent standard library: Quite a learning hurdle to get over.
- Dot operator: increases the flexibility of functions, but documentation of options is often severely lacking.
- Very slow: This is not a deal-breaker though for a glue/scripting language, and for prototyping purposes.
- Dynamic evaluation: A powerful feature, but is very confusing to new users, and difficult to implement for advanced users.
- Environments: generally confusing for users, and can lead to fun surprising (serializing a simple linear formula resulting in a 10 GB file).
- Dynamically typed: great for rapid prototyping, not for large frameworks.
Sourcing
Running an R script
| Action |
Code |
Details |
|
Run R script file
|
|
|
|
Get sourced file from within script
|
as.character(sys.call(1))[2]
|
Does not work during start-up! |
|
Get sourced directory from within script
|
dirname(as.character(sys.call(1))[2])
|
|
Options
| Action |
Code |
Details |
|
Enable debug (browse) on error
|
|
|
|
Treat warnings as errors
|
|
|
|
More concise (readable) traceback
|
options(error=function() traceback(max.lines=3))
|
|
|
Disable scientific notation
|
|
|
Version checks
| Action |
Code |
Details |
|
Run code conditional on R version being later or equal than the given version number
|
if (compareVersion(paste(version$major, version$minor, sep='.'), '3.6.0') >= 0)
|
|
Random number generation
| Action |
Code |
Details |
|
Use legacy RNG (R<3.6) for newer R versions
|
if (compareVersion(
paste(version$major, version$minor, sep='.'),
'3.6.0') >= 0
) {
RNGkind(sample.kind='Rounding')
}
|
A fix to reproduce RNG of R 3.5.0, for reproducing old results |
Output
| Action |
Code |
Details |
|
Output object
|
|
|
|
Output string
|
|
|
|
Print object
|
|
|
|
Print string
|
|
|
|
Show message
|
|
|
|
Show warning
|
|
|
|
Show warning right now
|
warning('warning', .immediate=TRUE)
|
|
|
Trigger and show error
|
|
|
|
Redirect output to file
|
|
Use sink() to restore |
|
Capture any output as string
|
txt = capture.output({...})
|
|
|
Suppress all output
|
|
|
|
Suppress automatic printing of object in interactive mode
|
|
|
|
Suppress all output
|
|
'/dev/null' on linux? |
|
Suppress messages
|
|
|
|
Suppress package messages on load
|
suppressPackageStartupMessages({...})
|
|
|
Suppress warnings
|
|
|