Datapasta provides RStudio addins and functions that give you complete freedom copy-paste data to and from your source editor, formatted for immediate use. Note: repeated use has been known to cause titilation and giddiness.
Places I’ve found this power useful:
dplyr::filter( .. %in% ..)
.c()
expressions with a LOT less typing and
fiddling.Typical usage takes full advantage of addins within RStudio, however
datapasta
can be used with any R editor, even just the
terminal. The typical RStudio case is described in full detail below,
followed by the fallback behaviour.
tribble_paste()
You can copy this html table of Brisbane weather forecasts:
X | Location | Min | Max |
---|---|---|---|
Partly cloudy. | Brisbane | 19 | 29 |
Partly cloudy. | Brisbane Airport | 18 | 27 |
Possible shower. | Beaudesert | 15 | 30 |
Partly cloudy. | Chermside | 17 | 29 |
Shower or two. Possible storm. | Gatton | 15 | 32 |
Possible shower. | Ipswich | 15 | 30 |
Partly cloudy. | Logan Central | 18 | 29 |
Mostly sunny. | Manly | 20 | 26 |
Partly cloudy. | Mount Gravatt | 17 | 28 |
Possible shower. | Oxley | 17 | 30 |
Partly cloudy. | Redcliffe | 19 | 27 |
And make this appear at the current cursor:
tibble::tribble(
~X, ~Location, ~Min, ~Max,
"Partly cloudy.", "Brisbane", 19L, 29L,
"Partly cloudy.", "Brisbane Airport", 18L, 27L,
"Possible shower.", "Beaudesert", 15L, 30L,
"Partly cloudy.", "Chermside", 17L, 29L,
"Shower or two. Possible storm.", "Gatton", 15L, 32L,
"Possible shower.", "Ipswich", 15L, 30L,
"Partly cloudy.", "Logan Central", 18L, 29L,
"Mostly sunny.", "Manly", 20L, 26L,
"Partly cloudy.", "Mount Gravatt", 17L, 28L,
"Possible shower.", "Oxley", 17L, 30L,
"Partly cloudy.", "Redcliffe", 19L, 27L
)
tibble::tribble()
or ‘transposed
tibble’ is a really neat function that allows a tibble
to
be written in human readable format (Thanks be to Hadley).
To paste data as a tribble()
call, just copy the table
header and data rows, then paste into the source editor using the addin
Paste as tribble
. For best results, assign the addin to a
memorable keyboard shortcut, e.g. ctrl + shift + t
. See Customizing
Keyboard Shortcuts.
tribble_paste()
is a flexible function that guesses the
separator and types of the data it pulls from the clipboard. Mostly this
seems to work well. Occasionally it epic-fails. The supported separators
are \|
(pipe), \t
(tab), ,
(comma), ;
(semicolon). Most data copied from the internet
or spreadsheets will be tab delimited. It will also attempt to recognise
a lack of a header row and create a default for you, although this is
not always possible.
vector_paste()
A list could be a row or column of a spreadsheet or intermediate
output. With the Paste as vector
addin you can go from
something like:
Mint Fedora Debian Ubuntu OpenSUSE
or
Mint, Fedora, Debian, Ubuntu, OpenSUSE
or
Mint
Fedora
Debian
Ubuntu
OpenSUSE
to
This is pasted into the source editor at the current cursor.
Just like tribble_paste()
, vector_paste()
has a flexible parser that can guess the type and separator of the data.
The supported separators are \|
(pipe), \t
(tab), ,
(comma), ;
(semicolon) and end of
line. The recommended keyboard shortcut is
crtl + alt + shift + v
.
vector_paste_vertical()
Given the same types of list inputs as above, the
Paste as vector (vertical)
addin pastes the output with
each element on its own line, e.g.:
This is much nicer for long lists. I have found this is actually the
version I use more often. I recommend using
ctrl + shift + v
as keyboard shortcut.
##Pasting as a data.frame with df_paste()
The parser
here is identical to tribble_paste()
and has all the same
type and separator guessing goodness. The difference is the output will
be a formatted call to base::data.frame()
. Some sensible
line wrapping rules etc are implemented. Useful for purists and
educators alike. Special thanks to Jonathan Carroll for contributing
this feature.
So the Brisbane weather table from above becomes:
data.frame(
X = c("Partly cloudy.", "Partly cloudy.", "Possible shower.",
"Partly cloudy.", "Shower or two. Possible storm.",
"Possible shower.", "Partly cloudy.", "Mostly sunny.", "Partly cloudy.",
"Possible shower.", "Partly cloudy."),
Location = c("Brisbane", "Brisbane Airport", "Beaudesert", "Chermside",
"Gatton", "Ipswich", "Logan Central", "Manly",
"Mount Gravatt", "Oxley", "Redcliffe"),
Min = c(19, 18, 15, 17, 15, 15, 18, 20, 17, 17, 19),
Max = c(29, 27, 30, 29, 32, 30, 29, 26, 28, 30, 27)
)
For a shortcut you could try ctrl + shift + d
.
dpasta()
All of the above addin functions can be called directly with an R
object argument. When run, this will result in the object being output
at the current cursor. Usually the next line. To make things more
magical, a there is a single function dpasta
that will
match the argument with the appropriate _paste()
function
based on its class. This means:
results in:
data.frame(
Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, 5.4),
Sepal.Width = c(3.5, 3, 3.2, 3.1, 3.6, 3.9),
Petal.Length = c(1.4, 1.4, 1.3, 1.5, 1.4, 1.7),
Petal.Width = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.4),
Species = as.factor(c("setosa", "setosa", "setosa", "setosa", "setosa",
"setosa"))
)
while:
will give you:
tibble::tribble(
~manufacturer, ~model, ~displ, ~year, ~cyl, ~trans, ~drv, ~cty, ~hwy, ~fl,
"audi", "a4", 1.8, 1999L, 4L, "auto(l5)", "f", 18L, 29L, "p",
"audi", "a4", 1.8, 1999L, 4L, "manual(m5)", "f", 21L, 29L, "p",
"audi", "a4", 2, 2008L, 4L, "manual(m6)", "f", 20L, 31L, "p",
"audi", "a4", 2, 2008L, 4L, "auto(av)", "f", 21L, 30L, "p",
"audi", "a4", 2.8, 1999L, 6L, "auto(l5)", "f", 16L, 26L, "p",
"audi", "a4", 2.8, 1999L, 6L, "manual(m5)", "f", 18L, 26L, "p"
)
There are two addins that operate on RStudio cursor selections to make your life easier:
Fiddle Selection
is intended to remove some fiddly tasks
from your workflow. It can turn raw data like 1 2 3
into
c(1,2,3)
, then pivot from that to:
c(1,
2,
3)
and back again to c(1,2,3)
. The parser here is really
flexible too. It will accept data delimited by any combination of
spaces, commas, and newlines.
Fiddle Selection
Can also reflow messy
tribble()
and data.frame()
expressions into
neatly aligned ones, say after hand editing.
Toggle Vector Quotes
will convert a selected expression
like c(a,b,c)
to a quoted version i.e
c("a","b","c")
. If it’s already quoted it will convert the
other way to a bare version. All elements will be quoted if there’s a
mixture. It also works with vertically aligned expressions.
With the combination of these two you can get really lazy e.g. go from:
some stuff I typed
#To
c("some",
"stuff",
"I",
"typed") # mostly
in a couple of keystrokes!
Try assigning these addins to ctrl + shift + f
and
ctrl + shift + q
respectively.
dmdclip()
dmdclip()
can help you take the data to somewhere that
uses markdown format, for example a Stack Overflow question or Github
issue. This function will copy the resulting formatted data object call
to the clipboard, inserting 4 spaces at the head of each line, which is
markdown syntax for a pre-formatted block.
So:
Will paste the following on the clipboard:
data.frame(
Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, 5.4),
Sepal.Width = c(3.5, 3, 3.2, 3.1, 3.6, 3.9),
Petal.Length = c(1.4, 1.4, 1.3, 1.5, 1.4, 1.7),
Petal.Width = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.4),
Species = as.factor(c("setosa", "setosa", "setosa", "setosa", "setosa",
"setosa"))
)
The rstudioapi
package enables the calling of addins and
output to the cursor. If the API is not detected, all the
_paste()
functions, and dpasta
will output
their text to the console, ready for copying and pasting to an editor
window.
In this scenario you may wish to avoid installation of the
rstudioapi
package dependency. Use
install.packages("datapasta", dependencies = "Depends")
to
avoid API installation, but be sure to follow up with
install.packages(c("readr","clipr"))
.
note: The dpasta()
function can be used without
clipr
installed, but you’re missing out on a fair amount of
awesomeness if you limit yourself to that.
Custom behaviour can be created by taking advantage of the
_construct()
variants of the _paste()
functions, as these return their output as an R object which can then be
written to an appropriate buffer or clipboard.
for example, if you copied the Brisbane weather forecast from above to the clipboard and then called:
trib_call
now contains a the tribble call as a character
vector. You could then write this with:
For your protection, datapasta
will initially refuse to
output R objects of 200 or more rows. Up the row limit for your specific
scenario with dp_set_max_rows(n)
. Large numbers of rows
could take a long time to format. In extreme cases you could crash your
R/RStudio session.
Use dp_set_decimal_mark(",")
to handle numbers like
3,14
.