Data Transformation with data.table :: CHEAT SHEET - BeOptimized
嚜澳ata Transformation with data.table : : CHEAT SHEET
Basics
Manipulate columns with j
data.table is an extremely fast and memory efficient package
for transforming data in R. It works by converting R*s native
data frame objects into data.tables with new and enhanced
functionality. The basics of working with data.tables are:
EXTRACT
b c
SUMMARIZE
dt[, .(x = sum(a))] 每 create a data.table with new
columns based on the summarized values of rows.
x
Summary functions like mean(), median(), min(),
max(), etc. may be used to summarize rows.
COMPUTE COLUMNS*
setDT(df)* or as.data.table(df) 每 convert a data frame or a list to
a data.table.
a
2
1
a
2
1
dt[1:2, ] 每 subset rows based on row numbers.
a
6
dt[a > 5, ] 每 subset rows based on values in
one or more columns.
dt[, j, keyby = .(a)] 每 group and
simultaneously sort rows according
to values in specified column(s).
COMMON GROUPED OPERATIONS
dt[, .(c = sum(b)), by = a] 每 summarize rows within groups.
Create a data.table
a
2
6
5
dt[, .(b, c)] 每 extract column(s) by name.
b c
a
Subset rows using i
dt[, j, by = .(a)] 每 group rows by
values in specified column(s).
a
dt[, c(2)] 每 extract column(s) by number. Prefix
column numbers with ※-§ to drop.
Take data.table dt,
subset rows using i,
and manipulate columns with j,
grouped according to by.
data.table(a = c(1, 2), b = c("a", "b")) 每 create a data.table from
scratch. Analogous to data.frame().
a
a
dt[i, j, by]
data.tables are also data frames 每 functions that work with data
frames therefore also work with data.tables.
Group according to by
c
3
3
dt[, c := 1 + 2] 每 compute a column based on an
expression.
c
NA
3
dt[a == 1, c := 1 + 2] 每 compute a column based
on an expression but only for a subset of rows.
c d
1 2
1 2
dt[, `:=`(c = 1 , d = 2)] 每 compute multiple
columns based on separate expressions.
dt[, .SD[1], by = a] 每 extract first row of groups.
dt[, .SD[.N], by = a] 每 extract last row of groups.
Chaining
dt[#][#] 每 perform a sequence of data.table operations by
chaining multiple ※[]§.
Functions for data.tables
REORDER
a
1
2
1
DELETE COLUMN
dt[, c := NULL] 每 delete a column.
c
dt[, c := sum(b), by = a] 每 create a new column and compute rows
within groups.
b
2
2
1
a
1
1
2
b
2
1
2
setorder(dt, a, -b) 每 reorder a data.table
according to specified columns. Prefix
column names with ※-§ for descending
order.
* SET FUNCTIONS AND :=
LOGICAL OPERATORS TO USE IN i
<
>
=
is.na() %in%
!is.na() !
CONVERT COLUMN TYPE
|
&
%like%
%between%
b
1.5
2.6
b
1
2
dt[, b := as.integer(b)] 每 convert the type of a
column using as.integer(), as.numeric(),
as.character(), as.Date(), etc..
data.table*s functions prefixed with ※set§ and the operator ※:=§
work without ※ ................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- c datatable to xml with schema
- c declare datatable as datagridview scsikort
- c get datatable schema
- c datatable select distinct where clause solana
- c datatable linq where clause
- data transformation with cheat sheet github
- tt display tree structured data using datatable widget dt
- c datatable select where clause glovalink
- c compare datatable schema
- working with data in 2 0 adding additional datatable columns