listingtable
Synopsis
A listing table displays the raw data from one column of a source table, with optional summary sections interleaved between. The row and column structure of the listing table is defined by grouping columns from the source table. Each row of data has to have its own cell in the listing table, therefore the grouping applied along rows and columns must be exhaustive, i.e., no two rows may end up in the same group together.
Here is an example of a hypothetical clinical trial with drug concentration measurements of two participants with five time points each.
using DataFrames
using SummaryTables
using Statistics
data = DataFrame(
concentration = [1.2, 4.5, 2.0, 1.5, 0.1, 1.8, 3.2, 1.8, 1.2, 0.2],
id = repeat([1, 2], inner = 5),
time = repeat([0, 0.5, 1, 2, 3], 2)
)
listingtable(
data,
:concentration => "Concentration (ng/mL)",
rows = :id => "ID",
cols = :time => "Time (hr)",
summarize_rows = [
length => "N",
mean => "Mean",
std => "SD",
]
)
Time (hr) | |||||
0 | 0.5 | 1 | 2 | 3 | |
ID | Concentration (ng/mL) | ||||
1 | 1.2 | 4.5 | 2 | 1.5 | 0.1 |
2 | 1.8 | 3.2 | 1.8 | 1.2 | 0.2 |
N | 2 | 2 | 2 | 2 | 2 |
Mean | 1.5 | 3.85 | 1.9 | 1.35 | 0.15 |
SD | 0.424 | 0.919 | 0.141 | 0.212 | 0.0707 |
Argument 1: table
The first argument can be any object that is a table compatible with the Tables.jl
API. Here are some common examples:
DataFrame
using DataFrames
using SummaryTables
data = DataFrame(value = 1:6, group1 = repeat(["A", "B", "C"], 2), group2 = repeat(["D", "E"], inner = 3))
listingtable(data, :value, rows = :group1, cols = :group2)
group2 | ||
D | E | |
group1 | value | |
A | 1 | 4 |
B | 2 | 5 |
C | 3 | 6 |
NamedTuple
of Vector
s
using SummaryTables
data = (; value = 1:6, group1 = repeat(["A", "B", "C"], 2), group2 = repeat(["D", "E"], inner = 3))
listingtable(data, :value, rows = :group1, cols = :group2)
group2 | ||
D | E | |
group1 | value | |
A | 1 | 4 |
B | 2 | 5 |
C | 3 | 6 |
Vector
of NamedTuple
s
using SummaryTables
data = [
(value = 1, group1 = "A", group2 = "D")
(value = 2, group1 = "B", group2 = "D")
(value = 3, group1 = "C", group2 = "D")
(value = 4, group1 = "A", group2 = "E")
(value = 5, group1 = "B", group2 = "E")
(value = 6, group1 = "C", group2 = "E")
]
listingtable(data, :value, rows = :group1, cols = :group2)
group2 | ||
D | E | |
group1 | value | |
A | 1 | 4 |
B | 2 | 5 |
C | 3 | 6 |
Argument 2: variable
The second argument primarily selects the table column whose data should populate the cells of the listing table. The column name is specified with a Symbol
:
using DataFrames
using SummaryTables
data = DataFrame(
value1 = 1:6,
value2 = 7:12,
group1 = repeat(["A", "B", "C"], 2),
group2 = repeat(["D", "E"], inner = 3)
)
listingtable(data, :value1, rows = :group1, cols = :group2)
group2 | ||
D | E | |
group1 | value1 | |
A | 1 | 4 |
B | 2 | 5 |
C | 3 | 6 |
Here we choose to list column :value2
instead:
using DataFrames
using SummaryTables
data = DataFrame(
value1 = 1:6,
value2 = 7:12,
group1 = repeat(["A", "B", "C"], 2),
group2 = repeat(["D", "E"], inner = 3)
)
listingtable(data, :value2, rows = :group1, cols = :group2)
group2 | ||
D | E | |
group1 | value2 | |
A | 7 | 10 |
B | 8 | 11 |
C | 9 | 12 |
By default, the variable name is used as the label as well. You can pass a different label as the second element of a Pair
using the =>
operators. The label can be of any type (refer to Types of cell values for a list).
using DataFrames
using SummaryTables
data = DataFrame(
value1 = 1:6,
value2 = 7:12,
group1 = repeat(["A", "B", "C"], 2),
group2 = repeat(["D", "E"], inner = 3)
)
listingtable(data, :value1 => "Value", rows = :group1, cols = :group2)
group2 | ||
D | E | |
group1 | Value | |
A | 1 | 4 |
B | 2 | 5 |
C | 3 | 6 |
Optional argument 3: pagination
A listing table can grow large, in which case it may make sense to split it into multiple pages. You can pass a Pagination
object with rows
and / or cols
keyword arguments. The Int
you pass to rows
and / or cols
determines how many "sections" of the table along that dimension are included in a single page. If there are no summary statistics, a "section" is a single row or column. If there are summary statistics, a "section" includes all the rows or columns that are summarized together (as it would not make sense to split summarized groups across multiple pages).
If the pagination
argument is provided, the return type of listingtable
changes to PaginatedTable{ListingPageMetadata}
. This object has an interactive HTML representation for convenience the exact form of which should not be considered stable across SummaryTables versions. The PaginatedTable
should be deconstructed into separate Table
s when you want to include these in a document.
Here is an example listing table without pagination:
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:30,
group1 = repeat(["A", "B", "C", "D", "E"], 6),
group2 = repeat(["F", "G", "H", "I", "J", "K"], inner = 5)
)
listingtable(data, :value, rows = :group1, cols = :group2)
group2 | ||||||
F | G | H | I | J | K | |
group1 | value | |||||
A | 1 | 6 | 11 | 16 | 21 | 26 |
B | 2 | 7 | 12 | 17 | 22 | 27 |
C | 3 | 8 | 13 | 18 | 23 | 28 |
D | 4 | 9 | 14 | 19 | 24 | 29 |
E | 5 | 10 | 15 | 20 | 25 | 30 |
And here is the same table paginated into groups of 3 sections along the both the rows and columns. Note that there are only five rows in the original table, which is not divisible by 3, so two pages have only two rows.
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:30,
group1 = repeat(["A", "B", "C", "D", "E"], 6),
group2 = repeat(["F", "G", "H", "I", "J", "K"], inner = 5)
)
listingtable(data, :value, Pagination(rows = 3, cols = 3), rows = :group1, cols = :group2)
Page 1
group2 | |||
F | G | H | |
group1 | value | ||
A | 1 | 6 | 11 |
B | 2 | 7 | 12 |
C | 3 | 8 | 13 |
We can also paginate only along the rows:
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:30,
group1 = repeat(["A", "B", "C", "D", "E"], 6),
group2 = repeat(["F", "G", "H", "I", "J", "K"], inner = 5)
)
listingtable(data, :value, Pagination(rows = 3), rows = :group1, cols = :group2)
Page 1
group2 | ||||||
F | G | H | I | J | K | |
group1 | value | |||||
A | 1 | 6 | 11 | 16 | 21 | 26 |
B | 2 | 7 | 12 | 17 | 22 | 27 |
C | 3 | 8 | 13 | 18 | 23 | 28 |
Or only along the columns:
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:30,
group1 = repeat(["A", "B", "C", "D", "E"], 6),
group2 = repeat(["F", "G", "H", "I", "J", "K"], inner = 5)
)
listingtable(data, :value, Pagination(cols = 3), rows = :group1, cols = :group2)
Page 1
group2 | |||
F | G | H | |
group1 | value | ||
A | 1 | 6 | 11 |
B | 2 | 7 | 12 |
C | 3 | 8 | 13 |
D | 4 | 9 | 14 |
E | 5 | 10 | 15 |
Keyword: rows
The rows
keyword determines the grouping structure along the rows. It can either be a Symbol
specifying a grouping column, a Pair{Symbol,Any}
where the second element overrides the group's label, or a Vector
with multiple groups of the aforementioned format.
This example uses a single group with default label.
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:5,
group = ["A", "B", "C", "D", "E"],
)
listingtable(data, :value, rows = :group)
group | value |
A | 1 |
B | 2 |
C | 3 |
D | 4 |
E | 5 |
The label can be overridden using the Pair
operator.
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:5,
group = ["A", "B", "C", "D", "E"],
)
listingtable(data, :value, rows = :group => "Group")
Group | value |
A | 1 |
B | 2 |
C | 3 |
D | 4 |
E | 5 |
Multiple groups are possible as well, in that case you get a nested display where the last group changes the fastest.
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:5,
group1 = ["F", "F", "G", "G", "G"],
group2 = ["A", "B", "C", "D", "E"],
)
listingtable(data, :value, rows = [:group1, :group2 => "Group 2"])
group1 | Group 2 | value |
F | A | 1 |
B | 2 | |
G | C | 3 |
D | 4 | |
E | 5 | |
Keyword: cols
The cols
keyword determines the grouping structure along the columns. It can either be a Symbol
specifying a grouping column, a Pair{Symbol,Any}
where the second element overrides the group's label, or a Vector
with multiple groups of the aforementioned format.
This example uses a single group with default label.
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:5,
group = ["A", "B", "C", "D", "E"],
)
listingtable(data, :value, cols = :group)
group | ||||
A | B | C | D | E |
value | ||||
1 | 2 | 3 | 4 | 5 |
The label can be overridden using the Pair
operator.
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:5,
group = ["A", "B", "C", "D", "E"],
)
listingtable(data, :value, cols = :group => "Group")
Group | ||||
A | B | C | D | E |
value | ||||
1 | 2 | 3 | 4 | 5 |
Multiple groups are possible as well, in that case you get a nested display where the last group changes the fastest.
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:5,
group1 = ["F", "F", "G", "G", "G"],
group2 = ["A", "B", "C", "D", "E"],
)
listingtable(data, :value, cols = [:group1, :group2 => "Group 2"])
group1 | ||||
F | G | |||
Group 2 | Group 2 | |||
A | B | C | D | E |
value | ||||
1 | 2 | 3 | 4 | 5 |
Keyword: summarize_rows
This keyword takes a list of aggregation functions which are used to summarize the listed variable along the rows. A summary function should take a vector of values (usually that will be numbers) and output one summary value. This value can be of any type that SummaryTables can show in a cell (refer to Types of cell values for a list).
using DataFrames
using SummaryTables
using Statistics: mean, std
data = DataFrame(
value = 1:24,
group1 = repeat(["A", "B", "C", "D", "E", "F"], 4),
group2 = repeat(["G", "H", "I", "J"], inner = 6),
)
mean_sd(values) = Concat(mean(values), " (", std(values), ")")
listingtable(data,
:value,
rows = :group1,
cols = :group2,
summarize_rows = [
mean,
std => "SD",
mean_sd => "Mean (SD)",
]
)
group2 | ||||
G | H | I | J | |
group1 | value | |||
A | 1 | 7 | 13 | 19 |
B | 2 | 8 | 14 | 20 |
C | 3 | 9 | 15 | 21 |
D | 4 | 10 | 16 | 22 |
E | 5 | 11 | 17 | 23 |
F | 6 | 12 | 18 | 24 |
mean | 3.5 | 9.5 | 15.5 | 21.5 |
SD | 1.87 | 1.87 | 1.87 | 1.87 |
Mean (SD) | 3.5 (1.87) | 9.5 (1.87) | 15.5 (1.87) | 21.5 (1.87) |
By default, one summary will be calculated over all rows of a given column. You can also choose to compute one summary for each group of a row grouping column, which makes sense if there is more than one row grouping column.
In this example, one summary is computed for each level of the group1
column.
using DataFrames
using SummaryTables
using Statistics: mean, std
data = DataFrame(
value = 1:24,
group1 = repeat(["X", "Y"], 12),
group2 = repeat(["A", "B", "C"], 8),
group3 = repeat(["G", "H", "I", "J"], inner = 6),
)
mean_sd(values) = Concat(mean(values), " (", std(values), ")")
listingtable(data,
:value,
rows = [:group1, :group2],
cols = :group3,
summarize_rows = :group1 => [
mean,
std => "SD",
mean_sd => "Mean (SD)",
]
)
group3 | |||||
G | H | I | J | ||
group1 | group2 | value | |||
X | A | 1 | 7 | 13 | 19 |
B | 5 | 11 | 17 | 23 | |
C | 3 | 9 | 15 | 21 | |
mean | 3 | 9 | 15 | 21 | |
SD | 2 | 2 | 2 | 2 | |
Mean (SD) | 3 (2) | 9 (2) | 15 (2) | 21 (2) | |
Y | A | 4 | 10 | 16 | 22 |
B | 2 | 8 | 14 | 20 | |
C | 6 | 12 | 18 | 24 | |
mean | 4 | 10 | 16 | 22 | |
SD | 2 | 2 | 2 | 2 | |
Mean (SD) | 4 (2) | 10 (2) | 16 (2) | 22 (2) | |
Keyword: summarize_cols
This keyword takes a list of aggregation functions which are used to summarize the listed variable along the columns. A summary function should take a vector of values (usually that will be numbers) and output one summary value. This value can be of any type that SummaryTables can show in a cell (refer to Types of cell values for a list).
using DataFrames
using SummaryTables
using Statistics: mean, std
data = DataFrame(
value = 1:24,
group1 = repeat(["A", "B", "C", "D", "E", "F"], 4),
group2 = repeat(["G", "H", "I", "J"], inner = 6),
)
mean_sd(values) = Concat(mean(values), " (", std(values), ")")
listingtable(data,
:value,
rows = :group1,
cols = :group2,
summarize_cols = [
mean,
std => "SD",
mean_sd => "Mean (SD)",
]
)
group2 | |||||||
G | H | I | J | ||||
group1 | value | mean | SD | Mean (SD) | |||
A | 1 | 7 | 13 | 19 | 10 | 7.75 | 10 (7.75) |
B | 2 | 8 | 14 | 20 | 11 | 7.75 | 11 (7.75) |
C | 3 | 9 | 15 | 21 | 12 | 7.75 | 12 (7.75) |
D | 4 | 10 | 16 | 22 | 13 | 7.75 | 13 (7.75) |
E | 5 | 11 | 17 | 23 | 14 | 7.75 | 14 (7.75) |
F | 6 | 12 | 18 | 24 | 15 | 7.75 | 15 (7.75) |
By default, one summary will be calculated over all columns of a given row. You can also choose to compute one summary for each group of a column grouping column, which makes sense if there is more than one column grouping column.
In this example, one summary is computed for each level of the group1
column.
using DataFrames
using SummaryTables
using Statistics: mean, std
data = DataFrame(
value = 1:24,
group1 = repeat(["X", "Y"], 12),
group2 = repeat(["A", "B", "C"], 8),
group3 = repeat(["G", "H", "I", "J"], inner = 6),
)
mean_sd(values) = Concat(mean(values), " (", std(values), ")")
listingtable(data,
:value,
cols = [:group1, :group2],
rows = :group3,
summarize_cols = :group1 => [
mean,
std => "SD",
mean_sd => "Mean (SD)",
]
)
group1 | group1 | |||||||||||
X | Y | |||||||||||
group2 | group2 | |||||||||||
A | B | C | A | B | C | |||||||
group3 | value | mean | SD | Mean (SD) | value | mean | SD | Mean (SD) | ||||
G | 1 | 5 | 3 | 3 | 2 | 3 (2) | 4 | 2 | 6 | 4 | 2 | 4 (2) |
H | 7 | 11 | 9 | 9 | 2 | 9 (2) | 10 | 8 | 12 | 10 | 2 | 10 (2) |
I | 13 | 17 | 15 | 15 | 2 | 15 (2) | 16 | 14 | 18 | 16 | 2 | 16 (2) |
J | 19 | 23 | 21 | 21 | 2 | 21 (2) | 22 | 20 | 24 | 22 | 2 | 22 (2) |
Keyword: variable_header
If you set variable_header = false
, you can hide the header cell with the variable label, which makes the table layout a little more compact.
Here is a table with the header cell:
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:6,
group1 = repeat(["A", "B", "C"], 2),
group2 = repeat(["D", "E"], inner = 3)
)
listingtable(data, :value, rows = :group1, cols = :group2, variable_header = true)
group2 | ||
D | E | |
group1 | value | |
A | 1 | 4 |
B | 2 | 5 |
C | 3 | 6 |
And here is a table without it:
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:6,
group1 = repeat(["A", "B", "C"], 2),
group2 = repeat(["D", "E"], inner = 3)
)
listingtable(data, :value, rows = :group1, cols = :group2, variable_header = false)
group2 | ||
group1 | D | E |
A | 1 | 4 |
B | 2 | 5 |
C | 3 | 6 |
Keyword: sort
By default, group entries are sorted. If you need to maintain the order of entries from your dataset, set sort = false
.
Notice how in the following two examples, the group indices are "dos"
, "tres"
, "uno"
when sorted, but "uno"
, "dos"
, "tres"
when not sorted. If we want to preserve the natural order of these groups ("uno", "dos", "tres" meaning "one", "two", "three" in Spanish but having a different alphabetical order) we need to set sort = false
.
using DataFrames
using SummaryTables
data = DataFrame(
value = 1:6,
group1 = repeat(["uno", "dos", "tres"], inner = 2),
group2 = repeat(["cuatro", "cinco"], 3),
)
listingtable(data, :value, rows = :group1, cols = :group2)
group2 | ||
cinco | cuatro | |
group1 | value | |
dos | 4 | 3 |
tres | 6 | 5 |
uno | 2 | 1 |
listingtable(data, :value, rows = :group1, cols = :group2, sort = false)
group2 | ||
cuatro | cinco | |
group1 | value | |
uno | 1 | 2 |
dos | 3 | 4 |
tres | 5 | 6 |
If you have multiple groups, sort = false
can lead to splitting of higher-level groups if they are not correctly ordered in the source data.
Compare the following two tables. In the second one, the group "A" is split by "B" so the label appears twice.
using SummaryTables
using DataFrames
data = DataFrame(
value = 1:4,
group1 = ["A", "B", "B", "A"],
group2 = ["C", "D", "C", "D"],
)
listingtable(data, :value, rows = [:group1, :group2])
group1 | group2 | value |
A | C | 1 |
D | 4 | |
B | C | 3 |
D | 2 | |
data = DataFrame(
value = 1:4,
group1 = ["A", "B", "B", "A"],
group2 = ["C", "D", "C", "D"],
)
listingtable(data, :value, rows = [:group1, :group2], sort = false)
group1 | group2 | value |
A | C | 1 |
B | D | 2 |
C | 3 | |
A | D | 4 |