pairing_tables {CellaRepertorium} | R Documentation |
A contingency table of every combination of cluster_idx
up to table_order
is generated. Combinations that are found in at least min_expansion
number
of cells are reported. All cells that have these combinations are returned,
as well as cells that only have orphan_level
of matching cluster_idx
.
pairing_tables( ccdb, ranking_key = "grp_rank", table_order = 2, min_expansion = 2, orphan_level = 1, cluster_keys = character(), cluster_whitelist = NULL, cluster_blacklist = NULL )
ccdb |
|
ranking_key |
field in |
table_order |
Integer larger than 1. What order of cluster_idx will be paired, eg, order = 2 means that the first and second highest ranked contigs will be sought and paired in each cell |
min_expansion |
the minimal number of times a pairing needs to occur for it to be reported |
orphan_level |
Integer in interval [1, |
cluster_keys |
optional |
cluster_whitelist |
a table of pairings or clusters that should always be reported. Here the clusters must be named "cluster_idx.1", "cluster_idx.2" (if order-2 pairs are being selected) rather than with 'ccdb$cluster_pk“ |
cluster_blacklist |
a table of pairings or clusters that will never be reported. Must be named as per |
For example, if table_order=2
and min_expansion=2
then heavy/light or
alpha/beta pairs found two or more times will be returned
(as well as alpha-alpha pairs, etc, if those are present).
If orphan_level=1
then all cells that share just a single chain with an
expanded clone will be returned.
The cluster_idx.1_fct
and cluster_idx.2_fct
fields in cell_tbl
,
idx1_tbl
, idx2_tbl
are cast to factors and ordered such that pairings will
tend to occur along the diagonal when they are cross-tabulated.
This facilitates plotting.
list of tables. The cell_tbl
is keyed by the cell_identifiers
, with fields "cluster_idx.1", "cluster_idx.2", etc, IDing the contigs present in each cell. "cluster_idx.1_fct" and "cluster_idx.2_fct" cast these fields to factors and are reordered to maximize the number of pairs along the diagonal. The idx1_tbl
and idx2_tbl
report information (passed in about the cluster_idx
by feature_tbl
.) The cluster_pair_tbl
reports all pairings found of contigs, and the number of times observed.
library(dplyr) tbl = tibble(clust_idx = gl(3, 2), cell_idx = rep(1:3, times = 2), contig_idx = 1:6) ccdb = ContigCellDB(tbl, contig_pk = c('cell_idx', 'contig_idx'), cell_pk = 'cell_idx', cluster_pk = 'clust_idx') # add `grp_rank` to ccdb$contig_tbl indicating how frequent a cluster is ccdb = rank_prevalence_ccdb(ccdb, tie_break_keys = character()) # using `grp_rank` to determine pairing # no pairs found twice pt1 = pairing_tables(ccdb) # all pairs found, found once. pt2 = pairing_tables(ccdb, min_expansion = 1) pt2$cell_tbl tbl2 = bind_rows(tbl, tbl %>% mutate(cell_idx = rep(4:6, times = 2))) ccdb2 = ContigCellDB(tbl2, contig_pk = c('cell_idx', 'contig_idx'), cell_pk = 'cell_idx', cluster_pk = 'clust_idx') %>% rank_prevalence_ccdb(tie_break_keys = character()) #all pairs found twice pt3 = pairing_tables(ccdb2, min_expansion = 1) pt3$cell_tbl ccdb2$contig_tbl = ccdb2$contig_tbl %>% mutate(umis = 1, reads = 1, chain = rep(c('TRA', 'TRB'), times = 6)) ccdb2 = rank_chain_ccdb(ccdb2, tie_break_keys = character()) pt4 = pairing_tables(ccdb2, min_expansion = 1, table_order = 2)