remove_redundancy {tidybulk}R Documentation

Drop redundant elements (e.g., samples) for which feature (e.g., transcript/gene) abundances are correlated

Description

remove_redundancy() takes as input a 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> | for correlation method or | <DIMENSION 1> | <DIMENSION 2> | <...> | for reduced_dimensions method, and returns a 'tbl' with dropped elements (e.g., samples).

Usage

remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  of_samples = TRUE,
  correlation_threshold = 0.9,
  top = Inf,
  log_transform = FALSE,
  Dim_a_column,
  Dim_b_column
)

## S4 method for signature 'spec_tbl_df'
remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  of_samples = TRUE,
  correlation_threshold = 0.9,
  top = Inf,
  log_transform = FALSE,
  Dim_a_column = NULL,
  Dim_b_column = NULL
)

## S4 method for signature 'tbl_df'
remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  of_samples = TRUE,
  correlation_threshold = 0.9,
  top = Inf,
  log_transform = FALSE,
  Dim_a_column = NULL,
  Dim_b_column = NULL
)

## S4 method for signature 'tidybulk'
remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  of_samples = TRUE,
  correlation_threshold = 0.9,
  top = Inf,
  log_transform = FALSE,
  Dim_a_column = NULL,
  Dim_b_column = NULL
)

## S4 method for signature 'SummarizedExperiment'
remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  of_samples = TRUE,
  correlation_threshold = 0.9,
  top = Inf,
  log_transform = FALSE,
  Dim_a_column = NULL,
  Dim_b_column = NULL
)

## S4 method for signature 'RangedSummarizedExperiment'
remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  of_samples = TRUE,
  correlation_threshold = 0.9,
  top = Inf,
  log_transform = FALSE,
  Dim_a_column = NULL,
  Dim_b_column = NULL
)

Arguments

.data

A 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> |

.element

The name of the element column (normally samples).

.feature

The name of the feature column (normally transcripts/genes)

.abundance

The name of the column including the numerical value the clustering is based on (normally transcript abundance)

method

A character string. The cluster algorithm to use, ay the moment k-means is the only algorithm included.

of_samples

A boolean. In case the input is a tidybulk object, it indicates Whether the element column will be sample or transcript column

correlation_threshold

A real number between 0 and 1. For correlation based calculation.

top

An integer. How many top genes to select for correlation based method

log_transform

A boolean, whether the value should be log-transformed (e.g., TRUE for RNA sequencing data)

Dim_a_column

A character string. For reduced_dimension based calculation. The column of one principal component

Dim_b_column

A character string. For reduced_dimension based calculation. The column of another principal component

Value

A tbl object with with dropped redundant elements (e.g., samples).

A tbl object with with dropped redundant elements (e.g., samples).

A tbl object with with dropped redundant elements (e.g., samples).

A tbl object with with dropped redundant elements (e.g., samples).

A 'SummarizedExperiment' object

A 'SummarizedExperiment' object

Examples



 tidybulk::counts_mini %>% 
 tidybulk(sample, transcript, count) %>% 
 identify_abundant() %>% 
   remove_redundancy(
	   .element = sample,
	   .feature = transcript,
	   	.abundance =  count,
	   	method = "correlation"
	   	)

counts.MDS = 
 tidybulk::counts_mini %>% 
 tidybulk(sample, transcript, count) %>% 
 identify_abundant() %>% 
  reduce_dimensions( method="MDS", .dims = 3)

remove_redundancy(
	counts.MDS,
	Dim_a_column = `Dim1`,
	Dim_b_column = `Dim2`,
	.element = sample,
  method = "reduced_dimensions"
)


[Package tidybulk version 1.2.1 Index]