MSstatsTMT : A package for protein significance analysis in shotgun mass spectrometry-based proteomic experiments with tandem mass tag (TMT) labeling

2020-10-13

This vignette summarizes the introduction and various options of all functionalities in MSstatsTMT.

A set of tools for detecting differentially abundant peptides and proteins in shotgun mass spectrometry-based proteomic experiments with tandem mass tag (TMT) labeling.

The types of experiment that MSstatsTMT supports for metabolic labeling or iTRAQ experiments. LC-MS, SRM, DIA(SWATH) with label-free or labeled synthetic peptides can be analyzed with other R package, MSstats.

MSstatsTMT includes the following three steps for statistical testing:

Converters for different peptide quantification tools to get the input with required format: PDtoMSstatsTMTFormat, MaxQtoMSstatsTMTFormat, SpectroMinetoMSstatsTMTFormat and OpenMStoMSstatsTMTFormat.

Protein summarization based on peptide quantification data: proteinSummarization

Group comparison on protein quantification data: groupComparisonTMT

1. Converters for different peptide quantification tools

PDtoMSstatsTMTFormat()

Preprocess PSM data from Proteome Discoverer and convert into the required input format for MSstatsTMT.

Arguments

input : data name of Proteome discover PSM output. Read PSM sheet.
annotation : data frame which contains column Run, Fraction, TechRepMixture, Channel, Condition, BioReplicate, Mixture.
which.proteinid : Use Protein.Accessions(default) column for protein name. Master.Protein.Accessions can be used instead.
useNumProteinsColumn : TURE(default) remove shared peptides by information of # Proteins column in PSM sheet.
useUniquePeptide : TRUE(default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein.
rmPSM_withMissing_withinRun : TRUE will remove PSM with any missing value within each Run. Default is FALSE.
rmPSM_withfewMea_withinRun : only for rmPSM_withMissing_withinRun = FALSE. TRUE(default) will remove the features that have 1 or 2 measurements within each Run.
removeProtein_with1Peptide : TRUE will remove the proteins which have only 1 peptide and charge. Default is FALSE.
summaryforMultipleRows : sum(default) or max - when there are multiple measurements for certain PSM in certain run, select the PSM with the largest summation or maximal value.

Example

# read in PD PSM sheet
# raw.pd <- read.delim("161117_SILAC_HeLa_UPS1_TMT10_5Mixtures_3TechRep_UPSdB_Multiconsensus_PD22_Intensity_PSMs.txt")
head(raw.pd)
#>    Checked Confidence Identifying.Node PSM.Ambiguity
#> 1:   FALSE       High      Mascot (O4)   Unambiguous
#> 2:   FALSE       High      Mascot (K2)   Unambiguous
#> 3:   FALSE       High      Mascot (K2)   Unambiguous
#> 4:   FALSE       High      Mascot (F2)      Selected
#> 5:   FALSE       High      Mascot (K2)   Unambiguous
#> 6:   FALSE       High      Mascot (K2)   Unambiguous
#>                        Annotated.Sequence
#> 1: [K].gFQQILAGEYDHLPEQAFYMVGPIEEAVAk.[A]
#> 2:          [R].qYPWGVAEVENGEHcDFTILr.[N]
#> 3:              [R].dkPSVEPVEEYDYEDLk.[E]
#> 4:                      [R].hEHQVMLmr.[Q]
#> 5:       [R].dNLTLWTADNAGEEGGEAPQEPQS.[-]
#> 6:         [R].aLVAIGTHDLDTLSGPFTYTAk.[R]
#>                                                      Modifications Marked.as
#> 1:                                 N-Term(TMT6plex); K30(TMT6plex)        NA
#> 2: N-Term(TMT6plex); C15(Carbamidomethyl); R21(Label:13C(6)15N(4))        NA
#> 3:                         N-Term(TMT6plex); K2(Label); K17(Label)        NA
#> 4:         N-Term(TMT6plex); M8(Oxidation); R9(Label:13C(6)15N(4))        NA
#> 5:                                                N-Term(TMT6plex)        NA
#> 6:                                    N-Term(TMT6plex); K22(Label)        NA
#>    X..Protein.Groups X..Proteins Master.Protein.Accessions
#> 1:                 1           1                    P06576
#> 2:                 1           1                    Q16181
#> 3:                 1           1                    Q9Y450
#> 4:                 1           1                    Q15233
#> 5:                 1           1                    P31947
#> 6:                 1           1                    Q9NSD9
#>                                                            Master.Protein.Descriptions
#> 1:         ATP synthase subunit beta, mitochondrial OS=Homo sapiens GN=ATP5B PE=1 SV=3
#> 2:                                         Septin-7 OS=Homo sapiens GN=SEPT7 PE=1 SV=2
#> 3:                                HBS1-like protein OS=Homo sapiens GN=HBS1L PE=1 SV=1
#> 4: Non-POU domain-containing octamer-binding protein OS=Homo sapiens GN=NONO PE=1 SV=4
#> 5:                               14-3-3 protein sigma OS=Homo sapiens GN=SFN PE=1 SV=1
#> 6:          Phenylalanine--tRNA ligase beta subunit OS=Homo sapiens GN=FARSB PE=1 SV=3
#>    Protein.Accessions
#> 1:             P06576
#> 2:             Q16181
#> 3:             Q9Y450
#> 4:             Q15233
#> 5:             P31947
#> 6:             Q9NSD9
#>                                                                   Protein.Descriptions
#> 1:         ATP synthase subunit beta, mitochondrial OS=Homo sapiens GN=ATP5B PE=1 SV=3
#> 2:                                         Septin-7 OS=Homo sapiens GN=SEPT7 PE=1 SV=2
#> 3:                                HBS1-like protein OS=Homo sapiens GN=HBS1L PE=1 SV=1
#> 4: Non-POU domain-containing octamer-binding protein OS=Homo sapiens GN=NONO PE=1 SV=4
#> 5:                               14-3-3 protein sigma OS=Homo sapiens GN=SFN PE=1 SV=1
#> 6:          Phenylalanine--tRNA ligase beta subunit OS=Homo sapiens GN=FARSB PE=1 SV=3
#>    X..Missed.Cleavages Charge DeltaScore DeltaCn Rank Search.Engine.Rank
#> 1:                   0      3     1.0000       0    1                  1
#> 2:                   0      3     1.0000       0    1                  1
#> 3:                   1      3     0.9730       0    1                  1
#> 4:                   0      4     0.5250       0    1                  1
#> 5:                   0      3     1.0000       0    1                  1
#> 6:                   0      3     0.9783       0    1                  1
#>     m.z..Da. MH...Da. Theo..MH...Da. DeltaM..ppm. Deltam.z..Da. Activation.Type
#> 1: 1270.3249 3808.960       3808.966        -1.51      -0.00192             CID
#> 2:  920.4493 2759.333       2759.332         0.31       0.00028             CID
#> 3:  920.1605 2758.467       2758.461         2.08       0.00192             CID
#> 4:  359.6898 1435.737       1435.738        -0.04      -0.00002             CID
#> 5:  920.0943 2758.268       2758.264         1.53       0.00141             CID
#> 6:  919.8502 2757.536       2757.532         1.48       0.00136             CID
#>    MS.Order Isolation.Interference.... Average.Reporter.S.N
#> 1:      MS2                  47.955590                  8.7
#> 2:      MS2                   9.377507                  8.1
#> 3:      MS2                  38.317050                 17.8
#> 4:      MS2                  21.390040                 36.5
#> 5:      MS2                   0.000000                 16.7
#> 6:      MS2                  30.619960                 26.7
#>    Ion.Inject.Time..ms. RT..min. First.Scan
#> 1:               50.000 212.2487     112815
#> 2:                3.242 164.7507      87392
#> 3:               13.596 143.4534      74786
#> 4:               50.000  21.6426       6458
#> 5:                6.723 174.1863      92950
#> 6:                8.958 176.4863      94294
#>                                   Spectrum.File File.ID Abundance..126
#> 1: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_03.raw      F1       2548.326
#> 2: 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw      F5      22861.765
#> 3: 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw      F5      25504.083
#> 4: 161117_SILAC_HeLa_UPS1_TMT10_Mixture4_02.raw     F10      13493.228
#> 5: 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw      F5      64582.786
#> 6: 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw      F5      35404.709
#>    Abundance..127N Abundance..127C Abundance..128N Abundance..128C
#> 1:        3231.929        2760.839        4111.639        3127.254
#> 2:       25817.946       23349.498       29449.609       25995.929
#> 3:       27740.450       25144.974       25754.579       29923.176
#> 4:       14674.490       11187.900       12831.495       13839.426
#> 5:       50576.417       47126.037       56285.129       46257.310
#> 6:       31905.852       30993.941       36854.351       37506.001
#>    Abundance..129N Abundance..129C Abundance..130N Abundance..130C
#> 1:        1874.163        2831.423        2298.401        3798.876
#> 2:       22955.769       30578.971       30660.488       38728.853
#> 3:       34097.637       31650.255       27632.692       23886.881
#> 4:       12441.353       13450.885       14777.844       13039.995
#> 5:       52634.885       49716.850       60660.574       55830.488
#> 6:       25703.444       38626.598       35447.942       33788.409
#>    Abundance..131 Quan.Info Ions.Score Identity.Strict Identity.Relaxed
#> 1:       3739.067        NA         90              28               21
#> 2:      25047.280        NA         76              24               17
#> 3:      35331.092        NA         74              30               23
#> 4:      12057.121        NA         40              25               18
#> 5:      40280.577        NA         38              21               14
#> 6:      32031.516        NA         46              29               22
#>    Expectation.Value Percolator.q.Value Percolator.PEP
#> 1:      7.038672e-09                  0      1.396e-05
#> 2:      6.298627e-08                  0      3.349e-07
#> 3:      4.318385e-07                  0      9.922e-07
#> 4:      3.351211e-04                  0      1.175e-04
#> 5:      2.152501e-04                  0      1.383e-05
#> 6:      2.060469e-04                  0      7.198e-05

# Read in annotation including condition and biological replicates per run and channel.
# Users should make this annotation file. It is not the output from Proteome Discoverer.
# annotation.pd <- read.csv(file="PD_Annotation.csv", header=TRUE)
head(annotation.pd)
#>                                            Run Fraction TechRepMixture Channel
#> 1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw        1              1     126
#> 2 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw        1              1    127N
#> 3 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw        1              1    127C
#> 4 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw        1              1    128N
#> 5 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw        1              1    128C
#> 6 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw        1              1    129N
#>   Condition  Mixture   BioReplicate
#> 1      Norm Mixture1  Mixture1_Norm
#> 2     0.667 Mixture1 Mixture1_0.667
#> 3     0.125 Mixture1 Mixture1_0.125
#> 4       0.5 Mixture1   Mixture1_0.5
#> 5         1 Mixture1     Mixture1_1
#> 6     0.125 Mixture1 Mixture1_0.125

# do not remove PSM with missing values within one run
input.pd <- PDtoMSstatsTMTFormat(raw.pd, annotation.pd)
#> ** Shared PSMs (assigned in multiple proteins) are removed.
#> ** 55 features have 1 or 2 intensities across runs and are removed.
#> ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows.
head(input.pd)
#>   ProteinName               PeptideSequence Charge
#> 1      P04406        [K].lISWYDNEFGYSNR.[V]      2
#> 2      Q9NSD9           [K].irPFAVAAVLr.[N]      3
#> 3      P04406    [K].lVINGNPITIFQErDPSk.[I]      3
#> 4      P04406          [R].vVDLmAHMASkE.[-]      3
#> 5      P06576      [R].dQEGQDVLLFIDNIFR.[F]      3
#> 6      P06576 [R].iPSAVGYQPTLATDMGTMQEr.[I]      3
#>                               PSM  Mixture TechRepMixture
#> 1        [K].lISWYDNEFGYSNR.[V]_2 Mixture1              1
#> 2           [K].irPFAVAAVLr.[N]_3 Mixture1              1
#> 3    [K].lVINGNPITIFQErDPSk.[I]_3 Mixture1              1
#> 4          [R].vVDLmAHMASkE.[-]_3 Mixture1              1
#> 5      [R].dQEGQDVLLFIDNIFR.[F]_3 Mixture1              1
#> 6 [R].iPSAVGYQPTLATDMGTMQEr.[I]_3 Mixture1              1
#>                                            Run Channel Condition  BioReplicate
#> 1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126      Norm Mixture1_Norm
#> 2 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126      Norm Mixture1_Norm
#> 3 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126      Norm Mixture1_Norm
#> 4 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126      Norm Mixture1_Norm
#> 5 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126      Norm Mixture1_Norm
#> 6 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126      Norm Mixture1_Norm
#>     Intensity
#> 1    8348.351
#> 2   28327.492
#> 3 1275010.965
#> 4   80589.877
#> 5    2231.389
#> 6  144854.307

# remove PSM with missing values within one run
input.pd.no.miss <- PDtoMSstatsTMTFormat(raw.pd, annotation.pd,
                                 rmPSM_withMissing_withinRun = TRUE)
#> ** Shared PSMs (assigned in multiple proteins) are removed.
#> ** Rows which has any missing value within a run were removed from that run.
#> ** 0 features have 1 or 2 intensities across runs and are removed.
#> ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows.
head(input.pd.no.miss)
#>   ProteinName          PeptideSequence Charge                        PSM
#> 1      P12277 [K].lAVEALSSLDGDLAGr.[Y]      3 [K].lAVEALSSLDGDLAGr.[Y]_3
#> 2      P04406   [K].lVINGNPITIFQEr.[D]      3   [K].lVINGNPITIFQEr.[D]_3
#> 3      Q16181     [K].dVTNNVHYENYr.[S]      3     [K].dVTNNVHYENYr.[S]_3
#> 4      P04406         [K].qASEGPLk.[G]      2         [K].qASEGPLk.[G]_2
#> 5      Q15233         [R].rQQEEMMr.[R]      3         [R].rQQEEMMr.[R]_3
#> 6      P06576 [R].dQEGQDVLLFIDNIFr.[F]      3 [R].dQEGQDVLLFIDNIFr.[F]_3
#>    Mixture TechRepMixture                                          Run Channel
#> 1 Mixture1              1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126
#> 2 Mixture1              1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126
#> 3 Mixture1              1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126
#> 4 Mixture1              1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126
#> 5 Mixture1              1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126
#> 6 Mixture1              1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126
#>   Condition  BioReplicate  Intensity
#> 1      Norm Mixture1_Norm  23037.057
#> 2      Norm Mixture1_Norm 349661.432
#> 3      Norm Mixture1_Norm  40699.454
#> 4      Norm Mixture1_Norm  13882.684
#> 5      Norm Mixture1_Norm   9302.419
#> 6      Norm Mixture1_Norm  12261.325

MaxQtoMSstatsTMTFormat()

Preprocess PSM-level data from MaxQuant and convert into the required input format for MSstatsTMT.

Arguments

evidence : name of evidence.txt data, which includes PSM-level data.
proteinGroups : name of proteinGroups.txt data, which contains the detailed information of protein identifications.
annotation : data frame which contains column Run, Fraction, TechRepMixture, Channel, Condition, BioReplicate, Mixture.
which.proteinid : Use Proteins(default) column for protein name. Leading.proteins or Leading.razor.proteins can be used instead. However, those can potentially have the shared peptides.
rmProt_Only.identified.by.site : TRUE will remove proteins with ‘+’ in ‘Only.identified.by.site’ column from proteinGroups.txt, which was identified only by a modification site. FALSE is the default.
useUniquePeptide : TRUE(default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein.
rmPSM_withMissing_withinRun : TRUE will remove PSM with any missing value within each Run. Default is FALSE.
rmPSM_withfewMea_withinRun : only for rmPSM_withMissing_withinRun = FALSE. TRUE(default) will remove the features that have 1 or 2 measurements within each Run.
removeProtein_with1Peptide : TRUE will remove the proteins which have only 1 peptide and charge. Default is FALSE.
summaryforMultipleRows : sum(default) or max - when there are multiple measurements for certain PSM in certain run, select the PSM with the largest summation or maximal value.

Example

# Read in MaxQuant files
# proteinGroups <- read.table("proteinGroups.txt", sep="\t", header=TRUE)

# evidence <- read.table("evidence.txt", sep="\t", header=TRUE)

# Users should make this annotation file. It is not the output from MaxQuant.
# annotation.mq <- read.csv(file="MQ_Annotation.csv", header=TRUE)

input.mq <- MaxQtoMSstatsTMTFormat(evidence, proteinGroups, annotation.mq)
#> ** + Contaminant, + Reverse, + Only.identified.by.site, proteins are removed.
#> ** PSMs, that have all zero intensities across channels in each run, are removed.
#> ** 2 features have 1 or 2 intensities across runs and are removed.
#> ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows.
head(input.mq)
#>   ProteinName             PeptideSequence Charge                           PSM
#> 1      O15042    AAAEIYEEFLAAFEGSDGNK(ly)      3    AAAEIYEEFLAAFEGSDGNK(ly)_3
#> 2      Q9P258           DGQILPVPNVVVR(ar)      3           DGQILPVPNVVVR(ar)_3
#> 3      Q96P70             ICPFTIAIFLK(ly)      3             ICPFTIAIFLK(ly)_3
#> 4      P36578         FCIWTESAFR(ar)K(ly)      3         FCIWTESAFR(ar)K(ly)_3
#> 5      Q9P258       AAAAAWEEPSSGNGTAR(ar)      2       AAAAAWEEPSSGNGTAR(ar)_2
#> 6      Q96P70 VWTANPQQFVEDEDDDTFSYTVR(ar)      3 VWTANPQQFVEDEDDDTFSYTVR(ar)_3
#>    Mixture TechRepMixture                                      Run   Channel
#> 1 Mixture1              1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01 channel.0
#> 2 Mixture1              1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01 channel.0
#> 3 Mixture1              1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01 channel.0
#> 4 Mixture1              1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01 channel.0
#> 5 Mixture1              1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01 channel.0
#> 6 Mixture1              1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01 channel.0
#>    BioReplicate Condition Intensity
#> 1 Mixture1_Norm      Norm   1031.50
#> 2 Mixture1_Norm      Norm   2219.20
#> 3 Mixture1_Norm      Norm    478.17
#> 4 Mixture1_Norm      Norm    534.43
#> 5 Mixture1_Norm      Norm    866.26
#> 6 Mixture1_Norm      Norm    388.78

SpectroMinetoMSstatsTMTFormat()

Preprocess PSM data from SpectroMine and convert into the required input format for MSstatsTMT.

Arguments

input : data name of SpectroMine PSM output. Read PSM sheet.
annotation : data frame which contains column Run, Fraction, TechRepMixture, Channel, Condition, BioReplicate, Mixture.
filter_with_Qvalue : TRUE(default) will filter out the intensities that have greater than qvalue_cutoff in EG.Qvalue column. Those intensities will be replaced with NA and will be considered as censored missing values for imputation purpose.
qvalue_cutoff : Cutoff for EG.Qvalue. default is 0.01.
useUniquePeptide : TRUE(default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein.
rmPSM_withMissing_withinRun : TRUE will remove PSM with any missing value within each Run. Default is FALSE.
rmPSM_withfewMea_withinRun : only for rmPSM_withMissing_withinRun = FALSE. TRUE(default) will remove the features that have 1 or 2 measurements within each Run.
removeProtein_with1Peptide : TRUE will remove the proteins which have only 1 peptide and charge. Default is FALSE.
summaryforMultipleRows : sum(default) or max - when there are multiple measurements for certain PSM in certain run, select the PSM with the largest summation or maximal value.
remove_norm_channel : TRUE(default) removes Norm channels from protein level data.
remove_empty_channel : TRUE(default) removes Empty channels from protein level data.

Example

# Read in SpectroMine PSM report
# raw.mine <- read.csv('20180831_095547_CID-OT-MS3-Short_PSM Report_20180831_103118.xls', sep="\t")

# Users should make this annotation file. It is not the output from SpectroMine
# annotation.mine <- read.csv(file="Mine_Annotation.csv", header=TRUE)

input.mine <- SpectroMinetoMSstatsTMTFormat(raw.mine, annotation.mine)
#> ** Intensities with great than 0.01 in PG.QValue are replaced with NA.
#> ** Intensities with great than 0.01 in EG.Qvalue are replaced with NA.
#> ** 0 rows have all NAs are removed.
#> ** All peptides are unique peptides in proteins.
#> ** 0 features have 1 or 2 intensities across runs and are removed.
#> ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows.
#> ** For peptides overlapped between fractions of 1_1, use the fraction with maximal average abundance.
#> ** Fractions belonging to same mixture have been combined.
head(input.mine)
#>   ProteinName                               PeptideSequence Charge
#> 1      Q9GZT9 _[TMT_Nter]AAAGGQGSAVAAEAEPGK[TMT_Lys]EEPPAR_      3
#> 2      Q9NVA2    _[TMT_Nter]K[TMT_Lys]ELEEEVNNFQK[TMT_Lys]_      3
#> 3      Q9NVA2                 _[TMT_Nter]SLDLVTMK[TMT_Lys]_      2
#> 4      Q9NVA2      _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]_      3
#> 5      Q9NVA2      _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]_      2
#> 6      P06753                    _[TMT_Nter]AADAEAEVASLNRR_      3
#>                                               PSM Mixture TechRepMixture Run
#> 1 _[TMT_Nter]AAAGGQGSAVAAEAEPGK[TMT_Lys]EEPPAR__3       1              1 1_1
#> 2    _[TMT_Nter]K[TMT_Lys]ELEEEVNNFQK[TMT_Lys]__3       1              1 1_1
#> 3                 _[TMT_Nter]SLDLVTMK[TMT_Lys]__2       1              1 1_1
#> 4      _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]__3       1              1 1_1
#> 5      _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]__2       1              1 1_1
#> 6                    _[TMT_Nter]AADAEAEVASLNRR__3       1              1 1_1
#>    Channel BioReplicate Condition  Intensity
#> 1 TMT6_126            1         3   382.1107
#> 2 TMT6_126            1         3 33554.1900
#> 3 TMT6_126            1         3 44713.6300
#> 4 TMT6_126            1         3 20877.8700
#> 5 TMT6_126            1         3   506.1669
#> 6 TMT6_126            1         3 10065.2800

OpenMStoMSstatsTMTFormat()

Preprocess MSstatsTMT report from OpenMS and convert into the required input format for MSstatsTMT.

Arguments

input: data name of MSstatsTMT report from OpenMS. Read csv file.
useUniquePeptide : TRUE(default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein.
rmPSM_withMissing_withinRun : TRUE will remove PSM with any missing value within each Run. Default is FALSE.
rmPSM_withfewMea_withinRun : only for rmPSM_withMissing_withinRun = FALSE. TRUE(default) will remove the features that have 1 or 2 measurements within each Run.
removeProtein_with1Peptide : TRUE will remove the proteins which have only 1 peptide and charge. Default is FALSE.
summaryforMultipleRows : sum(default) or max - when there are multiple measurements for certain PSM in certain run, select the PSM with the largest summation or maximal value.

Example

# read in MSstatsTMT report from OpenMS
# raw.om <- read.csv("OpenMS_20200222/20200225_MSstatsTMT_OpenMS_Export.csv")
head(raw.om)
#>   RetentionTime          ProteinName                 PeptideSequence Charge
#> 1      2924.491 sp|P11679|K2C8_MOUSE .(TMT6plex)AEAETMYQIK(TMT6plex)      2
#> 2      2924.491 sp|P11679|K2C8_MOUSE .(TMT6plex)AEAETMYQIK(TMT6plex)      2
#> 3      2924.491 sp|P11679|K2C8_MOUSE .(TMT6plex)AEAETMYQIK(TMT6plex)      2
#> 4      2924.491 sp|P11679|K2C8_MOUSE .(TMT6plex)AEAETMYQIK(TMT6plex)      2
#> 5      2924.491 sp|P11679|K2C8_MOUSE .(TMT6plex)AEAETMYQIK(TMT6plex)      2
#> 6      2924.491 sp|P11679|K2C8_MOUSE .(TMT6plex)AEAETMYQIK(TMT6plex)      2
#>   Channel Condition BioReplicate   Run Mixture TechRepMixture Fraction
#> 1       1   Long_LF            1 1_1_3       1            1_1        3
#> 2       2   Long_LF            2 1_1_3       1            1_1        3
#> 3       3    Long_M            3 1_1_3       1            1_1        3
#> 4       6    Long_M            6 1_1_3       1            1_1        3
#> 5       5      Norm            5 1_1_3       1            1_1        3
#> 6       9      Norm            9 1_1_3       1            1_1        3
#>   Intensity
#> 1  5727.319
#> 2  6985.365
#> 3  4553.897
#> 4  5937.782
#> 5  5151.292
#> 6  6800.128
#>                                                                                                          Reference
#> 1 PAMI-176_Mouse_A-J_TMT_40ug_22pctACN_25cm_120min_20160223_OT.mzML_controllerType=0 controllerNumber=1 scan=11324
#> 2 PAMI-176_Mouse_A-J_TMT_40ug_22pctACN_25cm_120min_20160223_OT.mzML_controllerType=0 controllerNumber=1 scan=11324
#> 3 PAMI-176_Mouse_A-J_TMT_40ug_22pctACN_25cm_120min_20160223_OT.mzML_controllerType=0 controllerNumber=1 scan=11324
#> 4 PAMI-176_Mouse_A-J_TMT_40ug_22pctACN_25cm_120min_20160223_OT.mzML_controllerType=0 controllerNumber=1 scan=11324
#> 5 PAMI-176_Mouse_A-J_TMT_40ug_22pctACN_25cm_120min_20160223_OT.mzML_controllerType=0 controllerNumber=1 scan=11324
#> 6 PAMI-176_Mouse_A-J_TMT_40ug_22pctACN_25cm_120min_20160223_OT.mzML_controllerType=0 controllerNumber=1 scan=11324

# the function only requries one input file
input.om <- OpenMStoMSstatsTMTFormat(raw.om)
#> Joining, by = c("RetentionTime", "ProteinName", "PeptideSequence", "Charge", "Run", "Reference")
#> ** PSMs, that have all zero intensities across channels in each run, are removed.
#> Joining, by = c("RetentionTime", "ProteinName", "PeptideSequence", "Charge", "Run", "Reference")
#> ** 2 features have 1 or 2 intensities across runs are removed.
#> Joining, by = c("Run", "Channel")
#> ** PSMs have been aggregated to peptide ions.
#> ** For peptides overlapped between fractions of 2_2_2, use the fraction with maximal average abundance.
#> ** For peptides overlapped between fractions of 3_3_3, use the fraction with maximal average abundance.
#> ** Fractions belonging to same mixture have been combined.
head(input.om)
#>            ProteinName                           PeptideSequence Charge
#> 1 sp|O08663|MAP2_MOUSE .(TMT6plex)GQEC(Carbamidomethyl)EYPPTQDGR      2
#> 2 sp|O08663|MAP2_MOUSE .(TMT6plex)GQEC(Carbamidomethyl)EYPPTQDGR      2
#> 3 sp|O08663|MAP2_MOUSE .(TMT6plex)GQEC(Carbamidomethyl)EYPPTQDGR      2
#> 4 sp|O08663|MAP2_MOUSE .(TMT6plex)GQEC(Carbamidomethyl)EYPPTQDGR      2
#> 5 sp|O08663|MAP2_MOUSE .(TMT6plex)GQEC(Carbamidomethyl)EYPPTQDGR      2
#> 6 sp|O08663|MAP2_MOUSE .(TMT6plex)GQEC(Carbamidomethyl)EYPPTQDGR      2
#>                                           PSM Mixture TechRepMixture   Run
#> 1 .(TMT6plex)GQEC(Carbamidomethyl)EYPPTQDGR_2       1            1_1 1_1_1
#> 2 .(TMT6plex)GQEC(Carbamidomethyl)EYPPTQDGR_2       1            1_1 1_1_1
#> 3 .(TMT6plex)GQEC(Carbamidomethyl)EYPPTQDGR_2       1            1_1 1_1_1
#> 4 .(TMT6plex)GQEC(Carbamidomethyl)EYPPTQDGR_2       1            1_1 1_1_1
#> 5 .(TMT6plex)GQEC(Carbamidomethyl)EYPPTQDGR_2       1            1_1 1_1_1
#> 6 .(TMT6plex)GQEC(Carbamidomethyl)EYPPTQDGR_2       1            1_1 1_1_1
#>   Channel Condition BioReplicate Intensity
#> 1       1   Long_LF            1  18748.36
#> 2      10  Short_LF           10  15084.31
#> 3       2   Long_LF            2  19591.20
#> 4       3    Long_M            3  17800.54
#> 5       4  Short_LF            4  21316.78
#> 6       5      Norm            5  17607.60

# use MSstats for protein summarization quant.msstats <- proteinSummarization(input.pd, method="msstats", global_norm=TRUE, reference_norm=TRUE, remove_norm_channel = TRUE, remove_empty_channel = TRUE) #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 4-29 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 3-33 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 3-29 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 1-28 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 1-30 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 2-30 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 4-31 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 3-30 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 5-30 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 3-31 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 3-31 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 1-31 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 3-34 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 2-30 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% #> #> Summary of Features : #> count #> # of Protein 10 #> # of Peptides/Protein 5-32 #> # of Transitions/Peptide 1-1 #> #> Summary of Samples : #> 0.125 0.5 0.667 1 Norm #> # of MS runs 2 2 2 2 2 #> # of Biological Replicates 1 1 1 1 1 #> # of Technical Replicates 2 2 2 2 2 #> | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% head(quant.msstats) #> Run Protein Abundance Channel #> 1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P04406 16.59812 127C #> 2 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P04406 16.55729 129N #> 3 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P04406 16.71783 128N #> 4 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P04406 16.67190 129C #> 5 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P04406 16.51106 127N #> 6 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P04406 16.49448 130C #> BioReplicate Condition TechRepMixture Mixture #> 1 Mixture1_0.125 0.125 1 Mixture1 #> 2 Mixture1_0.125 0.125 1 Mixture1 #> 3 Mixture1_0.5 0.5 1 Mixture1 #> 4 Mixture1_0.5 0.5 1 Mixture1 #> 5 Mixture1_0.667 0.667 1 Mixture1 #> 6 Mixture1_0.667 0.667 1 Mixture1 # use Median for protein summarization # since median method doesn't impute missing values, # we need to use the input data without missing values quant.median <- proteinSummarization(input.pd.no.miss, method="Median", global_norm=TRUE, reference_norm=TRUE, remove_norm_channel = TRUE, remove_empty_channel = TRUE) head(quant.median) #> Run Protein Abundance Channel #> 2 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P12277 15.32534 127C #> 3 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P12277 15.55383 127N #> 4 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P12277 15.51731 128C #> 5 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P12277 15.76108 128N #> 6 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P12277 15.52052 129C #> 7 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P12277 15.32590 129N #> BioReplicate Condition TechRepMixture Mixture #> 2 Mixture1_0.125 0.125 1 Mixture1 #> 3 Mixture1_0.667 0.667 1 Mixture1 #> 4 Mixture1_1 1 1 Mixture1 #> 5 Mixture1_0.5 0.5 1 Mixture1 #> 6 Mixture1_0.5 0.5 1 Mixture1 #> 7 Mixture1_0.125 0.125 1 Mixture1

## Profile plot without norm channnels and empty channels dataProcessPlotsTMT(data.peptide = input.pd, data.summarization = quant.msstats, type = 'ProfilePlot', width = 21, # adjust the figure width since there are 15 TMT runs. height = 7) #> Warning: Removed 16 rows containing missing values (geom_point). #> Drew the Profile plot for P04406 ( 1 of 10 ) #> Warning: Removed 29 rows containing missing values (geom_point). #> Warning: Removed 1 row(s) containing missing values (geom_path). #> Drew the Profile plot for P06576 ( 2 of 10 ) #> Warning: Removed 23 rows containing missing values (geom_point). #> Warning: Removed 1 row(s) containing missing values (geom_path). #> Drew the Profile plot for P12277 ( 3 of 10 ) #> Warning: Removed 2 rows containing missing values (geom_point). #> Drew the Profile plot for P23919 ( 4 of 10 ) #> Drew the Profile plot for P31947 ( 5 of 10 ) #> Warning: Removed 52 rows containing missing values (geom_point). #> Warning: Removed 3 row(s) containing missing values (geom_path). #> Drew the Profile plot for Q15233 ( 6 of 10 ) #> Warning: Removed 2 rows containing missing values (geom_point). #> Drew the Profile plot for Q16181 ( 7 of 10 ) #> Warning: Removed 2 rows containing missing values (geom_point). #> Drew the Profile plot for Q9NSD9 ( 8 of 10 ) #> Warning: Removed 8 rows containing missing values (geom_point). #> Drew the Profile plot for Q9UGP8 ( 9 of 10 ) #> Warning: Removed 6 rows containing missing values (geom_point). #> Drew the Profile plot for Q9Y450 ( 10 of 10 ) #> Warning: Removed 16 rows containing missing values (geom_point). #> Warning: Removed 16 rows containing missing values (geom_point). #> Drew the Profile plot with summarization for P04406 ( 1 of 10 ) #> Warning: Removed 29 rows containing missing values (geom_point). #> Warning: Removed 1 row(s) containing missing values (geom_path). #> Warning: Removed 29 rows containing missing values (geom_point). #> Drew the Profile plot with summarization for P06576 ( 2 of 10 ) #> Warning: Removed 23 rows containing missing values (geom_point). #> Warning: Removed 1 row(s) containing missing values (geom_path). #> Warning: Removed 23 rows containing missing values (geom_point). #> Drew the Profile plot with summarization for P12277 ( 3 of 10 ) #> Warning: Removed 2 rows containing missing values (geom_point). #> Warning: Removed 2 rows containing missing values (geom_point). #> Drew the Profile plot with summarization for P23919 ( 4 of 10 ) #> Drew the Profile plot with summarization for P31947 ( 5 of 10 ) #> Warning: Removed 52 rows containing missing values (geom_point). #> Warning: Removed 3 row(s) containing missing values (geom_path). #> Warning: Removed 52 rows containing missing values (geom_point). #> Drew the Profile plot with summarization for Q15233 ( 6 of 10 ) #> Warning: Removed 2 rows containing missing values (geom_point). #> Warning: Removed 2 rows containing missing values (geom_point). #> Drew the Profile plot with summarization for Q16181 ( 7 of 10 ) #> Warning: Removed 2 rows containing missing values (geom_point). #> Warning: Removed 2 rows containing missing values (geom_point). #> Drew the Profile plot with summarization for Q9NSD9 ( 8 of 10 ) #> Warning: Removed 8 rows containing missing values (geom_point). #> Warning: Removed 8 rows containing missing values (geom_point). #> Drew the Profile plot with summarization for Q9UGP8 ( 9 of 10 ) #> Warning: Removed 6 rows containing missing values (geom_point). #> Warning: Removed 6 rows containing missing values (geom_point). #> Drew the Profile plot with summarization for Q9Y450 ( 10 of 10 ) # ## Profile plot with all the channels # quant.msstats.all <- proteinSummarization(input.pd, # method="msstats", # normalization=TRUE, # remove_norm_channel=FALSE, # remove_empty_channel=FALSE) # # dataProcessPlotsTMT(data.peptide = input.pd, # data.summarization = quant.msstats.all, # type = 'ProfilePlot', # width = 21, # adjust the figure width since there are 15 TMT runs. # height = 7) ## Quality control plot # dataProcessPlotsTMT(data.peptide=input.pd, # data.summarization=quant.msstats, # type='QCPlot', # width = 21, # adjust the figure width since there are 15 TMT runs. # height = 7)

3. groupComparisonTMT()

Tests for significant changes in protein abundance across conditions based on a family of linear mixed-effects models in TMT experiment. Experimental design of case-control study (patients are not repeatedly measured) is automatically determined based on proper statistical model.

Arguments

data : Name of the output of proteinSummarization function. It should have columns named Protein, TechRepMixture, Mixture, Run, Channel, Condition, BioReplicate, Abundance.
contrast.matrix : Comparison between conditions of interests. 1) default is pairwise, which compare all possible pairs between two conditions. 2) Otherwise, users can specify the comparisons of interest. Based on the levels of conditions, specify 1 or -1 to the conditions of interests and 0 otherwise. The levels of conditions are sorted alphabetically.
moderated : If moderated = TRUE, then moderated t statistic will be calculated; otherwise, ordinary t statistic will be used.
adj.method : adjusted method for multiple comparison. ’BH` is default.
remove_norm_channel : TRUE(default) removes Norm channels from protein level data.
remove_empty_channel : TRUE(default) removes Empty channels from protein level data.

# test for all the possible pairs of conditions test.pairwise <- groupComparisonTMT(quant.msstats) head(test.pairwise) #> Protein Label log2FC SE DF pvalue adj.pvalue #> 1 P04406 0.125-0.5 -0.031373953 0.0214787 102.0003 0.1471712 0.3896575 #> 2 P04406 0.125-0.667 -0.010442843 0.0214787 102.0003 0.6278717 0.8969595 #> 3 P04406 0.125-1 -0.005921016 0.0214787 102.0003 0.7833599 0.9509302 #> 4 P04406 0.5-0.667 0.020931110 0.0214787 102.0003 0.3321112 0.5720904 #> 5 P04406 0.5-1 0.025452937 0.0214787 102.0003 0.2387587 0.5968968 #> 6 P04406 0.667-1 0.004521827 0.0214787 102.0003 0.8336771 0.9324055 #> issue #> 1 NA #> 2 NA #> 3 NA #> 4 NA #> 5 NA #> 6 NA # Check the conditions in the protein data levels(quant.msstats$Condition) #> [1] "0.125" "0.5" "0.667" "1" # Only compare condition 0.125 and 1 comparison<-matrix(c(-1,0,0,1),nrow=1) # Set the names of each row row.names(comparison)<-"1-0.125" # Set the column names colnames(comparison)<- c("0.125", "0.5", "0.667", "1") comparison #> 0.125 0.5 0.667 1 #> 1-0.125 -1 0 0 1 test.contrast <- groupComparisonTMT(data = quant.msstats, contrast.matrix = comparison) head(test.contrast) #> Protein Label log2FC SE DF pvalue adj.pvalue issue #> 1 P04406 1-0.125 0.005921016 0.02147870 102.0003 0.7833599 0.9509302 NA #> 2 P06576 1-0.125 -0.001284321 0.02081887 102.0003 0.9509302 0.9509302 NA #> 3 P12277 1-0.125 -0.013004897 0.02641674 102.0002 0.6235669 0.9142682 NA #> 4 P23919 1-0.125 0.031852508 0.02484815 102.0002 0.2027886 0.6298913 NA #> 5 P31947 1-0.125 0.034549433 0.03102666 102.0001 0.2680937 0.6298913 NA #> 6 Q15233 1-0.125 0.010110290 0.02155178 102.0003 0.6399877 0.9142682 NA

MSstatsTMT : A package for protein significance analysis in shotgun mass spectrometry-based proteomic experiments with tandem mass tag (TMT) labeling

Ting Huang (thuang0703@gmail.com), Meena Choi (mnchoi67@gmail.com), Sicheng Hao (hao.sic@husky.neu.edu), Olga Vitek(o.vitek@northeastern.edu)

2020-10-13

1. Converters for different peptide quantification tools

PDtoMSstatsTMTFormat()

Arguments

Example

MaxQtoMSstatsTMTFormat()

Arguments

Example

SpectroMinetoMSstatsTMTFormat()

Arguments

Example

OpenMStoMSstatsTMTFormat()

Arguments

Example

2. Protein summarization, normalization and visualization

2.1. proteinSummarization()

Arguments

Example

2.2 dataProcessPlotsTMT()

Arguments

Example

3. groupComparisonTMT()

Arguments

Example