Data File Standard for Flow Cytometry, Version FCS3.0

Data File Standard for Flow Cytometry, Version FCS3.0

Data File Standards Committee of the International Society for Analytical Cytology (ISAC)

FCS version 2.0 can be found in Cytometry 1990;11(3):323-32.

ABSTRACT

The flow cytometry data file standard provides the specifications needed to completely describe flow cytometry data sets within the confines of the file containing the experimental data. In 1984 the first Flow Cytometry Standard format for data files was adopted as FCS1.0, this standard was modified in 1990 as FCS2.0. We report here on the proposed next generation Flow Cytometry Standard data file format, FCS3.0. The principal goal of the Standard is to provide a uniform file format allowing files created by one type of acquisition hardware and software to be analyzed by another type. The proposed FCS3.0 standard maintains backwards compatibility with previous versions by retaining the basic FCS file structure. The FCS structure requires that each data set in a file contains three segments: HEADER, TEXT and DATA, with an optional ANALYSIS segment. The HEADER appears first and contains plain text byte offsets needed to locate the other segments. The TEXT segment contains plain text keyword-value pairs that describe the experiment, the instrument, the specimen, the data and any other information which the file creator wishes to include. The DATA segment contains the actual FCM data in one of several formats specified in the TEXT segment. The ANALYSIS segment contains plain text keyword-value pairs that describe user-specified analyses of the data. The proposed changes in FCS3.0 include: a mechanism for handling data sets of 100 megabytes and larger, support for UNICODE text for keyword values, support for cyclic redundancy check (CRC) validation for each data set, a requirement for the inclusion of information describing the method of signal amplification and increased support for the inclusion of time as a measurement parameter.

INTRODUCTION

The goal of the Flow Cytometry Data File Standard is to facilitate the development of software for reading and writing flow cytometry data files in a standardized format. Application of a standard file format allows files created on one type of instrument to be read and analyzed by software implemented on a different computer. The original FCS standard was published in 1984 as FCS1.0 (1) and amended in 1990 as FCS2.0 (2).

The changes included in FCS3.0 were made necessary by rapid evolution in microcomputer technology, computer communications, instrument design and experimental complexity. These technological advances have resulted in an increase in the average data file size straining the limit of 99,999,999 bytes per data set designed into previous FCS versions. FCS3.0 provides a mechanism to avoid this restriction while retaining backwards compatibility for those data files which do not exceed the 100-megabyte limit. The growth of computer networks has resulted in the routine movements of large amounts of data between computers. This has created the need for a means of confirming file integrity. Therefore, FCS3.0 provides support for a cyclic redundancy check (CRC) word to be placed at the end of each FCS3.0 data set. The use of a CRC check word allows errors, occurring during file transfer or reading, to be detected. Time is increasingly used as a measurement parameter. Therefore, keyword support has been added to better describe the acquisition of time. Internationalization in the field of flow cytometry has caused a need for the incorporation of international characters in keyword values. Therefore, a provision has been made to support the use of multi-byte characters for some strings by providing a keyword to support the UNICODE character set (3).

Table of Contents

1. General
1.1 Scope
1.2 Conformance
2. Terminology and General Requirements
2.1 Conventions
2.2 Definitions
2.3 General Concepts
3. File Segments
3.1 HEADER Segment
3.2 TEXT Segment
3.3 DATA Segment
3.4 ANALYSIS segment
3.5 CRC Value
3.6 Other Segments
4. References
5. Appendices
5.1 Appendix A - Differences from FCS2.0
5.2 Appendix B - Data File Standard Committee Members
5.3 Appendix C - Proposed API for reading and writing FCS files

1. General

1.1 Scope

This is version 3.0 of the Flow Cytometry Data File Standard (FCS3.0). Its purpose is to provide detailed specifications for the structure of the data sets produced as a result of acquiring data on a cytometer and writing the data to a file.

1.2 Conformance

To be conformant with FCS3.0, a data file must conform to the file structure as described in this document and must contain all required keyword-value pairs in the primary TEXT segment of the file. A conformant file must not contain other segments not described in the data set HEADER segment. To be conformant with FCS3.0 an analysis program must be able to correctly read and interpret all of the data contained in any minimum FCS3.0 conformant file (a minimum FCS3.0 conformant file is one with only the required keyword-value pairs in the TEXT segment of the file and no information in the ANALYSIS segment).

2. Terminology and General Requirements

2.1 Conventions

2.1.1 The ASCII character code is used for all keywords and most of the keyword values throughout an FCS3.0 file (see section 3.2.20 regarding the use of UNICODE characters).

2.1.2 Numerical values are base 10 unless otherwise specified.

2.2 Definition

2.2.1 An FCS3.0 data file consists of one or more data sets.

2.2.2 A data set is defined as the collection of information produced by a cytometer as it carries out its measurements on some number of particles.

2.2.3 The collection of information in a data set is divided into at least four segments including a HEADER segment, a primary TEXT segment, a DATA segment, and an ANALYSIS segment. The ANALYSIS segment may be empty and any number of implementor-defined segments may follow the first four segments. New to FCS3.0 is the inclusion of an optional supplemental TEXT segment.

2.2.4 The HEADER segment identifies the data set as FCS3.0 and contains ASCII byte offsets from the start of the data set to the beginning and end of each of the other segments.

2.2.5 A keyword is the label of a data field. A keyword-value pair is the label of the data field with its associated value.

2.2.6 The TEXT segments contain a series of ASCII keyword-value pairs that describe the format of the DATA segment and most of the experimental operating conditions. The primary TEXT segment contains all required keyword-value pairs. The supplemental TEXT segment contains optional keyword-value pairs only.

2.2.7 The DATA segment contains either a list of the events or histograms of the data.

2.2.8 An event is an ordered list of the cytometric measurements for one particle. The length of an event is the number of parameters involved in the measurement.

2.2.9 A parameter is the signal produced by one of the detectors of the cytometer. Forward scattering is typically one of the measurement parameters. A parameter value is a digital representation of a parameter.

2.2.10 Each data set in a data file contains all the information needed to read and interpret the data set.

2.2.11 All space within a file which is not contained in a segment specified in the HEADER must be filled with a space character (ASCII 32). This includes unused space between the end of one segment and the beginning of the next segment and between the end of the last data set and the end of the file.

2.2.12 List mode data storage means that events are stored one after the other in a list.

2.2.13 All byte offsets are referenced to the beginning of the data set. The first data set in a file begins at byte zero of the file.

2.2.14 The implementor is the entity that creates the software to read and write FCS conformant data files.

2.2.15 The "delimiter" is the first character of the primary TEXT segment and is subsequently placed in the primary TEXT, supplemental TEXT and ANALYSIS segments to separate keywords from keyword values. The delimiter can be any ASCII character.

2.3 General Concepts

An FCS3.0 file is composed of one or more data sets, each containing at a minimum HEADER, TEXT and DATA segments. The HEADER, TEXT, and ANALYSIS segments contain ASCII-encoded text readable by a text editor (some keyword-values may contain UNICODE characters; See section 3.2.20). The DATA segment contains flow cytometry data stored in list mode or as histograms.

3. File Segments

3.1 HEADER Segment

3.1.1 The primary purpose of the HEADER segment is to describe the location of the other segments in the data set. The HEADER segment begins at byte offset zero from the beginning of the data set. The first six bytes in the HEADER segment comprise the version identifier (FCS3.0). Note, there is no space character between the FCS and the 3.0 in the identifier. The next 4 bytes (6 - 9) are occupied by space characters (ASCII 32). Following the identifier are at least three pairs of ASCII-encoded integers indicating the byte offsets for the start and end of the primary TEXT segment, the DATA segment, and the ANALYSIS segment, respectively. The byte offsets are referenced to the beginning of the data set. Under FCS3.0 these offsets remain limited to 8 bytes. Each ASCII encoded integer offset is right justified in its 8 byte space. The first byte offset (bytes 10 - 17) is that to the start of the primary TEXT segment. The next byte offset (bytes 18 - 25) is that for the end of the primary TEXT segment. The next offset (bytes 26 - 33) is that for the start of the DATA segment. The byte offset for the end of the DATA segment occupies bytes 34 - 41. That for the start of the ANALYSIS segment occupies bytes 42 - 49. The byte offset for the end of the ANALYSIS segment is in bytes 50 - 57. If there is no ANALYSIS segment these last two byte offsets can be set to zero (right justified) or left blank (filled with space characters). Offsets to the start and end of user-defined OTHER segments of the data set follow the ANALYSIS segment offsets. The user-defined segments will not be interpretable by others unless appropriate information is passed on by the data set originator.

A major change from previous FCS versions is the allowance for data sets larger than 99,999,999 bytes. When any portion of a segment falls outside the 99,999,999 byte limit, '0's are substituted in the HEADER for that segments begin and end byte offset. The byte offsets for begin DATA, end DATA, begin ANALYSIS, end ANALYSIS (begin and end supplemental TEXT if appropriate) will then only be found as keyword-value pairs in the primary TEXT segment. Note, when a segment is contained completely within the first 99,999,999 bytes of a data set, the byte offsets for that segment will be duplicated in the TEXT segment as keyword values. Note also, if the ANALYSIS offsets in the HEADER are zero, the $BEGINANALYSIS and $ENDANALYSIS keywords must be checked to determine if an ANALYSIS segment is present.

Table 1. Contents of HEADER fields and the byte offsets to the beginning and end of each field. Each offset is right justified in its field.

Contents Start and end byte positions

FCS3.0 00 - 05

ASCII(32) - space characters 06 - 09

ASCII-encoded offset to first byte of TEXT segment 10 - 17

ASCII-encoded offset to last byte of TEXT segment 18 - 25

ASCII-encoded offset to first byte of DATA segment 26 - 33

ASCII-encoded offset to last byte of DATA segment 34 - 41

ASCII-encoded offset to first byte of ANALYSIS segment 42 - 49

ASCII-encoded offset to last byte of ANALYSIS segment 50 - 57

ASCII-encoded offset to user defined OTHER segments 58 - beginning of next segment

One example HEADER segment is as follows:

FCS3.0*********256****1545****1792**202456*******0*******0

The '*' character is used to represent a space character here. The TEXT segment starts at byte 256 from the location of the 'F' in FCS3.0 and ends at byte offset 1545. The DATA segment starts at byte offset 1792 and ends at 202456. There is no ANALYSIS segment, so the start and end offsets are shown as zeros. They could be left blank. Note that the HEADER segment is a continuous byte stream with no return or line feed characters. The bytes between the end of the HEADER segment and the start of the next segment must be filled with the space character. In this example, the segments are in the order HEADER, TEXT, DATA, and ANALYSIS. The FCS standard requires only that the HEADER segment be at the start of the data set and the primary TEXT segment be located entirely within the first 99,999,999 bytes.

A second example of a legal HEADER segment is:

FCS3.0*********256****1545*******0*******0*******0*******0

The '0's in the begin DATA and end DATA positions indicates that the DATA segment exceeds the 99,999,999 byte limit. Therefore, the byte offsets to begin Data and end Data, are located only in the $BEGINDATA, $ENDDATA keyword values in the TEXT segment. The begin ANALYSIS and end ANALYSIS byte offsets are also located in the $BEGINANALYSIS and $ENDANALYSIS keyword values in TEXT segment, if an ANALYSIS segment exist.

A third example of a legal HEADER segment is:

FCS3.0******202451**203140****1792**202450*******0*******0

This HEADER is different from the other examples in that it describes a data set in which the primary TEXT segment follows the DATA segment.

3.2 TEXT Segment

3.2.1 The TEXT segments (primary and supplemental) contain a series of ASCII encoded keyword-value pairs that describe various aspects of the data set. For example, $TOT/5000/ is a keyword-value pair indicating that the total number of events in the file is 5000. $TOT is the keyword and 5000 is the value. The '$' character flags this keyword as an standard FCS keyword. In this example, the '/' is the delimiter character.

3.2.2 A data set must contain a primary TEXT segment which contains all required keyword-value pairs and any number of optional keyword-value pairs. The primary TEXT segment must be contained entirely in the first 99,999,999 bytes of data set.

3.2.3 A data set may contain an optional supplemental TEXT segment that can contain only optional keyword-value pairs and may be placed anywhere in a data set after the HEADER segment.

3.2.4 The byte offset to the beginning and end of the supplemental TEXT segment is found in the $BEGINSTEXT and $ENDSTEXT keyword-value pairs which must be located in the primary TEXT segment.

3.2.5 The first character in the primary TEXT segment is the ASCII delimiter character. This character must also be used as the delimiter in the ANALYSIS and supplemental TEXT segments.

3.2.6 The delimiter is placed at the start and end of a keyword value.

3.2.7 The delimiter may not be the first character in a keyword or keyword value. If the delimiter appears in a keyword or keyword value, it must be immediately followed by a second delimiter. For example, "$SYS/RSX-11//M/" shows a value of RSX-11/M for the keyword $SYS. Since null (zero length) keywords or keyword values are not permitted, two consecutive delimiters can never occur between a value and a keyword.

3.2.8 All keywords are encoded in ASCII. Keyword values are encoded in ASCII by default. The values of specified keywords may be in languages not representable in ASCII by use of the $UNICODE keyword.

3.2.9 Keywords and keyword values must have lengths greater than zero.

3.2.10 Keywords are case insensitive, They may be written in a file in lower case, upper case, or a mixture of the two. However, an FCS file reader must ignore keyword case. A keyword value may be in lower case, upper case or a mixture of the two. Keyword values are case sensitive.

3.2.11 There are no default values for any keyword.

3.2.12 FCS-defined keywords must begin with the '$' character. Only FCS-defined keywords may begin with the '$' character.

3.2.13 FCS-defined keywords may not be redefined by the implementor.

3.2.14 There are required and optional FCS keyword-value pairs. The required keyword-value pairs represent the minimum set needed to successfully read and write an FCS data set. Conformant FCS file reading programs must recognize required FCS keywords.

3.2.15 The TEXT segments must not contain return (ASCII 13), line feed (ASCII 10) or other unprintable characters unless they are within a keyword value or are used as the delimiter character.

3.2.16 The parameter description keywords (e.g. $PnR, $PnB, etc) are numbered consecutively in the order in which the parameters are written to the file, beginning with number 1.

The required and optional FCS keywords are listed below with one line descriptions. The keywords and their values are described in alphabetical order following the lists. Required keywords are so indicated.

3.2.18 The required FCS primary TEXT segment keywords are as follows:

$BEGINANALYSIS Byte-offset to the beginning of the ANALYSIS segment.

$BEGINDATA Byte-offset to the beginning of the DATA segment.

$BEGINSTEXT Byte-offset to the beginning of a supplemental TEXT segment.

$BYTEORD Byte order for data acquisition computer.

$DATATYPE Type of data in DATA segment (ASCII, integer, floating point).

$ENDANALYSIS Byte-offset to the end of the ANALYSIS segment.

$ENDDATA Byte-offset to the end of the DATA segment.

$ENDSTEXT Byte-offset to the end of a supplemental TEXT segment.

$MODE Data mode (list mode, histogram).

$NEXTDATA Byte offset to next data set in the file.

$PAR Number of parameters in an event.

$PnB Number of bits reserved for parameter number n.

$PnE Amplification type for parameter n.

$PnR Range for parameter number n.

$TOT Total number of events in the data set.

3.2.19 The optional FCS TEXT segment keywords are as follows:

$ABRT Events lost due to data acquisition electronic coincidence.

$BTIM Clock time at beginning of data acquisition.

$CELLS Description of objects measured.

$COM Comment.

$COMP Fluorescence compensation matrix.

$CSMODE Cell subset mode, number of subsets to which an object may belong.

$CSVBITS Number of bits used to encode a cell subset identifier.

$CSVnFLAG The bit set as a flag for subset n.

$CYT Type of flow cytometer.

$CYTSN Flow cytometer serial number.

$DATE Date of data set acquisition.

$ETIM Clock time at end of data acquisition.

$EXP Name of investigator initiating the experiment.

$FIL Name of the data file containing the data set.

$GATE Number of gating parameters.

$GATING Specifies region combinations used for gating.

$GnE Amplification type for gating parameter number n.

$GnF Optical filter used for gating parameter number n.

$GnN Name of gating parameter number n.

$GnP Percent of emitted light collected by gating parameter n.

$GnR Range of gating parameter n.

$GnS Name used for gating parameter n.

$GnT Detector type for gating parameter n.

$GnV Detector voltage for gating parameter n.

$INST Institution at which data acquired.

$LOST Number of events lost due to computer busy.

$OP Name of flow cytometry operator.

$Pkn Peak channel number of univariate histogram for parameter n.

$PKNn Count in peak channel of univariate histogram for parameter n.

$PnF Name of optical filter for parameter n.

$PnG Amplifier gain used for acquisition of parameter n.

$PnL Excitation wavelength for parameter n.

$PnN Short name for parameter n.

$PnO Excitation power for parameter n.

$PnP Percent of emitted light collected by parameter n.

$PnS Name used for parameter n.

$PnT Detector type for parameter n.

$PnV Detector voltage for parameter n.

$PROJ Name of the experiment project.

$RnI Gating region for parameter number n.

$RnW Window settings for gating region n.

$SMNO Specimen (tube or well) label.

$SRC Source of the specimen (patient name, cell types)

$SYS Type of computer and its operating system.

$TIMESTEP Time step for time parameter.

$TR Trigger parameter and its threshold.

$UNICODE UNICODE code page for string type keyword values.

3.2.20 Alphabetical listing and detailed description of keywords. For all the keywords below 'n', 'n1', 'n2', etc represent ASCII-encoded integer values. The character 'f' represents an ASCII-encoded floating point number. The word "string" represents an ASCII or UNICODE-encoded TEXT string that can be of any length greater than zero. If the optional $UNICODE keyword is used, a specified subset of the strings may be represented with two byte characters in a variety of UNICODE conformant languages. Otherwise strings are in single byte ASCII. The character 'c' represents a single ASCII-encoded character. The '/' character is used here as the delimiter for illustrative purposes.

$ABRT/n/ $ABRT/1265/

Number of events lost due to data acquisition electronic coincidence effects. The number of aborted events here was 1265.

$BEGINANALYSIS/n/ $BEGINANALYSIS/123456789/ [REQUIRED]

This field contains the byte-offset from the beginning of the data set to the beginning of the optional ANALYSIS segment. If there is no ANALYSIS segment, a '0' should be placed in this keyword value. In this example, the ANALYSIS segment begins at byte 123,456,789.

$BEGINDATA/n/ $BEGINDATA/123456789/ [REQUIRED]

This field contains the byte-offset from the beginning of the data set to the beginning of the DATA segment. If the DATA segment is completely contained within the first 99,999,999 bytes of the data set, this value duplicates the offset contained in the HEADER segment. In this example, the DATA segment begins at byte 123,456,789

$BEGINSTEXT/n/ $BEGINSTEXT/123456789/ [REQUIRED]

This field contains the byte-offset from the beginning of the data set to the beginning of the supplemental TEXT segment. If there is no supplemental TEXT segment, the value should be set to '0'. In this example, the supplemental TEXT segment begins at byte 123,456,789.

$BTIM/hh:mm:ss[:tt]/ $BTIM/14:22:10:47/

Clock time at the beginning of data acquisition. The format of the value is 24-hour clock hours:minutes:seconds:number of fractional seconds in units of 1/60 of a second. The fractional seconds [:tt] is optional. Data acquisition began at 14 hours, 22 minutes, 10 seconds, and 47/60th of a second.

$BYTEORD/n1,n2,n3,n4/ $BYTEORD/4,3,2,1/ [REQUIRED]

This keyword specifies the order from numerically least significant[1] to numerically most significant[4] in which four binary data bytes are written to compose a 32-bit word in the data acquisition computer. The numbers are separated by commas (ASCII 44). In VAX computers and personal computers of the IBM PC type, the byte order is 1,2,3,4 with the least significant byte written first. In Hewlett Packard, Macintosh and Sun computers, the byte order is 4,3,2,1 meaning that the least significant byte is written last. In PDP-11 computers the byte order is 3,4,1,2 meaning that in the two 16-bit words comprising a 32-bit word, the most significant 16-bit word is written first. Within the 16-bit word, however, the least significant byte is written first, which is the same as for a PC. Byte order is discussed more fully in reference 4. In this example, the most significant byte is written first and the least significant byte is written last. Use of this keyword enables collection of data on one computer type and analysis of the data on another computer type.

$CELLS/string/ $CELLS/Normal human peripheral blood/

Type of cells or other objects measured. This specimen is normal human peripheral blood.

$COM/string/ $COM/Incubation time was 47 minutes./

This keyword is used to attach a comment to the data set. It should not to be used as a substitute for other standard keywords. This example shows the use of $COM to add a brief note to the data set, a note that otherwise might appear only in a laboratory notebook.

$COMP/n,f1,f2,f3,.../ $COMP/3,0.0,-0.1,0.0,-40.0,0.0,-0.6,0.0,-36.4,0.0/

This keyword enables the efficient storage of a fluorescence compensation matrix. The matrix has n rows and n columns where n represents the number of acquisition parameters. f1, f2,

f3, ... are floating point numbers representing the matrix elements. Both positive and negative values are allowed. A positive or unsigned value indicates that compensation has been additive while a negative value indicates the more common case of subtractive compensation. The elements are stored in row-major order, i.e., the elements in the first row appear first. The matrix element Cij is the percentage of FLj that has been subtracted electronically from FLi. In the example, the compensation matrix is 3 x 3 and the matrix elements have the following subtractive values: C11=0.0%, C12 = 0.1%, C13 = 0.0%, C21 = 40.0%, C22=0.0%, C23 = 0.6%, C31 = 0.0%, and C32 = 36.4%, C33 = 0.0%.

$CSMODE/n/ $CSMODE/3/

Cell subset mode, i.e., the number "n" of subsets to which a object may belong. The simplest case is that the cell subset parameter encodes a single value per object as would be indicated by n = 1. If the value of n is greater than 1 it indicates that the value of the cell subset parameter may encode n subset identifiers. In these cases, the $CSVBITS and $CSVnFLAG keyword values will specify how the cell subset values are encoded. It should be noted that regardless of the value for this keyword, a cell subset value of zero indicates that the object is undefined by the analysis scheme that was used.

$CSVBITS/n/ $CSVBITS/4/

The number of bits used to encode a cell subset value. When the $CSMODE keyword value is greater than 1, the number of bits used to encode a cell subset identifier must be specified by the $CSVBITS keyword value. In the cited example, 4 bits, i.e., values of 0-15, are used to encode cell subset identifiers. See the discussion of the ANALYSIS segment in section 3.4.

$CSVnFLAG $CSV1FLAG/4096/

The value used as a "flag" to indicate that the "n" identifier field encodes a value. In the cited example, if bit 13 is set in the value of the cell subset parameter (parameter value AND 8192 is TRUE), one should read the second field of bits to decode the value. It is not necessary to set "flags", but if one wishes to use zero to encode the first subset for any field, one must set a "flag" to indicate that the zero in that field refers to a subset. See the discussion of the ANALYSIS segment in section 3.4.

$CYT/string/ $CYT/FACScan/

The name of the flow cytometer used for the data set. Here a FACScan was used.

$CYTSN/string/ $CYTSN/400E370/

The serial number of the flow cytometer used for the data set. Here the serial number is 400E370.

$DATATYPE/c/ $DATATYPE/I/ [REQUIRED]

This keyword describes the type of data written in the DATA segment of the data set. The four allowed values are 'I', 'F', 'D', or 'A'. The DATA segment is a continuous bit stream with no delimiters. 'I' stands for unsigned binary integer, F stands for single precision IEEE floating point, 'D' stands for double precision IEEE floating point, and 'A' stands for ASCII. The additional keywords $PnB (bits per parameter) and $PnR (range per parameter) are needed to completely describe an event in the DATA segment.

$DATATYPE/I/ means that the events are written as unsigned binary integers. For each parameter in an event, both the maximum length in bits allocated for storage of the parameter and the actual integer range used by the parameter within that allocation are needed. The number of bits per parameter is specified by $PnB. For example, $P1B/16/ specifies that 16 bits are allocated for parameter 1. $P1R/1024/ specifies that parameter 1 values range from 0 to 1023. This allows the data word length to be specified, facilitating compatibility between machines with different data word lengths and enabling bit compression of the data.

$DATATYPE/F/ means that the data are written as single precision floating point values in the IEEE standard format. Note that the $PnB keywords should be set to a value of 32 for each parameter in an event. For example, $P1B/32/.

$DATATYPE/D/ means that the data are written as double precision floating point values in the IEEE standard format. The $PnB keyword should be set to a value of 64 for each parameter in an event. For example, $P3B/64/ says that parameter 3 is allocated 64 bits of storage space. The IEEE standard formats for single- and double-precision numbers are given in the table below:

Single-precision Double-precision

Sign bit 31 bit 63

Exponent bits 30-23 bits 62-52

bias 127 bias 1023

Fraction bits 22-0 bits 51-0

Range 3.402823e+38 1.797693e+308

approx. 1.175494e-38 2.225074e-308

$DATATYPE/A/ means that the data are written as ASCII-encoded integer values. In this case, the keyword $PnB specifies the number of bytes allocated per value (one byte per character). This represents fixed format ASCII data. $P1B/4/ indicates that the maximum value for parameter 1 would be 9999. Data are stored in a continuous byte stream, with no delimiters. If the value of the $PnB keyword is the * character, e.g., $P1B/*/, the data are free format and number of characters per parameter value may vary. In this case, all values are separated by one of the following delimiters: "space", "tab", "comma", "carriage return", or "line feed" characters. Note that multiple, consecutive delimiters are treated as a single delimiter. Since there are significant differences between the way in which consecutive delimiters are treated by different programming languages, care should be taken when using this format. Zero values must be explicitly specified by the zero (0) character. Thus, the string "1,3,, ,3" (note the space between the third and fourth commas) would only specify three values. It would be treated as between 3 and 5 values by different programming languages.

$DATE/dd-mmm-yyyy/ $DATE/01-OCT-1994/

This keyword specifies the date on which the data set was created. The format is day-month-year with the number of characters specified by dd-mmm-yyyy. This data set was created on 01 October 1994. Note that the all the character positions should be filled including leading zeros. Accepted abbreviations for the months are: JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC.

$ENDANALYSIS/n/ $ENDANALYSIS/123456789/ [REQUIRED]

This field contains the byte-offset from the beginning of the data set to the end of the ANALYSIS segment. If there is no ANALYSIS segment, a '0' should be placed in this keyword value. In this example, the ANALYSIS segment ends at byte 123,456,789

$ENDDATA/n/ $ENDDATA/123456789/ [REQUIRED]

This field contains the byte-offset from the beginning of the data set to the end of the DATA segment. If the DATA segment is completely contained in the first 99,999,999 bytes of the data set, this value duplicates the offset contained in the HEADER segment. In this example, the DATA segment ends at byte 123,456,789.

$ENDSTEXT/n/ $ENDSTEXT/123456789/ [REQUIRED]

This field contains the byte-offset from the beginning of the data set to the end of the supplemental TEXT segment. If there is no supplemental TEXT segment, the value should be set to '0'. In this example, the supplemental TEXT segment ends at byte 123,456,789.

$ETIM/hh:mm:ss[:tt]/ $ETIM/14:22:10:47/

Clock time at the end of data acquisition. The format of the value is 24-hour clock hours:minutes:seconds:number of fractional seconds in units of 1/60 of a second. Data acquisition ended at 14 hours, 22 minutes, 10 seconds, and 47/60th of a second. The fractional seconds keyword value is optional as indicated by the square brackets.

$EXP/string/ $EXP/A. Smith/

The name of the person initiating the experiment. This experiment was under the direction of A. Smith.

$FIL/string/ $FIL/071494.001/

The name of the data file that corresponds to this data set. If there is only one data set in the FCS file, then this file name should be the same as the name of the FCS file. If this data set is one of several in the FCS file, then the file name may correspond to a data file collected at some earlier time. In this example, the data are stored in a file named 071494.001.

$GATE/n/ $GATE/2/

This keyword specifies the number of parameters used for gating. It is analogous to the $PAR keyword, which specifies the total number of parameters for each event in the data set. In this example, there are two gating parameters. The current practice in many flow cytometry laboratories is that the gating parameters are collected as part of the data set. This fact is reflected in the redefinition of the $RnI keyword described below.

$GATING/string/ $GATING/R1/ $GATING/R1 AND (R2.0R.R3)/

This keyword specifies the conditions under which the data in the data set have been acquired. The conditions are set through Boolean operations among regions defined below using the $RnI and $RnW keywords. Allowed Boolean operators are AND, OR(inclusive), and NOT. The operands are the regions (Rn). Operators are separated from operands or other operators by spaces or periods. Operator precedence is from left to right unless overridden with parentheses. In the first example, data were collected using gating region R1. Events with parameter values falling outside R1 were excluded from the data set. In the second example, an event is included in the data set only if the appropriate parameter value is inside R1 and is inside R2 or R3 or both.

$GnE/f1,f2/ $G3E/4.0,0.01/

This keyword specifies whether linear or logarithmic amplifiers were used for gating parameter number n. When the amplification is logarithmic the value of f1 specifies the number of logarithmic decades and f2 represents the linear value that would have been obtained for a signal with a log value of 0. In the example above, the data for parameter 3 were collected using a four-decade logarithmic amplifier and the 0 channel represents the linear value, 0.01. When linear amplification is used or when amplification is undefined such as with some calculated parameters, f1 and f2 are set to 0.

$GnF/string/ $G2F/520LP/

This keyword specifies the optical filter that was used for the light reaching the detector for gating parameter n. This example shows that the optical filter used for the second gating parameter was a type 520 nm long pass.

$GnN/string/ $G1N/FL2/

This keyword specifies a short name for gating parameter number n. Here "FL2" is the name for gating parameter 1. Required short names for parameters include the following:

FS Forward Scatter

SS Side scatter

FLn Fluorescence channel n

AE Axial Extinction

CV Coulter Volume

TIME Time

$GnP/n1/ $P3P/27/

The amount of light collected by the detector for gating parameter number n1 expressed as a percentage of the light emitted by a fluorescent object. In the example, 27% of the emitted light was captured by the detector for gating parameter number 3.

$GnR/n1/ $G2R/1024/

This keyword specifies the range, n1, of gating parameter n. In this example, the events for gating parameter 2 range from 0 to 1023.

$GnS/string/ $G1S/FITC-CD45/

This keyword specifies a longer name for gating parameter n than is allowed by $GnN. Here, FITC-labeled CD45 is the name for gating parameter 1.

$GnT/string/ $G2T/PMT9524/

This keyword specifies the detector type for gating parameter n. Here, gating parameter 2 uses a photomultiplier tube (PMT) of type 9524.

$GnV/n1/ $G2V/645/

This keyword specifies the detector bias voltage for gating parameter n. In this example, the detector for gating parameter 2 is biased at 645 volts.

$INST/string/ $INST/Laboratory of FCM, RPCI/

The institution or laboratory in which the data were collected. In this example, the data were collected in the Laboratory of Flow Cytometry at Roswell Park Cancer Institute.

$LOST/n/ $LOST/457/

This keyword specifies the number of events lost during data acquisition because the computer was busy with other tasks. Here, 457 events were so lost.

$MODE/c/ $MODE/L/ [REQUIRED]

This keyword specifies the mode in which the data were acquired. Allowed values for the character c are 'C', 'L', or 'U'. These options are described as follows:

C One correlated multivariate histogram is stored in the data set as a multidimensional array. There can be only one such histogram per data set. In storing multiparameter correlated data, the index for the first parameter is incremented first, then the second, etc. For bivariate data, the first data value corresponds to index 1 for parameter 1 and index 1 for parameter 2, the second data value corresponds to index 2 for parameter 1 and index 1 for parameter 2, etc.

L List mode. For each event, the value of each parameter is stored in the order in which the parameters are described. The number of bits reserved for parameter 1 is described using the $P1B keyword. There can be only one set of list mode data per data set. The $DATATYPE keyword describes the data format. This is the most versatile mode for the storage of flow cytometry data because mode C and mode U data can be created from mode L data.

U Uncorrelated univariate histograms. There can be more than one univariate histogram per data set. The histogram frequencies for parameter 1 are stored first followed by those for parameter 2, etc. If the univariate histograms have been gated, they must all have been acquired with the same gates so that the total number of events in each histogram is the same.

$NEXTDATA/n/ $NEXTDATA/202512/ [REQUIRED]

When there is more than one data set in an FCS file, this keyword gives the byte offset from the beginning of a data set to the first byte in the HEADER of the next data set in the FCS file. If n is zero (0), this is the final or only data set in the file. This example shows that the next data set begins at byte 202512 from the beginning of the present data set. Each data set stands alone and must contain a full complement of keywords.

$OP/string/ $OP/Dave/

The name of the operator of the flow cytometer. Here Dave was the operator of this instrument.

$PAR/n/ $PAR/5/ [REQUIRED]

This keyword specifies the total number of parameters stored in each event in the data set. In this example, data for five parameters are stored for each event.

$Pkn/n1/ $PK2/374/

For a univariate histogram of parameter n, this keyword specifies the channel number, n1, containing the highest frequency of events. In this example, the peak in the univariate histogram for parameter 2 is located in channel 374. The $PKNn keyword specifies the count in that channel.

$PKNn/n1/ $PKN2/12803/

For a univariate histogram of parameter n, this keyword specifies the number of events, n1, in the channel number (histogram bin) containing the maximum event frequency. In this example, the univariate histogram for parameter 2 has a maximum event frequency of 12803. The $PKn keyword above specifies that this peak count occurs at channel 374.

$PnB/n1/ $P3B/16/ [REQUIRED]

For $DATATYPE/I/(binary integers), this keyword specifies the number of bits allocated, n1, for storage of parameter n. In this example, the data value for parameter 3 would be stored as two bytes (16 bits). This keyword is used in conjunction with $PnR to determine how the data are actually stored. A flow cytometer with 10-bit analog-to-digital converters (ADCs) would have $PnR/1024/. A 10-bit number would be stored in the 16-bit space allocated by $PnB/16/ leaving 6 empty bits per parameter. These keywords enable tight bit packing of events. For example, the data storage could be specified by $PnB/10/$PnR/1024/ for each of the n parameters in an event. Then fewer bits would be wasted in storing each event. However, packing these data for storage and unpacking them later for analysis is very time-consuming. In practice, most flow cytometers use $PnB/16/$PnR/1024/ for 10-bit data. A flow cytometer with 8-bit ADCs would use $PnB/8/$PnR/256/ where n represents integers from one to the number of parameters measured.

For $DATATYPE/A/(ASCII-encoded integers), $PnB specifies the number of characters, n, per measured value for parameter n.

$PnE/f1,f2/ $P3E/4.0,0.01/ [REQUIRED]

This keyword specifies whether linear or logarithmic amplifiers were used for parameter number n. When the amplification is logarithmic the value of f1 specifies the number of logarithmic decades and f2 represents the linear value that would have been obtained for a signal with a log value of 0. In the example above, the data for parameter 3 were collected using a four-decade logarithmic amplifier and the 0 channel represents the linear value, 0.01. When linear amplification is used or when amplification is undefined such as with some calculated parameters, f1 and f2 are set to 0.

$PnF/string/ $P2F/520LP/

This keyword specifies the optical filter that was used for the light reaching the detector for parameter n. This example shows that the optical filter used for the second parameter was a type 520 nm long pass.

$PnG/f/ $P2G/10.0/

This keyword specifies the gain that was used to amplify the signal for parameter n. This example shows that parameter 2 was amplified 10.0-fold before digitization.

$PnL/n1/ $P1L/488/

This keyword specifies the excitation wavelength, n1, in nm for parameter n. In this example, the wavelength was 488 nm for parameter number 1.

$PnN/string/ $P3N/FL1/

This keyword is used to specify the short name of parameter n. Here parameter 3 has a short name of FL1. Required short names for parameters include the following:

CS Cell subset

FS Forward Scatter

SS Side scatter

FLn Fluorescence channel n

AE Axial Extinction

CV Coulter Volume

TIME Time

$PnO/n1/ $P2O/200/

This keyword specifies the excitation power, n1, in milliwatts for the light source associated with the measurements for parameter n. Here 200 mW was used to produce the signal associated with parameter 2.

$PnP/n1/ $P4P/50/

The amount of light collected by the detector for parameter number n expressed as a percentage of the light emitted by a fluorescent object. In the example, 50% of the emitted light was captured by the detector for parameter number 4.

$PnR/n1/ $P2R/1024/ [REQUIRED]

This keyword specifies the maximum range, n1, of parameter n. For $MODE/L/ (list mode data), this corresponds to the ADC range, here 1024. The data values can range from 0 to 1023. For univariate histogram data ($MODE/C/ or $MODE/U/), it is the number of channels, n1, in the histogram for parameter n. Here the histogram channel numbers range from 0 to 1023.

$PnS/string/ $PnS/CD45 FITC Fluorescence/

This keyword specifies a long name to be used as an axis label in a plot of parameter n. Here FITC-labeled CD45 is the label. $PnS is the long name equivalent of $PnN.

$PnT/string/ $P2T/PMT9524/

This keyword specifies the detector type for parameter n. Here, parameter 2 uses a photomultiplier tube (PMT) of type 9524.

$PnV/n1/ $P2V/645/

This keyword specifies the detector bias voltage, n1, in Volts for parameter n. In this example, the detector for parameter 2 is biased at 645 Volts.

$PROJ/string/ $PROJ/AML patient study/

This keyword provides the name of the project. Here it is an AML patient study.

$RnI/string1,[string2]/ $R3I/P2,P4/ $R2I/G3/

This keyword associates a gating region number, n, with one or two parameters, here shown as string1 and string2. The two strings are of the form "Pn" or "Gn". "Pn" stands for collected parameter n, while "Gn" stands for gating parameter n. In the first example, gating region 3 is associated with a bivariate dot plot or bivariate histogram for parameters 2 and 4. The $RnW keyword described below specifies the shape of the gating region. In the second example, gating region 2 is associated with gating parameter 3. See the discussion for the $GATE keyword.

$RnW/n1, n2[;n3, n4;...]/ $R1W/345, 366/

This keyword specifies the window settings for gating region n. This window setting is useful only if the $RnI keyword is also specified. If the keyword $RnI has only a single value, then n1 and n2 specify the inclusive lower and upper bounds for the window in a univariate histogram. For example, $R2I/3/$R2W/345,366/ specifies that gating region 2 is associated with gating parameter 3. The gated events must range between channels 345 and 366 inclusive. If the $RnI keyword value has two values, then the window exists in a bivariate plot and it is specified in the $RnW keyword as a polygon. The x and y coordinates of the first point in the polygon are the pair n1, n2. The next point is separated from the first by a ';' character and is represented as n3, n4 above. The polygon can contain any number of points separated by semicolons. The first point and the last point are assumed to be connected. For example, $R1I/2,3/$R1W/310,205;515,304;480,615;240,514;354,542/ specifies that region 1 is defined in parameter 2 and 3 and that the region 1 window is a 5-sided polygon in this 2-parameter space. The $GATING keyword will specify the way the windows will be used (AND, OR, etc.).

$SMNO/string/ $SMN0/A7/

This keyword specifies the specimen number, which could be a tube or well number. Here the specimen number is A7.

$SRC/string/ $SRC/J. Doe, HIV positive patient/

This keyword specifies the source of the specimen. Note that this keyword value could contain patient information, which is protected by the U.S. Privacy Act and by strict U.S. National Institutes of Health guidelines. The acquiring laboratory may choose to use encoded information for this keyword value.

$SYS/string/ $SYS/Macintosh System 7.5/

This keyword specifies the type of computer and the operating system under which the data set was collected. Here the data set was collected on a Macintosh running System 7.5.

$TIMESTEP/f/ $TIMESTEP/0.0167/ $TIMESTEP/1.0/

The presence of this keyword indicates that time has been collected as one of the parameters in the data set. $PnN/TIME/ specifies which parameter represents time. $TIMESTEP specifies the time step in seconds. In the first example, the time step is 0.0167 seconds, which is 1/60 of a second and is the typical clock tick on a personal computer. For this example, an implementor specifies $P6N/TIME/$P6B/16/ $P6R/65536/$TIMESTEP/0.0167/. When the first event in the data set is captured by the computer, the number of clock ticks since the computer was turned on is read and saved as a constant, n Ticks. A zero value is entered into parameter 6 in this first event. When the second event arrives, the number of clock ticks is obtained from the computer clock. n Ticks is subtracted from this number and the result stored as parameter 6 of the second event. The actual number of seconds between any subsequent event and the first event is obtained by multiplying the parameter 6 value by the $TIMESTEP value. In this example, the maximum time range is approximately 17.5 minutes. In the second example, an implementor specifies $P6N/TIME/$P6B/16/$P6R/65536/$TIMESTEP/1.0/. Using the same procedure as in the first example, any events arriving less than 1.0 second after the first event have a parameter value of zero, while those arriving between 1.0 second and (less than) 2.0 seconds have a parameter 6 value of 1. The maximum time range is approximately 18 hours. If an external constant time interval generator is used to provide a signal input that increases linearly with time, the appropriate TEXT keywords might be $P6N/TIME/$P6B/16/$P6R/1024/

$TIMESTEP/0.001/. Here the time step is smaller than that available from the computer clock. However, the number of steps is limited by the range of the ADC, here 10 bits. The maximum time range for this example is 1023 seconds.

$TOT/n/ $TOT/25000/ [REQUIRED]

This keyword specifies the total number of events in the data set. This data set contains 25000 events.

$TR/string, n/ $TR/FS,54/

This keyword specifies the parameter name which serves as the trigger signal for an event. The number, n, is the channel number of the threshold signifying an event. When the threshold is exceeded, an event is declared. Here forward scatter (FS) is the trigger signal and the event threshold is at channel 54.

$UNICODE/n,string1,string2,etc/ $UNICODE/3,$SYS,$SRC/

The integer 'n' represents the UNICODE page number used and the comma delimited strings represent the keyword values where UNICODE text is used. UNICODE is an international standard that enables computer representation of most of the world's languages. The characters for each language are represented as two-byte codes on a code page. There are 65536 codes available. U.S. ASCII requires 256 two-byte characters. For computer systems that support UNICODE, implementors will be able to present axis labels and other appropriate text strings in the language of the country in which the flow cytometry data are being collected. If this keyword is not present, single byte U.S. ASCII is used for all strings. In the example above, UNICODE page 3 was used to write the values for the $SYS and $SRC keywords.

3.3 DATA Segment

The DATA segment contains the raw data in one of three modes (list, correlated or uncorrelated) described in the primary TEXT segment by the $MODE keyword value. The data are written to the DATA segment in one of four allowed formats (binary, floating point, double precision floating point or ASCII) described by the $DATATYPE keyword value. The most common form of data storage is list mode storage in the form of binary integers ($DATATYPE/I/ $MODE/L/). The $PnB set of keywords specify the bit width for the storage of each parameter. The $PnR set of keywords specify the channel number range for each parameter. For example, $P1B/16/ $P1R/1024/ specifies a 16-bit field for parameter 1 and a range for the values of parameter 1 from 0 to 1023, which corresponds to 10 bits. Implementors should use a bit mask when reading these list mode parameter values to insure that erroneous values are not read from the 4 unused bits.

3.4 ANALYSIS segment

ANALYSIS is an optional segment that, when present, contains the results of data processing. It is often the case that analysis is performed off-line, after the data has been collected and stored in a data set. Therefore, the ANALYSIS segment typically contains information added to a copy of the original file. For examples, the results of cell cycle analysis or immunophenotype determinations often involve more complex analyses than can be performed in "real time" as the data is collected and stored. The ANALYSIS segment has the same structure as the TEXT segment; i.e., it consists of a series of keyword-value pairs. There are no required keywords for the ANALYSIS segment. The optional FCS keywords are listed in 3.4.1 with one line descriptions and in 3.4.2 with full descriptions and examples. Implementors may add their own keywords.

A proposal has been made that the ANALYSIS segment be used for identifying cell subsets, determined either by region drawing or by some partitioning method such as cluster analysis (4). This may be particularly useful for immunophenotyping data. Three approaches to identifying cell subsets are discussed below. The first two use the least space in the data set but require the cell subsets be disjoint. The third approach adds a parameter to each event and supports overlapping cell subset assignments.

In method 1, the implementor uses the TEXT segment keyword-value pairs $CSMODE/1/ and $CSTOT/n/ to specify that there is one group of cell subsets containing n disjoint subsets of cells. The TEXT segment keyword-value pair $CSVBITS/8/ is used to indicate that the cell subset assignments for each event are stored in a binary vector of unsigned characters (8 bits each) whose length is the number of events in the data set. This vector is stored in an other segment following the ANALYSIS segment. The DATA segment contains a copy of the original data with the events written in the same order as in the original data set. In the ANALYSIS segment, $CSnNUM is used to count the number of cells in each of the n subsets.

In method 2, the implementor uses the TEXT segment keyword-value pairs $CSMODE/1/ and $CSTOT/n/ as above but does not use the $CSVBITS keyword. In the DATA segment, the events are written out one cell subset at time rather than in the original event order. In the ANALYSIS segment, $CSnNUM is used to count the number of cells in each of the n subsets. No other segment is required.

Method 3 creates an additional cell subset (CS) parameter for each event in the data set. Cell subsets may be defined by the method, e.g., cluster analysis, neural network, boolean gates on combinations of parameters, hyperplanes in n-dimensional space, etc. The value of the parameter may encode a single subset identifier number for each event ($CSMODE/1/) or more than one identifier number per event (value of $CSMODE greater than 1). The meanings of the identifier numbers are specified by the values of the $CSnNAME keywords in the TEXT and ANALYSIS segments. If the value of the CS parameter is 0 (zero), that event is unclassified by the definitions used to assign cell subsets. If the classification scheme creates unique non-overlapping populations, e.g., CD4 T cells, CD8 T cells, B cells, monocytes/macrophages, neutrophils, etc., then the simplest approach is to set the value of $CSMODE to "1" and use 1==CD4 T cell, 2 == CD8 T cells, etc. In some situations, it may be useful to be able to assign a single cell to more than one defined subset. For example, to extend the preceding example, subset identifiers 1 - 5 would correspond the definitions listed above with 6 == lymphocytes and 7 == mononuclear cells. This scheme would require $CSMODE/3/ since a single cell could belong to three defined subsets. Operationally, assuming that an event in the data set is a CD4 T cell, then the first bit field would encode a value of 1 (CD4 T cell), the second bit field would encode a value of 6 (lymphocyte), and the third bit field would encode a value of 7 (mononuclear cell). The bit fields and their interpretations in these cases would be defined by the values of the $CSVBITS and the $CSVnFLAG keywords as outlined in the reference (4). Method 3 also supports the creation of an ANALYSIS segment that includes a summary for the results written as the values for the keywords pertaining to the numbers of cells in each subset, etc. Method 3 has the size "cost" of an additional parameter, but it permits one to include a complete and explicit record of an analysis as an integral part of a data set.

3.4.1 Optional FCS ANALYSIS segment keyword list:

$CSDATE Cell subset analysis date.

$CSDEFFILE Cell subset definition file name.

$CSEXP Name of person who performed the cell subset analysis.

$CSnName Name of cell subset number n.

$CSnNUM Number of cells in cell subset number n.

3.4.2 Optional FCS ANALYSIS segment keywords:

$CSDATE/dd-mmm-yyyy/ $CSDATE/26-OCT-94/

Cell subset date. This keyword specifies the date on which the data set containing the cell subset analysis was created. The format is of the date is the same as that for $DATE. This data set was created on 26 October 1994.

$CSDEFFILE/string/ $CSDEFFILE/c:\filename.dat/

Cell subset definition file. The string is the name of the file containing the information needed to define each of the cell subsets. In the example the cell subset definition file is named filename.dat and is located on drive c:.

$CSEXP/string/ $CSEXP/A. Smith/

Cell subset experimenter. Name of the person who performed the cell subset analysis. Here, A. Smith performed the cell subset analysis.

$CSnName/string/ $CS2N/lymphocytes/

Cell subset name. This is a string naming cell subset number n. In the example, cell subset 2 is named "lymphocytes".

$CSnNUM/n1/ $CS2NUM/3456/

This keyword specifies the number of cells, n1, in cell subset number n. In the example, cell subset 2 contains 3456 cells.

3.5 CRC Value

The CRC word is computed for the part of each data set beginning with the first byte of the HEADER segment and ending with the last byte of the final segment of the data set (which could be a TEXT, DATA, ANALYSIS or OTHER segment). The CRC word is a 16-bit cyclic redundancy check value (5). This 16-bit CRC word conforms to the CCITT standard (Comite' Consultatif International Te'le'graphique et Te'le'phonique). This standard uses the CCITT polynomial X16 + X12 + X5 and requires that each input character be interpreted as its bit-reversed image. These requirements are satisfied by the icrc function in reference 6 if the last two function arguments are 0 and -1, respectively. The CRC value will be placed as ASCII in the 8 bytes immediately after the end data set. If an implementor chooses not to compute and store a CRC word then the 8 bytes immediately after the end of the data set should be filled with ASCII '0' characters.

3.6 Other Segments

Implementors may create any number of OTHER segments as they choose.

4. References

1. Murphy RF, Chused TM:A proposal for a flow cytometric data file standard. Cytometry 5:553-555, 1984.

2. Dean PN, Bagwell CB, Lindmo T, Murphy RF and Salzman GC: Data File Standard for Flow Cytometry. Cytometry 11:323-332, 1990.

3. The Unicode Consortium: The UNICODE Standard, Version 1.0, vol. 1. Addison-Wesley Publishing Co. Inc., Reading, MA, 1991.

4. Redelman D, Coder DM: Cell subset (CS) parameter to record the identities of individual cells in flow cytometric data. Cytometry 18:95-102, 1994.

5. Press WH, Teukolsky SA, Vetterling WT, Flannery BP: Numerical Recipes in C. 2nd ed. Cambridge University Press, Cambridge, UK, 1992.

5.1 Appendix A: Major Differences Between FCS2.0 and FCS3.0.

1) The HEADER has been modified to accommodate data sets longer than 99,999,999 bytes. Any offset value that requires more than 8-bytes is now represented by placing a '0' in the HEADER for that value and its associated "$BEGIN" value. The actual byte-offset value is then found in the primary TEXT segment of the data set. This system allows the vast majority of data files to be backwards compatible with analysis software designed for previous FCS versions. However, a '0' byte-offset in the HEADER will prevent previous FCS versions from reading very large data sets, avoiding read errors or partial data reads. Note, $BEGINDATA, $ENDDATA, $BEGINANALYSIS, $ENDANALYSIS, $BIGINSTEXT and $ENDSTEXT keyword-value pairs are required in the HEADER segment of FCS3.0 conformant files irrespective of the size of the data set. When the size of a data set remains below the 100 megabyte limit, the byte offsets will be found both in the HEADER and in keyword value pairs in the primary TEXT segment. When a data set reaches or exceeds 100 megabytes, byte offsets will only be located in the primary TEXT segment.

2) A supplemental TEXT segment may now be included in a data set. The supplemental TEXT segment may contain only optional keyword-value pairs and may be located anywhere in a data set after the HEADER segment.

3) A primary TEXT segment must contain all required keyword-value pairs and be located entirely within the first 99,999,999 bytes of a data set.

4) An optional 16-bit CRC check has been added to the end of each data set. This internal check-word allows for data set integrity checks.

5) To enable third party or off-line analysis software to correctly read and interpret data, the keyword $PnE is now required for each parameter. The $PnE keyword describes the method of amplification used for a given parameter.

6) There are a number of new optional FCS TEXT Segment keywords. $CSMODE, $CSTOT, $CSVBITS, $CSVnFLAG specify an added parameter to identify cell subsets. $CYTSN specifies the cytometer serial number. $RnI has been redefined. $TIMESTEP has been added to enable use of a time parameter. $UNICODE enables the specification of certain keywords in languages not representable with ASCII text.

7) The $DATE keyword value for year is increased by two bytes to -yyyy.

8) The following optional ANALYSIS segment keywords have been added: $CSDATE, CSDEFFILE, $CSEXP, $CSnN, and $CSnNUM to enable specification of cell subsets.

9) The definition of the $BYTEORD and $PnE keywords have been corrected and clarified. The $PnG keyword has been added, describing the linear gain applied to a signal.

10) The $COMP keyword has replaced $DFCiTOj for the description of fluorescence compensation.

5.2 Appendix B: Data File Standards Committee of the International Society for Analytical Cytology

Larry Seamer, Chair
Director, Flow Cytometry Facility
University of New Mexico
Cancer Center, Cytometry
900 Camino de Salud NE
Albuquerque, NM 87131
(505) 277-6206
lseamer@cobra.unm.edu

Bruce Bagwell
Maine Medical Center Research Institute
70 John Roberts Road, Suite 8
South Portland, ME 04106
75450.167@compuserve.com

Luther Barden
Div. of Computer Research and Technology,
Building 12A Room 2015
National Institutes of Health
9000 Rockville Pike
Bethesda, MD 20892
luther_barden@nih.gov

Marc Christofferson
Becton Dickinson Immunocytometry Systems
2350 Qume Drive
San Jose, California 95131-1807
(408) 954-2058
[N.B.: Marc Christofferson is no longer with BDIS.]

Louise E. Magruder
Division of Clinical Laboratory Devices
FDA/CDRH/ODE
72 Gaither Road
Rockville, MD 20850
lem@fdadr.cdrh.fda.gov

George Malachowski
Cytomation, Inc.
400 E. Horsetooth Rd.
Ft. Collins, CO
(303)226-2200

Robert F. Murphy
Associate Professor
Department of Biological Sciences and Center for
Light Microscope Imaging and Biotechnology
Carnegie Mellon University
4400 Fifth Avenue, Box 52
Pittsburgh, Pennsylvania 15213
(412) 268-3480
murphy+@cmu.edu

Doug Redelman
Sierra Cytometry
3150 Susileen Dr.
Reno, NV 89509

Gary C. Salzman
Life Sciences Division
Los Alamos National Laboratory
Mail Stop M888
Los Alamos, NM 87545
(505)667-5503
salzman@lanl.gov

James C.S. Wood
Coulter Corporation
Mail Code 52-A01
11800 S.W. 147th Avenue
Miami, FL 33196-2500
(305)380-2449 or 344-1290 (voice)
(305)344-5240 (FAX)
woodjcs@gate.net

� 1996 International Society for Analytical Cytology