File Formats

VAMPIRE currently supports two primary file types, which are relatively convenient for most microarray users. For users of Microsoft Excel, Excel worksheets should be saved as "Text (tab-delimited)" prior to uploading to the VAMPIRE server. For additional examples of valid files, see the tutorial.

Data should not be log-transformed before loading into VAMPIRE. Normalized data from RMA, CORGON, dChip, etc should be used with caution, as these tools have profound effects on the variance structure, and can prevent the variance structure of the data from being adequately modeled. For Affymetrix chips, we recommend MAS 5.0/GCOS scores. For Agilent chips, we recommend the processed signal intensities.

Affymetrix text file

VAMPIRE can directly import text files obtained from Affymetrix GCOS or MAS5.0. The text parser recognizes the first line in the file as the sample name. It then looks for a tab-delimited table with columns labeled "probe set name" and "signal". Any additional columns are ignored. Each successive line in the table is then recorded in the database. Below is an example of the Affymetrix file format.

010705A_hP1E
Sample Type: Human
Project: Olefsky
Sample User: uci-dmaf
Probe Array Type: HG-U133_Plus_2
Barcode: @52001900453325071905400306409473
Probe Array Lot: 4003064
Expiration Date: Jul 19 2005 12:00AM
Experiment User: uci-dmaf
Operator: SM
Hybridization Date: 1-31-05
Experiment Date: 1-26-05
Experiment Protocol: Standard
Array Type: HG133plus2
Array Lot Number: 4003064
Array Expired Date: 7-15-05
Algorithm: Statistical
HZ=4 VZ=4 BG=4 SmoothFactorBG=100 Epsilon=0.5 BF= Alpha1=0.05 Alpha2=0.065 Tau=0.015 Gamma1H=0.002 Gamma1L=0.002 Gamma2H=0.002667 Gamma2L=0.002667 Perturbation=1.1 TGT=500 NF=1.000000000000 SF=4.204811096191 SFGene=All Background=Avg:52.22,Stdev:0.95,Max:54.5,Min:49.9 Noise=Avg:2.68,Stdev:0.07,Max:2.9,Min:2.5 RawQ=1.60 Corner+=Avg:153,Count:32 Corner-=Avg:18621,Count:32 Central-=Avg:19633,Count:9

Probe Set Name    Stat    Pairs    Stat Pairs Used    Signal    Detection    Detection p-value
AFFX-BioB-5_at    20    20    704.2    P    0.000581
AFFX-BioB-M_at    20    20    1155.3    P    0.000044
AFFX-BioB-3_at    20    20    664.0    P    0.000044
AFFX-BioC-5_at    20    20    1824.9    P    0.000052
AFFX-BioC-3_at    20    20    2619.1    P    0.000044
AFFX-BioDn-5_at    20    20    5159.5    P    0.000044
AFFX-BioDn-3_at    20    20    11575.4    P    0.000060

Affymetrix pivot file

Users may also import pivot text files from Affymetrix software, which contain measurements from multiple samples in a single file. The first row should contain a blank first column, followed by sample labels in the form samplename_Signal. These sample labels will be imported directly into VAMPIRE, after stripping the _Signal suffix. Detection and description columns may also be present in the file, but these will be ignored. Below is an example of this file format.

    Sample1_Signal    Sample1_Detection    Sample2_Signal    Sample2_Detection    Descriptions
AFFX-BioB-5_at    10789.3    A    11726.4    A    E. coli GEN=bioB ...
AFFX-BioC-3_at    1021.9    A    985.7    A    E. coli GEN=bioC ...
AFFX-CreX-5_at    21    A    8.2    A    Bacteriophage P1 GEN=cre ...
AFFX-CreX-3_at    30.3    A    29.6    A    Bacteriophage P1 GEN=cre ...
AFFX-DapX-5_at    133.9    A    139.6    A    B. subtilis GEN=dapB ...

Data table

Alternatively, many users prefer to maintain their data as large worksheets containing all gene expression measurements. VAMPIRE can also import tables containing multiple samples as tab-delimited text files. All blank lines and lines preceded by the # character are ignored. The first non-ignored line is treated as the header row. The first column should be labeled "feature" and should contain the names of all microarray features/probe sets. Each successive column should be labeled by the sample name, and contain measurements for the corresponding microarray features. Below is an example of this file format.

# One-channel microarray tutorial data
# 3T3-L1: 24 hour TZD treatment

feature    con1    con2    con3
AFFX-MurIL2_at    18.5    35.1    135.6
AFFX-MurIL10_at    20.6    216.7    145.9
AFFX-MurIL4_at    7.3    62.8    78.8
AFFX-MurFAS_at    240.6    276.7    218.9
AFFX-BioB-5_at    2519.5    2982.3    5620.1
AFFX-BioB-M_at    5406.6    7233    13218.8
AFFX-BioB-3_at    2925.5    3147.6    6906.2