Datasets


CSV format for QTL experiment data

There are several formats for QTL experiment data. Among them csv is probably the best one. It is very easy to create/modify and highly readable. Plus it requires only one file. Here is a short description of data file format.

CSV file is comma-delimited text file. You can edit it in any spreedsheet editor (such like MS Excel). For the CSV file for QTL experiment, the first line should contain the phenotype names followed by the marker names. At least one phenotype must be included; for example, include a numerical index for each individual.

The second line should contain blanks in the phenotype columns, followed by chromosome identifiers for each marker in all other columns. If a chromosome has the identifier X or x, it is assumed to be the X chromosome; otherwise, it is assumed to be an autosome.

An optional third line should contain blanks in the phenotype columns, followed by marker positions, in cM.

Marker order is taken from the cM positions, if provided; otherwise, it is taken from the column order.

Subsequent lines should give the data, with one line for each individual, and with phenotypes followed by genotypes. If possible, phenotypes are made numeric; otherwise they are converted to factors.

A sample csv file can be found in here.