Skip to main content

Project Files

Each project has its own set of files. Project Files is a placeholder for all your data, analysis, notes, and other materials. You can organize your files into categorical or hierarchical groups like datasets or studies.

Supported files

tip

Carefully choosing file names that are informative and useful for both humans and machines is a simple step towards reproducible research and help others find your materials.

Genotype file

The DDB Platform supports Variant Call Format (VCF) files saved in .vcf.gz format. A VCF file a header and a body.

Header

Mandatory lines image

The #CHROM line contains 8 fixed columns, followed by the sample names.

image

Metadata lines

Meta-data is included after the double hash (##) string, and is usually followed by INFO, FILTER or FORMAT. These lines are optional.

image

Body

The body contains the data lines, each containing information about a position in the genome.

image

Official VCF file documentation

The complete VCF file specifications can be found here.

Phenotype file

Phenotype files contain quantitative or categorical information about a sample's phenotypic traits, such as plant height, flowering time, pericarp color, etc. This file is used in conjunction with genotype files for GS and GWAS analyses. Phenotype files should be saved as CSV files, encoded using the UTF-8 standard (CSV UTF-8).

image

image

Crossing table file

The crossing table contains individuals (parents - maternal and paternal) to be crossed in Crossing Simulation. This file requires a header with 2 columns (ind1, ind2). Rows after the header are the parent crosses indicated by sample names.

  • ind1: maternal parent sample name (e.g. NA0003)
  • ind2: paternal parent sample name (e.g. NA0008)

image

An example cross between sample NA0003 and NA0008

image

Linkage map file

The linkage map file contains the relative location of genetic markers across the genome. This file can be used to identify the location of genes that are responsible for plant traits and diseases. This requires a header with 4 columns:

  • chr: The chromosome number
  • physPos: The physical position of a genetic marker.
  • SNPid: The genetic marker's SNP ID
  • linkMapPos: The linkage map position calculated by dividing physPos by 1,000,000 centiMorgan (cM), by default

image

Linkage map position (linkMapPos) is usually the quotient of physPos and 1,000,000 cM (878 / 1000000 = 0.000878)

image

Pedigree file

A pedigree file contains an individual's parent information. It contains a header (first row) and a body, each containing information about an individual sample.

Header

image

Header specifications

FieldDescriptionData typeEncodingRequiredExample
acquisitionDateDate the material was acquired by the organizationDate (YYYY/MM/DD)UTF-8No2022/06/21
biologicalStatusOfAccessionCode3-digit numerical code that represents the genetic nature of the sampleIntegerUTF-8No410
collectionA specific panel/collection/population name the sample belongs toStringUTF-8NoF1_Hybrid
countryOfOriginCode3-letter ISO 3166-1 code of the country in which the sample was originally collectedStringUTF-8NoJPN
germplasmNameName of the sampleStringUTF-8YesBasmati 217
maternalNameGermplasm name of the sample’s maternal parentStringUTF-8NoSample14
paternalNameGermplasm name of the sample’s paternal parentStringUTF-8NoSample897
synonymsAlternative names or IDs used to reference the sample. Each synonym is separated by a comma (,)StringUTF-8NoNSFTV14
remarksAdditional notes about the sampleStringUTF-8NoAdditional info

Body

image

Each data line contains information about an individual sample. There are 9 fixed fields per data line. All lines are comma-delimited. Missing values can be left blank/empty. The fixed fields are as follows:

tip

To avoid errors when uploading:

  • Avoid commas (,) and tabs (/t) in the actual values
  • Duplicate germplasmNames are not allowed

Upload a file or create a folder.

What you need:

  • You need to be part of a project

Steps:

  1. Inside your project, on the left sidebar, click Project Files.

    image

  2. You will be redirected to the Project Files page. Click the button to upload a file or create a new folder

    image

  3. Click the button to upload a file or create a new folder

    image

  1. Create a new folder of upload a file

    image

  1. Actual genotype and phenotype files uploaded as examples. Uploaded files should have a SUCCESS status.

    image