Comparing annotations from different annotation pipelines--UNDER CONSTRUCTION

From MAKER Wiki
Jump to: navigation, search

This page will contain instructions for for using MAKER to compare annotations generated by different pipelines.

As annotation becomes more tractable for small genome communities, and even individual labs, it is not uncommon for an organism to be annotated multiple times. Often these annotations are produced by different groups, pipelines, evidence sets, and practices. These differences make an apples to apples comparison challenging at best. Further complicating this is the lack of quality metrics assigned to annotated features by many pipelines. MAKER provides means for comparing such annotations.

MAKER assigns quality metrics based on the evidence to all of the protein coding gene annotations it produces. These quality metrics are discussed further down in this page. You may be wondering how this is going to help in comparing annotations from pipelines other than MAKER. Well you can pass the annotations from any pipeline in GFF3 and MAKER will calculate the quality metrics on those annotations for you. Now you can compare annotation sets from any pipeline.

This is what it looks like in the maker_opts.ctl file to add quality metrics to protein coding gene annotations from any annotation pipeline.

Start with the genome assembly

genome=yourgenome.fasta

Make sure that both of the annotation sets you are comparing are from the same assembly. If the assemblies were different the coordinates in the gff3 files will not match the assembly and the results will be meaningless. There are ways to compare annotations of different assemblies which will be discussed later.


The next step is to add the evidence.

est=assembled_mRNA_Seq.fasta
protein=protein.fasta

The evidence must be the same for each annotation set for the comparisons to be fair.