Updating annotations in light of new data--UNDER CONSTRUCTION

From MAKER Wiki
Revision as of 20:40, 20 July 2013 by Mcampbell (talk | contribs)
Jump to navigation Jump to search

Additional data can improve an existing annotation. mRNA-Seq data from additional tissues, developmental time points, and experimental conditions can provide evidence support for genes that are expressed in specific tissues and/or specific developmental time points and/or under certain experimental conditions. All genome annotation projects--that I am aware of--start before mRNA-Seq data for every combination of tissues, developmental time points, and experimental conditions are available. As work continues on a given organism more and more data of this type becomes available and updating the original annotation becomes desirable. You can us MAKER to do this!

This is what this looks like in the MAKER control file to update a subset of an annotation set in the light of new data.

emacs maker_opts.ctl

genome=yourgenome.fasta
est=newdata.fasta #new data (denovo assembled transcript assembly)
est_gff=newdata.gff # new data in gff3 format (alignment based transcript assembly)
model_org=all #use the same repeat masking options as the original annotation
rmlib=custom_lib.fasta #use the same repeat masking options as the original annotation
repeat_protein=te_proteins.fasta #use the same repeat masking options as the original annotation
pred_gff=annotations_to_update.gff #a gff3 file with the annotations you wish to update 
model_gff=annotation_to_not_change #a gff3 file with annotation you do not want changed
keep_preds=1 #Add unsupported gene prediction to final annotation set, 1 = yes, 0 = no
map_forward=1 #map names and attributes forward from old GFF3 genes