https://weatherby.genetics.utah.edu/MAKER/wiki/index.php?title=Repeat_Library_Construction--Basic&feed=atom&action=history
Repeat Library Construction--Basic - Revision history
2024-03-28T22:30:07Z
Revision history for this page on the wiki
MediaWiki 1.33.1
https://weatherby.genetics.utah.edu/MAKER/wiki/index.php?title=Repeat_Library_Construction--Basic&diff=340&oldid=prev
Admin: /* 2. Exclusion of gene fragments */
2014-06-25T18:14:28Z
<p><span dir="auto"><span class="autocomment">2. Exclusion of gene fragments</span></span></p>
<table class="diff diff-contentalign-left" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">Revision as of 18:14, 25 June 2014</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l23" >Line 23:</td>
<td colspan="2" class="diff-lineno">Line 23:</td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==2. Exclusion of gene fragments ==</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==2. Exclusion of gene fragments ==</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>*All repeats collected by RepeatModeler were used to search against a plant protein database where transposon protein were excluded. Sequences match the plants proteins (considered as gene fragments) as well as 50 bp flanking sequences were excluded. After the exclusion if the remainder sequences were shorter than 50 bp, the entire sequence was excluded. A package for conducting this task is available at [http://weatherby.genetics.utah.edu/MAKER/data/ProtExcluder1.<del class="diffchange diffchange-inline">0</del>.tar.gz here] ([http://weatherby.genetics.utah.edu/MAKER/data/<del class="diffchange diffchange-inline">ProtExcluderManual</del>.docx manual]).</div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>*All repeats collected by RepeatModeler were used to search against a plant protein database where transposon protein were excluded. Sequences match the plants proteins (considered as gene fragments) as well as 50 bp flanking sequences were excluded. After the exclusion if the remainder sequences were shorter than 50 bp, the entire sequence was excluded. A package for conducting this task is available at [http://weatherby.genetics.utah.edu/MAKER/data/ProtExcluder1.<ins class="diffchange diffchange-inline">1</ins>.tar.gz here] ([http://weatherby.genetics.utah.edu/MAKER/data/<ins class="diffchange diffchange-inline">ProtExcluder1.1Manual</ins>.docx manual]).</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>After exclusion of putative gene fragments, ModelerID.lib were considered as know TE sequences (AtBasicTE.lib)</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>After exclusion of putative gene fragments, ModelerID.lib were considered as know TE sequences (AtBasicTE.lib)</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*AtBasicTE.lib was combined with Modelerunknown.lib (after exclusion of gene fragments) to form AtBasicAllRepeat.lib.</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*AtBasicTE.lib was combined with Modelerunknown.lib (after exclusion of gene fragments) to form AtBasicAllRepeat.lib.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*It is conceivable that the sequences in AtBasicTE.lib are relatively reliable transposons but this library does not contain all repeats (repeat numbers are underestimated). If this library is used, certain repeats are left out and maybe annotated as genes or portion of genes. On the other hand, AtBasicAllRepeat.lib is more comprehensive but may contain sequences from novel gene families that are not present in the existing plant protein database, so the repeat number may be overestimated in this library and novel gene families might be masked.</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*It is conceivable that the sequences in AtBasicTE.lib are relatively reliable transposons but this library does not contain all repeats (repeat numbers are underestimated). If this library is used, certain repeats are left out and maybe annotated as genes or portion of genes. On the other hand, AtBasicAllRepeat.lib is more comprehensive but may contain sequences from novel gene families that are not present in the existing plant protein database, so the repeat number may be overestimated in this library and novel gene families might be masked.</div></td></tr>
<!-- diff cache key maker_wiki:diff::1.12:old-167:rev-340 -->
</table>
Admin
https://weatherby.genetics.utah.edu/MAKER/wiki/index.php?title=Repeat_Library_Construction--Basic&diff=167&oldid=prev
Mcampbell at 22:01, 1 October 2013
2013-10-01T22:01:53Z
<p></p>
<table class="diff diff-contentalign-left" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">Revision as of 22:01, 1 October 2013</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l1" >Line 1:</td>
<td colspan="2" class="diff-lineno">Line 1:</td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>This page describes the process of generating a species specific repeat library suitable for repeat masking prior to protein coding gene annotation with MAKER. For a more <del class="diffchange diffchange-inline">advanced guide to </del>repetitive <del class="diffchange diffchange-inline">element identification and </del>classification see [[Repeat Library Construction--Advanced]].</div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>This page describes the process of generating a species specific repeat library suitable for repeat masking prior to protein coding gene annotation with MAKER<ins class="diffchange diffchange-inline">. This is achieved by a repeat collection tool (RepeatModeler) that collects sequences reaching a certain copy number. The repetitive sequences are then classified based on their similarity to known transposable elements. As a result, low copy number transposable elements are not included in the collection. Moreover, a substantial amount of sequences cannot be classified</ins>. For a more <ins class="diffchange diffchange-inline">comprehensive collection of </ins>repetitive <ins class="diffchange diffchange-inline">elements as well as better </ins>classification see [[Repeat Library Construction--Advanced]].</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div> </div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div> </div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>''Content contributed by [http://www.hrt.msu.edu/ning-jiang/ Dr. Ning Jiang]''</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>''Content contributed by [http://www.hrt.msu.edu/ning-jiang/ Dr. Ning Jiang]''</div></td></tr>
</table>
Mcampbell
https://weatherby.genetics.utah.edu/MAKER/wiki/index.php?title=Repeat_Library_Construction--Basic&diff=162&oldid=prev
Mcampbell at 23:01, 30 September 2013
2013-09-30T23:01:19Z
<p></p>
<table class="diff diff-contentalign-left" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">Revision as of 23:01, 30 September 2013</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l1" >Line 1:</td>
<td colspan="2" class="diff-lineno">Line 1:</td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;">This page describes the process of generating a species specific repeat library suitable for repeat masking prior to protein coding gene annotation with MAKER. For a more advanced guide to repetitive element identification and classification see [[Repeat Library Construction--Advanced]].</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;"> </ins></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>''Content contributed by [http://www.hrt.msu.edu/ning-jiang/ Dr. Ning Jiang]''</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>''Content contributed by [http://www.hrt.msu.edu/ning-jiang/ Dr. Ning Jiang]''</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td></tr>
</table>
Mcampbell
https://weatherby.genetics.utah.edu/MAKER/wiki/index.php?title=Repeat_Library_Construction--Basic&diff=161&oldid=prev
Mcampbell: /* 2. Exclusion of gene fragments */
2013-09-28T19:35:30Z
<p><span dir="auto"><span class="autocomment">2. Exclusion of gene fragments</span></span></p>
<table class="diff diff-contentalign-left" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">Revision as of 19:35, 28 September 2013</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l19" >Line 19:</td>
<td colspan="2" class="diff-lineno">Line 19:</td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*Sequences in Modelerunknown.lib were searched against a transposase database (derived from [http://www.repeatmasker.org/ RepeatMasker] package and [http://www.ncbi.nlm.nih.gov/pubmed/21535899 Kennedy et al (2011)]) and sequences matching transposase were considered as transposons belonging to the relevant superfamily and were incorporated into ModelerID.lib and excluded from Modelerunknown.lib.</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*Sequences in Modelerunknown.lib were searched against a transposase database (derived from [http://www.repeatmasker.org/ RepeatMasker] package and [http://www.ncbi.nlm.nih.gov/pubmed/21535899 Kennedy et al (2011)]) and sequences matching transposase were considered as transposons belonging to the relevant superfamily and were incorporated into ModelerID.lib and excluded from Modelerunknown.lib.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>== 2. Exclusion of gene fragments ==</div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>==2. <ins class="diffchange diffchange-inline"> </ins>Exclusion of gene fragments ==</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>*All repeats collected by RepeatModeler were used to search against a plant protein database where <del class="diffchange diffchange-inline">proteins from transposons </del>were excluded. Sequences match the plants proteins as well as 50 bp flanking sequences were excluded. After the exclusion if the remainder sequences were shorter than 50 bp, the entire sequence was excluded. A package for conducting this task is available [http://weatherby.genetics.utah.edu/MAKER/data/ProtExcluder1.0.tar.gz here] ([http://weatherby.genetics.utah.edu/MAKER/data/ProtExcluderManual.docx manual]).</div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>*All repeats collected by RepeatModeler were used to search against a plant protein database <ins class="diffchange diffchange-inline"> </ins>where <ins class="diffchange diffchange-inline">transposon protein </ins>were excluded. Sequences match the plants proteins <ins class="diffchange diffchange-inline">(considered as gene fragments) </ins>as well as 50 bp flanking sequences were excluded. After the exclusion if the remainder sequences were shorter than 50 bp, the entire sequence was excluded. A package for conducting this task is available <ins class="diffchange diffchange-inline">at </ins>[http://weatherby.genetics.utah.edu/MAKER/data/ProtExcluder1.0.tar.gz here] ([http://weatherby.genetics.utah.edu/MAKER/data/ProtExcluderManual.docx manual]).</div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div><del class="diffchange diffchange-inline">*</del>After exclusion of putative gene fragments, ModelerID.lib were considered as know TE sequences (AtBasicTE.lib)</div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>After exclusion of putative gene fragments, ModelerID.lib were considered as know TE sequences (AtBasicTE.lib)</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*AtBasicTE.lib was combined with Modelerunknown.lib (after exclusion of gene fragments) to form AtBasicAllRepeat.lib.</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*AtBasicTE.lib was combined with Modelerunknown.lib (after exclusion of gene fragments) to form AtBasicAllRepeat.lib.</div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>*It is conceivable that AtBasicTE.lib does not contain all repeats (<del class="diffchange diffchange-inline">repeats </del>numbers are underestimated)<del class="diffchange diffchange-inline">; </del>AtBasicAllRepeat.lib may contain sequences from novel gene families <del class="diffchange diffchange-inline">(Repeat </del>number <del class="diffchange diffchange-inline">are </del>overestimated<del class="diffchange diffchange-inline">)</del>.</div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>*It is conceivable that <ins class="diffchange diffchange-inline">the sequences in </ins>AtBasicTE.lib <ins class="diffchange diffchange-inline">are relatively reliable transposons but this library </ins>does not contain all repeats (<ins class="diffchange diffchange-inline">repeat </ins>numbers are underestimated)<ins class="diffchange diffchange-inline">. If this library is used, certain repeats are left out and maybe annotated as genes or portion of genes. On the other hand, </ins>AtBasicAllRepeat.lib <ins class="diffchange diffchange-inline">is more comprehensive but </ins>may contain sequences from novel gene families <ins class="diffchange diffchange-inline">that are not present in the existing plant protein database, so the repeat </ins>number <ins class="diffchange diffchange-inline">may be </ins>overestimated <ins class="diffchange diffchange-inline">in this library and novel gene families might be masked</ins>.</div></td></tr>
</table>
Mcampbell
https://weatherby.genetics.utah.edu/MAKER/wiki/index.php?title=Repeat_Library_Construction--Basic&diff=160&oldid=prev
Mcampbell: /* 1. Collecting repetitive sequences by RepeatModeler */
2013-09-28T19:34:39Z
<p><span dir="auto"><span class="autocomment">1. Collecting repetitive sequences by RepeatModeler</span></span></p>
<table class="diff diff-contentalign-left" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">Revision as of 19:34, 28 September 2013</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l3" >Line 3:</td>
<td colspan="2" class="diff-lineno">Line 3:</td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Building custom repeat library for plant genomes – Basic protocol</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Building custom repeat library for plant genomes – Basic protocol</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>== 1. Collecting repetitive sequences by [http://www.repeatmasker.org/RepeatModeler.html RepeatModeler] ==</div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>== 1. <ins class="diffchange diffchange-inline"> </ins>Collecting repetitive sequences by [http://www.repeatmasker.org/RepeatModeler.html RepeatModeler] ==</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>The genomic sequence (called seqfile) was processed by RepeatModeler</div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>The genomic sequence (called seqfile<ins class="diffchange diffchange-inline">,in fasta format</ins>) was processed by RepeatModeler</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>First command:</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>First command:</div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div> DIR/BuildDatabase -name <del class="diffchange diffchange-inline">umseqfiledb </del>-engine ncbi seqfile</div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div> DIR/BuildDatabase -name <ins class="diffchange diffchange-inline">seqfiledb </ins>-engine ncbi seqfile</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*DIR = path where RepeatModeler is.</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*DIR = path where RepeatModeler is.</div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>*“-engine ncbi” refers to <del class="diffchange diffchange-inline">that </del>NCBI blast program was used as alignment tool. </div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>*“-engine ncbi” refers to <ins class="diffchange diffchange-inline">the </ins>NCBI blast program <ins class="diffchange diffchange-inline">that </ins>was used as <ins class="diffchange diffchange-inline">the </ins>alignment tool. </div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Second command:</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Second command:</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div> nohup DIR/RepeatModeler -database seqfiledb >& seqfile.out</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div> nohup DIR/RepeatModeler -database seqfiledb >& seqfile.out</div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>*<del class="diffchange diffchange-inline">Among </del>the sequences <del class="diffchange diffchange-inline">generated by RepeatModeler</del>, <del class="diffchange diffchange-inline">some were associated with identities and others were not</del>. <del class="diffchange diffchange-inline">These </del>with identities were put in ModelerID.lib and <del class="diffchange diffchange-inline">the others </del>were in Modelerunknown.lib.</div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div> </div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>*<ins class="diffchange diffchange-inline">After implementation of </ins>the <ins class="diffchange diffchange-inline">commands, the RepeatModeler program generates a directory called “RM…”. Inside the directory there is a document called “consensi.fa.classified” that contains all the repetitive </ins>sequences<ins class="diffchange diffchange-inline">. The definition line of each sequence contains the sequence name and the identity in RepeatMasker format. If the sequence is unidentified</ins>, <ins class="diffchange diffchange-inline">it is marked as “Unknown”</ins>.</div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins class="diffchange diffchange-inline">*In our study, these </ins>with identities were put in ModelerID.lib and <ins class="diffchange diffchange-inline">these with “Unkown” </ins>were in Modelerunknown.lib.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*Sequences in Modelerunknown.lib were searched against a transposase database (derived from [http://www.repeatmasker.org/ RepeatMasker] package and [http://www.ncbi.nlm.nih.gov/pubmed/21535899 Kennedy et al (2011)]) and sequences matching transposase were considered as transposons belonging to the relevant superfamily and were incorporated into ModelerID.lib and excluded from Modelerunknown.lib.</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*Sequences in Modelerunknown.lib were searched against a transposase database (derived from [http://www.repeatmasker.org/ RepeatMasker] package and [http://www.ncbi.nlm.nih.gov/pubmed/21535899 Kennedy et al (2011)]) and sequences matching transposase were considered as transposons belonging to the relevant superfamily and were incorporated into ModelerID.lib and excluded from Modelerunknown.lib.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td></tr>
</table>
Mcampbell
https://weatherby.genetics.utah.edu/MAKER/wiki/index.php?title=Repeat_Library_Construction--Basic&diff=159&oldid=prev
Mcampbell: /* 1. Collecting repetitive sequences by RepeatModeler */
2013-09-27T20:11:02Z
<p><span dir="auto"><span class="autocomment">1. Collecting repetitive sequences by RepeatModeler</span></span></p>
<table class="diff diff-contentalign-left" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">Revision as of 20:11, 27 September 2013</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l15" >Line 15:</td>
<td colspan="2" class="diff-lineno">Line 15:</td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div> nohup DIR/RepeatModeler -database seqfiledb >& seqfile.out</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div> nohup DIR/RepeatModeler -database seqfiledb >& seqfile.out</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*Among the sequences generated by RepeatModeler, some were associated with identities and others were not. These with identities were put in ModelerID.lib and the others were in Modelerunknown.lib.</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*Among the sequences generated by RepeatModeler, some were associated with identities and others were not. These with identities were put in ModelerID.lib and the others were in Modelerunknown.lib.</div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>*Sequences in Modelerunknown.lib were searched against a transposase database (derived from [http://www.repeatmasker.org/ RepeatMasker] package and [http://www.ncbi.nlm.nih.gov/pubmed/21535899 Kennedy et al (2011)]) and sequences matching transposase were considered as transposons belonging to the relevant superfamily and were incorporated into ModelerID.lib and excluded from Modelerunknown.lib. </div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>*Sequences in Modelerunknown.lib were searched against a transposase database (derived from [http://www.repeatmasker.org/ RepeatMasker] package and [http://www.ncbi.nlm.nih.gov/pubmed/21535899 Kennedy et al (2011)]) and sequences matching transposase were considered as transposons belonging to the relevant superfamily and were incorporated into ModelerID.lib and excluded from Modelerunknown.lib.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>== 2. Exclusion of gene fragments ==</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>== 2. Exclusion of gene fragments ==</div></td></tr>
</table>
Mcampbell
https://weatherby.genetics.utah.edu/MAKER/wiki/index.php?title=Repeat_Library_Construction--Basic&diff=10&oldid=prev
Mlaw at 17:32, 30 May 2013
2013-05-30T17:32:12Z
<p></p>
<table class="diff diff-contentalign-left" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #222; text-align: center;">Revision as of 17:32, 30 May 2013</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l19" >Line 19:</td>
<td colspan="2" class="diff-lineno">Line 19:</td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>== 2. Exclusion of gene fragments ==</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>== 2. Exclusion of gene fragments ==</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"></td></tr>
<tr><td class='diff-marker'>−</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>*All repeats collected by RepeatModeler were used to search against a plant protein database where proteins from transposons were excluded. Sequences match the plants proteins as well as 50 bp flanking sequences were excluded. After the exclusion if the remainder sequences were shorter than 50 bp, the entire sequence was excluded. A package for conducting this task is available [here].</div></td><td class='diff-marker'>+</td><td style="color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>*All repeats collected by RepeatModeler were used to search against a plant protein database where proteins from transposons were excluded. Sequences match the plants proteins as well as 50 bp flanking sequences were excluded. After the exclusion if the remainder sequences were shorter than 50 bp, the entire sequence was excluded. A package for conducting this task is available [<ins class="diffchange diffchange-inline">http://weatherby.genetics.utah.edu/MAKER/data/ProtExcluder1.0.tar.gz </ins>here] <ins class="diffchange diffchange-inline">([http://weatherby.genetics.utah.edu/MAKER/data/ProtExcluderManual.docx manual])</ins>.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*After exclusion of putative gene fragments, ModelerID.lib were considered as know TE sequences (AtBasicTE.lib)</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*After exclusion of putative gene fragments, ModelerID.lib were considered as know TE sequences (AtBasicTE.lib)</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*AtBasicTE.lib was combined with Modelerunknown.lib (after exclusion of gene fragments) to form AtBasicAllRepeat.lib.</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*AtBasicTE.lib was combined with Modelerunknown.lib (after exclusion of gene fragments) to form AtBasicAllRepeat.lib.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*It is conceivable that AtBasicTE.lib does not contain all repeats (repeats numbers are underestimated); AtBasicAllRepeat.lib may contain sequences from novel gene families (Repeat number are overestimated).</div></td><td class='diff-marker'> </td><td style="background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>*It is conceivable that AtBasicTE.lib does not contain all repeats (repeats numbers are underestimated); AtBasicAllRepeat.lib may contain sequences from novel gene families (Repeat number are overestimated).</div></td></tr>
</table>
Mlaw
https://weatherby.genetics.utah.edu/MAKER/wiki/index.php?title=Repeat_Library_Construction--Basic&diff=7&oldid=prev
Mlaw: Created page with "''Content contributed by [http://www.hrt.msu.edu/ning-jiang/ Dr. Ning Jiang]'' Building custom repeat library for plant genomes – Basic protocol == 1. Collecting repetitiv..."
2013-05-29T23:06:14Z
<p>Created page with "''Content contributed by [http://www.hrt.msu.edu/ning-jiang/ Dr. Ning Jiang]'' Building custom repeat library for plant genomes – Basic protocol == 1. Collecting repetitiv..."</p>
<p><b>New page</b></p><div>''Content contributed by [http://www.hrt.msu.edu/ning-jiang/ Dr. Ning Jiang]''<br />
<br />
Building custom repeat library for plant genomes – Basic protocol<br />
<br />
== 1. Collecting repetitive sequences by [http://www.repeatmasker.org/RepeatModeler.html RepeatModeler] ==<br />
<br />
The genomic sequence (called seqfile) was processed by RepeatModeler<br />
<br />
First command:<br />
DIR/BuildDatabase -name umseqfiledb -engine ncbi seqfile<br />
*DIR = path where RepeatModeler is.<br />
*“-engine ncbi” refers to that NCBI blast program was used as alignment tool. <br />
<br />
Second command:<br />
nohup DIR/RepeatModeler -database seqfiledb >& seqfile.out<br />
*Among the sequences generated by RepeatModeler, some were associated with identities and others were not. These with identities were put in ModelerID.lib and the others were in Modelerunknown.lib.<br />
*Sequences in Modelerunknown.lib were searched against a transposase database (derived from [http://www.repeatmasker.org/ RepeatMasker] package and [http://www.ncbi.nlm.nih.gov/pubmed/21535899 Kennedy et al (2011)]) and sequences matching transposase were considered as transposons belonging to the relevant superfamily and were incorporated into ModelerID.lib and excluded from Modelerunknown.lib. <br />
<br />
== 2. Exclusion of gene fragments ==<br />
<br />
*All repeats collected by RepeatModeler were used to search against a plant protein database where proteins from transposons were excluded. Sequences match the plants proteins as well as 50 bp flanking sequences were excluded. After the exclusion if the remainder sequences were shorter than 50 bp, the entire sequence was excluded. A package for conducting this task is available [here].<br />
*After exclusion of putative gene fragments, ModelerID.lib were considered as know TE sequences (AtBasicTE.lib)<br />
*AtBasicTE.lib was combined with Modelerunknown.lib (after exclusion of gene fragments) to form AtBasicAllRepeat.lib.<br />
*It is conceivable that AtBasicTE.lib does not contain all repeats (repeats numbers are underestimated); AtBasicAllRepeat.lib may contain sequences from novel gene families (Repeat number are overestimated).</div>
Mlaw