Genomalysis User Guide
PDF Version (250 KB).
Table of Contents
- A Note
- Introduction
- Graphical Layout
- Filter Algorithms
- Notes on Algorithm Use
- Sequence Length Filter
- Regex Filter
- Transmembrane Prediction Filter
- Secretion Signal Filters
- Clustal Omega Filter
- A Primer on Regex
- Fasta Files
- Algorithms for DNA and Proteins Sequences
A Note
This is the user guide is for the final Java version of Genomalysis. The Genomalysis project is being migrated to Python and will likely be available in a Django web interface. The final Java version of Genomalysis has been focused on the core functions of mining and viewing proteomic and genomic sequences. A legacy version of Genomalysis, which includes more features (many of them not working), is available from the downloads section of this web site.Introduction
Genomalysis is a graphical user interface for mining proteomes and genomes for sequences of particular interest to the user. It is essentially a graphical wrapper that allows for the execution, in rapid succession, of third party algorithms. These are algorithms that test sequences in silico for specific properties: transmembrane segments, sequence patterns, secretion signals, etc. By automating the execution of such algorithms on an entire genome or proteome, it becomes much easier and faster for the user to extract sequences that have specific features. Genomalysis allows for the user to construct multiple rules using multiple algorithms in succession to mine for sequences that have multiple properties of interest. In addition to the mining functions, there is a viewing function that allows the user to open input and output files to view the sequences within. Genomalysis is currently implemented for Windows, is written in Java/Swing, and can mine genomic and proteomic information that is in the FASTA format. FASTA is a text only format that contains sequences with headers that describe what they are. These files can be opened by any text editor. The current filters available in Genomalysis are as follows:- Secretion Signal Filter: This filter tests for predicted secretion signals and their associated cleavage sites in protein sequences using the PrediSi algorithm. There are three implementations of this filter in Genomalysis: one for sequences from Gram negative bacteria, one for sequences from Gram positive bacteria, and one for sequences from eukaryotic cells. The PrediSi algorithm was developed by Hiller et al.
- Clustal Omega Filter: This allows the user to filter protein or DNA sequences based on various
parameters of alignment to a known sequence, parameters such as total number of identities,
strong groups and weak groups. Additional information about Clustal can be found at their web
page:
http://www.clustal.org/ - Regex Filter: This filter allows the user to apply regular expressions to test protein or DNA sequences. If you are unfamiliar with regular expressions, then read "A Primer on Regex" at the end of the filters section of this user guide.
- Transmembrane Prediction Filter: This filter tests protein sequences for predicted transmembrane segments using the single sequence version of the TMAP algorithm. The filter can be configured to find sequences that have a minimum and maximum number of transmembrane segments. The TMAP algorithm was developed by Persson and Argos.
- Sequence Length Filter: This filter tests protein or DNA sequences based on the number of monomers they contain. Sequences pass the filter if they are between a user designated minimum and maximum length.
Top of page Next page