The EBI has a FASTA email server.

To obtain instructions on how to utilise it just send an email message to fasta@ebi.ac.uk and in the main body of the message type the word help and you will receive instructions on how to submit jobs to the EBI FASTA email server.

This is the help file:


Date: Tue, 14 Aug 2001 06:20:42 +0100 (BST) To: opperdoes@trop.ucl.ac.be Subject: Fasta Help From: EMBL FASTA Server <FastA@ebi.ac.uk> Help [fasta] Introduction ------------ This is the main help text for using the European Bioinformatics Institute fasta3 email server. The fasta3 (1) program provides rapid and sensitive scanning of single protein or nucleic acid sequences against protein or nucleic acid sequence databases. Databases available ------------------- The following databases are available. These are the most recent and up-to-date databases produced at the EBI. Multiple choice is available. Please see the LIB section for details. Name Description Sequence Input -------------------------------------------------------------- SWALL SWALL NON-REDUNDANT Protein protein sequence database SWISSPROT SWISS-PROT Protein Database -"- SWNEW Updates to SWISS-PROT -"- TREMBL TREMBL (Translated EMBL) -"- TREMBLNEW TREMBLNEW -"- PDB Protein Structre database -"- SGT Structural Genomic Targets -"- -------------------------------------------------------------- EMBL The EMBL Database nucleic EFUN EMBL Fungi -"- EINV EMBL Invertebrates -"- EHUM EMBL Human -"- EMAM EMBL Mammalian -"- EORG EMBL Organelles -"- EPHG EMBL Phages -"- EPLN EMBL Plants -"- EPRO EMBL Prokaryote -"- EROD EMBL Rodents -"- ESTS EMBL STSs -"- ESYN EMBL Synthetic -"- EUNA EMBL Unclassified -"- EVRL EMBL Viral -"- EVRT EMBL Vertebrates -"- EEST EMBL ESTs -"- EGSS EMBL Genome Survey Sequences -"- EHTG EMBL High Throughput Genome Sequences -"- EMNEW EMBL New (Updates) -"- EMALL EMBL + EMBL New (Updates) -"- IMGT IMMUNOGENETICS DATABASE -"- HGBASE HUMAN GENIC BI-ALLELIC SEQUENCES AN EUROPEAN SNP DATABASE -"- Genomes and Proteomes: The following completed genomes and their corresponding proteomes are available for searching with fasta33: The proteomes are indentified by the taxonomy id. ***NOTE*** Please note that this new service is still under development. The address for submissions to search genomes and proteomes is test-fasta@ebi.ac.uk. Organims DNA lib. name Prot. lib. name ------------------------------------------------------------------ ##### Archea Archaeoglobus fulgidus 2234-g 2234-p Aeropyrum pernix K1 56636-g 56636-p Methanococcus jannaschii 2190-g 2190-p Methanobacterium thermoauto. 2166-g 2166-p Pyrococcus abyssi 29292-g 29292-p Pyrococcus horikoshii OT3 53953-g 53953-p Lactococcus lacti IL1403 1360-p #### Bacteria Aquifex aeolicus 63363-g 63363-p Bacillus subtilis 1423t-g 1423-p Borrelia burgdorferi 139-g 139-p Campylobacter jenuni 197-g 197-p Chlamydia muridarum 83560-g 83560-p Chlamydia pneumoniae 83558-g 83558-p Chlamydophila pneumoniae CWL029 115711-g 115711-p Chlamydophila pneumoniae ar39 115713-g 115713-p Chlamydophila pneumoniae J138 138677-g 138677-p Chlamydia trachomatis 813-g 813-p E. coli K-12 MG1655 562-g 562-p Haemophilus influenzae Rd 71421-g 71421-p Helicobacter pylori 26695 85962-g 85962-p Helicobacter pylori J99 85963-g 85963-p Mycobacterium tuberculosis 1773-g 1773-p Mycoplasma genitalium G37 2097-g 2097-p Mycoplasma pneumoniae 2104-g 2104-p Neisseria meningitidis sgA 122586-g 65699-p Neisseria meningitidis sgA 487-g 491-p Pasturella multocida PM70 747-p Rhizobium sp. NGR234 Plasmid pNGR234-g pNGR234-p Rickettsia prowazekii 782-g 782-p Synechocystis PCC6803 1148-g 1148-p Treponema pallidum 160-g 160-p Ureaplasma urealyticum 2130-g 2130-p Xylella fastidiosa 2371-g 2371-p Vibrio cholerae 666-g 666-p Pseudomonas aeruginosa 287-g 287-p Bacillus halodurans 86665-g 86665-p Deinococcus radiodurans 813-g 813-p Thermotoga maritima 1148-g 1148-p Buchnera sp. 17806-g 17806-p ### Eukaryots Caenorhabditis elegans 6239-g 6239-p Drosophila melanogaster 7227-g 7227-p Saccharomyces cerevisiae 4932-g 4932-p Homo sapiens (proteome) 9606-p Using the fasta email server --------------------------- Using fasta through email is simply. Send a properly formatted normal mail message to FASTA@EBI.AC.UK and wait for the results to drop into your mailbox. Please, don't send interactive messages, the software can't handle them! The Input Format ---------------- Since blast through email is an automatic process without any human intervention it only understands a limited set of commands. Thus you have to adhere to a well-defined syntax, which is pretty easy to learn and understand and should not cause any problems. Some general rules are: -Your mail message must contain only one command per line. -There is only one mandatory command, SEQ. All the other commands are optional, and default values will be used whenever they are not specified. -You can use both uppercase and lowercase characters, or mix them. -The order of the commands is not important, but make sure that SEQ is the last one, since everything following this line will be treated as a sequence (see below). -Blank lines or space characters are accepted. Here is a list of valid commands that are accepted by email server: HELP You know what it's for, don't you ? PATH This will normally not be required but if you want to email server to send results somewhere else type that email address here. Example: PATH joe@somewhere.there TITLE If you want to identify your search with a title please type that description here: Example: TITLE gpr-ii-rpt PROGRAM Use this option to set the name of the program you want to use. The table below explains the funtion of each of the available programs: PROGRAM NAME FUNCTION ------------------------------------------------------------- fasta3 scan a protein or DNA sequence against a corresponding protein or DNA library fastx3 compare a DNA sequence to a protein library, comparing the translated DNA sequence in forward and reverse frames. fasty3 as above. tfastx3 compares a protein to a translated DNA library. tfasty3 as above. fasts3 compares linked peptides from mass-spectrometry of a protein to a protein library. fastf3 compares mixed peptides obtained by Edman degradation of a CNBr cleavage of a protein to a protein library. NOTE: Please read the FORMATS section at the bottom of this document for details about how to submit sequence to fastf3 and fasts3! ------------------------------------------------------------- The default program is fasta3. If no program command is specified fasta3 will be used. Example: PROGRAM fastf3 LIB LIB can be one of the following (the default is EMALL for DNA sequences or SWALL for protein sequence): Please referr to the list of available databases above for the names and descriptions of databases you may use. Example: LIB tremblnew Multiple databases can be chosen by using "+" as list separator. No spaces please! Example: LIB ehum+emam This would search the human and mammalian divisions of the current release of EMBL. Please note that multiple database choice does not allow for combinations of the following type: LIB EFUN+EPRO+EVRL. If you specify EMBL or EMALL you have already specified all divisions! Choosing EMBL already includes all EMBL divisions in the search. When such cases are detected by the system only EFUN+EPRO+EVRL will be searched. If you want to search all divisions please specify EMBL alone. Please refer to the available databases section for details. SEQ Your sequence itself. Example: SEQ MPNIPTISLNDGRPFAEPGLGTYNLRGDEGVAAMVAAIDSGYRLLDTAVNYENESEVGRA VRASSVDRDELIVASKIPGRQHGRAEAVDSIRGSLDRLGLDVIDLQLIHWPNPSVGRWLD TWRGMIDAREAGLVRSIGVSNFTEPMLKTLIDETGVTPAVNQVELHPYFPQAA END END This is required in order to tell the server program where the sequence ends. Please see the SEQ command for an example. EXAMPLE OF A SIMPLE SUBMISSION ------------------------------ PATH joe@somewhere.there TITLE My Sequence LIB swall SEQ MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNAQDFC TDLAVSSANFIPTVTAISTSPDLQWLVQPALVSSVAPSQTRAPHPFGVPA PSAGAYSRAGVVKTMTGGRAQSIGRRGKVEQLSPEEEEKRRIRRERNKMA AAKCRNRRRELTDTLQAETDQLEDEKSALQTEIANLLKEKEKLEFILAAH RPACKIPDDLGFPEEMSVASLDLTGGLPEVATPESEEAFTLPLLNDPEPK PSVEPVKSISSMELKTEPFDDFLFPASSRPSGSETARSVPDMDLSGSFYA ADWEPLHSGSLGMGPMATELEPLCTPVVTCTPSCTAYTSSFVFTYPEADS FPSCAAAHRKGSSSNEPSSDSLSSPTLLAL END The following commands are not compulsory. Defaults will be generated with the mail server so you do not have to include these in your mail if you do not wish to do so. MATRIX You may decide to use another matrix from the default blosum62. Specify the name of the matrix here. Values accepted are: blosum50, blosum62, blosum80, pam120, pam250, mdm10, mdm20 and mdm40. Example: MATRIX pam250 WORD Word (also know as ktup) size. This is the size of words that will be used during the comparison. For protein and nucleic acid searches the defaults are 2 and 6 respectively. These values may be set smaller. This will increase the sensitivity of the search but dramatically increase the search time. We recommend you use the default values. However, if you are submitting a very short DNA sequence (less than 30 nucleotides) you should consider changing the ktup from 6 to 1. This will significantly increase the sensitivity of the program without compromising the biological significans of the results. Example: WORD 4 ALIGN The number of alignments that will be returned in the output file. The default is set to 25. Example: ALIGN 50 LIST Setting this option to any number available in the menu allows you to set to maximum number of reported scores in the output file. The default is 50. Example: LIST 25 STRAND This is nucleic acid only option which tells the program to search the reverse complement of your DNA query sequence. Valid options are 'both', 'top' and 'bottom'. Example: STRAND bottom HISTOGRAM This will make the program include in it's output a histogram. Fasta3 calculates two scores called Init1 and Initn for each comparison between the query sequence and sequences in the database. The histogram show how many comparisons were observed with a certain score. Example: HISTOGRAM yes Formats ------- Fastf3 and Fasts3 require the following format: SEQ MILGY, MLLEY, MGDAP, MLCYN END After the SEQ command, lines with individual peptides termated by a comma may follow except for the last peptide. The END command follows after the last peptide. References ----------- (1) W. R. Pearson and D. J. Lipman (1988), "Improved Tools for Biological Sequence Analysis", PNAS 85:2444- 2448, W. R. Pearson (1990), "Rapid and Sensitive Sequence Comparison with FASTP and FASTA", Methods in Enzymology 183:63-98. Fasta3 and Blast services at the EBI Rodrigo Lopez EMBL Outstation, The European Bioinformatics Institute. embnet.news Vol. 7.1 (1997) Contacts -------- European Bioinformatics Institute Services Programme (Support) Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD, UK Tel: +44 (0) 1223-494444 Fax: +44 (0) 1223-494468 support@ebi.ac.uk http://www.ebi.ac.uk/fasta3 William R. Pearson Department of Biochemistry Box 440, Jordan Hall U. of Virginia Charlottesville, VA