|
|
researchv10 Dan Cross
GBREL.TXT Genetic Sequence Data Bank
15 December 1990
GenBank(R) Release 66.0
Distribution Tape Release Notes
41057 loci, 51306092 bases, from 50908 reported sequences
This document describes the data written on GenBank distribution
tapes. The examples used are from the current release. If you have any
questions or comments about the data bank, the distribution tape, or
this document, please call (415)962-7364 or write to:
GenBank
c/o IntelliGenetics Inc.
700 East El Camino Real
Mountain View, California 94040
USA
The electronic mail address is: [email protected]
1. INTRODUCTION
1.1 Release 66.0
Release 66.0 has 41,057 loci representing 51,306,092 bases. Release
65.0 had 39,553 loci with 49,179,285 bases. Release 66.0 thus is 4.3%
larger in bases than Release 65.0. A statistical summary of Release
66.0 is presented in Appendix A.
1.2 Organization of This Document
This introduction notes the changes to GenBank since the last release.
The next section describes the contents of the tape files. The third
section illustrates the formats of the tape files. The fourth section
describes the proposed changes planned for future releases. The fifth
section describes known problems in this release. The last section
contains notes about the administration of GenBank.
1.3 Recent Changes in the Data Bank
1.3.1 Changes in This Release
1.3.1.1 Data from EMBL and the DNA Data Bank of Japan
New sequence data from EMBL Release 23 have been incorporated into
this release of GenBank. Sequence data from Release 7.0 of the DNA
Data Bank of Japan (DDBJ) have also been incorporated into this
release. Entries with accession numbers beginning with the letter `D'
have been created and annotated by DDBJ. Release 66 contains
approximately 467 entries from DDBJ.
1.3.1.2 Changes in the Source and Definition Lines
The Source and Definition lines for new entries are now generated by
the GenBank database software from information in the GenBank
relational database, rather than being entered by an annotator.
These lines include information such as the organism name, the name
and type of the molecule, and the gene product. An example is:
DEFINITION A.auricula-judae (mushroom) 5S ribosomal RNA.
SOURCE A.auricula-judae (mushroom) ribosomal RNA.
In Release 66.0, these lines contain the same information, but have a
format that is less English-like. This change only applies to new
entries.
1.3.1.3 Changes in Locus Names
Several locus names have been changed to make the application of the
organism codes consistent throughout the data bank. These changes are
listed in Appendix B.
1.3.1.4 Date on LOCUS Line
The date on the LOCUS line now indicates the actual date on which the
data first appeared in the GenBank relational database or the date of
the last revision. Previously, this was the date of the release in
which the data first appeared or was revised.
1.3.1.5 Release 66 "Close-of-Data"
The freeze date for data to appear in Release 66 was November 22,
1990. This is the date on which flatfile generation began. The process
of converting the data from the relational database into the flatfile
ASCII format takes several days to complete. New data continue to be
added during this time; if data are added before their division has
been processed, they may appear in the release (even though dated
after the freeze date).
1.3.2 Changes in Earlier Releases
The following changes in GenBank format were implemented in previous
releases and described in their Release Notes. These changes are
described here again for those users who may not have received those
releases.
1.3.2.1 International Feature Table Format (Release 64)
The EMBL Data Library and GenBank, with the assistance of the DNA Data
Bank of Japan, have developed a standard feature table to be
implemented by all three data banks. The new feature table is designed
to be more understandable and useful. The common feature table will
also make software development easier and allow simpler data
conversion between data banks.
The new feature table was implemented in Release 64.0 (June 1990).
Details of the new format are described in a document entitled: `The
DDBJ/EMBL/GenBank feature table: Definition, Version 1.01, September
10, 1988.' Copies of this document are available on request at the
address given on the first page of these release notes.
Section 3.5.11 provides further information on the new feature table
format.
1.3.2.2 Minor Differences in GenBank Format (Release 64)
Beginning in the second quarter of 1990, the GenBank data bank has
been maintained in relational format. The data bank will continue to
be distributed in the standard flatfile format described in this
document. Starting with Release 64.0, the standard flatfile GenBank
data bank (generated from the relational data base tables) contains a
few minor format differences from previous releases. These differences
are described in the remainder of this section.
Information about the relational database format can be requested by
calling (505) 665-2177 or by writing to:
GenBank
Group T-10, Mail Stop K710
Los Alamos National Laboratories
Los Alamos, NM 87545
1.3.2.2.1 Keyword Order
The order of keyword phrases on the KEYWORDS line is alphabetical.
1.3.2.2.2 Accession Number Order
The order of non-primary accession numbers on the ACCESSION line is
not necessarily preserved from one release to the next.
1.3.2.2.3 Reference Order
The order of the references in an entry is not necessarily preserved
from one release to the next.
1.3.2.2.4 Changes in Taxonomy
The taxonomic classification of many organisms was changed to ensure
uniformity throughout the data bank. A few organisms are listed as
unclassified as a result of inconsistencies in the data. The
annotation staff is addressing these inconsistencies.
1.3.2.2.5 Inconsistencies in Reference Information
Inconsistencies in reference information have not been preserved. All
occurrences of a single reference are identical, usually matching the
first occurrence in the data bank.
1.3.2.2.6 Formatting Changes
Certain text fields (for example, definition, comment, title, etc.)
have been reformatted slightly, in some entries. In most cases, the
actual text remains unchanged.
1.3.2.2.7 Comment Field Formatting
Any special formatting in the comment field may not be preserved. This
may be corrected in future releases.
1.3.2.2.8 Reference Line Changes
The reference lines for sites and review papers have been slightly
modified. All of these papers are classified as `sites', with the
original text describing the citation appearing in the comment field.
Also, when a reference is cited elsewhere in an entry, in most cases
the entire citation appears in square brackets. Previously, only the
number of the reference appeared in square brackets.
2. ORGANIZATION OF TAPE FILES
2.1 Tape Formats
The GenBank data bank is available in three formats on three different
physical media (see Section 5.4 for further details on which formats
are available on each medium), and on CD ROM.
GenBank is available on 9-track, unlabelled, industry-standard, ASCII
magnetic tapes. These tapes have been written in fixed-length records
of 80 characters, each with no carriage-return or line-feed
characters. Each record corresponds to one line in the data bank;
trailing blanks have been added to the lines to make them all exactly
80 characters long. (A completely blank line is therefore represented
by 80 blanks.)
The label affixed to the tape reel indicates its block size and
density. If no specifications are received from you, the tape is
written with a fixed block size of 160 records (12,800 characters) and
a density of 6250 bpi (bits per inch). We also offer tapes written at
a density of 1600 bpi and a block size of 40 records (3200
characters).
GenBank is also available as a VAX/VMS Backup saveset (on 9-track
tapes or TK-50 cartridges) or as compressed Unix tar archives (on 9
track tapes and Sun 1/4" QIC 24 format tape cartridges).
The GenBank tape distribution files are also available on ISO-9660
compatible CD ROM. The data are written as ASCII files with variable
length records. Each record corresponds to one line in the data bank
and ends with a carriage return and a line-feed character.
The data on the tapes have both uppercase and lowercase characters.
Upon special request, the unlabelled, 9 track tapes can be written
using uppercase characters only (Section 6.4 specifies which formats
are available in uppercase only).
2.2 Files
GenBank consists of twenty-two files in all magnetic tape
distributions. The list which follows describes each of the files
included in the distribution. In the following sections there are
additional lists indicating the breakdown of files on the various
media and formats.
2.2.1 File Descriptions
1. GBREL.TXT - Release notes (this document).
2. GBSDR.TXT - Short directory of the data bank.
3. GBNEW.TXT - List of new or substantially revised entries.
4. GBACC.IDX - Index of the entries according to accession number.
5. GBKEY.IDX - Index of the entries according to keyword phrase.
6. GBAUT.IDX - Index of the entries according to author.
7. GBJOU.IDX - Index of the entries according to journal citation.
8. GBHGM.IDX - Index of the entries according to gene symbol.
9. GBDAT.FRM - Forms for submitting sequences or corrections to GenBank.
10. GBPRI.SEQ - Primate sequence entries.
11. GBROD.SEQ - Rodent sequence entries.
12. GBMAM.SEQ - Other mammalian sequence entries.
13. GBVRT.SEQ - Other vertebrate sequence entries.
14. GBINV.SEQ - Invertebrate sequence entries.
15. GBPLN.SEQ - Plant sequence entries (including fungi and algae).
16. GBORG.SEQ - Eukaryotic organelle sequence entries.
17. GBBCT.SEQ - Bacterial sequence entries.
18. GBRNA.SEQ - Structural RNA sequence entries.
19. GBVRL.SEQ - Viral sequence entries.
20. GBPHG.SEQ - Phage sequence entries.
21. GBSYN.SEQ - Synthetic and chimeric sequence entries.
22. GBUNA.SEQ - Unannotated sequence entries.
2.2.2 Fixed Length Records
Approximately 197 MB of disk space is required for the Release 66.0
files in fixed-length record format. All the files fit on two 6250 bpi
tapes and are divided between the tapes as follows.
Tape 1
GBREL.TXT
GBSDR.TXT
GBNEW.TXT
GBACC.IDX
GBKEY.IDX
GBAUT.IDX
GBJOU.IDX
GBHGM.IDX
GBDAT.FRM
GBPRI.SEQ
GBROD.SEQ
GBMAM.SEQ
GBVRT.SEQ
GBINV.SEQ
GBPLN.SEQ
GBORG.SEQ
Tape 2
GBBCT.SEQ
GBRNA.SEQ
GBVRL.SEQ
GBPHG.SEQ
GBSYN.SEQ
GBUNA.SEQ
At 1600 bpi, seven tapes are required and the files are divided among
the tapes as follows:
Tape 1 Tape 4
GBREL.TXT GBVRT.SEQ
GBSDR.TXT GBBCT.SEQ
GBNEW.TXT
GBACC.IDX
GBKEY.IDX Tape 5
GBAUT.IDX
GBJOU.IDX GBINV.SEQ
GBHGM.IDX GBPLN.SEQ
GBDAT.FRM GBPHG.SEQ
GBMAM.SEQ
Tape 6
Tape 2
GBVRL.SEQ
GBPRI.SEQ GBSYN.SEQ
Tape 3 Tape 7
GBROD.SEQ GBUNA.SEQ
GBORG.SEQ
GBRNA.SEQ
2.2.3 VAX/VMS Backup Saveset
Saveset files are in directory order rather than in the order shown
for the formats above. The files are in compressed format (See Section
1.3.1.2 for details). Approximately 139 MB of disk space is required
for Release 66.0 files in VAX/VMS Backup Saveset format. The files
archived in the Backup Saveset use variable-length records, not the
80-character fixed-length records described above. All files fit on
one 6250 bpi tape. At 1600 bpi, two tapes are required. The division
of the files between the two tapes was not available at the time these
Release Notes were prepared. The files will appear in the following
order:
AAAREADME.TXT
DCOMPRESS.CLD
DCOMPRESS.EXE
DECMPRESS.COM
GBACC_IDX.Z
GBAUT_IDX.Z
GBBCT_SEQ.Z
GBDAT_FRM.Z
GBHGM_IDX.Z
GBINV_SEQ.Z
GBJOU_IDX.Z
GBKEY_IDX.Z
GBMAM_SEQ.Z
GBNEW_TXT.Z
GBORG_SEQ.Z
GBPHG_SEQ.Z
GBPLN_SEQ.Z
GBPRI_SEQ.Z
GBREL_TXT.Z
GBRNA_SEQ.Z
GBROD_SEQ.Z
GBSDR_TXT.Z
GBSYN_SEQ.Z
GBUNA_SEQ.Z
GBVRL_SEQ.Z
GBVRT_SEQ.Z
NOTE: When the files are uncompressed (as instructed in Section 2.3)
the `.Z' will be removed from the end of the file name and the
characters after the underscore will become the file extension. For
example, `GBACC_IDX.Z' will be named `GBACC.IDX'.
One TK-50 cartridge is required; the files are in directory order and
are compressed as described above.
2.2.4 Unix tar Format
The files are compressed with the Unix compress utility before the tar
command is executed; they must therefore be uncompressed before use
(see Section 2.4 below for details). Approximately 45 MB of disk space
is required for the Release 66.0 files when in the compressed format;
the uncompressed files require approximately 139 MB.
The tar file uses variable length records; the records are not padded
to 80 characters with space characters. To get fixed-length,
80-character records, first uncompress the.Z files. Then use dd with
the conv=block and cbs=80 options set to filter the file. If you pad
the records, it adds approximately 58 MB of disk space.
In the Unix tar file, the files are in directory order rather than in
the order shown for the fixed-length record formats. In addition, the
file names are in lowercase letters. All files fit on one 6250 bpi
tape or Sun cartridge. At 1600 bpi, two tapes are required, and the
files are divided between the tapes as follows:
Unix Tar File Order:
Tape 1
gbacc.idx.Z
gbaut.idx.Z
gbbct.seq.Z
gbdat.frm.Z
gbhgm.idx.Z
gbinv.seq.Z
gbjou.idx.Z
gbkey.idx.Z
gbmam.seq.Z
gbnew.txt.Z
gborg.seq.Z
gbphg.seq.Z
Tape 2
gbpln.seq.Z
gbpri.seq.Z
gbrel.txt.Z
gbrna.seq.Z
gbrod.seq.Z
gbsdr.txt.Z
gbsyn.seq.Z
gbuna.seq.Z
gbvrl.seq.Z
gbvrt.seq.Z
NOTE: When the files are uncompressed the `.Z' extension is removed
from the file names.
2.2.5 File Sizes
The following table indicates the approximate sizes of the individual
files in this release. Since minor changes to some of the files may
occur after the release notes are printed, these sizes should not be
used to determine
file integrity. They are provided as an aid to planning only. The
columns in the table have the following meanings:
(1) - Sizes (in bytes) of the fixed-length record files (described in
Section 2.2.2)
(2) - Sizes (in bytes) of the compressed files included in the Unix
tarfile (Section 2.2.4)
(3) - Sizes (in bytes) of the files in the Unix tarfile after
uncompression (Section 2.2.4)
(4) - Sizes (in blocks) of the compressed files included in the VMS
Backup saveset (Section 2.2.3 and 2.4)
(5) - Sizes (in blocks) of the files in the VMS Backup saveset after
decompression (Sections 2.2.3 and 2.3)
File (1) (2) (3) (4) (5)__
GBACC.IDX 3616640 558701 1641276 1101 3294
GBAUT.IDX 9839040 1978063 5541281 4040 11106
GBBCT.SEQ 20356960 4995239 15062023 10510 30263
GBDAT.FRM 39600 8450 21155 19 43
GBHGM.IDX 168160 42557 127617 86 254
GBINV.SEQ 13488080 3145845 9751044 6643 19599
GBJOU.IDX 4474400 750259 2454640 1530 4928
GBKEY.IDX 3676480 866411 2497752 1754 4981
GBMAM.SEQ 6563120 1524109 4708672 3223 9463
GBNEW.TXT 96800 16960 55668 37 113
GBORG.SEQ 5763760 1375541 4241530 2893 8524
GBPHG.SEQ 2378720 560271 1671525 1179 3363
GBPLN.SEQ 14084400 3392955 10325013 7124 20746
GBPRI.SEQ 32726160 7473028 23428438 15823 47079
GBREL.TXT 367680 96837 298595 201 600
GBRNA.SEQ 4694720 821816 2794838 1798 5636
GBROD.SEQ 30187600 6727065 21356518 14252 42920
GBSDR.TXT 3289440 1074335 3285363 2181 6578
GBSYN.SEQ 2838560 611437 1841890 1308 3706
GBUNA.SEQ 11974000 2752439 8627808 5717 17370
GBVRL.SEQ 18141760 4437537 13647843 9285 27420
GBVRT.SEQ 7638320 1747319 5408466 3705 10877
AAAREADME.TXT 2 2
DCOMPRESS.CLD 4 4
DCOMPRESS.EXE 150 150
DECMPRESS.COM 2 2
Totals 196404400 44957174 138788955 94567 279021
NOTE: The sizes of the CD ROM files are approximately the same as
those of the uncompressed Unix tar files (Column 3). The addition of
carriage-return/line-feed characters at the end of each line in the CD
ROM files increases the total size of the distribution by
approximately 3 Mb.
2.3 Loading Data Bank Files in VAX/VMS Backup Format
In order to use the VAX/VMS Backup Saveset format, you must be running
release 5.0 or greater of the VMS operating system. If you are not
running release 5.0 or greater, you should order the unlabelled ASCII
format instead of VAX/VMS Backup.
The following command should be used to load the saveset into the
current directory on your disk:
BACKUP/LOG MSA0:GENBANK []
(NOTE: Replace `MSA0' with the identifier for your disk.)
The following command should be used to uncompress the files. NOTE: If
you do not want to keep all of the files, delete those you do not want
before you run the uncompress procedure. The uncompress routine works
on all the files in the directory that have a `.Z' extension.
@DECMPRESS
The following commands were used to create the VAX/VMS Backup Saveset.
NOTE: The `...' indicates that the following line is a continuation
and should be typed without a break.
For 6250 bpi tape: BACKUP/DENSITY=6250/BUFFER=5/VERIFY/INTERCHANGE/...
LIST=GB.LST GB1:[GENBANK.PROD]GB*.* TAPE:GENBANK
For 1600 bpi tape: BACKUP/DENSITY=1600/BUFFER=5/VERIFY/INTERCHANGE/...
LIST=GB.LST GB1:[GENBANK.PROD]GB*.* TAPE:GENBANK
For TK-50 cartridge: BACKUP/BUFFER=5/VERIFY/INTERCHANGE/...
LIST=GB.LST GB1:[GENBANK.PROD]GB*.* TAPE:GENBANK
2.4 Loading Data Bank Files in Unix tar Format
The following commands should be used to load the Unix tar files into
the current directory on your disk:
tar xvfb /dev/rmt8 126 gb*.Z
uncompress gb*.Z
(NOTE: Replace `rmt8' with the identifier for your device.)
The following command was used to write the tarfile on the
distribution tape:
For 6250 and 1600 bpi tapes (execute the command twice, once for each
tape, for 1600 bpi):
tar cvfb /dev/rmt8 20 gb*.Z
For Sun cartridge:
tar cvfb /dev/rst8 126 gb*.Z
3. FILE FORMATS
3.1 File Header Information
Each of the twenty-two files on the distribution tape begins with the
same header, except for the first line, which contains the file name,
and the sixth line, which contains the title of the file. The first
line of the file contains the file name in character positions 1 to 9
and the full data bank name (Genetic Sequence Data Bank) starting in
column 20. The brief names of the files in this release are listed in
section 2.2.
The second line contains the date of the current release in the form
`day month year', beginning in position 26. The fourth line contains
the current GenBank release number. The release number appears in
positions 41 to 45 and consists of two numbers separated by a decimal
point. The number to the left of the decimal is the major release
number. The digit to the right of the decimal indicates the version of
the major release; it is zero for the first version. The sixth line
contains a title for the file. The eighth line lists the number of
entries (loci), number of bases (or base pairs), and number of reports
of sequences in this release of GenBank. These numbers are
right-justified at fixed positions. The number of entries appears in
positions 1 to 7, the number of bases in positions 15 to 22, and the
number of reports in positions 36 to 40. (There are more reports of
sequences than entries since reported sequences that overlap or
duplicate each other are combined into single entries.) The third,
fifth, seventh, and ninth lines are blank.
1 10 20 30 40 50 60 70 79
---------+---------+---------+---------+---------+---------+---------+---------
GBACC.IDX Genetic Sequence Data Bank
15 December 1990
GenBank(R) Release 66.0
Accession Number Index
41057 loci, 51306092 bases, from 50908 reported sequences
---------+---------+---------+---------+---------+---------+---------+---------
1 10 20 30 40 50 60 70 79
Example 1. Sample File Header
3.2 Directory Files
3.2.1 Short Directory File
The short directory file contains brief descriptions of all of the
sequence entries contained in this release. These descriptions are in
thirteen groups, one group for each of the thirteen sequence entry
data files. The first record at the beginning of a group of entries
contains the name of the group in uppercase characters, beginning in
position 21. The organism groups are PRIMATE, RODENT, OTHER MAMMAL,
OTHER VERTEBRATE, INVERTEBRATE, PLANT, ORGANELLE, BACTERIAL,
STRUCTURAL RNA, VIRAL, PHAGE, SYNTHETIC, or UNANNOTATED. The second
record is blank.
Each record in the short directory contains the sequence entry name
(LOCUS) in the first 12 positions, followed by a brief definition of
the sequence beginning in column 13. The definition is truncated (at
the end of a word) to leave room at the right margin for at least one
space, the sequence length, and the letters `bp'. The length of the
sequence is printed right-justified to column 77, followed by the
letters `bp' in columns 78 and 79. The next-to-last record for a group
has `ZZZZZZZZZZ' in its first ten positions (where the entry name
would normally appear). The last record is a blank line. An example of
the short directory file format, showing the descriptions of the last
entries in the Other Vertebrate sequence data file and the first
entries of the Invertebrate sequence data file, is reproduced below:
1 10 20 30 40 50 60 70 79
---------+---------+---------+---------+---------+---------+---------+---------
ZEFHOX21 Zebrafish Hox-2.1 gene homologue (ZF-21). 291bp
ZEFRZF21 Zebrafish mRNA for homeotic protein ZF-21. 2073bp
ZEFZF54 Zebrafish homeotic gene ZF-54. 246bp
ZEFZFEN Zebrafish engrailed-like homeobox sequence. 327bp
ZZZZZZZZZZ
INVERTEBRATE
ACAACTI Amoeba (A. castellanii) actin gene-i. 1571bp
ACAJJE A.castellanii 18S ribosomal RNA. 241bp
ACAJJEA A.castellanii 18S ribosomal RNA. 258bp
ACAJJEB A.castellanii 18S ribosomal RNA. 257bp
---------+---------+---------+---------+---------+---------+---------+---------
1 10 20 30 40 50 60 70 79
Example 2. Short Directory File
3.2.2 New and Updated Entry File
The directory of new and updated entries is a list of those entries
that have been newly added or that have undergone substantive revision
in this release. These entries are listed in the same order in which
they appear in the actual data files; they are divided into thirteen
groups, one group for each of the thirteen sequence entry data files.
The first record at the beginning of a group of entries designates
that group, beginning in position 21. The second record is blank and
the third record has asterisks in its first ten positions. Within each
group, the entries are listed alphabetically. For each entry, the new
and updated entry file gives the information included under the LOCUS
and DEFINITION keywords in the same format in which they appear in the
actual sequence entry; these categories are described in section
3.5.2. After the last record of an entry comes a record containing
asterisks in its first ten positions. At the end of each group, a
dummy entry contains only a LOCUS line with the entry name
`ZZZZZZZZZZ'. Therefore, the next-to-last record has ten asterisks in
its first ten positions; the last record of the group is blank.
The following excerpt from the current release shows the last new or
revised entry from the Other Vertebrate sequence data file, followed
by the first new or revised entry from the Invertebrate sequence data
file:
1 10 20 30 40 50 60 70 79
---------+---------+---------+---------+---------+---------+---------+---------
**********
LOCUS RANCRYR23 266 bp ds-DNA VRT 20-SEP-1990
DEFINITION R.temporaria rho-crystallin gene, exon X.
**********
LOCUS ZZZZZZZZZZ
**********
INVERTEBRATE
**********
LOCUS ACAJJE 241 bp ss-rRNA INV 05-NOV-1990
DEFINITION A.castellanii 18S ribosomal RNA.
**********
---------+---------+---------+---------+---------+---------+---------+---------
1 10 20 30 40 50 60 70 79
Example 3. New and Updated Entry File
3.3 Index Files
There are five files containing indices to the entries in this
release:
Accession number index file
Keyword phrase index file
Author name index file
Journal citation index file
Gene symbol index file
The index keys (accession numbers, keywords, authors, journals, and
gene symbols.) of an index are sorted alphabetically. (The index keys
for the keyword phrases and author names appear in uppercase
characters even though they appear in mixed case in the sequence
entries.) Under each index key, the names of the sequence entries
containing that index key are listed alphabetically. Each sequence
name is also followed by its data file division and primary accession
number. The following codes are used to designate the data file
divisions:
1. PRI - primates
2. ROD - rodents
3. MAM - other mammals
4. VRT - other vertebrates
5. INV - invertebrates
6. PLN - plants, fungi, and algae
7. ORG - organelles
8. BCT - bacteria
9. RNA - structural RNAs
10. VRL - viruses
11. PHG - bacteriophage
12. SYN - synthetic sequences
13. UNA - unannotated sequences
The index key begins in column 1 of a record. An 11-character field
for the sequence entry name starts in position 14 of a record,
followed by a 3-character field for the data file division, starting
at position 25 and ending at position 27, and a 6-character field for
the primary accession number, starting at position 29 and ending at
position 34. All entries in the fields are left-justified.
Beginning at positions 36 and 58, the three fields repeat, so three
sets of sequence information can appear in one record. If there are
more than three entry names, the next records are used; the index key
is not repeated. For the accession number and human gene symbol index
files, the entry names begin in the same record as the index key,
since the key is always less than 12 characters. In the other index
files, the entry names begin on the record following the index key
record.
3.3.1 Accession Number Index File
Accession numbers consist of a single letter followed by five digits.
They provide an unchanging designation for the data with which they
are associated, and we encourage you to cite accession numbers
whenever you refer to data from the data bank. The primary accession
number is the first accession number of an entry. It is unique to that
entry. Citation of that number will enable other investigators to
locate the data no matter what entry name changes or other data bank
reorganizations may occur. The accession numbers, however, carry no
intrinsic information about the data.
In addition to the primary accession number, some entries have
secondary accession numbers. Secondary accession numbers arise for a
number of reasons. For example, a single accession number may
initially be assigned to the sequence in an article. If it is later
discovered that the sequence must be entered into the data bank as
multiple entries, each entry would receive a new primary accession
number; the previous accession number would appear as the secondary
accession number in each entry.
The following excerpt from the accession number index file illustrates
the format of the index:
1 10 20 30 40 50 60 70 79
---------+---------+---------+---------+---------+---------+---------+---------
J00316 HUMTBB11P PRI J00316
J00317 HUMTBB46P PRI J00317
J00318 HUMUG1 PRI J00318
J00319 HUMUG1PA PRI J00319
J00320 HUMVIPMR1 PRI L00154 HUMVIPMR2 PRI L00155 HUMVIPMR3 PRI L00156
HUMVIPMR4 PRI L00157 HUMVIPMR5 PRI L00158
J00321 BABA1AT PRI J00321
J00322 CHPRSA PRI J00322
J00323 AGMRSASPC PRI J00323
J00324 BABATIII PRI J00324
---------+---------+---------+---------+---------+---------+---------+---------
1 10 20 30 40 50 60 70 79
Example 4. Accession Number Index File
If the same accession number is found in more than one entry (a result
of the infrequent occasions when a single entry is split into two or
more separate entries), then the additional entries and groups in
which the number appears are also given.
3.3.2 Keyword Phrase Index File
Keyword phrases consist of names for gene products and other
characteristics of sequence entries. There are approximately 12,000
keyword phrases. An excerpt from the keyword phrase index file is
shown below:
1 10 20 30 40 50 60 70 79
---------+---------+---------+---------+---------+---------+---------+---------
DNA GYRASE
ECOGYRA BCT X06744 ECORECF BCT K02179 ECORECFA BCT X04341
DNA HELICASE
ECOHELIV BCT J04726 ECOUVRD BCT X00738
DNA INVERTASE
ECOPIN BCT K00676 ECOPINP BCT K03521 PMUGINMOM PHG V01463
STAINVSA BCT M36694
DNA LIGASE
ECOLIG BCT M24278 ECOLIGA BCT M30255 PT4G30 PHG X00039
PT6LIG55 PHG M38465 PT7CG PHG J02518 YSCCDC9 PLN X03246
YSPCDC17 PLN X05107
DNA LIGASE I
HUMLIGAA PRI M36067
DNA MATURATION
HS1CAS VRL M22962
DNA METHYLASE
HEHMTS BCT J02677
DNA METHYLATION
HEHMTS BCT J02677 HUMSPM1 PRI X06585 HUMSPM2 PRI X06586
HUMSPM3 PRI X06587 HUMSPM4 PRI X06588 HUMSPM5 PRI X07490
HUMSPM6 PRI X07491 HUMSPM7 PRI X07492 HUMSPM8 PRI X07493
HUMSPM9 PRI X07494
---------+---------+---------+---------+---------+---------+---------+---------
1 10 20 30 40 50 60 70 79
Example 5. Keyword Phrase Index File
3.3.3 Author Name Index File
The author name index file lists all of the author names that appear
in the citations. An excerpt from the author name index file is shown
below:
1 10 20 30 40 50 60 70 79
---------+---------+---------+---------+---------+---------+---------+---------
JACKOWSKI,S.
ECOPANF BCT M30953
JACKS,C.M.
MUSRP32A ROD M35397 MUSRPL32A ROD M23453
JACKS,T.
MMTGXPPR VRL M16766
JACKSON,A.
BOVMHBOLA MAM M21044 BOVMHBOLB MAM M21043
JACKSON,A.O.
BSMRVPS SYN M28702 M23023 UNA M23023 MBSRNAG VRL M11511
MBSRNAGND VRL M16577 MBSRNAGSA VRL M11509 MBSRNAGSB VRL M11510
MBSRNAGT VRL M16576 SAPCAP VRL M17182 SYENCP VRL M17210
SYERNA VRL M13950 SYESC6 VRL M35689
---------+---------+---------+---------+---------+---------+---------+---------
1 10 20 30 40 50 60 70 79
Example 6. Author Name Index File
3.3.4 Journal Citation Index File
The journal citation index file lists all of the citations that appear
in the references. All citations are truncated to 80 characters. An
excerpt from the citation index file is shown below:
1 10 20 30 40 50 60 70 79
---------+---------+---------+---------+---------+---------+---------+---------
(IN) THE CELL NUCLEUS, VOLUME VIII: 261-305; ACADEMIC PRESS, NEW YORK (1981)
RATUR5A RNA K00783
(IN) THE IMMUNE SYSTEM: 132-138; S. KARGER, NEW YORK (1981).
HUMIGHVX PRI M35415
(IN) THE LENS: TRANSPARANCY AND CATARACT: 171-179; EURAGE, RIJSWIJK (1986)
RANCRYG2A VRT K02264 RANCRYG4A VRT K02266 RANCRYG5A VRT M22529
RANCRYG6A VRT M22530 RANCRYR VRT X00659
(IN) UCLA SYMP. MOL. CELL. BIOL. NEW SER., VOL. 77: 339-352; ALAN R. LISS, INC.
BOVTRNB2A MAM M36431 HUMTRNB PRI M36429 HUMTRNB1 PRI M36430
(IN) UCLA SYMPOSIA: 575-584; ALAN R. LISS, INC., NEW YORK (1987)
PFAHGPRT INV M54896
(IN) VIRUS RESEARCH. PROCEEDINGS OF 1973 ICN-UCLA SYMPOSIUM: 533-544; ACADEMIC
LAMCG PHG J02459
ACTA BIOCHIM. POL. 24, 301-318 (1977)
LUPTRFJ RNA K00345 LUPTRFN RNA K00346
---------+---------+---------+---------+---------+---------+---------+---------
1 10 20 30 40 50 60 70 79
Example 7. Journal Citation Index File
3.3.5 Cross-Reference To Gene Symbol Libraries
The gene symbol file contains the gene symbols used in the Genome Data
Base and other gene symbols, such as those for the E. coli genes. The
gene symbols are found in the feature table and have the form:
/gene="gene symbol"; an example is found in section 3.5.11.5. An
example of the format of the gene symbol index file follows:
1 10 20 30 40 50 60 70 79
---------+---------+---------+---------+---------+---------+---------+---------
INFC ECOHIMA BCT K02844 ECOTHRINF BCT V00291
INHA HUMINHA PRI M13981 HUMINHAA PRI M13144 HUMINHAG1 PRI X04445
HUMINHAG2 PRI X04446 HUMINHAG2 PRI X04446
INHBA HUMINHBA PRI M13436
INHBB HUMINHBB PRI M13437 HUMINHBB1 PRI M31668 HUMINHBB2 PRI M31669
HUMINHBB2 PRI M31669 HUMINHIB PRI M31682
INS HUMINS01 PRI J00265 HUMINS01 PRI J00265 HUMINSPR PRI M10039
HUMINV2 PRI M13903
INSR HUMINSR PRI M10051 HUMINSR01 PRI M23100 HUMINSR02 PRI M32823
HUMINSR03 PRI M32824 HUMINSR04 PRI M32825 HUMINSR05 PRI M32826
HUMINSR06 PRI M32827 HUMINSR07 PRI M32828 HUMINSR08 PRI M32829
HUMINSR09 PRI M32830 HUMINSR10 PRI M32831 HUMINSR11 PRI M32832
HUMINSR12 PRI M32833 HUMINSR13 PRI M32834 HUMINSR14 PRI M32835
HUMINSR15 PRI M32836 HUMINSR16 PRI M32837 HUMINSR17 PRI M32838
HUMINSR18 PRI M32839 HUMINSR19 PRI M32840 HUMINSR20 PRI M32841
HUMINSR21 PRI M32842 HUMINSR22 PRI M32972 HUMINSRA PRI X02160
HUMINSRA01 PRI M27195 HUMINSRA02 PRI M27197 HUMINSRB PRI J03466
HUMINSRC PRI M29929 HUMINSRD PRI M29930 HUMINSRMUT PRI M27196
HUMIRSRE PRI J05043
INT1 HUMINT1G PRI X03072
INT1L1 HUMIRP PRI X07876
IRGA VCHIRGA BCT M37773 VCHIRGA BCT M37773 VCHIRGA BCT M37773
VCHIRGA BCT M37773 VCHIRGA BCT M37773 VCHIRGB BCT M55988
VCHIRGB BCT M55988
---------+---------+---------+---------+---------+---------+---------+---------
1 10 20 30 40 50 60 70 79
Example 8. Gene Symbol Index File
3.4 GenBank Data Submission Form and Error/Suggestion Report Form
The distribution tape includes a data submission form in the file
GBDAT.FRM. Due to the large volume of new sequence data, we encourage
authors to complete this form and return it to the address listed on
the form. This will enable data to be entered more quickly into the
data bank.
You can complete the form with any text editor. You can send the
completed form to GenBank on tape or floppy diskette, or
electronically via INTERNET or BITNET (the electronic mail address is:
gb-sub%[email protected]). We can use information saved on any computer
medium from any computer system. You can also print the form, fill it
in by hand, and send it to the mailing address given at the beginning
of the form.
The second form in this file is the GenBank Error/Suggestion Report
Form. It is separated from the Data Submission Form by a form-feed
character (<CTRL>L, ASCII octal value 014, ASCII decimal value 12). We
encourage all GenBank users to report any errors to the data bank
staff using this form. Like the GenBank Data Submission Form, it may
be printed and filled in by hand and sent by mail to the address given
at the beginning of the form. It may also be filled out using a text
editor and sent to GenBank by electronic mail at the address given at
the top of the form.
If you have an IBM PC or compatible computer, or a Macintosh personal
computer, we request that you use the Authorin program for submitting
sequences to the data bank. See section 5.5 for information about
obtaining the Authorin program at no charge.
3.5 Sequence Entry Files
The distribution tape contains thirteen sequence entry data files, one
for each division of GenBank. Each file contains the entries for one
group of organisms.
3.5.1 File Organization
Each of these files has the same format and consists of two parts:
header information (described in section 3.1) and sequence entries for
that division (described in the following sections).
3.5.2 Entry Organization
In the second portion of a sequence entry file (containing the
sequence entries for that division), each record (line) consists of
two parts. The first part is found in positions 1 to 10 and may
contain:
1. A keyword, beginning in column 1 of the record (e.g., REFERENCE is
a keyword).
2. A subkeyword beginning in column 3, with columns 1 and 2 blank
(e.g., AUTHORS is a subkeyword of REFERENCE).
3. Blank characters, indicating that this record is a continuation of
the information under the keyword or subkeyword above it.
4. A code, beginning in column 5, indicating the nature of an entry
(feature key) in the FEATURES table; these codes are described in
Section 3.5.11.1 below.
5. A number, ending in column 9 of the record. This number occurs in
the portion of the entry describing the actual nucleotide sequence and
designates the numbering of sequence positions.
6. Two slashes (//) in positions 1 and 2, marking the end of an entry.
The second part of each sequence entry record contains the information
appropriate to its keyword, in positions 13 to 80 for keywords and
positions 11 to 80 for the sequence.
The following is a brief description of each entry field. Detailed
information about each field may be found in Sections 3.5.4 to 3.5.13.
LOCUS - A short unique name for the entry, chosen to suggest the
sequence's definition. Mandatory keyword/exactly one record.
DEFINITION - A concise description of the sequence. Mandatory
keyword/one or more records.
ACCESSION - The primary accession number is a unique, unchanging
code assigned to each entry. (Please use this code when citing
information from GenBank.) Mandatory keyword/one or more records.
KEYWORDS - Short phrases describing gene products and other
information about an entry. Mandatory keyword in all annotated
entries/one or more records.
SEGMENT - Information on the order in which this entry appears in a
series of discontinuous sequences from the same molecule. Optional
keyword (only in segmented entries)/exactly one record.
SOURCE - Common name of the organism or the name most frequently used
in the literature. Mandatory keyword in all annotated entries/one or
more records/includes one subkeyword.
ORGANISM - Formal scientific name of the organism (first line)
and taxonomic classification levels (second and subsequent lines).
Mandatory subkeyword in all annotated entries/two or more records.
REFERENCE - Citations for all articles containing data reported
in this entry. Includes four subkeywords and may repeat. Mandatory
keyword/one or more records.
AUTHORS - Lists the authors of the citation. Mandatory
subkeyword/one or more records.
TITLE - Full title of citation. Optional subkeyword (present
in all but unpublished citations)/one or more records.
JOURNAL - Lists the journal name, volume, year, and page
numbers of the citation. Mandatory subkeyword/one or more records.
STANDARD - Lists information about the degree to which the
entry has been annotated and the level of review to which it has been
subjected. Mandatory subkeyword/exactly one record.
COMMENT - Cross-references to other sequence entries, comparisons to
other collections, notes of changes in LOCUS names, and other remarks.
Optional keyword/one or more records/may include blank records.
FEATURES - Table containing information on portions of the
sequence that code for proteins and RNA molecules and information on
experimentally determined sites of biological significance. Optional
keyword/one or more records.
BASE COUNT - Summary of the number of occurrences of each base
code in the sequence. Mandatory keyword/exactly one record.
ORIGIN - Specification of how the first base of the reported sequence
is operationally located within the genome. Where possible, this
includes its location within a larger genetic map. Mandatory
keyword/exactly one record.
- The ORIGIN line is followed by sequence data (multiple
records).
// - Entry termination symbol. Mandatory at the end of an
entry/exactly one record.
3.5.3 Sample Sequence Data File
An example of a complete sequence entry file follows. (This example
has only two entries.) Note that in this example, as throughout the
data bank, numbers in square brackets indicate items in the REFERENCE
list. For example, in ACARR58S, [1] refers to the paper by Mackay, et
al.
1 10 20 30 40 50 60 70 79
---------+---------+---------+---------+---------+---------+---------+---------
GBSMP.SEQ Genetic Sequence Data Bank
15 December 1990
GenBank(R) Release 66.0
Structural Rna Sequences
2 loci, 280 bases, from 2 reported sequences
LOCUS AAURRA 118 bp ss-rRNA RNA 16-JUN-1986
DEFINITION A.auricula-judae (mushroom) 5S ribosomal RNA.
ACCESSION K03160
KEYWORDS 5S ribosomal RNA; ribosomal RNA.
SOURCE A.auricula-judae (mushroom) ribosomal RNA.
ORGANISM Auricularia auricula-judae
Eukaryota; Plantae; Thallobionta; Basidiomycotina;
Phragmobasidiomycetes; Heterobasidiomycetidae; Eutremellales;
Auriculariaceae; Auricularia; auricula-judae.
REFERENCE 1 (bases 1 to 118)
AUTHORS Huysmans,E., Dams,E., Vandenberghe,A. and De Wachter,R.
TITLE The nucleotide sequences of the 5S rRNAs of four mushrooms and
their use in studying the phylogenetic position of basidiomycetes
among the eukaryotes
JOURNAL Nucleic Acids Res. 11, 2871-2880 (1983)
STANDARD full staff_review
FEATURES Location/Qualifiers
rRNA 1..118
/note="5S ribosomal RNA"
BASE COUNT 27 a 34 c 34 g 23 t
ORIGIN 5' end of mature rRNA.
1 atccacggcc ataggactct gaaagcactg catcccgtcc gatctgcaaa gttaaccaga
61 gtaccgccca gttagtacca cggtggggga ccacgcggga atcctgggtg ctgtggtt
//
LOCUS ACARR58S 162 bp ss-rRNA RNA 15-MAR-1989
DEFINITION A.castellanii (amoeba) 5.8S ribosomal RNA.
ACCESSION K00471
KEYWORDS 5.8S ribosomal RNA; ribosomal RNA.
SOURCE A.castellani (amoeba; strain ATCC 30010) rRNA.
ORGANISM Acanthamoeba castellanii
Eukaryota; Animalia; Protozoa; Sarcomastigophora; Sarcodina;
Rhizopoda; Lobosa; Gymnamoeba; Amoebida; Acanthopodina;
Acanthamoebidae; Acanthamoeba; castellanii.
REFERENCE 1 (bases 1 to 162)
AUTHORS Mackay,R.M. and Doolittle,W.F.
TITLE Nucleotide sequences of AcanthamoebA.castellanii 5S and 5.8S
ribosomal ribonucleic acids: Phylogenetic and comparative
structural analyses
JOURNAL Nucleic Acids Res. 9, 3321-3334 (1981)
STANDARD simple staff_review
COMMENT [1] also sequenced A.castellanii 5S rRNA <K03160>.
FEATURES Location/Qualifiers
rRNA 1..162
/note="5.8S rRNA"
BASE COUNT 40 a 39 c 44 g 39 t
ORIGIN 5' end of mature rRNA.
1 aactcctaac aacggatatc ttggttctcg cgaggatgaa gaacgcagcg aaatgcgata
61 cgtagtgtga atcgcaggga tcagtgaatc atcgaatctt tgaacgcaag ttgcgctctc
121 gtggtttaac cccccgggag cacgttcgct tgagtgccgc tt
//
---------+---------+---------+---------+---------+---------+---------+---------
1 10 20 30 40 50 60 70 79
Example 9. Sample Sequence Data File
3.5.4 LOCUS Format
The pieces of information contained in the LOCUS record are always
found in fixed positions. The locus name (or entry name), which is
always ten characters or less, begins in position 13. The locus name
is designed to help group entries with similar sequences: the first
three characters usually designate the organism; the fourth and fifth
characters can be used to show other group designations, such as gene
product; for segmented entries the last character is one of a series
of sequential integers.
The number of bases or base pairs in the sequence ends in position 29.
The letters `bp' are in positions 31 to 32. Positions 34 to 36 give
the number of strands of the sequence. Positions 37 to 40 give the
topology of molecule sequenced. If the sequence is of a special type,
a notation (such as `circular') is included in positions 43 to 52.
GenBank sequence entries are divided among thirteen taxonomic
divisions. Each entry's division is identified by a three-letter code
in positions 53 to 55. See Section 3.3 for the division codes.
Positions 63 to 73 of the record contain the date the entry was
entered or underwent any substantial revisions, such as the addition
of newly published data, in the form dd-MMM-yyyy.
The detailed format for the LOCUS record is as follows:
Positions Contents
1-12 LOCUS
13-22 Locus name
23-29 Length of sequence, right-justified
31-32 bp
34-36 Blank, ss- (single-stranded), ds- (double-stranded), or
ms- (mixed-stranded)
37-40 Blank, DNA, RNA, tRNA (transfer RNA), rRNA (ribosomal RNA),
mRNA (messenger RNA), or uRNA (small nuclear RNA)
43-52 Blank (implies linear), circular, or tandem
53-55 The division code (see Section 3.3)
63-73 Date, in the form dd-MMM-yyyy (e.g., 15-DEC-1990)
3.5.5 DEFINITION Format
The DEFINITION record gives a brief description of the sequence,
proceeding from general to specific. It starts with the common name of
the source organism, then gives the criteria by which this sequence is
distinguished from the remainder of the source genome, such as the
gene name and what it codes for, or the protein name and mRNA, or some
description of the sequence's function (if the sequence is
non-coding). If the sequence has a coding region, the description may
be followed by a completeness qualifier, such as cds (complete coding
sequence). The length is limited to three lines and the last line must
end with a period.
3.5.6 ACCESSION Format
This field contains a series of six-character identifiers (accession
numbers: first character a letter, the remainder digits). The primary
(first) accession number occupies positions 13 to 18; subsequent
accession numbers occupy positions 20 to 25, 27 to 32, 34 to 39, 41 to
46, 48 to 53, 55 to 60, 62 to 67, and 69 to 74. No punctuation occurs
between accession numbers or after the final accession number;
accession numbers are separated only by one space.
3.5.7 KEYWORDS Format
The KEYWORDS field does not appear in unannotated entries, but is
required in all annotated entries. Keywords are separated by
semicolons; a keyword may be a single word or a phrase consisting of
several words. Each line in the keywords field ends in a semicolon;
the last line ends with a period. If no keywords are included in the
entry, the KEYWORDS record contains only a period.
3.5.8 SEGMENT Format
The SEGMENT keyword is used when two (or more) entries of known
relative orientation are separated by a short (<10 kb) stretch of DNA.
It is limited to one line of the form `n of m', where `n' is the
segment number of the current entry and `m' is the total number of
segments.
3.5.9 SOURCE Format
The SOURCE field consists of two parts. The first part is found after
the SOURCE keyword and contains free-format information including an
abbreviated form of the organism name followed by a molecule type;
multiple lines are allowed, but the last line must end with a period.
The second part consists of information found after the ORGANISM
subkeyword. The formal scientific name for the source organism (genus
and species, where appropriate) is found on the same line as ORGANISM.
The records following the ORGANISM line list the taxonomic
classification levels, separated by semicolons and ending with a
period.
3.5.10 REFERENCE Format
The REFERENCE field consists of five parts: the keyword REFERENCE, and
the subkeywords AUTHORS, TITLE (optional), JOURNAL and STANDARD.
The REFERENCE line contains the number of the particular reference and
(in parentheses) the range of bases in the sequence entry reported in
this citation. Additional prose notes may also be found within the
parentheses. The numbering of the references does not reflect
publication dates or priorities.
The AUTHORS line lists the authors in the order in which they appear
in the cited article. Last names are separated from initials by a
comma (no space); there is no comma before the final `and'. The list
of authors ends with a period.
The TITLE line is an optional field, although it appears in the
majority of entries. It does not appear in unpublished sequence data
entries that have been deposited directly into the GenBank data bank,
the EMBL Nucleotide Sequence Data Library, or the DNA Data Bank of
Japan. The TITLE field does not end with a period.
The JOURNAL line gives the appropriate literature citation for the
sequence in the entry. The word `Unpublished' will appear after the
JOURNAL subkeyword if the data did not appear in the scientific
literature, but was directly deposited into the data bank. For
published sequences the JOURNAL line gives the Thesis, Journal, or
Book citation, including the year of publication, the specific
citation, or In press.
The STANDARD line contains information about:
The degree to which the entry has been annotated:
`unannotated' for unannotated entries which include citation and
sequence only.
`simple' for unannotated entries which include the organism name and
protein coding regions as well as the citation and sequence.
`full' for fully annotated entries which include all the data items
that were described by the author.
The level of modification and review:
`automatic' for data subjected only to automated (i.e., software)
checks.
`staff_entry' for data that passed both automated and annotator
checks.
`staff_review' for data that passed previous review levels as well as
a review by senior annotators and/or outside experts.
The format for the STANDARD line is: annotation degree <SPACE> review level
3.5.11 FEATURES Format
This release uses the new feature table format. This format has been
designed jointly by GenBank, the EMBL Nucleotide Sequence Data
Library, and the DNA Data Bank of Japan, and will be common to all
three data banks.
The feature table contains information about genes and gene products,
as well as regions of biological significance reported in the
sequence. The feature table contains information on regions of the
sequence that code for proteins and RNA molecules. It also enumerates
differences between different reports of the same sequence, and
provides cross-references to other data collections, as described in
more detail below.
The first line of the feature table is a header that includes the
keyword `FEATURES' and the column header `Location/Qualifier.' Each
feature consists of a descriptor line containing a feature key and a
location (see sections below for details). If the location does not
fit on this line, a continuation line may follow. If further
information about the feature is required, one or more lines
containing feature qualifiers may follow the descriptor line.
The feature key begins in column 6 and may be no more than 15
characters in length. The location begins in column 22. Feature
qualifiers begin on subsequent lines at column 22. Location,
qualifier, and continuation lines may extend from column 22 to 80.
Feature tables are optional. However, a feature table must include one
header line and at least one feature descriptor line.
The sections below provide a brief introduction to the new feature
table format. For a thorough description of the new feature table
format, see the document `The DDBJ/EMBL/GenBank Feature Table:
Definition.' If you would like a copy of this publication, contact
GenBank at the address shown on the front page of these Release Notes.
3.5.11.1 Feature Key Names
The first column of the feature descriptor line contains the feature
key. It starts at column 6 and can continue to column 20. The list of
valid feature keys is shown below.
allele Related strain contains alternative gene form
attenuator Sequence related to transcription termination
CAAT_signal `CAAT box' in eukaryotic promoters
CDS Sequence coding for amino acids in protein (includes stop
codon)
cellular Region of cellular DNA
conflict Independent determinations differ
D-loop Displacement loop
enhancer Cis-acting enhancer of promoter function
exon Region that codes for part of spliced mRNA
GC_signal `GC box' in eukaryotic promoters
iDNA Intervening DNA eliminated by recombination
insertion_seq Insertion sequence (IS), a small transposon
intron Transcribed region excised by mRNA splicing
LTR Long terminal repeat
mat_peptide Mature peptide coding region (does not include stop codon)
misc_binding Miscellaneous binding site
misc_difference Miscellaneous difference feature
misc_feature Region of biological significance that cannot be described by
any other feature
misc_recomb Miscellaneous recombination feature
misc_RNA Miscellaneous transcript feature not defined by other RNA keys
misc_signal Miscellaneous signal
misc_structure Miscellaneous DNA or RNA structure
modified_base The indicated base is a modified nucleotide
mRNA Messenger RNA
mutation A mutation alters the sequence here
old_sequence Presented sequence revises a previous version
polyA_signal Signal for cleavage & polyadenylation
polyA_site Site at which polyadenine is added to mRNA
precursor_RNA Any RNA species that is not yet the mature RNA product
prim_transcript Primary (unprocessed) transcript
primer_bind Non-covalent primer binding site
promoter A region involved in transcription initiation
protein_bind Non-covalent protein binding site on DNA or RNA
provirus Proviral sequence
RBS Ribosome binding site
rep_origin Replication origin for duplex DNA
repeat_region Sequence containing repeated subsequences
repeat_unit One repeated unit of a repeat_region
rRNA Ribosomal RNA
satellite Satellite repeated sequence
scRNA Small cytoplasmic RNA
sig_peptide Signal peptide coding region
snRNA Small nuclear RNA
stem_loop Hair-pin loop structure in DNA or RNA
TATA_signal `TATA box' in eukaryotic promoters
terminator Sequence causing transcription termination
transit_peptide Transit peptide coding region
transposon Transposable element (TN)
tRNA Transfer RNA
unsure Authors are unsure about the sequence in this region
variation A related population contains stable mutation
virion Virion (encapsidated) viral sequence
- (hyphen) Placeholder
-10_signal `Pribnow box' in prokaryotic promoters
-35_signal `-35 box' in prokaryotic promoters
3'clip 3'-most region of a precursor transcript removed in processing
3'UTR 3' untranslated region (trailer)
5'clip 5'-most region of a precursor transcript removed in processing
5'UTR 5' untranslated region (leader)
3.5.11.2 Feature Location
The second column of the feature descriptor line designates the
location of the feature in the sequence. The location descriptor
begins at position 22. Several conventions are used to indicate
sequence location.
Base numbers in location descriptors refer to numbering in the entry,
which is not necessarily the same as the numbering scheme used in the
published report. The first base in the presented sequence is numbered
base 1. Sequences are presented in the 5' to 3' direction.
Location descriptors can be one of the following:
1. A single base;
2. A contiguous span of bases;
3. A site between two bases;
4. A single base chosen from a range of bases;
5. A single base chosen from among two or more specified bases;
6. A joining of sequence spans;
7. A reference to an entry other than the one to which the feature
belongs (i.e., a remote entry), followed by a location descriptor
referring to the remote sequence;
8. A literal sequence (a string of bases enclosed in quotation marks).
A site between two residues, such as an endonuclease cleavage site, is
indicated by listing the two bases separated by a carat (e.g., 23^24).
A single residue chosen from a range of residues is indicated by the
number of the first and last bases in the range separated by a single
period (e.g., 23.79). The symbols < and > indicate that the end point
of the range is beyond the specified base number.
A contiguous span of bases is indicated by the number of the first and
last bases in the range separated by two periods (e.g., 23..79). The
symbols < and > indicate that the end point of the range is beyond the
specified base number. Starting and ending positions can be indicated
by base number or by one of the operators described below.
Operators are prefixes that specify what must be done to the indicated
sequence to locate the feature. The following are the operators
available, along with their most common format and a description.
complement (location): The feature is complementary to the location
indicated. Complementary strands are read 5' to 3'.
join (location, location, .. location): The indicated elements should
be placed end to end to form one contiguous sequence.
order (location, location, .. location): The elements are found in the
specified order in the 5' to 3' direction, but nothing is implied
about the rationality of joining them.
group (location, location, .. location): The elements are related and
should be grouped together, but no order is implied.
one-of (location, location, .. location): The element can be any one,
but only one, of the items listed.
replace (location, location): The first location indicated should be
replaced by the sequence from the second location; used for
insertions, deletions, and variants.
3.5.11.3 Feature Qualifiers
Qualifiers provide additional information about features. They take
the form of a slash (/) followed by a qualifier name and, if
applicable, an equal sign (=) and a qualifier value. Feature
qualifiers begin at column 22.
Qualifiers convey many types of information. Their values can,
therefore, take several forms:
1. Free text;
2. Controlled vocabulary or enumerated values;
3. Citations or reference numbers;
4. Sequences;
5. Feature labels.
Text qualifier values must be enclosed in double quotation marks. The
text can consist of any printable characters (ASCII values 32-126
decimal). If the text string includes double quotation marks, each set
must be `escaped' by placing a double quotation mark in front of it
(e.g., /note="This is an example of ""escaped"" quotation marks").
Some qualifiers require values selected from a limited set of choices.
For example, the `/direction' qualifier has only three values `left,'
`right,' or `both.' These are called controlled vocabulary qualifier
values. Controlled qualifier values are not case sensitive; they can
be entered in any combination of upper- and lowercase without changing
their meaning.
Citation or published reference numbers for the entry should be
enclosed in square brackets ([]) to distinguish them from other
numbers. Multiple citations are separated by commas (e.g.,
[1],[2],[3]).
A literal sequence of bases (e.g., "atgcatt") should be enclosed in
quotation marks. Literal sequences are distinguished from free text by
context. Qualifiers that take free text as their values do not take
literal sequences, and vice versa.
The `/label=' qualifier takes a feature label as its qualifier.
Although feature labels are optional, they allow unambiguous
references to the feature. The feature label identifies a feature
within an entry; when combined with the accession number and the name
of the data bank from which it came, it is a unique tag for that
feature. Feature labels must be unique within an entry, but can be the
same as a feature label in another entry. Feature labels are not case
sensitive; they can be entered in any combination of upper-and
lowercase without changing their meaning.
The following is a list of valid feature qualifiers.
/anticodon Location of the anticodon of tRNA and the amino acid
for which it codes
/bound_moiety Moiety bound
/citation Reference to a citation providing the claim of or
evidence for a feature
/codon Specifies a codon that is different from any found in the
reference genetic code
/codon_start Indicates the reading frame of a protein coding region
/cons_splice Identifies intron splice sites that do not conform to
the 5'-GT... AG-3' splice site consensus
/direction Direction of DNA replication
/EC_number Enzyme Commission number for the enzyme product of the
sequence
/evidence Value indicating the nature of supporting evidence
/frequency Frequency of the occurrence of a feature
/function Function attributed to a sequence
/gene Symbol of the gene corresponding to a sequence region
/label A label used to permanently identify a feature
/mod_base Abbreviation for a modified nucleotide base
/note Any comment or additional information
/number A number indicating the order of genetic elements (e.g., exons
or introns) in the 5' to 3' direction
/organism Name of organism if different from that contained in
the entry's ORGANISM field
/partial Differentiates between complete regions and partial ones
/phenotype Phenotype conferred by the feature
/product Name of a product encoded by the sequence
/pseudo Indicates that this feature is a non-functional version of the
element named by the feature key
/rpt_family Type of repeated sequence; `Alu' or `Kpn,' for example
/rpt_type Organization of repeated sequence
/rpt_unit Identity of repeat unit that constitutes a
repeat_region
/standard_name Accepted standard name for this feature
/transl_except Translational exception: single codon, the translation
of which does not conform to the reference genetic code
/type Name of a strain if different from that in the SOURCE field
/usedin Indicates that feature is used in a compound feature in
another entry
3.5.11.4 Cross-Reference Information
One type of information in the feature table lists cross-references to
the annual compilation of transfer RNA sequences in Nucleic Acids
Research, which has kindly been sent to us on tape by Dr. Sprinzl.
Each tRNA entry of the feature table contains a /note= qualifier that
includes a reference such as `(NAR: 1234)' to identify code 1234 in
the NAR compilation. When such a cross-reference appears in an entry
that contains a gene coding for a transfer RNA molecule, it refers to
the code in the tRNA gene compilation. Similar cross-references in
entries containing mature transfer RNA sequences refer to the
companion compilation of tRNA sequences published by D.H. Gauss and M.
Sprinzl in Nucleic Acids Research. See section 3.5.11.6 for an
example.
The feature tables of human entries contain cross-references to the
Genome Data Base (GDB) in Baltimore, MD. GDB includes information on
mapped genes, probes, and restriction fragment length polymorphisms.
Each entry in that data bank contains the official symbol for the gene
or locus. GDB assigns each gene a unique identifier that remains
associated with that gene, regardless of changes in gene names. In
entries that contain sequences for mapped genes a /note= qualifier
includes this identifier placed within single quotes following the
term `/hgml_locus_uid='. The /note qualifier also includes the map
location in single quotes following the term `/map'. The gene symbol
formerly designated `/nomgen=' is contained in the /gene qualifier.
See section 3.5.11.6 for an example.
For more information about the Genome Data Base, contact:
Genome Data Base
1830 East Monument Street
Baltimore, MD 21205
Telephone: (203) 786-5515
3.5.11.5 Feature Table Examples
In the first example a number of key names, feature locations, and
qualifiers are illustrated, taken from different sequences. The first
table entry is a coding region consisting of a simple span of bases
and including a /gene qualifier. In the second table entry, an NAR
cross-reference is given (see the previous section for a discussion of
these cross-references). The third and fourth table entries use the
symbols `<`and `>' to indicate that the beginning or end of the
feature is beyond the range of the presented sequence. In the fifth
table entry, the symbol `^' indicates that the feature is between
bases. In the sixth table entry, the replace operator is shown.
1 10 20 30 40 50 60 70 79
---------+---------+---------+---------+---------+---------+---------+---------
CDS 5..1261
/note="alpha-1-antitrypsin precursor /map=`14q32.1'
/hgml_locus_uid=`LX0081X'"
/gene="PI"
tRNA 1..87
/note="Leu-tRNA-CAA (NAR: 1057)"
/anticodon=(pos:35..37,aa:Leu)
mRNA 1..>66
/note="alpha-1-acid glycoprotein mRNA"
transposon <1..267
/note="insertion element IS5"
misc_recomb 105^106
/note="B.subtilis DNA end/IS5 DNA start"
conflict replace(258..258,"t")
/citation=[2]
---------+---------+---------+---------+---------+---------+---------+---------
1 10 20 30 40 50 60 70 79
Example 10. Feature Table Entries
The next example shows the representation for a CDS that spans more
than one entry.
1 10 20 30 40 50 60 70 79
---------+---------+---------+---------+---------+---------+---------+---------
LOCUS HUMAPOB1 840 bp ds-DNA PRI 15-JUN-1989
DEFINITION Human apolipoprotein B-100 gene, exons 1 and 2.
ACCESSION M15053
KEYWORDS apolipoprotein B-100.
SEGMENT 1 of 2
.
.
.
FEATURES Location/Qualifiers
sig_peptide 283..354
/note="apolipoprotein B-100 signal peptide"
precursor_RNA 155..>840
/note="apoB100 mRNA"
intron 356..669
/note="apoB100 intron A"
intron 709..>840
/note="apoB100 intron B"
.
.
.
//
LOCUS HUMAPOB2 13872 bp ss-mRNA PRI 15-JUN-1989
DEFINITION Human apolipoprotein B-100 mRNA, starting at exon 3.
ACCESSION M15051 M15054
KEYWORDS apolipoprotein B-100.
SEGMENT 2 of 2
.
.
.
FEATURES Location/Qualifiers
precursor_RNA <1..13872
/note="apoB100 mRNA"
variation 3204
/note="g in lambda-B25; c in lambda B1"
CDS join(M15053:283..355,M15053:670..708,
1..13571)
/note="apolipoprotein B-100 precursor"
mat_peptide join(M15053:355..355,M15053:670..708,
1..13568)
/note="apolipoprotein B-100"
.
.
.
//
---------+---------+---------+---------+---------+---------+---------+---------
1 10 20 30 40 50 60 70 79
Example 11. Joining Sequences
3.5.12 ORIGIN Format
The ORIGIN record may be left blank, may appear as `Unreported.' or
may give a local pointer to the sequence start, usually involving an
experimentally determined restriction cleavage site or the genetic
locus (if available). The ORIGIN record ends in a period if it
contains data, but does not include the period if the record is left
empty (in contrast to the KEYWORDS field which contains a period
rather than being left blank).
3.5.13 SEQUENCE Format
The nucleotide sequence for an entry is found in the records following
the ORIGIN record. The sequence is reported in the 5'to 3' direction.
There are sixty bases per record, listed in groups of ten bases
followed by a blank, starting at position 11 of each record. The
number of the first nucleotide in the record is given in columns 4 to
9 (right justified) of the record.
4. FUTURE RELEASES
4.1 Changes Planned for Release 67.0
No changes are planned for Release 67.0.
5. KNOWN PROBLEMS WITH THE GENBANK DATABASE
5.1 Incorrect Gene Symbols in Entries and Index
The /gene qualifier should contain gene symbols. In this release,
however, the /gene qualifier for many entries incorrectly contains
values other than the gene symbol, such as the product or standard
name of the gene. The gene symbol index (GBHGM.IDX) is created from
the data in the /gene qualifier and therefore contains data other than
gene symbols. These errors will be corrected as soon as possible.
6. GENBANK ADMINISTRATION
IntelliGenetics Inc., a developer and distributor of molecular biology
computer programs and instrumentation, is the primary contractor for
the GenBank data bank. IntelliGenetics maintains the computerized data
center and oversees data distribution on all media. Under an
arrangement with IntelliGenetics, Los Alamos National Laboratory
(LANL) gathers, annotates, and organizes sequence data and transmits
it to IntelliGenetics. LANL is operated by the University of
California for the Department of Energy.
The electronic mail address of LANL is [email protected]; their
telephone number is (505) 665-2177. The IntelliGenetics address is on
the front page of these release notes.
6.1 Registered Trademark Notice
GenBank (R) is a registered trademark of the U.S. Department of Health
and Human Services for the Genetic Sequence Data Bank operated by
IntelliGenetics and Los Alamos National Laboratory under contract with
the National Institutes of Health.
6.2 GenBank Sponsorship
GenBank is sponsored by the National Institute of General Medical
Sciences, NIH; The National Library of Medicine, NIH; and the U.S.
Department of Energy.
6.3 Citing GenBank
If you have used GenBank in your research, we would appreciate it if
you would include a reference to GenBank in all publications related
to that research. You may also wish to note that the GenBank data bank
is publicly available from IntelliGenetics.
When citing data in GenBank, it is appropriate to give the sequence
name, primary accession number, and the publication in which the
sequence first appeared. If the data are unpublished, we urge you to
contact the group which submitted the data to GenBank to see if there
is a recent publication or if they have determined any revisions or
extensions of the data.
It is also appropriate to list a reference for GenBank itself. The
following publication, which describes the GenBank data bank, should
be cited:
Bilofsky, H.S. and Burks, C. The GenBank (R) Genetic Sequence Data Bank.
Nucl. Acids Res. 16: 1861-1864 (1988)
The following statement is an example of how you may cite GenBank
data. It cites the sequence, its primary accession number, the group
who determined the sequence, and GenBank. The numbers in brackets
refer to one of the GenBank citations above and the REFERENCE in the
GenBank sequence entry.
`We scanned the GenBank (1) data bank for sequence similarities and
found one sequence (2),
GenBank accession number J01016, which showed significant
similarity...'
(1) Bilofsky, H.S. and Burks, C. Nucl. Acids Res. 16: 1861-1864 (1988)
(2) Nellen, W. and Gallwitz, D. J. Mol. Biol. 159, 1-18 (1982)
6.4 GenBank Distribution Formats and Media
The GenBank data bank is available in three formats on three physical
media. The three formats are fixed-length 80-character records,
VAX/VMS Backup saveset, and compressed Unix tar archive format. The
three media are industry-standard 9-track magnetic tapes, Sun 1/4" QIC
24 format cartridges, and TK-50 cartridges. The following chart
specifies which formats are available in each medium.
To request a change in the format, media, or density of the tapes you
receive, write to the address (or call the telephone number) on the
first page of these release notes.
FILE FORMATS TAPE MEDIA
Unlabelled ASCII VAX/VMS Unix
(fixed-length records) Backup Saveset tar tarfile
9-track, 2400' reel
1600 bpi MU M M
6250 bpi MU M M
TK-50 cartridge (DEC) NA M NA
1/4" QIC 24 cartridge (Sun) NA NA M
MU tapes are available in both mixed-case and uppercase-only formats
M tapes are available only in mixed-case characters
NA not available
Table 1. Tape Media and Formats
6.5 Request for Direct Submission of Sequence Data
The growth of nucleotide sequence data is close to exponential. Both
the proposed Human Genome sequencing project and the increasing
automation of sequencing make it clear that GenBank is going to
continue to grow rapidly. A successful GenBank requires that the data
enter the data bank as soon as possible after publication, that the
annotations be as complete as possible, and that the sequence and
annotation data be accurate. All three of these requirements are best
met if authors of sequence data submit their data directly to GenBank
in a usable form. It is especially important that these submissions be
in computer-readable form.
GenBank must rely on direct author submission of data to ensure that
it achieves its goals of complete, accurate, and timely data. To
assist researchers in entering their own sequence data, GenBank has
developed AUTHORIN, an easy-to-use program that enables authors to
enter a sequence, annotate it, and submit it to GenBank or any of the
other data banks. The IBM PC compatible and Macintosh versions of
AUTHORIN may be obtained by completing the enclosed AUTHORIN request
card or by contacting GenBank at the address shown on the front of
these release notes. Versions for the VAX and Sun workstations are
also planned and will be announced in future release notes as they
become available.
For those who are unable to use the Authorin program, GenBank has a
printed data submission form. This form is now standardized among
EMBL, DDBJ, GenBank, PIR, MIPS, and JIPID. GenBank also provides a
corresponding computer-readable data submission form that can be used
for electronic mail and floppy disk submissions. The GenBank Data
Submission Form (located in the file GBDAT.FRM) can be used to submit
your sequence and annotations. Electronic mail submissions should go
to the address "GB-SUB%[email protected]"; direct mail should go to our
postal address in Los Alamos, which is on the data submission form.
6.6 Request for Corrections and Comments
We welcome your suggestions for improvements to GenBank. We are
especially interested to learn of errors or inconsistencies in the
data. Please use the GenBank Error/Suggestion Report Form, which is
part of this distribution of GenBank (located in the file GBDAT.FRM),
to send your suggestions and corrections to the address on the first
page of these release notes. Please be certain to indicate the GenBank
release number (e.g., Release 66.0) and the primary accession number
of the entry to which your comments apply; it is helpful if you also
give the entry name and the current contents of any data field for
which you are recommending a change.
6.7 Disclaimer
IntelliGenetics Inc., Los Alamos National Laboratory, and the United
States Government make no representations or warranties regarding the
content or accuracy of the information. IntelliGenetics Inc., Los
Alamos National Laboratory, and the United States Government also make
no representations or warranties of merchantibility or fitness for a
particular purpose and accept no responsibility for any consequences
of the receipt or use of the information.
APPENDIX A. Statistical Summary
Division Entries Bases Reports
PRIMATE 7511 9003383 9493
RODENT 7652 7841099 9176
OTHER MAMMALIAN 1552 1935603 1817
OTHER VERTEBRATE 1876 2142926 2263
INVERTEBRATE 3195 4005462 3811
PLANT 2976 4659180 3636
ORGANELLE 1271 1848854 1569
BACTERIAL 4293 6992664 5528
STRUCTURAL RNA 1647 445723 1946
VIRAL 3707 6439492 4751
PHAGE 593 682556 880
SYNTHETIC 1028 516186 1129
UNANNOTATED 3756 4792964 4909
Total (13 divisions) 41057 51306092 50908
Sequences with greater than 30,000 bp
Locus Div Accession Length
ADBCG VRL J01917 35937bp
CHKMYHE VRT J02714 31111bp
HS11UL VRL D00317 108360bp
HS4 VRL V01555 172282bp
HS5HCMVU VRL X04650 43275bp
HUMADAG PRI M13792 36741bp
HUMFIXG PRI K02402 38059bp
HUMGHCSA PRI J03071 66495bp
HUMHBB PRI J00179 73326bp
HUMHPRTB PRI M26434 56736bp
HUMTPA PRI K03021 36594bp
LAMCG PHG J02459 48502bp
MPOCPCG ORG X04465 121024bp
MUSBGCXD ROD X14061 55856bp
PT7CG PHG J02518 39936bp
RABBGLOB MAM M18818 44594bp
RATCRYG ROD M19359 54670bp
TOBCPCG ORG Z00044 155844bp
VACCG VRL M35027 191737bp
VAZCG VRL X04370 124884bp
X14112 UNA X14112 152260bp
X14720 UNA X14720 35100bp
X15423 UNA X15423 47081bp
X15917 UNA X15917 40469bp
X17012 UNA X17012 30000bp
X17403 UNA X17403 229354bp
APPENDIX B. Entries with a change in locus name
Accession Rel 65.0 Rel 66.0
-------- --------- ---------
J03132 HUMICAM1 HUMICAM1M
J04134 MUSCNR MUSCNRA
K00319 BEACPTRMF PHVCPTRMF
K00336 BEACPTRF PHVCPTRF
M22244 BOVSVSP BOVSVSPA
M24637 CHKCEK CHKCEK1
M29035 BSUPEPF BSUPEPFA
M29579 RATGLUSA RATGLUS
M30953 ECOPANA ECOPANF
M31077 HUMSTATHG1 HUMSTATH1
M31612 TRBESAGC TRBESAGCA
M31628 STMHISOP STMHISOPA
M32331 HUMICAM1AA HUMICAM1
M32639 HUMSTATHG2 HUMSTATH2
M36473 ACLP322P ACCP322P
X02297 PARSP51A3 PARSP51A4
X03499 DDITRNV DDITGNV
X03500 DDITRNE DDITGNE
X04083 TVMXGG TVMXCG
X12646 HUMRPHO2A HUMAPP2A
X12656 HUMPP2A HUMBPP2A
X17615 ECOFHUE X17615
Y00448 ECOK2KORB RK2KORB
APPENDIX C. Number of entries, reports, and bases by organism
PRIMATE
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. AGM Cercopithecus aethiops 61 42 31391
2. ATR Aotus trivirgatus 7 7 7557
3. BAB Papio anubis 3 3 2653
4. BAB Papio doguera 1 1 2000
5. BAB Papio hamadryas 7 7 11052
6. BAB Papio papio 1 1 343
7. BAB Papio sp. 3 3 1601
8. CEB Cebus sp. 2 2 190
9. CEP Cebus apella 7 7 1819
10. CHP Pan paniscus 1 1 1683
11. CHP Pan troglodytes 68 53 72174
12. COL Colobus polykomos 2 2 1494
13. GCR Galago crassicaudatus 38 38 16345
14. GIB Hylobates lar 8 6 17024
15. GOR Gorilla gorilla 19 11 19990
16. GSE Galago senegalensis 1 1 369
17. HUM Homo sapiens 9156 7235 8687688
18. LEM Cheirogaleus medius 1 1 1899
19. LEM Lemur albifrons 1 1 1786
20. LEM Lemur macaco 3 3 5590
21. LEM Lemur sp. 1 1 1380
22. MAC Macaca fascicularis 8 7 6355
23. MAC Macaca mulatta 40 28 33283
24. MAC Macaca nemestrina 1 1 1115
25. MAC Macaca radiata 2 2 342
26. MAC Macaca sp. 7 6 7502
27. MNK Ateles geoffroyi 4 4 13966
28. MNK Monkey 11 11 2081
29. ORA Pongo pygmaeus 19 17 36667
30. SOE Saguinus oedipus 4 4 5207
31. TAR Tarsius sp. 6 5 10837
Total 9493 7511 9003383
RODENT
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. DIP Dipodomys ordii 1 1 3318
2. GPI Cavia cobaya 1 1 491
3. GPI Cavia cutleri 3 2 4959
4. GPI Cavia porcellus 23 22 36118
5. GPI Cavia sp. 2 2 3226
6. HAM Cricetulus griseus 32 28 38686
7. HAM Cricetulus longicaudatus 44 31 20752
8. HAM Cricetulus sp. 37 33 31583
9. HAM Cricetus cricetus 6 6 11446
10. HAM Mesocricetus auratus 81 58 77935
11. HAM Mesocricetus sp. 94 62 47934
12. MAR Marmota monax 8 6 8563
13. MUS Mus caroli 18 17 13904
14. MUS Mus domesticus 24 20 12906
15. MUS Mus muscaris 56 56 28343
16. MUS Mus musculus 5821 4884 4299073
17. MUS Mus pahari 7 7 7763
18. MUS Mus platythrix 1 1 315
19. MUS Mus sp. 11 11 9130
20. MUS Mus spretus 9 7 7454
21. PER Peromyscus leucopus 3 3 3640
22. PER Peromyscus maniculatus 11 11 1791
23. RAT Rattus colletti 4 4 7849
24. RAT Rattus fuscipes 1 1 1161
25. RAT Rattus leucopus 3 3 3481
26. RAT Rattus norvegicus 2599 2144 2797456
27. RAT Rattus rattus 209 177 284465
28. RAT Rattus sordidus 1 1 1161
29. RAT Rattus sp. 56 45 61254
30. RAT Rattus tunneyi 1 1 1161
31. RAT Rattus villosissimus 2 2 3369
32. SEH Spalax ehrenbergi 7 5 10412
Total 9176 7652 7841099
OTHER MAMMALIAN
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. AXI Axis axis 2 2 1758
2. BOV Bos bovis 1 1 2363
3. BOV Bos taurus 755 642 741804
4. CAT Felis catus 40 40 23293
5. CAT Felis domesticus 3 3 4748
6. CAT Felis silvestris 8 6 9516
7. CAT Felis sp. 1 1 3534
8. DAV Dasyurus viverrinus 2 2 939
9. DOG Canis familiaris 55 45 52602
10. DOG Canis lupus 8 8 7867
11. DOG Canis sp. 19 18 22470
12. GOT Capra hircus 46 41 42286
13. HRS Equus caballus 66 38 32167
14. LEE Lepus capensis 1 1 434
15. LEE Lepus europaeus 1 1 3646
16. MMU Muntiacus muntjak 1 1 807
17. MVI Mustela vison 3 3 1909
18. OPO Didelphis virginiana 9 9 7139
19. ORC Orcinus orca 1 1 1579
20. PIG Sus scrofa 190 155 242401
21. RAB Basilea sp. 1 1 377
22. RAB Oryctolagus cuniculus 420 363 525433
23. RAB Oryctolagus sp. 57 57 70289
24. RAB Sylvilagus floridanus 1 1 1065
25. SEA Halichoerus grypus 3 3 2288
26. SHP Ovis aries 69 61 87157
27. SHP Ovis sp. 41 38 35507
28. SUN Suncus murinus 8 6 6231
29. VMP Desmodus rotundus 2 1 1725
30. WAL Macropus eugenii 1 1 754
31. WAL Macropus robustus 1 1 1465
32. WAL Macropus rufus 1 1 50
Total 1817 1552 1935603
OTHER VERTEBRATE
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. ANG Anguilla australis 1 1 2191
2. APT Ascaphus truei 2 1 1897
3. BUJ Bufo japonicus 2 2 1116
4. CHK Gallus domesticus 158 120 174064
5. CHK Gallus gallus 1040 851 1029779
6. CPL Carcharhinus plumbeus 4 4 1821
7. CRC Caiman crocodylus 5 5 2902
8. DUK Aix sp. 1 1 165
9. DUK Anas platyrhynchos 15 14 21280
10. DUK Cairina moschata 30 22 25269
11. FAL Falco columbarius 1 1 174
12. FSA Myxine glutinosa 1 1 915
13. FSA Petromyzon marinus 5 5 9661
14. FSB Acipenser transmontano 2 1 1140
15. FSB Anarhichas lupus 1 1 3395
16. FSB Carassius auratus 12 11 9222
17. FSB Catostomus commersoni 4 3 1936
18. FSB Ctenopharyngobon idella 1 1 4243
19. FSB Cyprinus carpio 18 17 16913
20. FSB Electrophorus electricus 5 5 8692
21. FSB Elops saurus 9 5 3870
22. FSB Fundulus heteroclitus 1 1 1673
23. FSB Ictalurus punctatus 7 6 5679
24. FSB Limanda ferruginea 1 1 416
25. FSB Lophius americanus 8 5 2942
26. FSB Macrozoarces americanus 3 3 2657
27. FSB Misgurnus fossilis 2 2 1697
28. FSB Oncorhynchus keta 30 26 23515
29. FSB Oncorhynchus kisutch 2 2 3221
30. FSB Oncorhynchus tschawytscha 3 3 1862
31. FSB Paralichthys olivaceus 7 5 4859
32. FSB Pseudopleuronectus americanus 8 6 3002
33. FSB Salmo gairdneri 52 48 40813
34. FSB Salmo irideus 1 1 1278
35. FSB Salmo salar 8 6 6764
36. FSB Thunnus thynnus 1 1 911
37. FSC Torpedo californica 21 13 23035
38. FSC Torpedo marmorata 7 7 9992
39. GOO Anser anser 2 2 4906
40. GRE Geoclemys reevessi 1 1 239
41. HEF Heterodontus francisci 34 29 19173
42. KRY Kryptophaneron alfredi 1 1 1230
43. LSE Laticauda semifasciata 1 1 483
44. LSL Laticauda laticaudata 2 1 632
45. MRG Mergus serrator 1 1 2574
46. NEW Cynops pyrrhogaster 1 1 629
47. NVI Notophthalmus viridescens 10 10 1458
48. ORN Oreochromis mossambicus 2 1 237
49. ORN Oreochromis niloticus 1 1 847
50. PAG Pagrus major 2 1 906
51. PGN Columba sp. 2 2 1665
52. PHS Phasianus colchicus 1 1 739
53. PHU Phyllomedusa bicolor 1 1 781
54. PHU Phyllomedusa sauvagei 2 2 1315
55. PLS Phylloscopus trochilus 6 3 2593
56. PYU Pyura stolonifera 7 7 1029
57. QUL Coturnix coturnix 50 35 36670
58. QUL Coturnix japonica 1 1 311
59. RAN Rana catesbeiana 10 8 6949
60. RAN Rana pipiens 4 3 1625
61. RAN Rana temporaria 19 14 7422
62. SCC Scyliorhinus caniculus 2 2 667
63. SEQ Seriola quinqueradiata 1 1 879
64. SKT Raja erinacea 8 8 10209
65. SMD Triturus vulgaris 4 4 901
66. SMR Pleurodeles waltlii 5 4 2305
67. SNK Aipysurus laevis 6 3 1332
68. SNK Bothrops atrox 8 7 11423
69. SNK Crotalus durissus 3 2 1263
70. SNK Elaphe radiata 1 1 2483
71. SNK Naja naja 1 1 312
72. SNK Natrix tessellata 1 1 312
73. SNK Notechis scutatus 2 1 621
74. SRA Hemitripterus americanus 2 2 3294
75. TKY Meleagris gallopavo 16 14 6836
76. VIA Vipera ammodytes 2 1 607
77. XEB Xenopus borealis 27 26 19249
78. XEC Xenopus clivii 6 6 1406
79. XEL Xenopus laevis 503 440 505221
80. XET Xenopus tropicalis 11 8 14850
81. XIP Xiphophorus maculatus 4 2 983
82. XIP Xiphophorus sp. 2 1 4135
83. ZEF Brachydanio rerio 8 6 4264
Total 2263 1876 2142926
INVERTEBRATE
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. ACA Acanthamoeba castellanii 32 28 28156
2. ACP Acropora formosa 2 2 236
3. ACP Acropora latistella 3 3 354
4. ADO Acheta domesticus 1 1 541
5. AEI Aequipecten irradians 2 2 1253
6. AEV Aequorea victoria 4 4 2595
7. AME Apis melifica 2 2 897
8. AMF Apis mellifera 8 8 2082
9. APL Aplysia californica 35 32 28418
10. APL Aplysia sp. 9 9 9086
11. APO Antheraea polyphemus 33 33 16883
12. APY Antheraea yamamai 1 1 1200
13. ARB Arbacia punctulata 3 2 5078
14. BBO Babesia bovis 4 4 2482
15. BBO Babesia rodhaini 1 1 3238
16. BMO Bombyx mandarina 3 3 2712
17. BMO Bombyx mori 152 120 87988
18. BPL Brachionus plicatilis 1 1 120
19. BRP Brugia malayi 9 8 8730
20. BRP Brugia pahangi 2 2 4893
21. BUG Bugula neritina 1 1 120
22. CAF Calanus finmarchicus 1 1 1487
23. CAR Carcinoscorpius rotundicauda 1 1 983
24. CAV Calliphora vicina 11 7 16088
25. CBL Trichoplusia ni 1 1 2475
26. CCA Caledia captiva 12 12 11713
27. CEI Ceratitis capitata 1 1 1115
28. CEL Caenorhabditis briggsae 7 4 5300
29. CEL Caenorhabditis elegans 139 108 228239
30. CER Calliphora erythrocephala 6 6 1677
31. CHI Chironomus pallidivittatus 21 16 5529
32. CHI Chironomus tentans 22 20 20059
33. CHI Chironomus thummi 32 30 32150
34. CLM Clam sp. 1 1 2163
35. CLM Spisula solidissima 2 2 3577
36. CPD Colpidium campylum 1 1 76
37. CPD Colpidium colpoda 1 1 76
38. CRB Cardisoma guanhumi 3 3 1992
39. CRB Gecarcinus lateralis 4 4 6714
40. CRB Geryon quinquedens 1 1 85
41. CRB Limulus polyphemus 4 4 4886
42. CRB Paguras pollicaris 4 4 419
43. CUL Culex pipiens 1 1 3105
44. DDI Dictyostelium discoideum 282 215 215990
45. DDI Dictyostelium sp. 2 1 4439
46. DEM Dermasterias imbricata 2 2 1016
47. DEP Dermatophagoides pteronyssinus 2 1 824
48. DIA Diadromus pulchellus 1 1 324
49. DIC Dicyema misakiense 1 1 116
50. DIR Dirofilaria immitis 2 2 2791
51. DRE Drosophila erecta 4 4 4847
52. DRF Drosophila funebris 2 1 3002
53. DRG Drosophila gymnobasis 10 10 1793
54. DRH Drosophila heteroneura 1 1 922
55. DRH Drosophila hydei 18 15 26936
56. DRH Drosophila irilis 1 1 1602
57. DRI Drosophila grimshawi 3 3 3860
58. DRL Drosophila silvarentis 2 2 356
59. DRM Drosophila mauritiana 6 4 8528
60. DRN Drosophila navojoa 2 1 2334
61. DRN Drosophila nebulosa 2 2 2991
62. DRO Drosophila melanogaster 1242 1001 1581360
63. DRO Drosophila subobscura 2 2 4827
64. DRP Drosophila pseudoobscura 14 7 38083
65. DRQ Drosophila sechellia 3 2 5559
66. DRR Drosophila orena 7 4 7293
67. DRS Drosophila simulans 23 18 21926
68. DRS Drosophila sp. 2 2 5761
69. DRT Drosophila teissieri 4 4 7787
70. DRU Drosophila mulleri 1 1 6778
71. DRV Drosophila virilis 47 37 72725
72. DRW Drosophila mojavensis 6 5 21612
73. DRY Drosophila yakuba 1 1 1853
74. ECC Echinococcus granulosus 2 2 846
75. EIM Eimeria acervulina 4 2 1764
76. EIM Eimeria tenella 5 4 4629
77. ENH Entamoeba histolytica 15 14 10247
78. EPH Ephydatia mulleri 1 1 2895
79. EUC Eurypelma californicum 1 1 1579
80. EWA Euplotes aediculatus 1 1 1882
81. EWC Euplotes crassus 2 2 1427
82. EWE Euplotes eurystomus 2 1 930
83. EWR Euplotes raikovi 1 1 593
84. FFL Luciola cruciata 1 1 1985
85. FHE Fasciola hepatica 1 1 894
86. GCH Glaucoma chattoni 3 3 564
87. GLA Giardia lamblia 14 10 9487
88. GLA Giardia sp. 1 1 831
89. GLY Glycera dibranchiata 1 1 745
90. GMO Glossina austeni 1 1 653
91. GMO Glossina fuscipes 1 1 239
92. GMO Glossina morsitans 7 7 3244
93. GMO Glossina palpalis 1 1 236
94. HAE Haemonchus contortus 4 4 3557
95. HCE Hyalophora cecropia 11 11 6608
96. HEL Heliothis virescens 1 1 2977
97. HIR Hirudo medicinalis 1 1 379
98. HLT Haliotis corrugata 1 1 650
99. HLT Haliotis rufescens 1 1 642
100. HOL Holothuria polii 5 5 1964
101. HOL Holothuria tubulosa 1 1 441
102. HYD Hydra sp. 2 2 4555
103. LAN Lingula anatina 1 1 120
104. LAR Lampetra reissneri 1 1 120
105. LEI Leishmania donovani 10 6 10013
106. LEI Leishmania enriettae 4 4 1562
107. LEI Leishmania enriettii 4 4 4213
108. LEI Leishmania major 10 8 9788
109. LEI Leishmania sp. 4 4 8884
110. LEI Leishmania tropica 1 1 1851
111. LIT Litomosoides carinii 2 2 214
112. LMI Locusta migratoria 5 5 3039
113. LOA Loa loa 1 1 839
114. LUM Lumbricus terrestris 2 2 5061
115. LYM Lymnaea stagnalis 2 2 1764
116. MDO Musca domestica 3 2 2916
117. MOT Manduca sexta 25 22 40693
118. MSQ Aedes aegypti 4 4 8350
119. MSQ Anopheles gambiae 21 21 7322
120. NEM Ascaris lumbricoides 35 34 17038
121. NEM Ascaris suum 2 2 6079
122. NEP Nephila clavipes 1 1 2336
123. NGR Naegleria gruberi 4 4 6389
124. OCT Octopus dofleini 1 1 1315
125. OCT Paroctopus defleini 1 1 1675
126. OFA Oxytricha fallax 34 13 9421
127. OMM Ommastrephes sloanei 2 2 1825
128. ONG Onchocerca sp. 2 2 214
129. ONG Onchocerca volvulus 25 24 15477
130. ONO Oxytricha nova 21 21 20172
131. OWE Owenia fusiformis 1 1 1548
132. PAA Parascaris sp. 2 2 215
133. PAL Paracentrotus lividus 11 9 13481
134. PAR Paramecium aurelia 5 5 1190
135. PAR Paramecium primaurelia 8 8 12822
136. PAR Paramecium tetraurelia 20 17 9174
137. PBA Plasmodium gallinaceum 1 1 799
138. PBE Plasmodium berghei 9 9 11176
139. PBS Plasmodium brasilianum 2 2 3010
140. PCH Plasmodium chabaudi 11 9 17151
141. PCR Philosamia cynthia ricini 2 2 240
142. PCY Plasmodium cynomolgi 6 6 7875
143. PFA Plasmodium falciparum 168 138 216729
144. PFA Plasmodium fragile 2 1 2307
145. PIO Pisaster ochraceus 5 5 9699
146. PKN Plasmodium knowlesi 11 9 8880
147. PLM Plasmodium malariae 1 1 1545
148. PLM Plasmodium reichenowi 1 1 654
149. PLO Plasmodium lophurae 6 5 6087
150. PMC Pneumocystis carinii 10 6 5670
151. PMI Prorocentrum micans 2 1 2451
152. PNG Panagrellus redivivus 2 2 322
153. PNG Panagrellus silusiae 1 1 682
154. PPY Photinus pyralis 1 1 2387
155. PVI Plasmodium vivax 13 7 8966
156. PYO Plasmodium yoelii 15 11 21793
157. SAR Sarcocystis gigantea 3 3 473
158. SCA Schistocerca americana 3 3 3458
159. SCA Schistocerca nitans 2 2 711
160. SCI Sciara coprophila 2 2 822
161. SCM Schistosoma japonicum 10 9 9090
162. SCM Schistosoma mansoni 58 52 46518
163. SCR Androctonus australis 7 7 2563
164. SEM Parastichopus parvimensis 1 1 1458
165. SHR Artemia salina 21 20 14517
166. SHR Artemia sp. 11 8 6407
167. SLE Stylonychia lemnae 4 4 7237
168. SLU Stylonychia pustulata 5 5 2820
169. SPE Sarcophaga peregrina 7 7 6405
170. SPF Spodoptera frugiperda 5 5 7382
171. SQD Loligo pealii 1 1 3693
172. STF Asterina pectinifera 1 1 2180
173. SUH Hemicentrotus pulcherrimus 1 1 2413
174. SUL Lytechinus pictus 14 13 10476
175. SUL Lytechinus variegatus 16 16 12102
176. SUP Psammechinus miliaris 50 38 25964
177. SUS Strongylocentrotus drobachiensis 4 4 1256
178. SUS Strongylocentrotus franciscanus 2 2 3832
179. SUS Strongylocentrotus purpuratus 167 158 114993
180. SUT Tripneustes gratilla 13 7 10207
181. SUU Sea urchin 12 12 2632
182. TAE Taenia solium 3 2 3737
183. TAT Tachypleus tridentatus 1 1 946
184. TCA Tribolium castaneum 1 1 707
185. TCK Boophilus microplus 2 1 2225
186. TCS Trichostrongylus colubriformis 2 2 1987
187. TEC Tetrahymena cosmopolitanis 1 1 511
188. TEH Tetrahymena hyperangularis 2 2 767
189. TEL Tetrahymena leucophrys 1 1 75
190. TEM Tetrahymena malaccensis 1 1 507
191. TEN Tenebrio molitor 23 23 5207
192. TEP Tetrahymena paravorax 1 1 74
193. TEP Tetrahymena pigmentosa 4 4 3072
194. TES Tetrahymena sonneborni 1 1 511
195. TET Tetrahymena thermophila 64 61 49075
196. TEU Tetrahymena patula 1 1 75
197. TEX Tetrahymena vorax 1 1 75
198. TEY Tetrahymena pyriformis 13 11 10560
199. THE Theileria annulata 2 2 2859
200. THE Theileria parva 3 3 6729
201. TOX Toxoplasma gondii 10 10 12582
202. TRB Trypanosoma brucei 252 223 288704
203. TRC Trypanosoma cruzi 35 33 31153
204. TRE Trypanosoma equiperdum 6 4 3816
205. TRF Crithidia fasciculata 16 12 20239
206. TRI Trichomonas vaginalis 1 1 582
207. TRL Leptomonas collosoma 1 1 154
208. TRL Leptomonas seymouri 6 4 2130
209. TRO Trypanosoma congolense 9 9 8879
210. TRV Trypanosoma vivax 2 2 857
211. TRY Trypanosoma rangeli 1 1 153
212. TSR Trichinella spiralis 3 2 2138
213. UCA Urechis caupo 1 1 718
214. VAH Vargula hilgendorfii 1 1 1818
215. VAI Vairimorpha necatrix 1 1 1244
216. VUI Eupelmus vuilleti 1 1 106
217. WSP Dolichovespula maculata 2 2 1367
218. WUC Wuchereria bancrofti 1 1 1323
Total 3811 3195 4005462
PLANT
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. ABG Absidia glauca 2 1 1011
2. ACK Achlya ambisexualis 1 1 1121
3. ACK Achlya bisexualis 1 1 1809
4. ACK Achlya klebsiana 1 1 1254
5. ACT Actinidia chinensis 8 5 5231
6. ACT Actinidia deliciosa 1 1 634
7. AEG Aegilops tauschii 1 1 421
8. ALC Allium cepa 3 3 1135
9. ALF Medicago sativa 29 18 26748
10. AMA Antirrhinum majus 11 7 11860
11. APE Acremonium chrysogenum 2 2 2410
12. ASA Aspergillus awamori 4 3 7873
13. ASG Aspergillus niger 12 9 18564
14. ASL Aessosporon salmonicolor 1 1 119
15. ASN Aspergillus nidulans 49 42 89815
16. ASO Aspergillus oryzae 13 10 14743
17. AST Avena sativa 8 8 22641
18. ATH Arabidopsis thaliana 94 76 136042
19. AVO Persea americana 3 3 4672
20. BJE Bjerkandera adusta 1 1 118
21. BLY Hordeum vulgare 91 72 86901
22. BNA Brassica napus 25 21 23597
23. BOL Brassica campestris 8 5 5320
24. BOL Brassica juncea 3 2 356
25. BOL Brassica oleracea 12 9 3859
26. BRM Bremia lactucae 1 1 2869
27. BRN Bertholletia excelsa 1 1 621
28. CAG Canavalia gladiata 3 2 4797
29. CCI Coprinus cinereus 2 2 6091
30. CEN Canavalia ensiformis 1 1 1027
31. CFU Caldariomyces fumago 2 1 2787
32. CHE Chenopodium rubrum 6 3 1865
33. CHL Chlorella protothecoides 2 1 1332
34. CHL Chlorella sp. 6 6 1019
35. CIP Mesembryanthemum crystallinum 11 8 25401
36. CLC Claviceps purpurea 2 2 758
37. CLI Citrus limon 3 3 4958
38. CLR Clarkia unguiculata 2 1 2040
39. CNA Citrullus vulgaris 1 1 1334
40. COA Convolvulus arvensis 4 4 4549
41. COC Cochliobolus heterostrophus 2 2 2634
42. COG Colletotrichum capsici 1 1 2557
43. COG Colletotrichum gloeosporioides 1 1 1749
44. COG Colletotrichum graminicola 2 2 5286
45. COT Gossypium hirsutum 9 9 13403
46. CPA Carica papaya 3 3 2037
47. CPA Cyanophora paradoxa 1 1 528
48. CRE Chlamydomonas reinhardtii 49 38 52939
49. CTR Catharanthus roseus 1 1 1740
50. CUC Cucurbita maxima 5 3 15360
51. CUC Cucurbita moschata 2 1 1781
52. CUC Cucurbita pepo 8 8 7363
53. CUS Cucumis melo 1 1 1137
54. CUS Cucumis sativus 13 12 12742
55. DAR Daucus carota 11 9 11820
56. DBI Dolichos biflorus 3 3 5384
57. DUN Dunaliella salina 3 2 2541
58. EGR Euglena gracilis 9 6 9011
59. EPA Endothia parasitica 5 3 3809
60. EPK Ephedra kokanica 2 1 120
61. ERG Erysiphe graminis 1 1 2475
62. FIL Filobasidium capsuligenum 1 1 118
63. FIL Filobasidium floriforme 1 1 118
64. FLX Linum usitatissimum 5 5 1726
65. FSO Fusarium oxysporum 2 2 1347
66. FSO Fusarium solani 3 3 5508
67. FSO Fusarium sporotrichioides 2 1 1908
68. FTR Flaveria trinervia 1 1 752
69. GCO Gracilaria tikvahiae 1 1 1771
70. GCO Gracilaria verrucosa 1 1 1771
71. GNG Gnetum gnemon 2 1 120
72. GRO Gracilariopsis sp. 1 1 1782
73. HAS Hansenula anomala 2 1 2132
74. HAS Hansenula polymorpha 2 1 3637
75. HEV Hevea brasiliensis 1 1 1008
76. HNN Helianthus annuus 6 4 5752
77. HRA Armoracia rusticana 2 2 5828
78. IPB Ipomoea batatas 9 7 11859
79. LGI Lemna gibba 6 6 5024
80. LGI Lemna minor 1 1 119
81. LIL Lilium henryi 1 1 9345
82. LOL Lolium perenne 2 2 2082
83. LUP Lupinus luteus 20 15 5280
84. MAQ Marsilia quadrifolia 2 1 155
85. MIN Matthiola incana 1 1 509
86. MRA Mucor racemosus 8 7 6601
87. MRM Mucor circinelloides 1 1 4399
88. MRM Mucor miehei 2 2 3316
89. MRP Mucor pusillus 1 1 1965
90. MZE Zea mays 235 193 295885
91. MZE Zea mexicana 2 2 360
92. NAN Nanochlorum eucaryotum 2 1 1796
93. NEU Neurospora crassa 116 102 168075
94. OCH Ochromonas danica 1 1 1789
95. PAN Podospora anserina 3 3 1901
96. PAP Papaver somniferum 2 2 864
97. PBL Phycomyces blakesleeanus 3 3 545
98. PCP Physcomitrella patens 1 1 2544
99. PEA Pisum sativum 81 72 85166
100. PEC Penicillium chrysogenum 7 6 20758
101. PEP Penicillium patulum 1 1 6357
102. PET Petunia hybrida 37 31 28787
103. PET Petunia sp. 25 24 13154
104. PHA Phanerochaete chrysosporium 14 10 18832
105. PHN Pharbitis nil 10 5 534
106. PHO Petroselinum hortense 1 1 1431
107. PHT Phytophthora megasperma 4 2 3031
108. PHV Phaseolus lunatus 1 1 926
109. PHV Phaseolus vulgaris 48 37 43358
110. PIN Pinus contorta 1 1 745
111. PIN Pinus sylvestris 2 1 583
112. PIN Pinus thunbergii 4 2 1889
113. PMI Prorocentrum micans 2 1 3408
114. POA Polytomella agilis 3 3 6616
115. POM Polystichum munitum 4 4 4645
116. POP Populus sp. 17 10 3776
117. POT Solanum tuberosum 59 52 81544
118. PSJ Psathyrostachys juncea 2 2 2035
119. PTE Porphyra umbilicalis 2 1 121
120. PUM Petroselinum crispum 10 10 6270
121. PYL Pylaiella littoralis 2 1 1644
122. RAD Raphanus sativus 9 6 2470
123. RCC Ricinus communis 6 6 10485
124. RCH Rhizopus chinensis 1 1 1133
125. RCH Rhizopus niveus 2 2 3448
126. RCH Rhizopus oryzae 1 1 2290
127. RDT Rhodotorula rubra 2 1 3586
128. RHD Rhodosporidium toruloides 2 2 3181
129. RHP Parasponia andersonii 1 1 1520
130. RHP Parasponia rhizobium 2 2 5530
131. RIC Oryza sativa 81 56 78071
132. RYE Secale cereale 3 3 5870
133. SAL Sinapis alba 12 8 8316
134. SCN Schwanniomyces occidentalis 2 1 2292
135. SCO Schizophyllum commune 5 5 4836
136. SES Sesbania rostrata 6 5 3948
137. SIP Silene pratensis 4 4 3165
138. SLM Physarum polycephalum 59 49 52689
139. SOY Glycine max 144 111 165611
140. SPI Spinacia oleracea 37 30 34346
141. SRG Sorghum bicolor 9 5 6336
142. SRG Sorghum sp. 2 1 4638
143. SSI Scilla siberica 4 4 204
144. SSY Sisymbrium irio 2 1 433
145. TDA Thaumatococcus daniellii 1 1 931
146. TFR Trifolium repens 2 1 1268
147. THI Thinopyrum elongatum 1 1 1375
148. TLA Thermomyces lanuginosus 6 5 3804
149. TOB Nicotiana alata 1 1 804
150. TOB Nicotiana plumbaginifolia 12 11 22286
151. TOB Nicotiana rustica 2 2 593
152. TOB Nicotiana sylvestris 4 4 1382
153. TOB Nicotiana tabacum 67 51 71765
154. TOM Lycopersicon esculentum 91 72 94920
155. TOM Lycopersicon peruvianum 1 1 480
156. TRD Tripsacum dactyloides 6 6 1528
157. TRH Trichosanthes kirilowii 2 2 2237
158. TRR Trichoderma reesei 4 4 8343
159. TRT Trema tomentosa 2 1 1727
160. URO Uromyces appendiculatus 1 1 1449
161. USM Ustilago maydis 3 2 2656
162. VFA Vicia faba 39 33 29444
163. VIR Vigna mungo 2 1 1314
164. VIR Vigna radiata 5 4 4354
165. VVC Volvox carteri 11 9 17792
166. WHT Triticum aestivum 100 79 115801
167. WHT Triticum durum 1 1 898
168. WHT Triticum sp. 8 8 8441
169. WHT Triticum vulgare 1 1 965
170. YS1 Zygosaccharomyces fermentati 1 1 5416
171. YS2 Saccharomycopsis fibuligera 3 3 9339
172. YS4 Candida boidinii 2 2 1863
173. YS5 Candida glabrata 3 3 2758
174. YSA Candida albicans 11 8 13498
175. YSB Candida tropicalis 16 13 23751
176. YSC Saccharomyces cerevisiae 1131 933 1716468
177. YSCTY Transposable element TY1 53 44 52430
178. YSD Saccharomyces diastaticus 4 4 4319
179. YSD Saccharomyces douglassi 1 1 4072
180. YSE Candida pelliculosa 1 1 5327
181. YSF Candida maltosa 5 4 8167
182. YSG Saccharomyces carlsbergensis 23 20 38053
183. YSH Hansenula wingei 3 3 720
184. YSI Saccharomyces fibuligera 2 2 6761
185. YSJ Yarrowia lipolytica 8 7 17412
186. YSK Kluyveromyces lactis 41 32 86137
187. YSM Hansenula polymorpha 3 3 8018
188. YSN Kluyveromyces fragilis 1 1 4193
189. YSO Zygosaccharomyces rouxii 5 3 15025
190. YSP Schizosaccharomyces japonicus 1 1 108
191. YSP Schizosaccharomyces malidevorans 1 1 107
192. YSP Schizosaccharomyces octosporus 1 1 109
193. YSP Schizosaccharomyces pombe 135 114 206826
194. YSQ Pichia pastoris 3 3 899
195. YSS Cephalosporium acremonium 5 5 2757
196. YST Yeast sp. 38 37 17556
197. YSU Candida utilis 4 4 7578
198. YSV Saccharomyces uvarum 1 1 2001
199. YSW Kluyveromyces drosophilarum 1 1 4757
200. YSX Saccharomyces rosei 1 1 278
201. YSY Saccharomyces kluyveri 5 4 2875
202. YSZ Zygosaccharomyces bailii 1 1 5415
203. ZAM Zamia pumila 1 1 1813
Total 3636 2976 4659180
ORGANELLE
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. ABGMT Mitochondrion Absidia glauca 1 1 596
2. ACUCP Chloroplast Acorus calamus 1 1 103
3. AEGCP Chloroplast Aegilops crassa 2 1 1436
4. AEGCP Chloroplast Aegilops squarrosa 1 1 203
5. AKOMT Mitochondrion Akodon aerosus 6 6 2406
6. AKOMT Mitochondrion Akodon andinus 1 1 401
7. AKOMT Mitochondrion Akodon boliviensis 2 2 802
8. AKOMT Mitochondrion Akodon jelskii 3 3 1203
9. AKOMT Mitochondrion Akodon juninensis 1 1 401
10. AKOMT Mitochondrion Akodon kofordi 1 1 401
11. AKOMT Mitochondrion Akodon mollis 1 1 401
12. AKOMT Mitochondrion Akodon puer 1 1 401
13. AKOMT Mitochondrion Akodon subfuscus 3 3 1203
14. AKOMT Mitochondrion Akodon torques 3 3 1203
15. ALFCP Chloroplast Medicago sativa 3 3 3460
16. AMDCP Chloroplast Acetabularia mediterranea 1 1 1175
17. AMFMT Mitochondrion Apis mellifera 1 1 2949
18. AMHCP Chloroplast Amaranthus hybridus 1 1 1187
19. AMTMT Mitochondrion Ambystoma tigrinum 2 1 225
20. ASIMT Mitochondrion Ascobolus immersus 2 1 5142
21. ASNMT Mitochondrion Aspergillus amstelodami 1 1 624
22. ASNMT Mitochondrion Aspergillus nidulans 22 18 32837
23. ASTCP Chloroplast Avena sativa 1 1 1623
24. ATHCP Chloroplast Arabidopsis thaliana 3 2 1499
25. ATHMT Mitochondrion Arabidopsis thaliana 2 1 880
26. ATPCP Chloroplast Atriplex patula 1 1 1786
27. ATPCP Chloroplast Atriplex rosea 1 1 1790
28. BETMT Mitochondrion Beta vulgaris 4 4 5919
29. BLSMT Mitochondrion Boletus satanas 2 1 341
30. BLYCP Chloroplast Hordeum vulgare 18 13 23006
31. BNACP Chloroplast Brassica napus 1 1 1633
32. BOLCP Chloroplast Brassica oleracea 1 1 543
33. BOLMT Mitochondrion Brassica oleracea 1 1 549
34. BOLMT Mitochondrion Brassica sp. 2 2 770
35. BOMMT Mitochondrion Bolomys amoenus 2 2 802
36. BOVMT Mitochondrion Bos taurus 6 5 19563
37. CEUMT Mitochondrion Cervus unicolor 1 1 2682
38. CHECP Chloroplast Chenopodium album 1 1 207
39. CHKMT Mitochondrion Gallus domesticus 1 1 3571
40. CHKMT Mitochondrion Gallus gallus 4 3 1153
41. CHLCP Chloroplast Chlorella ellipsoidea 12 9 15063
42. CHPMT Mitochondrion Pan troglodytes 1 1 896
43. CNAMT Mitochondrion Citrullus lanatus 1 1 4512
44. COCMT Mitochondrion Cochliobolus heterostrophus 1 1 1827
45. CODCP Chloroplast Codium fragile 4 4 304
46. COOCP Chloroplast Coleochaete orbicularis 1 1 1587
47. CPACP Chloroplast Cyanophora paradoxa 11 8 2748
48. CPACY Cyanelle Cyanophora paradoxa 5 5 5317
49. CRECP Chloroplast Chlamydomonas moewusii 3 2 8107
50. CRECP Chloroplast Chlamydomonas reinhardtii 42 33 37766
51. CRECP Chloroplast Chlamydomonas sp. 3 2 2346
52. CREMT Mitochondrion Chlamydomonas reinhardtii 26 22 21535
53. CRRMT Mitochondrion Corcorax melanorhamphos 1 1 239
54. CRYCP Chloroplast Cryptomonas phi 2 1 715
55. DARMT Mitochondrion Daucus carota 1 1 690
56. DIPMT Mitochondrion Dipodomys californicus 1 1 239
57. DIPMT Mitochondrion Dipodomys heermanni 1 1 239
58. DIPMT Mitochondrion Dipodomys panamintinus 2 2 478
59. DRMMT Mitochondrion Drosophila mauritania 1 1 976
60. DROMT Mitochondrion Drosophila melanogaster 9 6 16133
61. DRSMT Mitochondrion Drosophila simulans 1 1 975
62. DRVMT Mitochondrion Drosophila virilis 1 1 191
63. DRYMT Mitochondrion Drosophila yakuba 9 3 19938
64. EGRCP Chloroplast Euglena gracilis 35 24 62419
65. EQQMT Mitochondrion Equus quagga 2 2 229
66. FHEMT Mitochondrion Fasciola hepatica 2 1 708
67. FRGMT Mitochondrion Rana catesbeiana 9 5 12109
68. FSBMT Mitochondrion Acipenser transmontano 2 1 156
69. FSBMT Mitochondrion Cichlosoma centrarchus 1 1 239
70. FSBMT Mitochondrion Cichlosoma citrinellum 1 1 239
71. FSBMT Mitochondrion Cichlosoma labiatum 1 1 239
72. FSBMT Mitochondrion Cichlosoma nicaraguense 1 1 239
73. FSBMT Mitochondrion Cyprinus carpio 3 3 873
74. FSBMT Mitochondrion Julidochromis regani 1 1 239
75. FSBMT Mitochondrion Salmo gairdneri 2 2 855
76. FTRCP Chloroplast Flaveria bidentis 1 1 1839
77. FTRCP Chloroplast Flaveria pringlei 1 1 1842
78. GCOCP Chloroplast Gracilaria tenuistipitata 1 1 1930
79. GIBMT Mitochondrion Hylobates lar 1 1 896
80. GORMT Mitochondrion Gorilla gorilla 1 1 896
81. HAMMT Mitochondrion Cricetulus sp. 1 1 880
82. HNNMT Mitochondrion Helianthus annuus 1 1 1336
83. HUMMT Mitochondrion Homo sapiens 43 36 37585
84. HYRMT Mitochondrion Hydropotes inermis 1 1 2680
85. IPBCP Chloroplast Ipomoea batatas 1 1 2004
86. LEIKP Kinetoplast Leishmania aethiopica 1 1 376
87. LEIKP Kinetoplast Leishmania major 2 1 1031
88. LEIKP Kinetoplast Leishmania mexicana 3 3 2134
89. LEIKP Kinetoplast Leishmania tarentolae 22 16 28301
90. LEIMT Mitochondrion Leishmania tarentolae 1 1 189
91. LEMMT Mitochondrion Lemur catta 1 1 895
92. LIGCP Chloroplast Ligularia calthifolia 1 1 103
93. LMIMT Mitochondrion Locusta migratoria 3 3 5118
94. LUAMT Mitochondrion Lupinus angustifolius 2 2 1330
95. LUPMT Mitochondrion Lupinus luteus 2 1 630
96. MACMT Mitochondrion Macaca fascicularis 2 2 1598
97. MACMT Mitochondrion Macaca fuscata 1 1 896
98. MACMT Mitochondrion Macaca mulatta 1 1 896
99. MACMT Mitochondrion Macaca sylvanus 1 1 896
100. MCXMT Mitochondrion Microxus mimus 2 2 802
101. MMUMT Mitochondrion Muntiacus reevesi 1 1 2682
102. MPOCP Chloroplast Marchantia polymorpha 10 1 121024
103. MSQMT Mitochondrion Aedes albopictus 9 9 3448
104. MUSMT Mitochondrion Mus musculus 22 15 20678
105. MZECP Chloroplast Zea mays 72 55 54150
106. MZECP Chloroplast Zea perennis 2 2 1456
107. MZEMT Mitochondrion Zea mays 53 45 80743
108. NEUMT Mitochondrion Neurospora crassa 44 39 55773
109. NEUMT Mitochondrion Neurospora intermedia 5 4 10804
110. NRACP Chloroplast Neurachne munroi 1 1 1990
111. NRACP Chloroplast Neurachne tenuifolia 1 1 2010
112. OBECP Chloroplast Oenothera berteriana 6 4 1813
113. OBEMT Mitochondrion Oenothera berteriana 18 15 25036
114. OBOCP Chloroplast Oenothera odorata 2 2 964
115. ODOMT Mitochondrion Odocoileus virginianus 1 1 2677
116. OHOCP Chloroplast Oenothera hookeri 2 2 2132
117. OLICP Chloroplast Olisthodiscus luteus 1 1 714
118. ORAMT Mitochondrion Pongo pygmaeus 1 1 895
119. OSPMT Mitochondrion Oenothera sp. 2 2 1635
120. PALMT Mitochondrion Paracentrotus lividus 17 17 21974
121. PANMT Mitochondrion Podospora anserina 21 20 65750
122. PARMT Mitochondrion Paramecium aurelia 9 9 8110
123. PARMT Mitochondrion Paramecium primaurelia 4 3 5645
124. PARMT Mitochondrion Paramecium sp. 34 17 12563
125. PARMT Mitochondrion Paramecium tetraurelia 4 4 5844
126. PEACP Chloroplast Pisum sativum 36 30 55118
127. PEAMT Mitochondrion Pisum sativum 8 6 10694
128. PENCP Chloroplast Pennisetum americanum 2 1 325
129. PETCP Chloroplast Petunia hybrida 7 7 5806
130. PETMT Mitochondrion Petunia hybrida 4 3 2954
131. PETMT Mitochondrion Petunia parodii 1 1 1774
132. PFAMT Mitochondrion Plasmodium falciparum 1 1 935
133. PGYMT Mitochondrion Paragyrodon sphaerosporus 2 1 337
134. PHVMT Mitochondrion Phaseolus vulgaris 1 1 88
135. PIGMT Mitochondrion Sus scrofa 2 2 686
136. PILCP Chloroplast Pilayella littoralis 1 1 353
137. PMGMT Mitochondrion Placopecten magellanicus 5 5 4580
138. POGMT Mitochondrion Thomomys townsendi 1 1 239
139. PZOCP Chloroplast Pelargonium zonale 2 2 463
140. RADMT Mitochondrion Raphanus sativus 2 2 5752
141. RATMT Mitochondrion Rattus norvegicus 35 26 23507
142. RATMT Mitochondrion Rattus rattus 4 4 4388
143. RHS Mitochondrion Rhizopogon achraeceorubens 2 1 341
144. RHS Mitochondrion Rhizopogon subcaerulescens 2 1 341
145. RICCP Chloroplast Oryza sativa 11 9 12553
146. RICMT Mitochondrion Oryza sativa 5 5 8084
147. RYECP Chloroplast Secale cereale 9 7 9269
148. SAIMT Mitochondrion Saimiri sciureus 1 1 893
149. SALCP Chloroplast Sinapis alba 8 6 9874
150. SAOCP Chloroplast Saponaria officinalis 1 1 1252
151. SCOMT Mitochondrion Schizophyllum commune 1 1 1120
152. SLMMT Mitochondrion Physarum polycephalum 1 1 1536
153. SNICP Chloroplast Solanum nigrum 1 1 1501
154. SOLCP Chloroplast Spirodela oligorhiza 9 9 6538
155. SOYCP Chloroplast Glycine max 14 9 16045
156. SOYMT Mitochondrion Glycine max 5 5 8683
157. SPFMT Mitochondrion Spodoptera frugiperda 1 1 446
158. SPICP Chloroplast Spinacia oleracea 47 38 79017
159. SRGCP Chloroplast Sorghum bicolor 1 1 862
160. SRGMT Mitochondrion Sorghum sp. 4 2 4768
161. STFMT Mitochondrion Asterina pectinifera 1 1 3849
162. SUIMT Mitochondrion Suillus cavipes 2 1 339
163. SUSMT Mitochondrion Strongylocentrotus drobachiensis
2 2 965
164. SUSMT Mitochondrion Strongylocentrotus franciscanus
3 3 1276
165. SUSMT Mitochondrion Strongylocentrotus intermedius
2 2 960
166. SUSMT Mitochondrion Strongylocentrotus pallidus 2 2 961
167. SUSMT Mitochondrion Strongylocentrotus purpuratus
5 4 16929
168. TARMT Mitochondrion Tarsius syrichta 1 1 895
169. TETMT Mitochondrion Tetrahymena thermophila 1 1 53
170. TEYMT Mitochondrion Tetrahymena pyriformis 14 13 12462
171. TOBCP Chloroplast Nicotiana acuminata 1 1 2052
172. TOBCP Chloroplast Nicotiana debneyi 3 3 4016
173. TOBCP Chloroplast Nicotiana otophora 1 1 2052
174. TOBCP Chloroplast Nicotiana plumbaginifolia 6 4 4169
175. TOBCP Chloroplast Nicotiana tabacum 47 40 200231
176. TOBMT Mitochondrion Nicotiana plumbaginifolia 2 1 1740
177. TOBMT Mitochondrion Nicotiana tabacum 4 3 4074
178. TOMCP Chloroplast Lycopersicon esculentum 1 1 103
179. TOMMT Mitochondrion Lycopersicon esculentum 2 1 558
180. TRBKP Kinetoplast Trypanosoma brucei 27 21 37228
181. TRBMT Mitochondrion Trypanosoma brucei 4 4 2285
182. TRCKP Kinetoplast Trypanosoma cruzi 27 27 11864
183. TREKP Kinetoplast Trypanosoma equiperdum 2 2 2017
184. TREKP Kinetoplast Trypanosoma evansi 3 2 1998
185. TRFKP Kinetoplast Crithidia fasciculata 19 18 12549
186. TRFKP Kinetoplast Crithidia oncopelti 6 3 1231
187. TRFMT Mitochondrion Crithidia fasciculata 2 1 2034
188. TRFMT Mitochondrion Crithidia oncopelti 1 1 149
189. TRLKP Kinetoplast Leptomonas sp. 1 1 2568
190. TRWKP Kinetoplast Trypanosoma lewisi 2 2 2036
191. VFACP Chloroplast Vicia faba 6 6 9547
192. VFAMT Mitochondrion Vicia faba 4 4 9356
193. WARMT Mitochondrion Pomatostomus isidori 1 1 239
194. WARMT Mitochondrion Pomatostomus ruficeps 1 1 239
195. WARMT Mitochondrion Pomatostomus superciliosus 1 1 239
196. WARMT Mitochondrion Pomatostomus temporalis 1 1 239
197. WHTCP Chloroplast Triticum aestivum 28 26 26229
198. WHTMT Mitochondrion Triticum aestivum 33 24 28296
199. XELMT Mitochondrion Xenopus laevis 6 5 25196
200. XERMT Mitochondrion Xerocomus chrysenteron 2 1 339
201. YSCMT Mitochondrion Saccharomyces cerevisiae 191 171 142613
202. YSGMT Mitochondrion Saccharomyces carlsbergensis
1 1 149
203. YSKMT Mitochondrion Kluyveromyces lactis 78 39 3699
204. YSKMT Mitochondrion Kluyveromyces thermotolerans
5 3 1287
205. YSLMT Mitochondrion Torulopsis glabrata 10 9 6200
206. YSPMT Mitochondrion Schizosaccharomyces pombe 10 9 13361
207. YSSMT Mitochondrion Cephalosporium acremonium 2 2 3029
208. YSTMT Mitochondrion Yeast sp. 4 4 5196
209. YSUMT Mitochondrion Candida utilis 1 1 306
210. YSVMT Mitochondrion Saccharomyces uvarum 3 3 2296
Total 1569 1271 1848854
BACTERIAL
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. ABC Acetobacter aceti 1 1 1624
2. ABC Acetobacter xylinum 1 1 9540
3. AC2 Plasmid pAC27 1 1 803
4. ACC Acinetobacter calcoaceticus 13 10 27286
5. ACC Acinetobacter sp. 4 3 5193
6. ACH Achromobacter sp. 1 1 2414
7. ACL Acholeplasma laidlawii 3 2 1858
8. ACN Actinobacillus actinomycetemcomitans 1 1 3842
9. ACN Actinobacillus pleuropneumoniae 1 1 3831
10. ACO Acetogenium kivui 1 1 2477
11. ACY Actinomyces naeslundii 1 1 2160
12. ACY Actinomyces viscosus 2 1 1850
13. AER Aeromicrobium erythreus 1 1 1463
14. AFA Alcaligenes eutrophus 9 9 23911
15. AFA Alcaligenes faecalis 4 4 8591
16. AFA Plasmid pJP4 7 5 11105
17. AHA Aphanocapsa sp. 1 1 1920
18. AMC Acidaminococcus fermentans 2 1 3245
19. AMS Ampullariella sp. 1 1 1892
20. ANA Anabaena 7120 8 6 15235
21. ANA Anabaena sp. 27 19 26386
22. ANI Anacystis nidulans 32 26 36901
23. ANN Actinoplanes missouriensis 2 1 1639
24. APM Anaplasma marginale 5 4 11487
25. AQU Agmenellum quadruplicatum 3 3 5497
26. ARF Archaeoglobus fulgidus 3 2 1727
27. ARG Arthrobacter sp. 1 1 2075
28. ATU Agrobacterium rhizogenes 8 6 14645
29. ATU Agrobacterium sp. 1 1 1599
30. ATU Agrobacterium tumefaciens 38 31 62767
31. AVH Azotobacter chroococcum 1 1 1654
32. AVI Azotobacter vinelandii 24 20 80709
33. AZS Azospirillum brasilense 1 1 1910
34. BAD Bacillus caldolyticus 1 1 1147
35. BAL Bacillus caldotenax 5 5 5550
36. BAM Bacillus amyloliquefaciens 19 16 17715
37. BAN Bacillus anthracis 4 4 14401
38. BBR Bacillus brevis 11 10 25879
39. BC1 Plasmid pBC16 1 1 257
40. BCC Bacillus coagulans 2 2 2433
41. BCE Bacillus cereus 18 14 17016
42. BCI Bacillus circulans 9 6 12823
43. BCQ Bacillus Q 5 5 786
44. BFI Bacillus firmus 1 1 1434
45. BIF Bifidobacterium longum 1 1 1767
46. BLI Bacillus licheniformis 21 18 19815
47. BLL Bacillus lautus 1 1 2323
48. BMA Bacillus macerans 1 1 2744
49. BME Bacillus megaterium 24 19 31050
50. BNG Bacteroides gingivalis 1 1 1420
51. BNO Bacteroides nodosus 6 5 4828
52. BNR Bacteroides fragilis 6 5 8556
53. BOR Borrelia burgdorferei 7 6 3181
54. BPE Bordetella bronchiseptica 1 1 4936
55. BPE Bordetella parapertussis 3 3 4749
56. BPE Bordetella pertussis 21 15 42985
57. BPO Bacillus polymyxa 4 3 8764
58. BPU Bacillus pumilus 12 9 8915
59. BRL Brevibacterium epidermidis 2 1 1721
60. BRL Brevibacterium lactofermentum 9 8 22049
61. BRU Brucella abortus 2 2 5253
62. BS2 Plasmid pBS2 1 1 2279
63. BSN Bacillus natto 1 1 676
64. BSP Bacillus sp. 22 21 48210
65. BSS Bacillus sphaericus 16 12 30172
66. BST Bacillus stearothermophilus 41 39 62425
67. BSU Bacillus subtilis 274 216 357850
68. BTH Bacillus thuringiensis 67 52 149803
69. BTT Thermoactinomyces thalpophilus 2 2 1036
70. BUT Butyrivibrio fibrisolvens 2 2 4411
71. C11 Plasmid pColBM-C1139 1 1 2149
72. C1B Plasmid Colicin B4 3 3 1561
73. CAJ Campylobacter coli 3 3 4985
74. CAJ Campylobacter fetus 1 1 3974
75. CAJ Campylobacter jejuni 4 4 5410
76. CB2 Plasmid Colicin B2 1 1 360
77. CCR Caulobacter crescentus 24 24 15105
78. CD1 Plasmid Colicin D 1 1 1099
79. CDC Caldocellum saccharolyticum 6 5 16338
80. CE1 Plasmid Colicin E1 42 31 17722
81. CE2 Plasmid Colicin E2 5 5 4629
82. CE3 Plasmid Colicin E3 1 1 392
83. CE5 Plasmid Colicin E5-099 2 2 2401
84. CE8 Plasmid Colicin E8 1 1 1268
85. CE9 Plasmid Colicin E9 2 2 2148
86. CEC Plasmid Colicin E3-CA38 8 3 4883
87. CEC Plasmid Colicin E6-CT14 2 2 4675
88. CFI Cellulomonas fimi 7 7 5787
89. CFI Cellulomonas uda 1 1 1828
90. CFR Citrobacter freundii 6 5 4522
91. CFX Chloroflexus aurantiacus 4 3 3568
92. CGF Chlorogloeopsis fritschii 1 1 210
93. CH1 Plasmid pCHL1 2 2 8101
94. CHT Chlamydia psittaci 5 5 6932
95. CHT Chlamydia trachomatis 36 22 61362
96. CIA Plasmid Colicin Ia 1 1 3727
97. CIB Plasmid Colicin Ib 5 5 10934
98. CIB Plasmid Colicin Ib-P9 2 2 2373
99. CLA Plasmid Colicin A 4 4 3950
100. CLD Plasmid CloDF13 13 1 9957
101. CLK Plasmid Colicin K 2 2 815
102. CLN Plasmid Colicin V 1 1 412
103. CLN Plasmid Colicin V-K30 1 1 1465
104. CLN2 Plasmid Colicin V2-K94 1 1 550
105. CLO Clostridium acetobutylicum 9 7 14479
106. CLO Clostridium acidiurici 1 1 2266
107. CLO Clostridium botulinum 1 1 4835
108. CLO Clostridium cellulolyticum 1 1 2405
109. CLO Clostridium difficile 4 3 10451
110. CLO Clostridium innocuum 1 1 1544
111. CLO Clostridium pasteurianum 25 17 18163
112. CLO Clostridium perfringens 10 9 7118
113. CLO Clostridium sordellii 1 1 1504
114. CLO Clostridium tetani 3 3 10529
115. CLO Clostridium thermoaceticum 1 1 1965
116. CLO Clostridium thermocellum 10 8 16656
117. CLO Clostridium thermohydrosulfuricum 1 1 4839
118. CLO Clostridium thermosulfurogenes 2 2 4258
119. CLT Calothrix sp. 12 12 1936
120. CLV Plasmid ColVBtrp 1 1 441
121. CN2 Plasmid pCN2 1 1 117
122. CN3 Plasmid pCN3 1 1 114
123. COR Corynebacterium diphtheriae 2 1 2529
124. COR Corynebacterium glutamicum 9 6 16234
125. COR Corynebacterium nephridii 1 1 615
126. COR Corynebacterium sp. 2 2 2512
127. COX Coxiella burnetii 3 3 5956
128. CPC Cryptococcus albidus 2 1 2984
129. CPC Cryptococcus neoformans 1 1 2029
130. CYA Cyanobacterium nostoc 2 2 4220
131. CYT Cytophaga lytica 1 1 1509
132. DCG Dictyoglomus thermophilum 3 3 6900
133. DEI Deinococcus radiodurans 2 2 4970
134. DMO Desulfurococcus mobilis 7 7 9730
135. DSB Desulfobacterium autotrophicum 1 1 1376
136. DSB Desulfobacterium niacini 1 1 1375
137. DSB Desulfobacterium vacuolatum 1 1 1383
138. DSF Desulfococcus multivorans 1 1 1372
139. DSI Desulfomicrobium baculatus 1 1 1379
140. DSL Desulfomonas pigra 1 1 1381
141. DSO Desulfotomaculum orientis 1 1 1402
142. DSO Desulfotomaculum ruminis 1 1 1368
143. DSP Desulfobacter curvatus 1 1 1396
144. DSP Desulfobacter hydrogenophilus 1 1 1390
145. DSP Desulfobacter latus 1 1 1373
146. DSP Desulfobacter sp. 2 2 2869
147. DSU Desulfobulbus propionicus 1 1 1371
148. DSU Desulfobulbus sp. 1 1 1365
149. DSV Desulfosarcina variabilis 1 1 1527
150. DVU Desulfovibrio africanus 1 1 1382
151. DVU Desulfovibrio baarsii 2 2 1589
152. DVU Desulfovibrio baculatus 2 1 2589
153. DVU Desulfovibrio desulfuricans 5 5 4678
154. DVU Desulfovibrio fructosovorans 1 1 3180
155. DVU Desulfovibrio gigas 2 2 4126
156. DVU Desulfovibrio multispirans 1 1 186
157. DVU Desulfovibrio salexigens 2 2 2107
158. DVU Desulfovibrio sapovorans 1 1 1395
159. DVU Desulfovibrio vulgaris 9 9 11185
160. EAM Erwinia amylovora 2 2 1641
161. ECA Erwinia carotovora 14 12 17877
162. ECB Erwinia herbicola 1 1 4902
163. ECH Erwinia chrysanthemi 19 13 27706
164. ECO Escherichia coli 1716 1188 1848929
165. ECO Escherichia fergusonii 1 1 3133
166. ECO F sex factor plasmid 3 3 5370
167. ECO Plasmid Colicin BM-Cl139 3 3 3707
168. ECO Plasmid pCU1 1 1 2056
169. ECO Plasmid pF166 1 1 2133
170. EHP Ectothiorhodospira halophila 1 1 121
171. EHR Ehrlichia risticii 2 1 1498
172. EHV Ectothiorhodospira vacuolata 1 1 120
173. ENC Enterococcus faecium 1 1 1900
174. ENR Plasmid ENTR 2 2 1273
175. ENS Plasmid ENT 1 1 866
176. ENT Enterobacter aerogenes 6 6 5330
177. ENT Enterobacter agglomerans 3 3 1272
178. ENT Enterobacter cloacae 8 8 9262
179. ETA Edwardsiella tarda 2 1 306
180. EUB Eubacterium sp. 6 5 10586
181. FA3 Plasmid pFA3 1 1 1597
182. FDI Fremyella diplosiphon 26 20 29933
183. FIB Fibrobacter succinogenes 2 2 3736
184. FPL Plasmid F 30 23 37662
185. FRA Frankia sp. 3 2 3758
186. FRN Francisella tularensis 1 1 1233
187. FVB Flavobacterium heparinum 1 1 1528
188. FVB Flavobacterium okeanokoites 8 8 9873
189. FVB Flavobacterium sp. 4 4 6028
190. GS5 Plasmid pGS05 1 1 1357
191. HAF Hafnia alvei 1 1 2961
192. HAL Halobacterium cutirubrum 7 5 8712
193. HAL Halobacterium halobium 36 26 45031
194. HAL Halobacterium salinarium 1 1 606
195. HAL Halobacterium sp. 14 13 22487
196. HAL Haloferax volcanii 1 1 3566
197. HCL Heliobacterium chlorum 1 1 1512
198. HCU Halobacterium cutirubrum 2 1 3116
199. HEC Helicobacter felis 3 2 2887
200. HEC Helicobacter mustelae 2 1 1435
201. HEH Haemophilus haemolyticus 2 2 3186
202. HEI Haemophilus influenzae 96 41 17798
203. HEP Haemophilus parainfluenza 5 5 853
204. HLF Haloferax sp. 4 3 1187
205. HMO Halococcus morrhuae 2 2 4402
206. HPT Herpetosiphon aurantiacus 1 1 1484
207. HV2 Plasmid pHV2 1 1 6354
208. IM13 Plasmid pIM13 1 1 2246
209. INC Plasmid incB 1 1 352
210. INC Plasmid incI-1 1 1 418
211. INC Plasmid incI-gamma 1 1 417
212. INS Insertion sequence 10 10 4266
213. INS Insertion sequence IS1 5 4 3243
214. INS Insertion sequence IS150 2 1 1443
215. INS Insertion sequence IS186 2 2 2677
216. INS Insertion sequence IS2 4 4 517
217. INS Insertion sequence IS26 1 1 859
218. INS Insertion sequence IS30 1 1 1221
219. INS Insertion sequence IS4 1 1 1426
220. INS Insertion sequence IS476 1 1 1225
221. INS Insertion sequence IS493 1 1 1641
222. INS Insertion sequence IS5 3 2 1570
223. INS Insertion sequence IS891 1 1 1351
224. INS Insertion sequence ISHS1 1 1 1449
225. JD1 Plasmid pJD1 2 1 4207
226. JS3 Plasmid pJS37 3 3 252
227. KAE Klebsiella aerogenes 18 16 23367
228. KCI Kluyvera citrophila 1 1 2734
229. KPN Klebsiella pneumoniae 71 55 109716
230. KPN Plasmid pJHC-MW1 1 1 1352
231. KPO Klebsiella oxytoca 2 2 4901
232. KY1 Plasmid pKY1 1 1 3022
233. KYM Plasmid pKYM 1 1 2083
234. LAC Lactococcus lactis 13 12 24807
235. LAE Listonella ordalii 2 1 120
236. LAE Listonella tubiashii 2 1 120
237. LB1 Plasmid p1 1 1 533
238. LB3 Lactobacillus 30a 3 2 2189
239. LBA Lactobacillus acidophilus 1 1 400
240. LBB Lactobacillus bulgaricus 1 1 536
241. LBD Lactobacillus delbrueckii 7 4 5405
242. LBH Lactobacillus helveticus 1 1 3292
243. LBP Lactobacillus plantarum 3 2 3664
244. LBP Plasmid pC30il 1 1 2140
245. LBP Plasmid pLP1 1 1 2093
246. LCA Lactobacillus casei 6 6 9787
247. LCO Lactobacillus confusus 1 1 1320
248. LEP Leptospira biflexa 2 2 4788
249. LEP Leptospira interrogans 2 1 3244
250. LIS Listeria monocytogenes 3 2 3940
251. LM0 Plasmid pLM020 1 1 2330
252. LPN Legionella pneumophila 2 2 2005
253. LS1 Plasmid pLS11 1 1 253
254. MBA Methanobacterium ivanovii 2 1 1353
255. MBF Methanobacterium formicicum 1 1 3597
256. MBH Methanobrevibacter smithii 3 3 7221
257. MBI Methanobacterium thermoautotrophicum 10 7 26621
258. MBI Plasmid pME2001 1 1 1440
259. MBO Moraxella bovis 2 2 5044
260. MBO Moraxella lacunata 1 1 969
261. MBO Moraxella sp. 2 1 3034
262. MCL Mastigocladus laminosus 1 1 1701
263. MEC Micromonospora echinospora 1 1 398
264. MEF Methanothermus fervidus 9 8 18721
265. MEH Methanospirillum hungatei 1 1 295
266. MEN Methanolobus tindarius 1 1 128
267. MES Methanosarcina barkeri 5 3 13117
268. MLC Methylococcus capsulatus 1 1 2463
269. MLU Micrococcus luteus 10 9 19465
270. MLY Micrococcus lysodeikticus 1 1 166
271. MPL Mycoplasma-like organism 1 1 1535
272. MSG Mycobacterium bovis 7 6 7320
273. MSG Mycobacterium leprae 7 5 11027
274. MSG Mycobacterium tuberculosis 15 9 18387
275. MSG Plasmid pAL5000 1 1 4837
276. MTB Methylobacterium extorquens 1 1 4500
277. MTB Methylobacterium sp. 1 1 2791
278. MTB Methylobacterium specialis 2 1 2211
279. MTF Methylobacillus flagellatum 1 1 1349
280. MV1 Plasmid pMV158 1 1 2436
281. MVA Methanococcus vannielii 19 17 28753
282. MVO Methanococcus voltae 11 10 15241
283. MVT Methanococcus thermolithotrophicus 4 3 2820
284. MXA Myxococcus xanthus 16 15 27127
285. MXB Lysobacter enzymogenes 3 2 3218
286. MYC Mycoplasma capricolum 14 13 21175
287. MYC Mycoplasma hyopneumoniae 3 3 1928
288. MYC Mycoplasma mycoides 4 4 2716
289. MYC Mycoplasma sp. 41 37 51236
290. MYC Plasmid pADB201 1 1 1717
291. NAH Plasmid NAH7 (from P. putida) 6 5 3771
292. NAT Natronobacterium pharaonis 1 1 1015
293. NG2 Plasmid pNG2 1 1 1810
294. NGO Neisseria flavescens 1 1 1228
295. NGO Neisseria gonorrhoeae 63 55 50544
296. NGO Neisseria meningitidis 12 8 9940
297. NOC Nocardia mediterranei 3 3 450
298. NOS Nostoc commune 1 1 4241
299. NR1 Plasmid NR1 4 3 6463
300. NT1 Plasmid NTP1 2 2 1440
301. NT1 Plasmid NTP16 1 1 2730
302. P15 Plasmid P15A 2 2 1226
303. P18X Plasmid pACYC184 2 2 171
304. P23 Plasmid pMM2-3 2 2 182
305. P307 Plasmid P307 3 3 4629
306. P53 Plasmid pMM5-3 4 4 429
307. P55 Plasmid pMM5-5 4 4 420
308. PAC Plasmid P177 1 1 345
309. PAM Plasmid PAM177 1 1 1443
310. PAS Pasteurella haemolytica 4 3 15958
311. PAZ Plasmid pAZ1 1 1 808
312. PB0 Plasmid pUB110 9 8 12606
313. PB2 Plasmid pUB112 1 1 901
314. PBF4 Plasmid pBF4 1 1 1041
315. PBW Plasmid pBWH77 2 2 1623
316. PC1 Plasmid pC194 2 2 3946
317. PC2 Plasmid pC221 2 1 4555
318. PDE Paracoccus denitrificans 10 7 17422
319. PDG Plasmid pDGO100 2 2 3683
320. PDU Plasmid pDU1358 2 2 5076
321. PE1 Plasmid pE194 7 3 5039
322. PE2 Plasmid pED208 2 2 5640
323. PHL Plasmid pHly152 1 1 8215
324. PI25 Plasmid pI258 5 4 12140
325. PIJ Plasmid pIJ101 2 2 9188
326. PIP Plasmid pIP401 2 2 383
327. PIP Plasmid pIP630 1 1 1883
328. PIP11 Plasmid pIP1100 1 1 1386
329. PIP404 Plasmid pIP404 4 3 15188
330. PJH Plasmid pJH1 1 1 1489
331. PJM1 Plasmid pJM1 1 1 3581
332. PJR Plasmid PJR225 1 1 1527
333. PKL Plasmid pKLH1 1 1 160
334. PKL Plasmid pKLH102 2 2 351
335. PKL Plasmid pKLH104 1 1 131
336. PKL Plasmid pKLH2 2 2 674
337. PKL Plasmid pKLH201 1 1 153
338. PKM Plasmid pKM101 1 1 1797
339. PLB Plasmid pLB1 1 1 2190
340. PLM Plasmid pAA3.7X 3 1 9583
341. PLP Plasmid pSa 1 1 1447
342. PME Plasmid pMEA100 1 1 150
343. PMM Plasmid pMM110 1 1 240
344. PMO Plasmid pMON234 1 1 997
345. PNE Plasmid pNE131 2 1 2355
346. PNS Plasmid pNS1 1 1 3879
347. PNS Plasmid pNS1981 4 3 1819
348. PO2 Plasmid pOAD2 2 2 2914
349. PR1 Plasmid R1 13 10 7500
350. PR2 Plasmid R1126 1 1 428
351. PR6 Plasmid R6-5 2 1 858
352. PRC Plasmid R 1 1 1487
353. PRI Plasmid PRI13 2 1 2234
354. PRM Morganella morganii 2 2 1831
355. PRM Proteus mirabilis 7 6 16319
356. PRM Proteus vulgaris 13 8 14186
357. PRO Providencia sp. 1 1 1135
358. PRO Providencia stuartii 1 1 3889
359. PRS Propionibacterium shermanii 3 2 4951
360. PS1 Streptomyces lividans plasmid pS1 1 1 75
361. PSA Plasmid pSA2100 1 1 98
362. PSC Plasmid pSC101 16 10 16807
363. PSE Plasmid pCMS1 1 1 1322
364. PSE Pseudomonas aeruginosa 99 77 110225
365. PSE Pseudomonas amyloderamosa 5 2 4488
366. PSE Pseudomonas cepacia 2 2 5867
367. PSE Pseudomonas fluorescens 10 8 17694
368. PSE Pseudomonas fragi 2 2 1682
369. PSE Pseudomonas paucimobilis 1 1 1080
370. PSE Pseudomonas pseudoalcaligenes 1 1 2040
371. PSE Pseudomonas putida 28 26 68302
372. PSE Pseudomonas sp. 25 22 39293
373. PSE Pseudomonas syringae 11 10 27701
374. PSE Pseudomonas testosteroni 2 2 2435
375. PSE TOL Plasmid (from Pseudomonas putida) 11 7 9788
376. PSE Zoogloea ramigera 1 1 1524
377. PSM SYM megaplasmid(from R. meliloti) 9 8 5150
378. PSN Plasmid pSN2 1 1 1288
379. PT1 Plasmid pT181 7 4 5479
380. PTB Plasmid pTB913 1 1 1200
381. PWM Plasmid pWM5 1 1 569
382. PWP Plasmid pWP7b 1 1 1370
383. PWR Plasmid PWR60 1 1 4832
384. PYR Pyrodictium occultum 4 4 2077
385. PYW Pyrococcus woesi 2 1 124
386. R10 Plasmid R100 25 17 26157
387. R11 Plasmid R1162 5 5 2389
388. R12 Plasmid R124 1 1 272
389. R14 Plasmid R144 1 1 801
390. R26 Plasmid R26 1 1 1541
391. R27 Plasmid R27 1 1 1507
392. R36 Plasmid R386 1 1 441
393. R37 Plasmid R387 1 1 1160
394. R38 Plasmid R388 2 2 3204
395. R41 Plasmid R401 1 1 1857
396. R45 Plasmid R485 1 1 591
397. R46 Plasmid R46 3 3 2859
398. R48 Plasmid R483 1 1 1618
399. R53 Plasmid R538 3 2 1712
400. R65 Plasmid R65 2 2 1380
401. R67 Plasmid R67 1 1 293
402. R6K Plasmid R6K 7 6 1894
403. R75 Plasmid R751 4 4 1697
404. R77 Plasmid R773 1 1 4347
405. RA1 Plasmid RA1 1 1 758
406. RBH Plasmid pRBH1 2 2 1521
407. RBL Rhodopseudomonas acidophila 1 1 1491
408. RBL Rhodopseudomonas blastica 1 1 12368
409. RCA Rhodobacter capsulatus 32 26 52831
410. RDC Rhodocyclus purpureus 1 1 1478
411. REI Plasmid pRE-I 1 1 439
412. RER Rhodococcus erythropolis 2 1 2070
413. RER Rhodococcus fascians 2 1 121
414. RGN Plasmid RGN238 2 1 2427
415. RHA Azorhizobium caulinodans 4 2 3849
416. RHB Bradyrhizobium japonicum 23 18 34554
417. RHF Rhizobium fredii 1 1 2862
418. RHH Rhizobium phaseoli 4 4 3681
419. RHI Bradyrhizobium sp. 2 2 6665
420. RHI Rhizobium sp. 8 7 10894
421. RHJ Rhizobium japonicum 10 8 10225
422. RHL Plasmid pRL1JI 5 1 12055
423. RHL Rhizobium leguminosarum 13 13 19168
424. RHM Rhizobium meliloti 52 45 93945
425. RHR Rhizobium IRc78 2 2 2199
426. RHT Rhizobium trifolii 5 5 6886
427. RIA Plasmid Ri 1 1 21126
428. RIR Rickettsia conorii 1 1 539
429. RIR Rickettsia prowazekii 5 4 9706
430. RIR Rickettsia rickettsii 4 4 8555
431. RIR Rickettsia tsutsugamushi 2 2 5186
432. RIR Rickettsia typhi 2 2 1067
433. RIR Rochalimaea quintana 1 1 1493
434. RK2 Plasmid RK2 10 7 9952
435. RMV Rhodomicrobium vannielii 1 1 1484
436. ROS Roseburia cecicola 1 1 1031
437. RP1 Plasmid RP1 1 1 2709
438. RP4 Plasmid RP4 5 5 2997
439. RSF Plasmid RSF1010 7 7 10521
440. RSP Rhodospirillum rubrum 11 11 21804
441. RSS Rhodobacter sphaeroides 17 17 14217
442. RTS Plasmid Rts1 2 1 1855
443. RUA Ruminobacter amylophilus 2 1 2867
444. RUM Ruminococcus albus 1 1 2180
445. RVI Rhodopseudomonas viridis 5 4 3885
446. SA2 Plasmid pSAM2 3 3 866
447. SAC Sulfolobus acidocaldarius 6 5 15339
448. SAU Stigmatella aurantiaca 2 1 1300
449. SB2 Plasmid pSB24.2 2 2 7412
450. SCP Plasmid SCP1 1 1 2513
451. SE2 Plasmid pSE211 2 2 3017
452. SER Saccharopolyspora erythraea 10 7 8075
453. SHD Shigella dysenteriae 7 7 11010
454. SHF Plasmid pMYSH6000 1 1 4472
455. SHF Shigella flexneri 12 10 24401
456. SHS Shigella sonnei 11 11 6738
457. SLP1 Plasmid SLP1 3 3 630
458. SMA Serratia marcescens 31 25 35311
459. SMA Serratia sp. 1 1 2570
460. SME Spiroplasma melliferum 1 1 1510
461. SME Spiroplasma sp. 1 1 5025
462. SMY Plasmid pSL2 1 1 345
463. SMY1 Plasmid pSL1 2 2 633
464. SPA Spirochaeta aurantia 1 1 1257
465. SPO Sporolactobacillus laevis 1 1 118
466. SPO Sporosarcina ureae 2 1 116
467. SPU Spirulina platensis 1 1 5273
468. SSO Sulfolobus shibatae 1 1 1495
469. SSO Sulfolobus solfataricus 6 6 5873
470. SSP Sulfolobus sp. 7 4 19617
471. STA Plasmid pT48 1 1 2475
472. STA Staphylococcus aureus 75 56 85940
473. STA Staphylococcus carnosus 1 1 720
474. STA Staphylococcus epidermidis 1 1 423
475. STA Staphylococcus haemolyticus 1 1 1087
476. STA Staphylococcus hyicus 1 1 2212
477. STA Staphylococcus mutans 1 1 2288
478. STA Staphylococcus simulans 1 1 1486
479. STA Staphylococcus staphylolyticus 1 1 1825
480. STM Streptomyces antibioticus 1 1 1567
481. STM Streptomyces avidinii 1 1 638
482. STM Streptomyces azureus 1 1 1521
483. STM Streptomyces clavuligerus 6 4 5745
484. STM Streptomyces coelicolor 15 12 17749
485. STM Streptomyces fradiae 6 6 11455
486. STM Streptomyces glaucescens 6 4 6431
487. STM Streptomyces griseus 16 12 25048
488. STM Streptomyces hygroscopicus 7 7 6374
489. STM Streptomyces lavendulae 2 2 2078
490. STM Streptomyces limosus 1 1 2291
491. STM Streptomyces lividans 25 20 14900
492. STM Streptomyces plicatus 2 2 1245
493. STM Streptomyces rochei 4 4 2390
494. STM Streptomyces sp. 47 39 49732
495. STM Streptomyces thermotolerans 1 1 1260
496. STM Streptomyces vinaceus 1 1 1119
497. STR Plasmid pAM-beta-1 3 3 6001
498. STR Plasmid pMK157 1 1 1920
499. STR Streptococcus equisimilis 1 1 2568
500. STR Streptococcus faecalis 6 6 8343
501. STR Streptococcus lactis 8 7 8222
502. STR Streptococcus mutans 15 13 46945
503. STR Streptococcus pneumoniae 49 34 55746
504. STR Streptococcus pyogenes 24 18 36619
505. STR Streptococcus sanguis 3 3 8094
506. STR Streptococcus sobrinus 1 1 4995
507. STR Streptococcus sp. 17 14 31406
508. STV Streptoverticillum sp. 1 1 1130
509. STY Plasmid R1767 1 1 1519
510. STY Plasmid R64 1 1 482
511. STY Salmonella infantis 1 1 3430
512. STY Salmonella potsdam 1 1 1727
513. STY Salmonella rubislaw 1 1 1479
514. STY Salmonella sp. 7 6 6050
515. STY Salmonella typhimurium 174 142 204225
516. SYC Synechocystis sp. 15 11 18768
517. SYN Synechococcus sp. 15 15 38283
518. TBA Thermophilic bacterium 6 3 11617
519. TDT Trichodesmium thiebautii 1 1 357
520. TFE Plasmid pTF-FC2 1 1 329
521. TFE Thiobacillus acidophilus 9 4 599
522. TFE Thiobacillus ferrooxidans 9 7 14624
523. TFE Thiobacillus sp. 1 1 2172
524. THA Thermoplasma acidophilum 3 3 7703
525. THC Thermococcus celer 2 2 1312
526. THF Thermomonospora fusca 1 1 264
527. THP Thermofilum pendens 1 1 240
528. THR Thermomicrobium roseum 1 1 1528
529. TIP Plasmid pTiAch5 1 1 1164
530. TIP Plasmid pTiAg162 1 1 420
531. TIP Plasmid pTiB6S3 1 1 4203
532. TIP Plasmid pTiC58 4 4 5214
533. TIP Plasmid Ti (from A. tumefaciens) 61 52 120107
534. TMO Thermotoga maritima 2 2 2763
535. TRN Transposon gamma-delta 6 3 1092
536. TRN Transposon Tn10 1 1 830
537. TRN Transposon Tn21 1 1 1333
538. TRN Transposon Tn2501 1 1 480
539. TRN Transposon Tn3 2 2 389
540. TRN Transposon Tn3411 2 1 2925
541. TRN Transposon Tn4521 1 1 1315
542. TRN Transposon Tn5 1 1 2040
543. TRN Transposon Tn501 1 1 86
544. TRN Transposon Tn602 4 4 639
545. TRN10 Transposon Tn10 11 4 6024
546. TRN15 Transposon Tn1525 1 1 1721
547. TRN16 Transposon Tn1681 1 1 658
548. TRN17 Transposon Tn1721 8 8 1797
549. TRN1771 Transposon Tn1771 3 3 348
550. TRN21 Transposon Tn21 4 4 6671
551. TRN25 Transposon Tn2501 1 1 1539
552. TRN26 Transposon Tn2680 1 1 194
553. TRN3 Transposon Tn3 10 8 6351
554. TRN34 Transposon Tn3411 1 1 1321
555. TRN43 Transposon Tn4351 2 1 1982
556. TRN431 Transposon Tn431 3 3 2405
557. TRN4551 Transposon Tn4551 1 1 2080
558. TRN4556 Transposon Tn4556 2 2 86
559. TRN5 Transposon Tn5 9 6 4978
560. TRN501 Transposon Tn501 4 4 7310
561. TRN554 Transposon Tn554 5 1 6691
562. TRN7 Transposon Tn7 7 7 7535
563. TRN9 Transposon Tn9 2 2 1362
564. TRN903 Transposon Tn903 6 6 6118
565. TRN916 Transposon Tn916 1 1 1740
566. TRN917 Transposon Tn917 3 3 6353
567. TRNCAM Transposon Tn-Cam204 1 1 921
568. TRP Treponema pallidum 7 6 8045
569. TTE Thermoproteus tenax 8 8 4399
570. TTH Thermus aquaticus 6 5 7004
571. TTH Thermus caldophilus 1 1 1229
572. TTH Thermus flavus 1 1 1771
573. TTH Thermus thermophilus 23 15 28221
574. TTV Thermoproteus tenax virus 1 2 1 13669
575. URE Ureaplasma urealyticum 2 2 3982
576. VCH Vibrio cholerae 17 15 18430
577. VI1 Plasmid pVI150 1 1 972
578. VIB Aeromonas hydrophila 7 6 7237
579. VIB Aeromonas sobria 2 1 2510
580. VIB Photobacterium leiognathi 5 4 10047
581. VIB Photobacterium sp. 5 5 8715
582. VIB Vibrio alginolyticus 5 4 12901
583. VIB Vibrio anguillarum 1 1 4379
584. VIB Vibrio fischeri 4 4 5791
585. VIB Vibrio harveyi 12 11 14283
586. VIB Vibrio parahaemolyticus 2 2 2861
587. VIB Vibrio sp. 3 2 1390
588. VIT Vitreoscilla sp. 2 1 689
589. VIT Vitreoscilla stercoraria 1 1 745
590. VVU Vibrio vulnificus 1 1 2237
591. W10 Plasmid pWR100 2 1 4761
592. WOL Wolinella succinogenes 1 1 91
593. WP1 Plasmid pWP113a 1 1 1316
594. WP1 Plasmid pWP116a 1 1 1336
595. WP1 Plasmid pWP14a 1 1 1336
596. XAA Xanthobacter autotrophicus 1 1 3041
597. XAN Xanthomonas campestris 3 3 4683
598. XEN Xenorhabdus luminescens 1 1 2553
599. YEP Plasmid pYV03 2 1 3316
600. YEP Yersinia bercovieri 2 2 257
601. YEP Yersinia enterocolitica 26 24 18913
602. YEP Yersinia pestis 4 4 4462
603. YEP Yersinia pseudotuberculosis 10 8 15439
604. ZMO Zymomonas mobilis 9 8 13565
Total 5528 4293 6992664
STRUCTURAL RNA
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. AAU Auricularia auricula-judae 1 1 118
2. ABC Acetobacter sp. 2 2 236
3. ACA Acanthamoeba castellanii 3 3 400
4. ACC Acinetobacter calcoaceticus 2 2 1652
5. ACH Achromobacter cycloclastes 1 1 120
6. ACH Achromobacter xylosoxidans 1 1 114
7. ACL Acholeplasma entomophilum 1 1 1476
8. ACL Acholeplasma modicum 1 1 1473
9. ACN Actinobacillus actinomycetemcomitans 3 3 494
10. ACN Actinobacillus equuli 2 2 445
11. ACN Actinobacillus hominis 3 3 494
12. ACN Actinobacillus lignieresii 3 3 1931
13. ACS Avian sarcoma virus 1 1 75
14. ACY Actinomyces bovis 1 1 1368
15. ACY Actinomyces israelii 2 2 1879
16. ACY Actinomyces naeslundii 1 1 1378
17. ACY Actinomyces odontolyticus 1 1 1359
18. ACY Actinomyces pyogenes 2 1 1361
19. ACY Actinomyces viscosus 1 1 1351
20. AED Agaricus edulis 1 1 118
21. AEQ Actinia equina 2 1 120
22. AFA Alcaligenes eutrophus 1 1 1511
23. AFA Alcaligenes faecalis 6 6 3410
24. AKK Akkesiphycus lubricum 2 1 118
25. ALF Medicago sativa 1 1 119
26. ALL Asteroleplasma anaerobium 1 1 1471
27. ALR Rous sarcoma virus 1 1 75
28. AMG Acyrthosiphon magnoliae 2 2 281
29. AMO Amoebidium parasiticum 1 1 119
30. AMP Amoeba proteus 1 1 419
31. ANC Ancylobacter aquaticus 1 1 117
32. ANI Anacystis nidulans 4 4 371
33. ANM Anisodoris nobilis 6 3 994
34. ANP Anaeroplasma abactoclasticum 1 1 1453
35. ANP Anaeroplasma bactoclasticum 1 1 1436
36. ANP Anaeroplasma varium 1 1 1436
37. APE Acremonium persicinum 1 1 119
38. APL Aplysia kurodai 1 1 119
39. APN Anthoceros punctatus 1 1 118
40. APR Antheraea pernyi 1 1 120
41. APU Aeromonas punctata 1 1 109
42. AQU Agmenellum quadruplicatum 1 1 76
43. ARB Arbacia punctulata 9 3 1049
44. ARG Arthrobacter globiformis 4 3 1774
45. ARG Arthrobacter luteus 1 1 122
46. ARG Arthrobacter oxidans 2 1 121
47. ARG Arthrobacter sp. 2 1 121
48. ARN Argulus nobilis 1 1 1843
49. ARO Arhodomonas oleiferhydrans 1 1 1487
50. ARU Arundinaria gigantea 1 1 50
51. ASC Acinetospora crinita 2 1 118
52. ASE Aquaspirillum serpens 1 1 116
53. ASF Aspergillus flavus 1 1 119
54. ASG Aspergillus niger 1 1 119
55. ASN Aspergillus nidulans 4 4 476
56. AST Avena sativa 1 1 50
57. ATT Atractiella solani 1 1 119
58. ATU Agrobacterium tumefaciens 1 1 120
59. AUT Aureobacterium testaceum 1 1 120
60. AVI Azotobacter vinelandii 1 1 120
61. AXY Amphibacillus xylanus 2 1 116
62. BAC Bacillus acidocaldarius 1 1 117
63. BAE Batrachospermum ectocarpum 1 1 121
64. BAS Basidiobolus magnus 1 1 120
65. BBR Bacillus brevis 2 2 1674
66. BDE Bdellovibrio stolpii 1 1 1553
67. BEG Beggiatoa alba 1 1 120
68. BFI Bacillus firmus 1 1 116
69. BGA Blue Green Algae 1 1 76
70. BGL Bacillus globigii 1 1 116
71. BHA Beneckea harveyi 1 1 122
72. BJA Blepharisma japonicum 2 2 476
73. BLI Bacillus licheniformis 1 1 116
74. BLK Blakeslea trispora 1 1 120
75. BLT Blastobacter viscosus 1 1 118
76. BLY Hordeum vulgare 9 7 487
77. BME Bacillus megaterium 1 1 116
78. BMO Bombyx mori 14 13 1362
79. BNA Brassica napus 2 2 196
80. BNC Bacteroides asaccharolyticus 1 1 48
81. BNG Bacteroides gingivalis 1 1 53
82. BNI Bacteroides intermedius 1 1 52
83. BNO Bacteroides nodosus 1 1 1532
84. BOV Bos taurus 21 18 1426
85. BPA Bacillus pasteurii 1 1 117
86. BPL Brachionus plicatilis 1 1 121
87. BRA Branchiostoma belcheri 1 1 120
88. BRA Branchiostoma californiense 6 3 974
89. BRL Brevibacterium helvolum 2 1 120
90. BRL Brevibacterium linens 1 1 123
91. BRP Brugia pahangi 1 1 363
92. BRU Brucella abortus 2 1 1429
93. BSI Blastocladiella simplex 1 1 118
94. BST Bacillus stearothermophilus 10 10 845
95. BSU Bacillus subtilis 16 14 1153
96. BVO Bresslaua vorax 1 1 120
97. BVU Beta vulgaris 1 1 120
98. CAI Capniomyces stellatus 1 1 121
99. CAO Carpopeltis crispata 1 1 121
100. CAU Caulobacter spinosum 1 1 117
101. CBC Caseobacter polymorphus 1 1 121
102. CCI Coprinus cinereus 1 1 118
103. CCO Crypthecodinium cohnii 4 4 492
104. CDB Cardiobacterium hominis 1 1 1470
105. CEL Caenorhabditis elegans 9 7 713
106. CET Ceratobasidium cornigerum 1 1 118
107. CFI Cellulomonas biazotea 1 1 120
108. CHA Chaetopterus sp. 6 3 975
109. CHB Chlorobium limicola 2 2 1615
110. CHB Chlorobium phaeobacteroides 1 1 110
111. CHF Chordaria flagelliformis 2 1 118
112. CHH Chaetomorpha moniligera 1 1 120
113. CHK Gallus gallus 25 23 2774
114. CHL Chlorella pyrenoidosa 2 1 119
115. CHL Chlorella sp. 5 3 2082
116. CHO Chilomonas paramecium 1 1 124
117. CHR Chromobacterium fluviatile 1 1 1473
118. CHR Chromobacterium violaceum 1 1 1475
119. CHS Christiansenia pallida 1 1 120
120. CLL Callinectes sapidus 1 1 1861
121. CLM Spisula solidissima 6 3 937
122. CLO Clostridium aminovalericum 2 1 1554
123. CLO Clostridium barkeri 2 1 1527
124. CLO Clostridium bifermentans 1 1 117
125. CLO Clostridium butyricum 2 2 234
126. CLO Clostridium carnis 2 1 117
127. CLO Clostridium pasteurianum 4 3 1745
128. CLO Clostridium ramosum 1 1 1530
129. CLO Clostridium sticklandii 2 2 1501
130. CLO Clostridium tyrobutyricum 4 4 464
131. COE Coemansia mojavensis 1 1 120
132. COR Corynebacterium aquaticum 1 1 120
133. COR Corynebacterium glutamicum 1 1 121
134. COR Corynebacterium sp. 2 1 1366
135. COR Corynebacterium xerosis 2 2 243
136. COT Gossypium hirsutum 1 1 118
137. COX Coxiella burnetii 2 1 1484
138. CPA Cyanophora paradoxa 3 3 356
139. CRA Coprinus radiatus 1 1 118
140. CRB Limulus polyphemus 6 3 977
141. CRE Chlamydomonas reinhardtii 3 3 399
142. CRE Chlamydomonas sp. 1 1 118
143. CRS Cryptochiton stelleri 6 3 923
144. CTU Coleosporium tussilaginis 1 1 118
145. CUN Cunninghamella elegans 1 1 120
146. CUR Curtobacterium citreum 1 1 122
147. CVN Chromatium vinosum 1 1 1526
148. CYR Cycas revoluta 1 1 120
149. CYT Cytophaga aquatilis 1 1 111
150. CYT Cytophaga heparina 1 1 114
151. CYT Cytophaga johnsonae 1 1 116
152. DAC Dryopteris acuminata 1 1 121
153. DDE Dacrymyces deliquescens 1 1 118
154. DDI Dictyostelium discoideum 7 6 1170
155. DIT Diatoma tenue 1 1 118
156. DJA Dugesia japonica 1 1 120
157. DJA Dugesia tigrina 6 3 962
158. DOG Canis lupus 1 1 149
159. DOG Canis sp. 2 2 191
160. DPS Dipsacomyces acuminosporus 1 1 119
161. DRO Drosophila melanogaster 42 36 5318
162. DSA Desulfuromonas acetoxidans 1 1 1522
163. DSM Desulfomonile tiedjei 1 1 1505
164. DSP Desulfobacter postgatei 1 1 1519
165. DSV Desulfosarcina variabilis 1 1 1527
166. DUK Cairina moschata 1 1 78
167. DVU Desulfovibrio vulgaris 1 1 120
168. EAL Enchytraeus albidus 1 1 120
169. EAR Equisetum arvense 2 1 120
170. EBI Eisenia bicyclis 1 1 118
171. ECO Escherichia coli 148 116 14668
172. EFI Efibulobasidium albescens 1 1 118
173. EGR Euglena gracilis 4 4 391
174. EHP Ectothiorhodospira halophila 1 1 1494
175. EIK Eikenella corrodens 4 4 5933
176. EJA Entosphenus japonicus 2 2 241
177. EMP Emplectonema gracile 2 2 239
178. ERL Erythrobacter longus 1 1 119
179. ERP Protomonas extorquens 1 1 116
180. ERY Erysipelothrix rhusiopathiae 1 1 1487
181. ESE Endophyllum sempervivi 1 1 118
182. ESP Euphausia sperba 1 1 75
183. EUT Eucidaris tribuloides 6 3 923
184. EVA Exobasidium vaccinii 1 1 118
185. EWO Euplotes woodruffi 1 1 120
186. EXI Exidia glandulosa 1 1 118
187. FAE Faenia rectivirgula 1 1 1246
188. FBC Flexibacter sp. 1 1 117
189. FSB Misgurnus fossilis 3 3 399
190. FSB Oncorhynchus keta 1 1 75
191. FSB Salmo gairdneri 2 2 282
192. FSO Fusarium culmorum 2 2 387
193. FSO Fusarium decemcellulare 6 6 1161
194. FSO Fusarium graminearum 2 2 387
195. FSO Fusarium javanicum 4 4 768
196. FSO Fusarium moniliforme 6 6 1159
197. FSO Fusarium nivale 2 2 384
198. FSO Fusarium oxysporum 3 3 1010
199. FSO Fusarium solani 4 4 767
200. FVB Flavobacterium sp. 1 1 121
201. GBI Ginkgo biloba 1 1 120
202. GCL Gymnosporangium clavariaeforme 1 1 118
203. GCO Gracilaria compressa 3 2 242
204. GEA Gelidium amansii 2 2 241
205. GEM Gemmata obscuriglobus 1 1 108
206. GEN Genistelloides hibernus 1 1 122
207. GLA Giardia lamblia 1 1 127
208. GLC Gloiopeltis complanata 1 1 120
209. GOL Golfingia gouldii 6 3 973
210. GRA Graphiola phoenicis 1 1 118
211. HAL Halobacterium volcanii 56 46 5074
212. HAM Mesocricetus sp. 2 1 94
213. HAP Halichondria panicea 1 1 120
214. HAZ Haemophilus aphrophilus 3 3 494
215. HCU Halobacterium cutirubrum 15 13 1050
216. HDI Hymenolepis diminuta 2 2 215
217. HEA Haemophilus aegypticus 1 1 116
218. HEI Haemophilus influenzae 3 3 1917
219. HJA Halichondria japonica 1 1 120
220. HLF Haloferax mediterranei 2 1 123
221. HMO Halococcus morrhuae 2 2 309
222. HOC Haliclona oculata 1 1 120
223. HPT Herpetosiphon aurantiacus 1 1 117
224. HRO Halocynthia roretzi 2 1 119
225. HSA Hymeniacidon sanguinea 2 2 276
226. HUM Homo sapiens 51 45 6394
227. HYD Hydra sp. 6 3 966
228. HYF Hydrurus foetidus 1 1 118
229. HYV Hyphomicrobium sp. 1 1 119
230. HYV Hyphomicrobium vulgare 1 1 119
231. IGU Iguana iguana 1 1 120
232. ISO Isosphaera pallida 1 1 111
233. JLA Aurelia aurita 2 2 240
234. JLC Chrysaora quinquecirrha 1 1 120
235. JLN Nemopsis dofleini 1 1 120
236. JLS Spirocodon saltatrix 1 1 121
237. KAB Kabatiella microsticta 1 1 120
238. KIN Kingella denitrificans 1 1 1475
239. KIN Kingella indologenes 1 1 1474
240. KIN Kingella kingae 1 1 1476
241. LAE Listonella aestuarianus 1 1 119
242. LAN Lingula anatina 1 1 119
243. LAN Lingula reevi 6 3 919
244. LAP Lamprometra palmata 9 3 1044
245. LBK Lactobacillus kandleri 2 1 1528
246. LBM Lactobacillus minor 2 1 1524
247. LBR Lactobacillus brevis 1 1 117
248. LBT Lactobacillus halotolerans 2 1 1529
249. LCA Lactobacillus casei 2 1 1574
250. LCA Lactobacillus catenaforme 1 1 1549
251. LCO Lactobacillus confusus 2 1 1525
252. LEI Leishmania enriettii 1 1 68
253. LEU Leuconostoc cremoris 2 1 1493
254. LEU Leuconostoc lactis 2 1 1499
255. LEU Leuconostoc mesenteroides 2 1 1554
256. LEU Leuconostoc oenos 2 1 1510
257. LEU Leuconostoc paramesenteroides 2 1 1524
258. LGE Lineus geniculatus 1 1 120
259. LHE Lophocolea heterophylla 1 1 119
260. LND Linderina macrospora 1 1 119
261. LPN Fluoribacter bozemanae 6 6 796
262. LPN Fluoribacter dumoffii 6 6 825
263. LPN Fluoribacter gormanii 3 3 385
264. LPN Legionella pneumophila 11 10 1252
265. LSY Leptosynapta inhaerens 6 3 1051
266. LTT Leptothrix discophora 1 1 117
267. LUM Lumbricus sp. 6 3 976
268. LUP Lupinus luteus 5 5 380
269. LVI Lactobacillus viridescens 4 3 1816
270. LVI Lactobacillus vitulinus 1 1 1477
271. LYC Lycopodium clavatum 1 1 121
272. LYO Lycoperdon pyriforme 1 1 118
273. MAG Methylomonas agile 1 1 119
274. MAG Methylomonas methanica 2 2 1400
275. MAG Methylomonas rubra 1 1 119
276. MBF Methanobacterium formicicum 1 1 1476
277. MBI Methanobacterium thermoautotrophicum 4 4 415
278. MES Methanosarcina barkeri 1 1 130
279. MET Metridium senile 6 3 963
280. MGL Metasequoia glyptostroboides 1 1 120
281. MJU Microstroma juglandis 1 1 121
282. MLC Methylococcus capsulatus 3 3 1469
283. MLM Moloney murine leukemia virus 1 1 74
284. MLU Micrococcus luteus 2 2 238
285. MLY Micrococcus lysodeikticus 1 1 120
286. MNI Mnium rugicum 2 1 157
287. MOR Mortierella formosensis 1 1 120
288. MPO Marchantia polymorpha 1 1 119
289. MSE Megasphaera elsdenii 1 1 1567
290. MSG Mycobacterium asiaticum 2 1 1368
291. MSG Mycobacterium aurum 2 1 1349
292. MSG Mycobacterium avium 4 2 2735
293. MSG Mycobacterium chelonei 2 1 1355
294. MSG Mycobacterium chitae 2 1 1359
295. MSG Mycobacterium fallax 2 1 1348
296. MSG Mycobacterium flavescens 2 1 1357
297. MSG Mycobacterium gordonae 2 1 1373
298. MSG Mycobacterium kansasii 2 1 1369
299. MSG Mycobacterium leprae 1 1 313
300. MSG Mycobacterium neoaurum 2 1 1354
301. MSG Mycobacterium nonchromogenicum 2 1 1376
302. MSG Mycobacterium paratuberculosis 2 1 1367
303. MSG Mycobacterium phlei 2 1 1357
304. MSG Mycobacterium senegalense 2 1 1356
305. MSG Mycobacterium smegmatis 1 1 77
306. MSG Mycobacterium sp. 4 2 2715
307. MSG Mycobacterium terrae 2 1 1363
308. MSG Mycobacterium thermoresistible 2 1 1359
309. MSG Mycobacterium triviale 2 1 1351
310. MSG Mycobacterium tuberculosis 1 1 116
311. MSL Mytilus edulis 1 1 119
312. MTB Methylobacterium extorquens 2 2 1471
313. MTB Methylobacterium organophilum 2 2 1431
314. MTB Methylobacterium sp. 1 1 1052
315. MTE Methylosporovibrio methanica 1 1 1306
316. MUS Mus musculus 40 40 4555
317. MYA Mya arenaria 6 3 927
318. MYC Mycoplasma capricolum 3 3 259
319. MYC Mycoplasma hyopneumoniae 3 3 1799
320. MYC Mycoplasma mycoides 7 6 1885
321. MYC Mycoplasma sp. 24 24 34006
322. MYL Methylosinus trichosporium 2 2 1575
323. MYM Methylophilus methylotrophus 2 2 1619
324. MYP Methylocystis parvus 2 2 1433
325. MZE Zea mays 1 1 50
326. NDU Nematospiroides dubius 1 1 360
327. NEC Nectria haematococca 6 6 1152
328. NEM Ascaris suum 22 22 1251
329. NEU Neurospora crassa 6 6 1100
330. NGO Neisseria denitrificans 1 1 1478
331. NGO Neisseria gonorrhoeae 1 1 1486
332. NIF Nitella flexilis 1 1 121
333. NIT Nitrobacter winogradskyi 1 1 117
334. OCE Oceanospirillum linum 1 1 1542
335. ONG Onchocerca gibsoni 1 1 363
336. OPW Ophiocoma wendtii 9 3 1036
337. PAE Palaemonetes kadiakensis 1 1 1877
338. PAR Paramecium caudatum 1 1 366
339. PAR Paramecium primaurelia 1 1 366
340. PAR Paramecium tetraurelia 1 1 120
341. PAS Pasteurella multocida 3 3 1926
342. PBL Phycomyces blakesleeanus 1 1 120
343. PBR Perinereis brevicirris 1 1 120
344. PCL Prochloron sp. 1 1 122
345. PCR Philosamia cynthia ricini 2 2 289
346. PDE Paracoccus denitrificans 1 1 117
347. PEA Pisum sativum 8 8 824
348. PEC Penicillium chrysogenum 1 1 119
349. PEP Penicillium patulum 1 1 119
350. PEU Penaeus aztecus 1 1 1902
351. PFA Plasmodium falciparum 1 1 78
352. PGO Phascolopsis gouldii 1 1 120
353. PHS Phasianus colchicus 1 1 95
354. PHV Phaseolus vulgaris 1 1 75
355. PHY Pythium hydnosporum 1 1 118
356. PIL Pilayella littoralis 1 1 118
357. PIR Phlyctochytrium irregulare 1 1 118
358. PIS Pimelobacter simplex 1 1 120
359. PIV Pivellula marina 2 1 2885
360. PLA Platygloea peniophorae 1 1 119
361. PLC Planococcus citreus 2 1 116
362. PLC Planococcus kocurii 2 1 116
363. PLE Phleogena faginea 1 1 119
364. PLL Pirella marina 1 1 110
365. PLL Pirella sp. 2 2 222
366. PLT Planctomyces brasiliensis 1 1 110
367. PLT Planctomyces limnophilus 1 1 111
368. PLT Planctomyces staleyi 1 1 1525
369. PMC Pneumocystis carinii 1 1 120
370. PMI Prorocentrum micans 1 1 364
371. PNC Pseudonocardia thermophila 1 1 1246
372. PNU Psilotum nudum 2 2 171
373. POC Procaris ascensionis 1 1 1874
374. POO Prosthecochloris aestuarii 1 1 110
375. POR Porocephalus crotali 1 1 1830
376. POS Pleurotus ostreatus 1 1 118
377. PPO Puccinia poarum 1 1 118
378. PRA Procambarus leonensis 1 1 1869
379. PRE Planocera reticulata 1 1 120
380. PRM Proteus vulgaris 4 4 1925
381. PSE Pseudomonas aeruginosa 2 2 1637
382. PSE Pseudomonas cepacia 2 2 1589
383. PSE Pseudomonas fluorescens 2 1 120
384. PSE Pseudomonas sp. 1 1 118
385. PT4 Bacteriophage T4 17 12 979
386. PT5 Bacteriophage T5 9 9 711
387. PTE Porphyra tenera 1 1 121
388. PTR Plagiomnium trichomanes 1 1 119
389. PYE Porphyra yezoensis 1 1 121
390. QUL Coturnix coturnix 1 1 136
391. RAB Oryctolagus cuniculus 15 11 4955
392. RAT Rattus norvegicus 60 45 6084
393. RAT Rattus rattus 4 4 230
394. RCA Rhodobacter capsulatus 2 2 235
395. RCY Russula cyanoxantha 1 1 119
396. REC Renobacter vacuolatum 1 1 116
397. RER Rhodococcus equi 2 1 1360
398. RER Rhodococcus erythropolis 1 1 121
399. RHC Rhizoctonia crocorum 1 1 119
400. RHZ Rhizoctonia hiemalis 1 1 119
401. RIC Oryza sativa 2 2 417
402. RIF Riftia pachyptila 6 3 929
403. RIR Rickettsia rickettsii 2 1 1443
404. RIR Rickettsia typhi 2 1 1444
405. RMA Rhodopseudomonas marina 1 1 1417
406. RPA Rhodopseudomonas palustris 1 1 119
407. RRU Rhodospirillum rubrum 2 2 161
408. RSP Rhodospirillum rubrum 1 1 1446
409. RSS Rhodobacter sphaeroides 1 1 115
410. RTO Rhabditis tokai 1 1 119
411. RYE Secale cereale 3 3 356
412. SAC Sulfolobus acidocaldarius 2 2 204
413. SAG Schizochytrium aggregatum 1 1 119
414. SAH Saccharum officinarum 1 1 50
415. SAU Stigmatella aurantiaca 2 2 239
416. SCC Scyliorhinus caniculus 1 1 120
417. SCL Styela clava 6 3 967
418. SCM Schistosoma mansoni 2 2 215
419. SCS Saccharopolyspora hirsuta 1 1 1284
420. SCU Thyone briareus 6 3 1059
421. SEP Septobasidium carestianum 1 1 119
422. SFE Saprolegnia ferax 1 1 118
423. SFU Sargassum fulvellum 1 1 118
424. SHE Shewanella hanedai 2 1 120
425. SHP Ovis sp. 1 1 76
426. SHR Artemia salina 2 2 282
427. SJA Sabellastarte japonica 1 1 120
428. SLI Synechococcus lividus 1 1 120
429. SLM Physarum polycephalum 6 5 845
430. SME Spiroplasma sp. 11 11 14826
431. SMI Smittium culisetae 1 1 121
432. SNL Arion rufus 3 2 276
433. SNL Helix pomatia 1 1 119
434. SOB Scenedesmus obliquus 5 5 407
435. SOF Sepia officinalis 2 1 120
436. SOS Stichopus oshimae 1 1 120
437. SPG Saprospira grandis 1 1 121
438. SPI Spinacia oleracea 3 2 205
439. SPL Spirillum volutans 1 1 1492
440. SPM Spirobolus marginatus 6 3 977
441. SPO Sporolactobacillus inulinus 1 1 117
442. SPS Spirogyra sp. 1 1 120
443. SQD Illex illecebrosus 1 1 120
444. SRG Sorghum bicolor 1 1 50
445. SSO Sulfolobus solfataricus 1 1 126
446. SSP Sulfolobus sp. 1 1 131
447. STA Staphylococcus aureus 1 1 115
448. STA Staphylococcus epidermidis 5 3 264
449. STC Stentor coeruleus 1 1 353
450. STE Stella humosa 1 1 117
451. STF Asteria amurensis 2 2 195
452. STF Asterias forbesi 9 3 1045
453. STF Asterina pectinifera 1 1 120
454. STM Streptomyces griseus 1 1 120
455. STN Stenopus hispidus 1 1 1885
456. STR Streptococcus cremoris 1 1 117
457. STR Streptococcus faecalis 1 1 117
458. STR Streptococcus sp. 1 1 1577
459. STY Salmonella typhimurium 7 6 459
460. SUD Pseudocentrotus depressus 1 1 120
461. SUE Heliocidaris erythrogramma 6 3 1043
462. SUE Heliocidaris tuberculata 6 3 910
463. SUH Hemicentrotus pulcherrimus 1 1 120
464. SUL Lytechinus pictus 6 3 1046
465. SUP Psammechinus miliaris 8 8 422
466. SUS Strongylocentrotus purpuratus 6 3 988
467. SYB Syntrophospora bryantii 1 1 1532
468. SYC Synechocystis sp. 1 1 76
469. SYN Synechococcus lividus 1 1 119
470. SYW Syntrophomonas wolfei 1 1 1532
471. TAM Tatlockia maceachernii 3 3 458
472. TAM Tatlockia micdadei 9 9 1286
473. TAN Tilletiaria anomala 1 1 118
474. TAP Taphrina deformans 1 1 119
475. TCO Tilletiaria controversa 1 1 118
476. TET Tetrahymena thermophila 7 7 615
477. TEY Tetrahymena pyriformis 3 3 623
478. TFE Acidiphilium cryptum 1 1 122
479. TFE Thiobacillus acidophilus 1 1 120
480. TFE Thiobacillus ferrooxidans 2 2 240
481. TFE Thiobacillus intermedius 1 1 117
482. TFE Thiobacillus neapolitanus 1 1 119
483. TFE Thiobacillus novellus 1 1 120
484. TFE Thiobacillus perometabolis 1 1 116
485. TFE Thiobacillus sp. 1 1 117
486. TFE Thiobacillus thiooxidans 1 1 121
487. TFE Thiobacillus thioparus 1 1 118
488. TFE Thiobacillus versutus 1 1 116
489. TFE Thiomicrospira pelophila 1 1 118
490. TFE Thiomicrospira sp. 1 1 117
491. THA Artificial gene 3 3 276
492. THC Thermococcus celer 2 2 1611
493. THR Thermomicrobium roseum 1 1 127
494. THT Thiothrix nivea 1 1 122
495. THT Thiothrix sp. 1 1 120
496. THV Thiovulum sp. 1 1 123
497. TLA Thermomyces lanuginosus 2 2 276
498. TLP Torulopsis utilis 1 1 121
499. TOB Nicotiana tabacum 2 2 152
500. TOR Trichosporon oryzae 1 1 118
501. TRB Trypanosoma brucei 2 2 106
502. TRD Tripsacum dactyloides 1 1 50
503. TRF Crithidia fasciculata 7 7 1305
504. TRI Trichomonas vaginalis 1 1 341
505. TTE Thermoproteus tenax 1 1 1504
506. TTH Thermus aquaticus 2 2 243
507. TTH Thermus sp. 2 2 243
508. TTH Thermus thermophilus 5 4 354
509. TUL Tulasnella violea 1 1 118
510. TUM Tuberoidobacter mutans 1 1 116
511. TVI Thraustochytrium visurgense 1 1 119
512. UPE Ulva pertusa 1 1 120
513. URE Ureaplasma urealyticum 1 1 1464
514. UTH Uthatobasidium fusisporum 1 1 118
515. UUN Urechis unicinctus 1 1 120
516. VCH Vibrio cholerae 1 1 119
517. VER Verrucomicrobium spinosum 1 1 116
518. VFA Vicia faba 2 2 327
519. VIB Aeromonas hydrophila 1 1 118
520. VIB Aeromonas media 1 1 119
521. VIB Aeromonas salmonicida 1 1 119
522. VIB Alteromonas colwelliana 2 1 120
523. VIB Alteromonas putrifaciens 1 1 120
524. VIB Photobacterium angustum 1 1 120
525. VIB Photobacterium leiognathi 1 1 120
526. VIB Photobacterium sp. 1 1 120
527. VIB Plesiomonas shigelloides 1 1 120
528. VIB Vibrio alginolyticus 1 1 121
529. VIB Vibrio anguillarum 1 1 120
530. VIB Vibrio carchariae 1 1 120
531. VIB Vibrio cincinnatii 1 1 120
532. VIB Vibrio damsela 1 1 120
533. VIB Vibrio fischeri 1 1 120
534. VIB Vibrio fluvialis 2 2 240
535. VIB Vibrio gazogenes 1 1 120
536. VIB Vibrio logei 1 1 120
537. VIB Vibrio marinus 2 2 236
538. VIB Vibrio metschnitovii 1 1 120
539. VIB Vibrio mimicus 1 1 120
540. VIB Vibrio natriegens 1 1 121
541. VIB Vibrio nereis 2 1 121
542. VIB Vibrio parahaemolyticus 2 2 239
543. VIB Vibrio pelagius 1 1 120
544. VIB Vibrio proteolyticus 1 1 120
545. VIB Vibrio psychroerythus 1 1 119
546. VIB Vibrio sp. 5 4 478
547. VIT Vitreoscilla sp. 2 2 234
548. VIT Vitreoscilla stercoraria 2 2 1608
549. VVU Vibrio vulnificus 1 1 120
550. WHT Triticum aestivum 17 14 1729
551. WHT Triticum sp. 2 2 152
552. WHT Triticum vulgare 2 2 282
553. WLB Wolbachia persica 2 1 1475
554. WOL Wolinella succinogenes 1 1 1503
555. XEB Xenopus borealis 3 3 477
556. XEL Xenopus laevis 20 17 4158
557. XET Xenopus tropicalis 2 2 242
558. XYL Xylella fastidiosa 1 1 1493
559. YSA Candida albicans 1 1 121
560. YSC Saccharomyces cerevisiae 62 47 4920
561. YSG Saccharomyces carlsbergensis 4 2 242
562. YSK Kluyveromyces lactis 1 1 121
563. YSP Schizosaccharomyces pombe 8 8 630
564. YSR Pichia membranaefaciens 1 1 120
565. YST Yeast sp. 7 6 492
566. YSU Candida utilis 12 10 815
567. Unidentified 177 169 19082
Total 1946 1647 445723
VIRAL
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. AA2 Adeno associated virus 9 6 7879
2. AAF Avian musculoaponeurotic fibrosarcoma virus
2 1 3171
3. AC2 Avian carcinoma virus 14 11 18641
4. ACB Avian erythroblastosis virus 19 15 18700
5. ACE Avian endogenous virus 5 5 2772
6. ACF Fujinami sarcoma virus 3 2 7503
7. ACM Avian myelocytomatosis retrovirus 10 8 11975
8. ACR Avian reticuloendotheliosis virus 12 8 8401
9. ACS Avian sarcoma virus 15 13 16357
10. AD4 Mastadenovirus h40 4 4 10795
11. AD4 Mastadenovirus h41 3 3 8920
12. ADA Mastadenovirus s30 5 5 1318
13. ADB Mastadenovirus 2 82 5 36399
14. ADB Mastadenovirus c2 1 1 196
15. ADC Mastadenovirus h3 12 11 9026
16. ADD Mastadenovirus h4 7 5 5078
17. ADE Mastadenovirus 7 1 1 2718
18. ADE Mastadenovirus h5 35 11 30276
19. ADG Mastadenovirus h7 15 6 13245
20. ADG Mastadenovirus s7 6 5 4931
21. ADI Mastadenovirus 9 2 2 332
22. ADJ Mastadenovirus 10 1 1 135
23. ADL Mastadenovirus 2 2 1 430
24. ADL Mastadenovirus h12 41 23 19901
25. ADR Mastadenovirus 18 2 2 364
26. ADT Tupaia adenovirus 4 4 3784
27. ADU Mastadenovirus 19 1 1 154
28. ADV Adenovirus VA 8 7 11146
29. ADV Mastadenovirus 2 2 1549
30. ADV Mastadenovirus h40 1 1 1849
31. ADV Mastadenovirus h41 2 1 1939
32. ADX Mastadenovirus bos1 1 1 159
33. ADX Mastadenovirus mus 7 5 9449
34. ADY Eggdrop syndrome-1976 virus 1 1 52
35. ADZ Mastadenovirus 31 2 2 300
36. ADZ Mastadenovirus bos3 1 1 2849
37. ADZ Mastadenovirus c2 2 2 3689
38. AEA Avian adenovirus 3 3 576
39. AEC Canine adenovirus 4 4 805
40. AEE Equine adenovirus 4 4 617
41. AIN Aino virus 1 1 850
42. ALE Rous associated virus 9 8 5119
43. ALK Avian leukemia virus 2 2 454
44. ALM Avian myeloblastosis virus 6 6 6465
45. ALR Rous sarcoma virus 162 136 65936
46. ALV Avian leukosis virus 12 12 5400
47. APH Foot and mouth disease virus 125 120 77985
48. ARE Avian retrovirus 2 2 698
49. ARE Avian retrovirus IC10 3 2 6013
50. ARR Adult diarrhea rotavirus 2 2 1445
51. ASB Avocado sunblotch viroid 20 19 4715
52. ASS Apple scar skin viroid 1 1 329
53. ASV African swine fever virus 4 3 6376
54. BBM Broad bean mottle virus 3 3 680
55. BBV Black beetle virus 4 3 4893
56. BCT Beet curly top virus 1 1 2993
57. BEC Bovine enteritic coronavirus 1 1 1710
58. BEV Bovine enterovirus 1 1 7414
59. BIM Bovine immunodeficiency-like virus 1 1 8482
60. BLC Bunyamwera virus 3 3 12294
61. BLC Bunyavirus La Crosse 19 18 9070
62. BLC Germiston bunyavirus 2 2 5514
63. BLV Bovine leukemia virus 20 20 31133
64. BNY Beet necrotic yellow vein mosaic virus 6 6 17031
65. BOO Boolarra virus 2 1 1305
66. BRV Berne virus 2 2 376
67. BTV Bluetongue virus 31 23 43490
68. BVD Bovine viral diarrhea virus 1 1 12573
69. BWY Beet western yellow virus 5 3 7958
70. BYD Barley yellow dwarf virus 3 2 6280
71. CAD Canine distemper virus 4 4 6857
72. CAN Carnation etched ring virus 1 1 7932
73. CAP Capripoxvirus 4 4 10417
74. CAS Cassava latent virus 2 2 5503
75. CASNS Cas NS1 retrovirus 1 1 2711
76. CBV Choristoneura biennis virus 1 1 1173
77. CCC Cadang-cadang coconut viroid 3 3 779
78. CCP Cricket paralysis virus 1 1 1594
79. CEA Caprine arthritis encephalitis virus 9 8 11854
80. CEV Citrus exocortis viroid 4 4 1484
81. CFD Coconut foliar decay virus 1 1 1291
82. CHM Chloris striate mosaic virus 1 1 2750
83. CHV Chlorella virus 2 2 3727
84. CMV Carnation mottle virus 1 1 4003
85. CNV Cucumber necrosis virus 1 1 4701
86. CO4 Coliphage N4 2 2 1759
87. COB Bovine coronavirus 8 5 15070
88. CPB Chlorella PBCV-1 virus 1 1 4265
89. CPE Euxoa scandens cytoplasmic polyhedrosis virus
1 1 882
90. CPF Cucumber pale fruit viroid 3 2 604
91. CPR Chandipura virus 1 1 1751
92. CPV Cowpox virus 20 20 17648
93. CRV Cymbidium ringspot virus 2 1 4733
94. CSO Campoletis sonorensis virus 8 6 9418
95. CSV Chrysanthemum stunt viroid 4 3 1040
96. CTN Coconut tinangaja viroid 1 1 254
97. CXB Coxsackievirus B1 2 2 8844
98. CXB Coxsackievirus B3 5 5 20481
99. CXB Coxsackievirus B4 1 1 7395
100. CYM Clover yellow mosaic potexvirus 1 1 1051
101. CYS Lymphocystis disease virus of fish 3 3 5310
102. DEN Dengue virus 105 103 90462
103. DHV Dhori virus 1 1 1479
104. DMB Thymotropic retrovirus type B 1 1 285
105. DNV Densonucleosis virus 1 1 4277
106. DPF Dapple peach fruit disease viroid 1 1 297
107. DPP Dapple plum and peach fruit disease viroid
1 1 297
108. DUG Dugbe nairovirus 1 1 1712
109. EAE Equine arthritis encephalitis virus 1 1 2580
110. EBO Ebola virus 2 2 3178
111. ECV Echo 11 virus 1 1 98
112. ECV Echo 6 virus 1 1 99
113. ECV Echo 9 virus 2 2 615
114. EEE Eastern equine encephalomyelitis virus 5 5 5163
115. EEV Venezuelan equine encephalitis virus 7 6 8300
116. EEW Western equine encephalitis virus 2 2 4521
117. EIA Equine infectious anemia virus 18 11 24136
118. EMC Encephalomyocarditis virus 10 9 27249
119. FCG Gardner-Arnstein Feline Leukemia oncovirus B
2 2 3863
120. FCL Feline calicivirus 2 2 6358
121. FCR RD114 retrovirus 1 1 126
122. FCS Feline sarcoma virus 7 7 14248
123. FCV Feline leukemia virus 17 15 38510
124. FHV Flock house virus 2 1 1400
125. FIP Feline infectious peritonitis virus 1 1 4500
126. FIV Feline immunodeficiency virus 6 3 19318
127. FLA Influenza virus type A 504 412 430270
128. FLB Influenza virus type B 85 72 102701
129. FLC Influenza virus type C 29 29 46959
130. FMV Figwort mosaic virus 1 1 7743
131. FPV Fowlpox virus 6 6 25819
132. FV3 Frog virus 3 3 2 2273
133. GPB Granulosis virus 1 1 999
134. GPR Gottfried porcine rotavirus 1 1 3302
135. GSB GS virus 1 1 307
136. GSH Ground squirrel hepatitis virus 1 1 3311
137. GVI Grapevine chrome mosaic virus 4 2 11653
138. GVI Grapevine viroid 3 2 665
139. GVT Trichoplusia ni granulosis virus 1 1 998
140. GYS Grapevine yellow speckle viroid 3 2 730
141. HAN Hantaan virus 6 5 9735
142. HBD Duck hepatitis B virus 6 5 12249
143. HBH Heron hepatitis B virus 1 1 3027
144. HCV Hog cholera virus 3 2 24567
145. HIV Human immunodeficiency virus type 1 101 53 187276
146. HIV Human immunodeficiency virus type 2 13 9 75733
147. HIV Human lymphotropic virus type III 1 1 261
148. HIV Human T-cell lymphotropic virus type II 7 3 10520
149. HJV Highlands J virus 2 2 505
150. HL1 Human lymphotropic virus type I 15 13 26925
151. HL2 Human lymphotropic virus type II 4 4 5400
152. HLV Hop latent viroid 2 1 256
153. HOB Human coronavirus 4 3 3550
154. HOJ HoJo virus 1 1 3613
155. HOM Mus hortulanus virus 3 3 4668
156. HOP Hop Stunt Viroid 10 7 2094
157. HPA Hepatitis A virus 13 11 42881
158. HPB Hepatitis B virus 74 70 76139
159. HPC Hepatitis C virus 3 2 7893
160. HPD Hepatitis delta virus 4 3 3523
161. HPE Hepatitis E virus 2 1 2570
162. HPU Duck hepatitis virus 2 1 3021
163. HPV Hepatitis virus 2 2 4553
164. HRD Human retrovirus type D 1 1 8785
165. HRV Human rhinovirus 11 9 31251
166. HS1 Herpes simplex virus type 1 148 109 315961
167. HS2 Herpes simplex virus type 2 37 31 54029
168. HS4 Epstein-Barr virus 88 68 310429
169. HS5 Human cytomegalovirus 44 40 136382
170. HS5 Murine cytomegalovirus 4 3 6921
171. HS5 Simian cytomegalovirus 1 1 880
172. HS6 Human herpesvirus type 6 2 2 26298
173. HSB Bovine herpesvirus type 1 9 9 8227
174. HSC Simian cytomegalovirus 2 2 2294
175. HSE Equine herpesvirus type 1 19 17 45132
176. HSG Gallid herpesvirus type 1 5 5 10607
177. HSK Gallid herpesvirus type 2 4 4 11325
178. HSL Feline herpesvirus 1 1 1619
179. HSL Herpesvirus ateles 1 1 2577
180. HSM Gallid herpesvirus type 1 2 2 3367
181. HSO Herpesvirus papio 2 2 806
182. HSP Human spumaretrovirus 3 3 12095
183. HSS Herpesvirus saimiri 30 29 22133
184. HSS Pseudorabies virus 18 14 38325
185. HST Herpesvirus tamarinus 2 1 2556
186. HSU Herpesvirus tupaia 1 1 863
187. HSV Herpes simplex virus 1 1 501
188. HSY Herpesvirus sylvilagus 1 1 559
189. HTV Human adult T-cell leukemia virus 3 2 3556
190. IBA Avian infectious bronchitis virus 24 24 31163
191. IBB Infectious bronchitis virus 3 2 7215
192. IBD Infectious bursal disease virus of chickens
4 3 9138
193. IHN Infectious hematopoietic necrosis virus 2 2 2961
194. INS Insect iridescent virus type 22 1 1 2183
195. IPN Infectious pancreatic necrosis virus 2 1 3097
196. IRI Iridescent virus type 1 1 1 2461
197. IRI Iridescent virus type 6 3 3 10327
198. JEV Japanese encephalitis virus 4 4 18496
199. KUN Kunjin virus 1 1 10664
200. KVS Killer virus of S.cerevisiae 8 6 1872
201. LCV Lymphocytic choriomeningitis virus 11 11 19716
202. LDV Lactate dehydrogenase-elevating virus 6 6 1684
203. LEE Lee virus 1 1 3616
204. LSV Lassa virus 4 4 10307
205. MAA Alfalfa mosaic virus 28 19 15793
206. MAV Myeloblastosis-associated virus 1 1 1173
207. MBG Bean golden mosaic virus 4 4 10465
208. MBG Bean yellow mosaic virus 1 1 1015
209. MBR Brome mosaic virus 16 12 9903
210. MBS Barley stripe mosaic virus 11 8 13655
211. MBV Middleburg virus 4 3 3394
212. MCA Cauliflower mosaic virus 28 25 41207
213. MCC Cowpea chlorotic mottle virus 7 6 6379
214. MCF Mink cell focus-forming virus 8 8 8202
215. MCG Cucumber green mottle mosaic virus 2 2 2421
216. MCM Maize chlorotic mottle virus 2 1 4437
217. MCP Cowpea mosaic virus 8 8 10170
218. MCV Cucumber mosaic virus 53 52 34142
219. MDP Aleutian mink disease parvovirus 4 2 8255
220. MEA Measles virus 30 25 83843
221. MEV Maus-Elberfeld virus 1 1 54
222. MGR Maguari bunyavirus 1 1 945
223. MHV Murine hepatitis virus 39 33 39489
224. MLA Abelson murine leukemia virus 8 6 10848
225. MLE Mouse RFV endogenous retrovirus 2 2 684
226. MLF Friend mink cell focus-inducing virus 5 4 7000
227. MLF Friend murine leukemia virus 2 2 4170
228. MLF Friend spleen focus-forming virus 9 9 13488
229. MLG Gross passage A murine leukemia virus 2 2 1220
230. MLK Kirsten murine leukemia virus 1 1 1335
231. MLM Moloney murine leukemia virus 56 42 35727
232. MLN Murine non-leukeminogenic retrovirus 1 1 529
233. MLO AKV murine leukemia virus 7 2 9000
234. MLR Rauscher spleen focus-forming virus 2 2 2244
235. MLS Soule murine leukemia virus 2 2 1310
236. MLT Tikaut murine leukemia virus 1 1 641
237. MLV Murine leukemia virus 53 44 48928
238. MLX Xenotropic murine leukemia virus 1 1 3060
239. MMT Mouse mammary tumor virus 29 28 37314
240. MNC Narcissus mosaic potexvirus 1 1 6955
241. MOK Mokola lyssavirus 2 2 152
242. MOP Mopeia virus 1 1 3419
243. MPM Mouse polyomavirus 1 1 1155
244. MPV Monkeypox virus 1 1 1276
245. MRV Marburg virus 1 1 59
246. MSB Southern bean mosaic virus 2 2 793
247. MSC Sugarcane mosaic virus 1 1 1782
248. MSH Harvey murine sarcoma virus 4 4 3226
249. MSJ FBJ murine osteosarcoma virus 1 1 4226
250. MSK Kirsten murine sarcoma virus 2 2 1933
251. MSN Solanum nodiflorum mottle virus 1 1 377
252. MSR FBR murine osteosarcoma virus 1 1 3811
253. MSV Murine sarcoma virus 5 5 5020
254. MSY Myeloproliferative sarcoma virus 3 3 5305
255. MTG Tomato golden mosaic virus 3 3 6342
256. MTR Tobacco rattle virus 7 7 20386
257. MTS Lucerne transient streak virus 3 3 970
258. MTV Tobacco mosaic virus 22 8 16050
259. MTV Velvet tobacco mottle virus 1 1 366
260. MTY Andean potato latent virus 1 1 96
261. MTY Clitoria yellow vein virus 1 1 120
262. MTY Eggplant mosaic virus 3 3 6469
263. MTY Kennedya yellow mosaic virus 1 1 83
264. MTY Ononis yellow mosaic virus 2 2 6342
265. MTY Turnip yellow mosaic virus 20 16 15958
266. MUM Mumps virus 11 9 13622
267. MUR Murine retrovirus SL3-2 1 1 492
268. MVE Murray Valley encephalitis virus 2 2 5994
269. MVM Minute virus of mice 9 7 16222
270. MYX Myxoma virus 3 2 4473
271. MZS Maize streak virus 5 4 8139
272. NDV Newcastle disease virus 49 47 98864
273. NEV Nephropathia epidemica 2 2 5466
274. NOD Nodamura virus 2 1 1335
275. NPA Antheraea pernyi nuclear polyhedrosis virus
1 1 285
276. NPA Autographa californica nuclear polyhedrosis virus
45 41 67447
277. NPB Bombyx mori nuclear polyhedrosis virus 3 3 3931
278. NPG Galleria mellonella nuclear polyhedrosis virus
5 5 2556
279. NPM Mamestra brassicae nuclear polyhedrosis virus
1 1 2598
280. NPO Orgyia pseudotsugata polyhedrosis virus 8 8 16937
281. NPS Spodoptera frugiperda nuclear polyhedrosis virus
1 1 1557
282. OLV Ovine lentivirus 2 2 18512
283. ONN O'Nyong-nyong virus 2 1 11835
284. ORF Orf virus 2 1 5003
285. PCB Baboon endogenous virus 8 7 20105
286. PCC Colobus type C cpc-1 endogenous retrovirus
2 2 373
287. PCE Chimpanzee type C endogenous retrovirus 2 2 430
288. PCG Gibbon leukemia virus 5 4 9202
289. PCM Macaca endogenous retrovirus 1 1 126
290. PCM Macaca mulatta type C retrovirus 4 4 938
291. PCS Simian sarcoma virus 12 9 10868
292. PEB Pea early browning virus 2 1 7073
293. PEV Subacute sclerosing panencephalitis virus 3 3 3444
294. PIB Bovine parainfluenza virus type 3 3 1 8700
295. PIC Pichinde Arenavirus 8 8 12637
296. PIF Human parainfluenza virus type 3 29 27 46139
297. PLV Potato leaf roll virus 5 3 6650
298. PLY Budgerigar fledgling disease virus 1 1 4980
299. PLY Polyomavirus 131 39 35440
300. PLY Polyomavirus BK 2 2 799
301. PLY Polyomavirus JC 3 3 765
302. PMP Papaya mosaic potexvirus 2 2 1039
303. PMS Simian paramyxovirus (SV5) 1 1 1382
304. PMV Pepper mottle virus 1 1 1480
305. POL Poliovirus 120 106 78406
306. POV Porcine parvovirus 1 1 3670
307. PPA Avian papillomavirus 2 2 786
308. PPB Bovine papillomavirus 14 14 32821
309. PPC Hamster papovavirus 2 2 10672
310. PPD Deer papillomavirus 1 1 8374
311. PPE European Elk papillomavirus 4 3 8842
312. PPH Human papillomavirus 40 38 101284
313. PPI Micromys minutus papillomavirus 3 3 487
314. PPL Lymphotropic papovavirus 1 1 5270
315. PPM Monkey B-lymphotropic papovavirus 4 4 10920
316. PPR Reindeer papillomavirus 2 2 930
317. PPV Plum pox potyvirus 3 3 13827
318. PRH Prospect Hill virus 1 1 1675
319. PRV Porcine rotavirus 11 9 12190
320. PSV Peanut stunt virus 1 1 393
321. PTP Punta toro phlebovirus 6 6 7130
322. PTV Potato spindle tuber viroid 5 5 1795
323. PV1 Parvovirus H1 3 2 5302
324. PV3 Parvovirus H3 1 1 125
325. PVA Raccoon parvovirus 2 1 2410
326. PVB Papovavirus BKV 38 25 18937
327. PVB Parvovirus B19 4 4 11325
328. PVC Canine parvovirus 4 4 10016
329. PVD Bovine parvovirus 2 1 5517
330. PVF Feline panleukopenia virus 10 6 16703
331. PVM Mink enteritis virus 4 2 4888
332. PVR Kilham rat virus 1 1 125
333. PVR Parvovirus R1 2 2 548
334. PVS Potato virus S 1 1 3552
335. PVX Potato virus X 8 6 22573
336. PVY Potato virus Y 8 5 17509
337. RAV Rabies virus 21 20 32337
338. RBF Malignant rabbit fibroma virus 3 3 446
339. RBF Rabbit fibroma virus 15 15 28212
340. RBV Rabbit rotavirus 1 1 1036
341. RCM Red clover mottle virus 1 1 3543
342. RDV Rice dwarf virus 5 3 5209
343. REO Reovirus sp. 2 2 2903
344. REO Reovirus type 1 21 19 14348
345. REO Reovirus type 2 14 12 6823
346. REO Reovirus type 3 48 34 25499
347. RML Rauscher murine leukemia virus 3 3 395
348. RNM Red clover necrotic mosaic virus 3 2 5338
349. RO1 Rotavirus sp. 3 3 4074
350. RO1 Rotavirus subgroup 1 3 2 2712
351. RO2 Rotavirus subgroup 2 6 6 5955
352. ROB Bovine rotavirus 21 16 24214
353. ROH Human rotavirus 6 6 8164
354. ROR Rhesus rotavirus 2 2 3424
355. ROT Simian rotavirus SA11 19 15 20730
356. RPF Rinderpest virus 5 5 10323
357. RPV Raccoonpox virus 1 1 2195
358. RRV Ross river virus 4 3 19686
359. RSB Bovine syncytial virus 1 1 1201
360. RSH Human respiratory syncytial virus 33 21 19332
361. RSV Rat sarcoma virus 1 1 1380
362. RUB Rubella virus 12 7 26119
363. RVF Rift Valley fever virus 1 1 3884
364. SAM Satellite arabis mosaic virus 1 1 300
365. SAP Satellite panicum mosaic virus 1 1 826
366. SFS Sandfly fever Sicilian virus 2 1 1747
367. SFV Semliki forest virus 12 3 15380
368. SHV Simian hepatitis A virus 6 4 4331
369. SIG Sigma virus 1 1 1718
370. SIN Sindbis virus 16 6 18450
371. SIV Simian immunodeficiency virus 31 23 106170
372. SIV Simian immunodeficiency virus 1 1 7759
373. SIV Simian immunodeficiency virus 3 3 9130
374. SLO St. Louis encephalitis virus 8 8 5391
375. SMF Simian foamy virus 1 1 3534
376. SND Parainfluenza virus 39 31 59944
377. SND Parainfluenza virus type 4A 2 2 3767
378. SNV Spleen necrosis virus 9 8 3909
379. SPV Spiroplasma virus 1 1 4421
380. SRV Sapporo rat virus 2 2 5420
381. SSH Snowshoe hare bunyavirus 11 10 6726
382. STL Simian T-cell lymphotropic virus type I 3 3 8227
383. STT St. Thomas 3 rotavirus 1 1 1062
384. SUV Subterranean clover mottle virus 2 2 720
385. SV4 Rhesus macaque polyomavirus 180 42 15361
386. SV5 Simian virus 5 4 4 5586
387. SVC Spring viremia of carp virus 2 2 778
388. SVD Swine vesicular disease virus 2 2 7475
389. SYE Sonchus yellow net virus 3 3 2822
390. TAC Tacaribe virus 4 3 10607
391. TAS Tomato apical stunt viroid 3 2 723
392. TBE Tick-borne encephalitis virus 7 5 32678
393. TBR Tomato black ring virus 11 11 18222
394. TBS Tomato bushy stunt virus 3 2 5172
395. TBV Tick-borne virus 1 1 1586
396. TCV Turnip crinkle virus 5 5 6433
397. TEV Tobacco etch virus 4 3 21315
398. TGE Transmissible gastroenteritis virus 14 9 20969
399. TME Theiler's murine encephalomyelitis virus 4 4 26220
400. TMG Tobacco mild green mosaic virus 3 2 7768
401. TNC Tobacco necrosis virus 1 1 3660
402. TNS Satellite tobacco necrosis virus 4 2 1380
403. TOA Tomato aspermy virus 5 5 943
404. TOS Tomato ringspot virus 2 2 3096
405. TPM Tomato plant macho viroid 1 1 360
406. TRS Tobacco ringspot virus 3 3 790
407. TSV Tobacco streak virus 3 3 2525
408. TVM Tobacco vein mottling virus 4 2 9892
409. UST Ustilago maydis P6 virus 1 1 1234
410. UUK Uukuniemi virus 2 2 4951
411. VAC Vaccinia virus 97 88 389431
412. VAR Variola virus 1 1 1274
413. VAZ Varicella-zoster virus 11 7 131533
414. VLV Visna virus 2 2 9690
415. VSV Vesicular stomatitis virus 169 146 192746
416. VYS Saccharomyces cerevisiae virus ScV1 1 1 819
417. WCP White clover mosaic virus 4 3 13303
418. WDV Wheat dwarf virus 2 2 2829
419. WHV Woodchuck hepatitis virus 7 7 19239
420. WMS Woolly monkey sarcoma virus 2 1 1431
421. WNF West Nile virus 7 4 11434
422. WTV Wound tumor virus 11 9 14409
423. YFV Flavivirus febricis 5 3 22289
424. ZYM Zucchini yellow mosaic virus 1 1 1374
Total 4751 3707 6439492
PHAGE
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. AL3 Bacteriophage alpha3 7 5 1299
2. BAZ Bacteriophage Z 1 1 370
3. BBF Bacteriophage BF23 3 3 1434
4. BEO Corynebacteriophage omega 1 1 1880
5. BET Corynebacteriophage beta 3 2 4162
6. BEU Corynebacteriophage gamma 2 2 139
7. BFR Bacteriophage fr 4 4 2053
8. BM2 Bacteriophage PM2 3 3 1025
9. BNF Bacteriophage NF 6 5 3258
10. BO1 Bacteriophage Bo1 1 1 205
11. BP2 Bacteriophage P21 2 2 191
12. BPH Bacteriophage phi-11 2 2 2041
13. BT1 Bacteriophage T1 1 1 1091
14. BU3 Bacteriophage U3 1 1 201
15. BZ3 Bacteriophage Bz13 1 1 218
16. C31 Bacteriophage phi-c31 1 1 3413
17. CF1 Bacteriophage Cf16 1 1 500
18. CP1 Bacteriophage Cp-1 5 3 3364
19. CP5 Bacteriophage Cp-5 4 2 1850
20. CP7 Bacteriophage Cp-7 5 3 4792
21. CP9 Bacteriophage Cp-9 1 1 1253
22. CPT Bacteriophage Cp-T1 1 1 730
23. D18 Bacteriophage D108 9 7 3935
24. F1C Bacteriophage f1 16 13 16373
25. F2C Bacteriophage f2 1 1 58
26. FR1 Bacteriophage fr1 1 1 205
27. G14 Bacteriophage G14 1 1 113
28. H19B Bacteriophage H19B 2 2 3301
29. H30 Bacteriophage H30 1 1 1905
30. H44 Bacteriophage H4489A 1 1 3222
31. HB3 Bacteriophage HB-3 1 1 1319
32. HP1 Bacteriophage HP1 4 2 10673
33. IKE Bacteriophage Ike 3 2 7200
34. J93 Bacteriophage 933J 2 1 1499
35. JP3 Bacteriophage Jp34 2 2 1070
36. JP5 Bacteriophage Jp501 1 1 205
37. K5T Bacteriophage BK5-T 5 5 2070
38. KU1 Bacteriophage Ku1 1 1 220
39. L17 Bacteriophage L17 2 2 240
40. L54 Bacteriophage L54a 1 1 1626
41. LAM Bacteriophage lambda 120 24 55603
42. LP7 Bacteriophage LP7 1 1 2110
43. M13 Bacteriophage M13 12 8 8218
44. M13MP7 Bacteriophage M13mp7 1 1 60
45. M13MP8 Bacteriophage M13mp8 3 3 240
46. M13MP9 Bacteriophage M13mp9 2 2 318
47. M2Y Bacteriophage M2Y 2 2 336
48. MS2 Bacteriophage MS2 16 8 4679
49. OX2 Bacteriophage Ox2 2 2 2641
50. P15 Bacteriophage phi-105 1 1 1306
51. P16 Bacteriophage 16-3 1 1 720
52. P18 Bacteriophage 186 1 1 3561
53. P21 Bacteriophage phi-21 2 2 949
54. P22 Bacteriophage P22 17 15 18461
55. P29 Bacteriophage phi-29 18 15 29805
56. P42 Bacteriophage 42D 2 2 1986
57. P434 Bacteriophage 434 7 5 2933
58. P80 Bacteriophage phi-80 7 6 4714
59. P82 Bacteriophage 82 1 1 1200
60. P93 Bacteriophage 933W 2 1 1661
61. PA2 Bacteriophage PA-2 1 1 2816
62. PF1D Bacteriophage Pf1 1 1 435
63. PF3 Bacteriophage Pf3 4 4 12981
64. PFD Bacteriophage fd 12 7 7334
65. PFI Bacteriophage Fi 1 1 78
66. PG4 Bacteriophage G4 12 8 7247
67. PGA Bacteriophage Ga 4 4 4022
68. PH1 Bacteriophage H1 1 1 98
69. PH15 Bacteriophage phi-15 3 3 2352
70. PH2 Bacteriophage 21 1 1 1688
71. PH3 Bacteriophage phi-3T 2 2 3422
72. PH5 Bacteriophage phi-105 6 5 3851
73. PH6 Bacteriophage phi-6 7 7 13619
74. PHC Lactococcus 1 bacteriophage 1 1 1654
75. PHI Bacteriophage phi-H 1 1 2465
76. PHK Bacteriophage phi-K 3 2 426
77. PHM Bacteriophage phi-vML3 1 1 1208
78. PK3 Bacteriophage K3 7 6 6630
79. PM1 Bacteriophage M1 1 1 1714
80. PM2 Bacteriophage M2 1 1 1820
81. PMU Bacteriophage Mu 48 37 18049
82. PP1 Bacteriophage P1 43 41 20939
83. PP2 Bacteriophage P2 11 10 7614
84. PP4 Bacteriophage P4 9 8 14159
85. PP7 Bacteriophage P7 6 5 3315
86. PQB Bacteriophage Q-beta 14 13 1866
87. PR4 Bacteriophage PR4 2 2 240
88. PR5 Bacteriophage PR5 2 2 238
89. PR722 Bacteriophage PR722 2 2 240
90. PRD1 Bacteriophage PRD1 9 9 9061
91. PS2 Bacteriophage PBS2 1 1 720
92. PSP Bacteriophage Sp 3 2 4542
93. PST Bacteriophage ST 1 1 246
94. PT2 Bacteriophage T2 9 6 8743
95. PT3 Bacteriophage T3 21 18 16851
96. PT4 Bacteriophage T4 120 67 128557
97. PT5 Bacteriophage T5 29 26 26837
98. PT6 Bacteriophage T6 4 3 2938
99. PT7 Bacteriophage T7 41 18 47176
100. PVK Bacteriophage VK 1 1 246
101. PX1 Bacteriophage phi-X174 40 15 7239
102. PZA Bacteriophage PZA 3 1 19366
103. R17 Bacteriophage R17 9 7 463
104. RB1 Bacteriophage RB18 1 1 674
105. RB5 Bacteriophage RB51 1 1 700
106. RHO Bacteriophage Rho11s 2 1 2187
107. S13 Bacteriophage S13 2 1 5386
108. SF6 Bacteriophage SF6 1 1 996
109. SP1 Bacteriophage SPO1 20 20 6864
110. SP2 Bacteriophage SPO2 1 1 3040
111. SP6 Bacteriophage SP6 5 4 2948
112. SP8 Bacteriophage SP82 5 5 1527
113. SPB Bacteriophage SP-beta 4 3 2224
114. SPC Bacteriophage S-phi-C 1 1 1377
115. SPP Bacteriophage SPP1 2 2 1558
116. SPR Bacteriophage SPR 3 1 2129
117. ST1 Bacteriophage ST-1 2 2 844
118. T12 Bacteriophage T12 1 1 1837
119. TH1 Bacteriophage TH1 1 1 220
120. TW1 Bacteriophage TW19 1 1 76
121. TW2 Bacteriophage TW28 1 1 260
Total 880 593 682556
SYNTHETIC
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. ACC Cloning vector 1 1 1337
2. AD2 Artificial gene 1 1 128
3. ADB Artificial gene 8 8 573
4. ADH Artificial gene 1 1 106
5. ADL Artificial gene 3 3 273
6. ADV Artificial gene 1 1 106
7. ALM Avian myeloblastosis virus 1 1 337
8. ALR Rous sarcoma virus 4 4 413
9. AMH Artificial gene 1 1 234
10. APH Artificial gene 1 1 400
11. ARB Artificial gene 3 3 1180
12. ARC Cloning vector 2 2 760
13. ARE Artificial gene 1 1 255
14. ARG Artificial gene 1 1 249
15. ARH Artificial gene 5 5 440
16. ARI Artificial gene 1 1 465
17. ARL Artificial gene 2 2 424
18. ARM Artificial gene 1 1 457
19. ARN Cloning vector 1 1 333
20. ARP Cloning vector 6 6 1079
21. ARS Artificial gene 1 1 529
22. ART Artificial gene 1 1 60
23. ARY Artificial gene 1 1 264
24. ATH Artificial gene 1 1 417
25. BAM Artificial gene 1 1 85
26. BKV BK Virus 6 3 1560
27. BOV Bos taurus 18 18 6440
28. BSF Cloning vector 2 2 54
29. BSM Cloning vector 1 1 54
30. BSU Bacillus subtilis 10 9 3921
31. BTH Artificial gene 2 2 104
32. CAR Artificial gene 1 1 3616
33. CEL Caenorhabditis elegans 1 1 186
34. CHK Gallus sp. 4 4 701
35. CHS Artificial gene 1 1 478
36. CMVMUS Artificial gene 1 1 1376
37. COT Artificial gene 1 1 7876
38. CRO Artificial gene 2 2 198
39. CVC Cloning vector 1 1 46
40. CVE Cloning vector 1 1 60
41. CVJ Cloning vector 5 5 390
42. CVK Cloning vector 1 1 120
43. CYN Artificial gene 1 1 282
44. DRO Drosophila sp. 4 4 4200
45. E6V Artificial gene 2 2 206
46. ECO Escherichia coli 124 111 24710
47. EGF Artificial gene 1 1 299
48. ERY Artificial gene 1 1 217
49. EXP Cloning vector 2 2 123
50. EZZ Cloning vector 1 1 60
51. FCN Artificial gene 1 1 42
52. FCS Cloning vector 2 2 136
53. FLA Artificial gene 1 1 69
54. FLU Influenza virus 6 6 861
55. FSB Artificial gene 2 2 747
56. GFA Artificial gene 1 1 176
57. HAL Artificial gene 4 3 1633
58. HBV Hepatitis B virus 3 3 315
59. HCY Artificial gene 1 1 313
60. HET Hetropolymeric DNA 2 2 594
61. HIR Artificial gene 1 1 220
62. HIV Artificial gene 6 6 879
63. HL1 Artificial gene 4 4 238
64. HNR Artificial gene 1 1 90
65. HPB Artificial gene 1 1 556
66. HS1 Artificial gene 1 1 780
67. HS2 Artificial gene 2 2 129
68. HS5 Human cytomegalovirus 1 1 210
69. HSV Herpes Simplex Virus 6 6 323
70. HUM Artificial human gene 80 71 22856
71. HY3 Plasmid pHY300PLK 1 1 4870
72. IFH Cloning vector 1 1 63
73. IL1 Artificial gene 1 1 88
74. INS Artificial gene 1 1 232
75. ISN Insertion element 6 6 378
76. JRD Cloning vector 3 3 6852
77. KAN Cloning vector 3 3 210
78. KPN Klebsiella pneumoniae 2 2 354
79. KY1 Artificial gene 1 1 171
80. LAC Cloning vector 2 2 1173
81. LAM Bacteriophgage lambda 4 4 336
82. LET Artificial gene 1 1 212
83. LGT Cloning vector lambda gt11 1 1 210
84. LHM Artificial gene 1 1 232
85. LOR Cloning vector 1 1 5614
86. M13 Cloning vector M13 8 8 643
87. M13MP7 Cloning vector M13mp7 2 2 120
88. M13MP8 Cloning vector M13mp8 1 1 382
89. M13MP9 Cloning vector M13mp9 1 1 60
90. M13TG103 Cloning vector M13tg103 1 1 66
91. M13TG114 Cloning vector M13tg114 1 1 60
92. M13TG115 Cloning vector M13tg115 1 1 66
93. M13TG117 Cloning vector M13tg117 1 1 63
94. M13TG120 Cloning vector M13tg120 1 1 54
95. M13TG130 Cloning vector M13tg130 1 1 93
96. M13TG131 Cloning vector M13tg131 1 1 93
97. MBO Artificial gene 1 1 91
98. MBR Artificial gene 3 3 157
99. MCA Cauliflower mosaic virus 2 2 139
100. MCV Cucumber mosaic virus 5 5 284
101. MHI Mouse-human hybrid 4 4 1574
102. MLE Artificial gene 1 1 936
103. MLF Artificial gene 1 1 213
104. MLM Artificial gene 2 2 178
105. MML Cloning vector 12 4 24042
106. MNV Artificial gene 1 1 87
107. MP7 Artificial gene 2 1 69
108. MP8 Artificial gene 2 1 60
109. MP9 Artificial gene 2 1 60
110. MS2 Artificial gene 1 1 100
111. MSM Artificial gene 2 2 331
112. MUS Mus musculus 38 38 4298
113. NEU Artificial gene 2 2 171
114. NNL Plasmid pNNL 1 1 815
115. NPA Autographa californica nuclear polyhedrosis virus
3 3 922
116. OVC Artificial gene 1 1 738
117. P17X Plasmid pACYC177 7 6 4190
118. P18X Plasmid pACYC184 5 4 4593
119. P23 Artificial gene 1 1 119
120. PAC Cloning vector 1 1 83
121. PAH Artificial gene 2 2 107
122. PBD Cloning vector 1 1 79
123. PBG Cloning vector 6 3 12379
124. PBR Plasmid pBR322 43 23 7123
125. PBR313 Plasmid pBR313 1 1 200
126. PBR322SV Plasmid pBR322/SV40 hybrid 5 5 209
127. PBR325 Plasmid pBR325 3 3 319
128. PBR327 Plasmid pBR327 3 3 3334
129. PBR329 Plasmid pBR329 1 1 4150
130. PBR345 Plasmid pBR345 2 2 1024
131. PBRH4 Plasmid pBRH4 1 1 71
132. PCE Cloning vector 1 1 510
133. PCG86 Plasmid pCG86 2 2 654
134. PCZ Plasmid pCZ 2 2 208
135. PDPL Plasmid PDPL13 1 1 79
136. PEA Artificial gene 1 1 1004
137. PEM Cloning vector pEMBL8m 4 2 7878
138. PES Cloning vector 1 1 99
139. PF1 Bacteriophage f1 1 1 254
140. PFD Artificial gene 1 1 85
141. PFE Plasmid pFE 2 2 180
142. PFH Plasmid pFH 1 1 120
143. PFL Cloning vector 2 1 4588
144. PFR Plasmid pFR 4 4 341
145. PFX Artificial gene 2 1 3627
146. PHP Plasmid pHP45 1 1 155
147. PHS Plamsid pHS 3 3 2877
148. PHV100 Cloning vector 1 1 396
149. PHV33 Artificial gene 13 13 650
150. PIC Plasmid pIC 5 5 477
151. PIG Artificial pig gene 3 3 440
152. PIP1088 Plasmid pIP1088 2 2 142
153. PIVX Cloning vector pi-VX 1 1 902
154. PJSC73 Plasmid pJSC73 1 1 3564
155. PK18 Cloning vector 1 1 2661
156. PKN Plasmodium knowlesi 2 1 360
157. PKT Artificial gene 3 3 264
158. PKU Cloning vector 3 2 7825
159. PL2 Artificial gene 2 2 240
160. PL5 Artificial gene 2 2 310
161. PLB Cloning vector 1 1 852
162. PLF Cloning vector 2 1 3641
163. PLY Artificial gene 1 1 66
164. PMB9 Cloning vector 1 1 138
165. PMC1843 Plasmid pMC1843 1 1 62
166. PMK20 Artificial gene 2 1 4028
167. PMT Cloning vector 1 1 2854
168. PMU Artificial gene 4 4 576
169. POG Cloning vector 1 1 352
170. POL Artificial gene 2 2 129
171. POLY Cloning vector 6 3 6226
172. PORI17 Plasmid pOri17 2 2 490
173. PPI Cloning vector 1 1 4734
174. PPUC Cloning vector 1 1 75
175. PQB Artificial gene 1 1 64
176. PRK Cloning vector 2 2 839
177. PRT Artificial gene 1 1 711
178. PRW1707 Plasmid pRW1707 1 1 66
179. PRW1718 Plasmid pRW1718 1 1 72
180. PRW1724 Plasmid pRW1724 1 1 66
181. PRW1725 Plasmid pRW1725 1 1 66
182. PSE Artificial gene 2 2 139
183. PSI Cloning vector 1 1 81
184. PSKS104 Plasmid pSKS104 1 1 69
185. PSKS105 Plasmid pSKS105 1 1 60
186. PSKS106 Plasmid pSKS106 1 1 60
187. PSKS107 Plasmid pSKS107 1 1 46
188. PSMF Cloning vector 2 2 259
189. PSP Cloning vector 3 3 215
190. PSR Artificial gene 1 1 138
191. PSS Cloning vector 2 2 475
192. PT4 Bacteriophage T4 4 4 725
193. PT7 Bacteriophage T7 2 2 282
194. PTK Plasmid pTK 1 1 68
195. PTL Cloning vector 1 1 51
196. PTN Plasmid pTN 1 1 355
197. PTR Plasmid pTr 1 1 137
198. PTU Artificial gene 4 4 883
199. PTZ Plasmid pTZ12 1 1 2517
200. PUC Cloning vector 2 1 3914
201. PUEX Cloning vector 2 1 6728
202. PVH51 Plasmid pVH51 1 1 3847
203. PX1 Bacteriophage phi-X174 1 1 59
204. PYM Artificial gene 1 1 252
205. PYR Artificial gene 1 1 158
206. PZ189 Cloning vector 1 1 153
207. R38 Plasmid R388 1 1 1167
208. R67 Plasmid R67 1 1 353
209. R6K Cloning vector 2 2 176
210. RAD Artificial gene 1 1 955
211. RAT Rattus sp. 14 13 1864
212. RET Cloning vector 2 2 780
213. RMT Artificial gene 6 6 638
214. RNA Artificial gene 1 1 328
215. ROT Artificial gene 2 2 141
216. RRNA Artificial gene 1 1 136
217. RSC1 Plasmid Rsc13 3 1 7894
218. RSF1050 Plasmid RSF1050 1 1 104
219. RSP Artificial gene 1 1 100
220. RSV Rous Sarcoma Virus 3 3 450
221. RTS Artificial gene 1 1 280
222. S100 Artificial gene 1 1 283
223. SAA Bacteriophage sigma-11-AA248 1 1 83
224. SAU Staphylcoccus aureus 1 1 60
225. SFV Semliki forest virus 3 3 171
226. SHI Cloning vector 3 3 428
227. SHU Artificial gene 3 3 272
228. SIN Cloning vector 2 2 596
229. SLM Artificial gene 10 10 817
230. SOMINS Artificial gene 1 1 226
231. SOP Artificial gene 1 1 111
232. SP02 Bacteriophage SP02 1 1 487
233. SP6 Artificial gene 1 1 78
234. SPI Artificial gene 2 2 295
235. SRU Artificial gene 1 1 252
236. STA Artificial gene 4 4 619
237. STM Artificial gene 1 1 71
238. STY Salmonella sp. 1 1 135
239. SV4 Simian Virus 40 13 13 2415
240. SVA Artificial gene 1 1 213
241. SYN Plasmid pDSP1 161 146 49857
242. SYN Synthetic sequence 51 48 131042
243. T13 Artificial gene 1 1 223
244. T4L Artificial gene 1 1 518
245. TAC Artificial gene 1 1 842
246. THA Artificial gene 1 1 641
247. THY Plasmid pUC8 2 1 503
248. TI Plasmid Ti 4 3 6552
249. TN3 Artificial gene 1 1 110
250. TNP Artificial gene 1 1 192
251. TNS Cloning vector 2 2 144
252. TOB Artificial gene 1 1 788
253. TRN28 Cloning vector 2 2 284
254. TRN3 Transposon Tn3 10 10 1119
255. TRN5 Artificial gene 1 1 80
256. TRNB Artificial gene 1 1 84
257. TU4 Cloning vector 2 2 350
258. VAC Cloning vector 4 4 683
259. VCH Artificial gene 1 1 444
260. VEC Cloning vector 1 1 143
261. VTR Cloning vector 1 1 148
262. WHL Artificial gene 1 1 507
263. XEL Xenopus laevis 13 13 1227
264. YSC Saccharomyces cerevisiae 41 40 10154
265. YSE Artificial gene 2 1 1795
266. YST Artificial gene 1 1 82
267. ZMO Artificial gene 3 3 740
Total 1129 1028 516186
UNANNOTATED
Key Name Reports Entries Bases
-------------------------------------------------------------------------------
1. Unidentified 4909 3756 4792964
Total 4909 3756 4792964
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.