swiss-prot
Background and rationale
Almost since its creation by Amos Bairoch in 1986, the SWISS-PROT protein sequence database has
been a collaborative effort of the Department of Medical Biochemistry of the University of Geneva
and what was then called the Data Library group of the European Molecular Biology Laboratory (EMBL)
in Heidelberg (Germany). In 1994, the activities of the Data Library were broadened and
incorporated in a new Outstation of EMBL, the European Bioinformatics Institute (EBI) in Hinxton
(UK). In April this year, the SWISS-PROT group in Geneva joined a new academic institution, the
Swiss Institute of Bioinformatics (SIB), which it helped to create. SWISS-PROT is therefore an
equal partnership of the SIB and EBI/EMBL. The founder, Amos Bairoch, is ultimately responsible for
the scientific content and format of SWISS-PROT.
The enormous growth in the quantity of sequence and characterization data has made the task of
producing an annotated and comprehensive protein sequence database a major challenge. While
automation of some aspects of this work has made it possible to obtain significant progress in
productivity, it nonetheless remains a task which is intensive in terms of human resources, and
which requires an increasing amount of expertise. Recent years have shown that public funding for
such an activity is not going to keep pace with its financial requirements. During the same period,
the importance of high quality annotation for all kinds of life sciences research activities has
grown. We are therefore faced with the paradoxical situation where no major life sciences research
lab can function without a database such as SWISS-PROT, yet the existence and continued development
of such a resource is in jeopardy.
We believe that the only feasible solution to this problem is to obtain additional funds through
the payment of yearly license fees by non-academic users for access to SWISS-PROT. We have,
therefore, explored legal and organizational solutions to achieve this goal. After careful
consideration of all potential options, we have planned the following solution:
Starting in September 1998, we intend to implement a system of an annual subscription fee for
commercial use of the database. Both SIB and EBI will mandate a new company, Geneva Bioinformatics
(GeneBio) to act as their representative for the purpose of concluding the necessary license
agreements and levying the fees.
We will describe here in detail the consequences of this change. The most important take-home
message is that these changes should not have any impact on the way SWISS-PROT is accessed or
redistributed. Academic users will not be affected by these changes. Industrial end-users will also
not directly be affected as long as their employer pays the license fee. The same holds true for
bioinformatics companies. Academic software or database developers as well as providers of database
distribution services will be only minimally affected by these changes. We hope to be able to keep
the spirit of SWISS-PROT alive and at the same time ensure its long-term financial survival. We
sincerely hope and believe that in the next two years the only change that will matter will be the
increase in scope and timeliness of the database.
How are these new funds going to be used?
The funds obtained through licensing of SWISS-PROT to industrial users will be used by both SIB
and EBI to contribute to the further development of the database. In particular, new annotation and
programming support positions will be created. We will also hire persons whose task will be to
interact with users and to further enhance the successful dialog that has been established between
SWISS-PROT and the scientists who are contributing the information that is used to build the
database. The growth of the SWISS-PROT staff will allow us to hire specialized annotators with a
larger knowledge spread (medicine, pharmacology, virology, etc.) than is currently represented
among our staff.
When can you expect these new developments to enhance the impact of SWISS-PROT?
It takes about a year to train an annotator. As new funds will not be generated before the last
quarter of 1998, most changes will only be apparent in the last months of 1999. Nevertheless, we
believe that before the release of "SWISS-PROT 2000", we will have achieved a number of specific
goals. We will have among other things:
Finished the first pass of the complete annotation of the proteins encoded in a number of
complete genomes, and in particular those of E.coli, B.subtilis, M.jannaschii and yeast
(S.cerevisiae);
Made a substantial effort toward the full annotation of human and rodent proteins;
Significantly increased the speed of annotation so as to be able to annotate key members of new
protein families as soon as they become available;
Changed the taxonomy currently used in SWISS-PROT to that used by the DNA databases
Converted the "ALL UPPER CASE" format currently used to a more appealing and user-readable "Mixed
Case" format;
Created mirror sites of the WWW ExPASy server so as to provide users in every part of the world
with a comprehensive database and software environment for protein studies.
Why the funding model of SWISS-PROT is not applicable to nucleotide sequence databases
We consider that the funding model that has to be adopted to secure the viability of SWISS-PROT is
not applicable to the international nucleotide sequence databases (EMBL/GenBank/DDBJ), even though
these are also curated. Nucleotide sequences, from which SWISS-PROT entries are derived, must
remain in the public domain in recognition of the fact that they are the primary data, and have
been submitted to public-domain collections by individual scientists. This same consideration holds
for primary databases of macromolecular structures (such as PDB).
Will there be changes on the way SWISS-PROT can be accessed?
The take-home message is: if you are a user of SWISS-PROT from a non-profit organization, you will
not be affected by these changes. If you are a for-profit user, you should not be directly
affected, but your company will have to pay a yearly license fee to allow you and your colleagues
to make use of the database. Legally, SWISS-PROT will be copyrighted, so that it can be legally
protected against unauthorized use.
We are planning no major changes in the procedures currently used for access. We are aware that
SWISS-PROT is redistributed in many different forms and media by numerous organizations and
bioinformatics companies around the world, and we have decided to keep the current system in place.
There will be no password scheme and no limitation on access. The whole system will be based on
trust. What this means is that we trust commercial companies to contribute to the financial health
of the database by paying their yearly subscription. We will, of course, check any examples of
flagrant abuse. If you are an academic user of SWISS-PROT you should not see any changes other than
improvements such as those listed above and the fact that SWISS-PROT entries will now contain a
statement which will probably look very much like this one:
CC --------------------------------------------------------------------------
CC This SWISS-PROT entry is copyright. It is produced through a collaboration
CC between the Swiss Institute of Bioinformatics and the EMBL Outstation -
CC the European Bioinformatics Institute. There are no restrictions on its
CC use by non-profit institutions as long as its content is in no way
CC modified and this statement is not removed. Usage by and for commercial
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/
CC or send an email to license@isb-sib.ch).
CC --------------------------------------------------------------------------
Such a statement will appear in the majority of SWISS-PROT entries, however it will not appear in
any entry whose sequence originates solely from direct protein sequencing or from the translation
of a DNA sequence which is not available in the international nucleotide sequence database
(EMBL/GenBank/DDBJ). This decision was taken to allow industrial users that have submitted protein
sequences to retrieve them without any legal consequences.
Is SWISS-PROT release 36 still in public domain?
To facilitate the transition for all users, the SWISS-PROT release 36 will remain completely in
the public domain and is not subject to any of the changes mentioned in this document.
Redistribution of SWISS-PROT
There are many ways in which all or part of SWISS-PROT can be redistributed. The most common cases
are the following:
Distribution of the entire database by FTP;
Distribution of the entire database on CD-ROM, either in its original format or reformatted and
indexed to be used with a specific software package;
Access to specific entries using a WWW server.
We are aware that many academic institutions and software companies redistribute SWISS-PROT and we
therefore want to minimize disruption of the existing schemes. If you are redistributing SWISS-PROT
or making it available to all users on the Internet you need to explicitly ask permission to do so
by registering your service with either the SIB or the EBI. Such a permission will be granted if
you agree to observe the following rules:
In the case of a WWW or FTP server, you are asked to make available to SIB or the EBI that part
of your log files that specifically deals with access to SWISS-PROT. Such information should be
provided at least twice a year. SIB and EBI will then inform GeneBio of access to SWISS-PROT by
companies that have not yet paid their yearly license fee. Apart from this specific use and that of
building statistics of global usage of SWISS-PROT, this information will not be used in any other
way, nor will it be made available to third parties.
In the case of CD-ROM distribution, you are asked to provide the list of the for-profit
institutions to which such CD-ROMS were distributed. Such a list should be provided at least twice
a year.
You must update the copy of SWISS-PROT that you make available on a timely basis. For an FTP
service, you should provide the last full update of the database at the latest three weeks after it
has been released. For CD-ROM distribution, you should provide a minimum of two updates per year.
For a WWW server you should provide the latest database no more than one month after it becomes
available on the official WWW servers of SIB and EBI. It goes without saying that by registering
your service, you will directly receive from SIB or EBI information about the availability of major
and weekly releases so as to help you to comply with this request. This condition is purely made on
behalf of users of the database: most existing servers already provide up-to-date information, but
in a recent survey, we found one or two services that were offering releases dating back more than
18 months.
You should not redistribute SWISS-PROT in a format other than those listed hereafter without the
explicit consent of SIB/EBI. The formats that are accepted by default, in addition to the original
format, are those known as 'ASN.1' and as 'GCG'. You can also make SWISS-PROT available in 'FASTA'
or 'Blast' formats as long as you also provide a version that includes all annotations in the
agreed formats. Again, this request is made on behalf of users, who are confused as to what type of
information is available in SWISS-PROT when the database is reformatted and some of its structure
is lost after such a transformation. It is fair to say that SIB/EBI will look positively into any
request for redistribution which keeps the structure of the database, and will respond negatively
only in the rare cases where the integrity of the structure is degraded.
You should make available to users of SWISS-PROT to whom you redistribute the database some
information items that will be provided by SIB and EBI. These information items are meant to
briefly describe the database and its content; to point out the original sites (SIB and EBI) where
the database can be obtained; and to summarize the principle of the new licensing system. We will
tailor these information items to the specific technical requirements of your distribution system.
For example: if you have a WWW server running SRS, we will provide the 'IT' file that describes
SWISS-PROT or, if you have a FTP server, we will provide the 'readme' file to be stored in the
SWISS-PROT directory.
Incorporation of SWISS-PROT in a similarity search service
What we discuss here are services that allow users to detect similarities between a 'test'
sequence and the sequences stored in one or more sequence databases. Most services of that kind are
those based on the well-known FASTA or Blast series of programs. The same conditions also apply to
protein identification services (e.g. proteomics tools used in the context of mass spectrometry
(MS) or 2D-PAGE) as well as services based on the identification of protein families (e.g. PROSITE,
BLOCKS, Pfam, etc.).
There are a number of cases to consider:
If you are an academic institution providing such a service to internal users of your
institution, you do not need to do anything, not even register your service;
If you are an academic institution providing such a service on the Internet to any users either
academic or industrial, you need to register your service with SIB/EBI and observe the following
rules:
Your search service should offer a version of SWISS-PROT that is not more than 2 months older
than that available on the official WWW servers of SIB and EBI. If for technical reasons, you are
not able to comply with this request, we may, on a case by case basis, decide to relax this rule.
The output produced by your search engine should explicitly state what version of SWISS-PROT was
used to do the search (Example: 'Release 36 with updates up to October 24, 1998').
While this seems similar to the case discussed previously (services redistributing or providing
access to SWISS-PROT entries), there is an important difference. You do not need to provide to SIB
or EBI the log file of such a service as long as your similarity search service does not directly
provide, as its output, any SWISS-PROT entry. This is the case of most existing programs, such as
Blast or FASTA.
If you want to create links from the results of a search to the full display of the relevant
SWISS-PROT entries, you are encouraged to do so. We specifically encourage you to make these links
to the original entries on the SIB or EBI WWW servers, as these will always contain the most
up-to-date information. However, if you want to provide links to a copy of SWISS-PROT stored on
your local server, you can also do so, but in that case you will become a redistributor of
SWISS-PROT and the rules in the relevant section of this document will be applicable.
Use of SWISS-PROT as the primary resource for a derived database or information service
We distinguish here three types of usage:
integration of all of SWISS-PROT into another database (example: Entrez or OWL);
integration of part of SWISS-PROT into a specialized database (examples: AmsDb, GeneCards,
EcoGene, etc.);
integration of SWISS-PROT into a specialized information service. Examples: ProDom, which shows
the domain organization of SWISS-PROT entries, or ProtoMap which clusters entries by families on
the basis of similarity.
Integration of all of SWISS-PROT into another database will require explicit permission from
SIB/EBI. Such permission will only be granted if there is a valid scientific or technical reason to
encapsulate the entire content of SWISS-PROT into a new resource. In the event that such permission
is granted you will become a redistributor of SWISS-PROT and the rules in the relevant section of
this document will be applicable.
Integration of part of SWISS-PROT into a specialized database is encouraged. However, if you are
using or intend to use SWISS-PROT entries or part of SWISS-PROT entries, you need to contact SIB or
EBI to get an explicit permission to do so. You will be asked to describe the scope of your
database, what part of SWISS-PROT you want to incorporate and how the information will be presented
and distributed.
Services that make use of SWISS-PROT to build an information resource are bound by the same rules
as those described for similarity search services. However we do not want to hinder services such
as ProDom or ProtoMap, which requires huge computing resources to produce a new release. We will
therefore consider relaxing the time constraints for such services on a case per case basis, on
explicit request.
Use of SWISS-PROT in an educational context
Use of SWISS-PROT for educational purpose is actively encouraged. As a member of an academic
institution you can use SWISS-PROT in any courses or seminars with no restriction whatsoever. If
you organize courses that are attended by industrial users you can make them aware of the following
statement:
Industrial participants in courses and seminars are free to make use of SWISS-PROT during the
course or seminar irrespective of whether or not their employer is currently subscribing to the
database.
We also encourage use of the databases for educational purpose by exempting companies whose
purpose is to organize courses and/or seminars for the Life Sciences community from the obligation
of paying for a yearly subscription. Such companies need to contact SIB or EBI to register. They
are asked to provide the charter of their organization so as to ensure that they are actively
engaged in such educational activities and that they are not using a peripheral educational
activity as to get exempted from paying their subscription!
Making reference to SWISS-PROT entries
There are no restrictions on either academic or industrial users making references to SWISS-PROT
entries in any form of publications, printed or electronic. We only want to take this opportunity
to remind you again (but believe us, this is needed!) that when you cite a database entry you need
to cite the primary accession number of that entry. Accession numbers are fixed identifiers, entry
names are not. Of course, you can include the entry name in the citation of an entry, but the
accession number is the primary mean of identification of an entry and should always be used.
Incorporating SWISS-PROT information or entries in printed publications
If you are writing a book, a book chapter or an article and you want to illustrate it with one or
more figures representing SWISS-PROT entries or excerpts of entries, you are encouraged to do so
and do not need to ask for permission. However if you intend to publish a book which contains the
printout of a significant number of entries from the database, you need to ask explicitly for
permission. We intend to grant permissions, but we want to make a distinction between illustrating
an article or a book with excerpts from the database (which is encouraged) and printing a book
which solely or substantially consists of printouts from the database.
For more information...
If after having read this document, you still have questions, you can send an email to the
following addresses:
General information: info@isb-sib.ch
Licensing information: license@isb-sib.ch