Regarding the inclusion of sequences in a Sequence Listing, the "what/where/when/why/how" can be a source of confusion for many, and, thus, it is one of the most common questions asked (in various forms). As such, below is a brief summary of the main rules related to the requirements for inclusion (or exclusion) of nucleotide and protein sequences in Sequence Listings.
Nucleotide and protein sequences often disclose symbols to represent modified positions or to draw attention to a certain position. One such symbol is the asterisk (*), and while asterisks can represent a plethora of things, they most often indicate the presence of stop codons. This is extremely important from a Sequence Listing preparation standpoint as the incorrect interpretation can result in a rejection from the United States Patent and Trademark Office. Sequence Listing rules are quite specific as to how the aforementioned sequences are to be disclosed. For example, per
37 CFR § 1.822(d)(5)
, "an amino acid sequence that contains internal terminator symbols (e.g., ‘Ter’, ‘*’, ‘.’, etc.) may not be represented as a single amino acid sequence." In other words, if a stop codon is “embedded” in what appears to be a single protein sequence, the sequence
be disclosed as one long, contiguous sequence in a Sequence Listing. Each sequence of four or more specifically defined residues must be assigned its own unique SEQ ID NO identifier. This rule also has ramifications on nucleotide sequences when including the coding feature (known as CDS region) in a Sequence Listing. Each separate coding region must be indicated as an individual CDS region. For instance, coding through a stop codon in a nucleotide sequence will result in a rejection even though the protein sequence may continue to be coded. As you’ll recall, the presence of an asterisk does not indicate a stop codon every time. It is, however, the most common interpretation of the symbol, and minus any further definitions, should likely be treated as such in order to interpret the sequence correctly and prevent the issuance of a rejection.
What/Where/When/Why/How Do I Include this Sequence in a Sequence Listing
Table of Contents
2016 Ⓒ Boston Patent Law Association
< Previous Article
Next Article >
The overriding statement regarding the inclusion of sequences derives from
37 CFR § 1.821(a)
: "Nucleotide and/or amino acid sequences as used in §§
are interpreted to mean an unbranched sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides…Sequences with fewer than four specifically defined nucleotides or amino acids are specifically excluded from this section. ‘Specifically defined’ means those amino acids other than ‘Xaa’ and those nucleotide bases other than ‘n’…"
CFR § 1.821(a)(1)
goes on to provide additional information for nucleotide sequences: "Nucleotides are intended to embrace only those nucleotides that can be represented using symbols set forth in
WIPO [World Intellectual Property Organization] Standard ST.25 (1998)
Appendix 2, Table 1
WIPO Standard ST.25 (1998)
Appendix 2
, Table 1
outlines the symbols which correspond to the 5 naturally occurring bases (a, c, t, g and u), as well as the degenerate bases (r, y, m, k, s, w, b, d, h and v) and the universal variable base (n). Additionally,
CFR § 1.821(a)(2)
provides more specific information for protein sequences: "Amino acids are those L-amino acids commonly found in naturally occurring proteins and are listed in
WIPO Standard ST.25 (1998), Appendix 2, Table 3.
" Again,
WIPO Standard ST.25 (1998), Appendix 2, Table 3
lists out the twenty naturally occurring amino acids (A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W and Y), special residue designations (B and Z), and the universal variable amino acid (X). Sounds fairly straight forward, right? Ha! Not so fast. While these rules sound
simple enough
, they are merely a jumping off point for a litany of other rules to follow when preparing compliant Sequence Listings. So, although determining which sequences qualify for inclusion in a Sequence Listing may seem straightforward, keep in mind that it is just the beginning. Below are two examples of common sequence disclosures that require additional interpretation.
April 2016 -
Tara Rix
, Group Leader at
Harbor Consulting IP Services, Inc
Example One:
Example Two:
It is not uncommon for sequences to contain gaps or regions representing an unknown number of undefined residues/bases. While scientifically the disclosure may represent a single molecule, scientific accuracy and Sequence Listing rules do not always see eye-to-eye. When this occurs, what do Sequence Listing rules require? With regards to the presence of gaps or undefined regions within sequences,
37 CFR § 1.822(e)
states that “a sequence with a gap or gaps shall be presented as a plurality of separate sequences, with separate sequence identifiers, with the number of separate sequences being equal in number to the number of continuous strings of sequence data.” So while
a sequence containing gaps or stop codons may represent a single molecule, from a Sequence Listing standpoint there are two options:
1. Assign each portion of the defined sequence a unique identifier, or
2. Use feature keys such as "non_cons" (non consecutive residues in protein sequences only) to indicate that two residues in the sequence are not consecutive and that there may be an uncertain number of unknown residues between them, or "unsure" (can be used in nucleotide or protein sequences) to note that the applicants/authors are unsure of the position/region.
Sequence Listings are a mandatory formality, and because the rules can be difficult to navigate and interpret, compliance can be time-consuming and frustrating.
Table of Contents
President's Message by Erik Belt
Read more >
Message from the Editor-in-Chief
Read more >
Alice: Making Step Two Work
Read more >
Grant Submissions and Novelty: A Catch-22 for Startups
Read more >
When are Method/Device Hybrid Claims Indefinite?
Read more >
Update on the Patent Pro Bono Program of New England and Call For Volunteer Attorneys!
Read more >
Saved by the Date? Is Your Pre-AIA Patent Still At Risk for AIA Post Grant Review?
Read more >
6th annual Invented Here! Announcement
Read more >
BPLA Sponsors the New England Innovation Award Program of Smaller Business Association of New England (SBANE)
Read more >
Report on the Giles Rich Moot Court Competition
Read more >
BPLA Writing Competition Accepting Entries
Read more >
Annual Judges' Dinner
Read more >
IP Roundtable with Chief Judge Saris
Read more >
Significant changes to the rules related to european trademarks ¡v three things you should know
Read more >
What/where/when/why/how do I include this sequence in a sequence listing?
Read more >
Members on the move
Read more >
< Back
Member Companies and Firms are Encouraged to Utilize BPLA’s Career Center
Read more >