ProteaseDigestion

class pyopenms.ProteaseDigestion

Bases: object

Cython implementation of _ProteaseDigestion

Documentation is available at http://www.openms.de/current_doxygen/html/classOpenMS_1_1ProteaseDigestion.html

– Inherits from [‘EnzymaticDigestion’]

Digestion can be performed using simple regular expressions, e.g. [KR] | [^P] for trypsin. Also missed cleavages can be modeled, i.e. adjacent peptides are not cleaved due to enzyme malfunction/access restrictions. If n missed cleavages are allowed, all possible resulting peptides (cleaved and uncleaved) with up to n missed cleavages are returned. Thus no random selection of just n specific missed cleavage sites is performed. —– Usage:

from pyopenms import * from urllib.request import urlretrieve # urlretrieve (”http://www.uniprot.org/uniprot/P02769.fasta”, “bsa.fasta”) # dig = ProteaseDigestion() dig.setEnzyme(‘Lys-C’) bsa_string = “”.join([l.strip() for l in open(“bsa.fasta”).readlines()[1:]]) bsa_oms_string = String(bsa_string) # convert python string to OpenMS::String for further processing # minlen = 6 maxlen = 30 # # Using AASequence and digest result_digest = [] result_digest_min_max = [] bsa_aaseq = AASequence.fromString(bsa_oms_string) dig.digest(bsa_aaseq, result_digest) dig.digest(bsa_aaseq, result_digest_min_max, minlen, maxlen) print(result_digest[4].toString()) # GLVLIAFSQYLQQCPFDEHVK print(len(result_digest)) # 57 peptides print(result_digest_min_max[4].toString()) # LVNELTEFAK print(len(result_digest_min_max)) # 42 peptides # # Using digestUnmodified without the need for AASequence from the EnzymaticDigestion base class result_digest_unmodified = [] dig.digestUnmodified(StringView(bsa_oms_string), result_digest_unmodified, minlen, maxlen) print(result_digest_unmodified[4].getString()) # LVNELTEFAK print(len(result_digest_unmodified)) # 42 peptides

__init__()
  • Cython signature: void ProteaseDigestion()

  • Cython signature: void ProteaseDigestion(ProteaseDigestion &)

Methods

__init__

  • Cython signature: void ProteaseDigestion()

digest

  • Cython signature: size_t digest(AASequence & protein, libcpp_vector[AASequence] & output)

digestUnmodified

Cython signature: size_t digestUnmodified(StringView sequence, libcpp_vector[StringView] & output, size_t min_length, size_t max_length)

getEnzymeName

Cython signature: String getEnzymeName() Returns the enzyme for the digestion

getMissedCleavages

Cython signature: size_t getMissedCleavages() Returns the number of missed cleavages for the digestion

getSpecificity

Cython signature: Specificity getSpecificity() Returns the specificity for the digestion

getSpecificityByName

Cython signature: Specificity getSpecificityByName(String name) Returns the specificity by name.

isValidProduct

  • Cython signature: bool isValidProduct(AASequence protein, size_t pep_pos, size_t pep_length, bool ignore_missed_cleavages, bool methionine_cleavage)

peptideCount

Cython signature: size_t peptideCount(AASequence & protein) Returns the number of peptides a digestion of protein would yield under the current enzyme and missed cleavage settings

setEnzyme

  • Cython signature: void setEnzyme(String name)

setMissedCleavages

Cython signature: void setMissedCleavages(size_t missed_cleavages) Sets the number of missed cleavages for the digestion (default is 0).

setSpecificity

Cython signature: void setSpecificity(Specificity spec) Sets the specificity for the digestion (default is SPEC_FULL)

digest()
  • Cython signature: size_t digest(AASequence & protein, libcpp_vector[AASequence] & output)

  • Cython signature: size_t digest(AASequence & protein, libcpp_vector[AASequence] & output, size_t min_length, size_t max_length)

Parameters
  • protein – Sequence to digest

  • output – Digestion products (peptides)

  • min_length – Minimal length of reported products

  • max_length – Maximal length of reported products (0 = no restriction)

Returns

Number of discarded digestion products (which are not matching length restrictions)

digestUnmodified()

Cython signature: size_t digestUnmodified(StringView sequence, libcpp_vector[StringView] & output, size_t min_length, size_t max_length)

Parameters
  • sequence – Sequence to digest

  • output – Digestion products

  • min_length – Minimal length of reported products

  • max_length – Maximal length of reported products (0 = no restriction)

Returns

Number of discarded digestion products (which are not matching length restrictions)

getEnzymeName()

Cython signature: String getEnzymeName() Returns the enzyme for the digestion

getMissedCleavages()

Cython signature: size_t getMissedCleavages() Returns the number of missed cleavages for the digestion

getSpecificity()

Cython signature: Specificity getSpecificity() Returns the specificity for the digestion

getSpecificityByName()

Cython signature: Specificity getSpecificityByName(String name) Returns the specificity by name. Returns SPEC_UNKNOWN if name is not valid

isValidProduct()
  • Cython signature: bool isValidProduct(AASequence protein, size_t pep_pos, size_t pep_length, bool ignore_missed_cleavages, bool methionine_cleavage)

param protein

Protein sequence

param pep_pos

Starting index of potential peptide

param pep_length

Length of potential peptide

param ignore_missed_cleavages

Do not compare MC’s of potential peptide to the maximum allowed MC’s

param allow_nterm_protein_cleavage

Regard peptide as n-terminal of protein if it starts only at pos=1 or 2 and protein starts with ‘M’

param allow_random_asp_pro_cleavage

Allow cleavage at D|P sites to count as n/c-terminal

returns

True if peptide has correct n/c terminals (according to enzyme, specificity and above flags) - Cython signature: bool isValidProduct(String protein, size_t pep_pos, size_t pep_length, bool ignore_missed_cleavages, bool methionine_cleavage)

Forwards to isValidProduct using protein.toUnmodifiedString()

  • Cython signature: bool isValidProduct(String sequence, int pos, int length, bool ignore_missed_cleavages)

Parameters
  • protein – Protein sequence

  • pep_pos – Starting index of potential peptide

  • pep_length – Length of potential peptide

  • ignore_missed_cleavages – Do not compare MC’s of potential peptide to the maximum allowed MC’s

Returns

True if peptide has correct n/c terminals (according to enzyme, specificity and missed cleavages)

peptideCount()

Cython signature: size_t peptideCount(AASequence & protein) Returns the number of peptides a digestion of protein would yield under the current enzyme and missed cleavage settings

setEnzyme()
  • Cython signature: void setEnzyme(String name) Sets the enzyme for the digestion (by name)

  • Cython signature: void setEnzyme(DigestionEnzyme * enzyme) Sets the enzyme for the digestion

setMissedCleavages()

Cython signature: void setMissedCleavages(size_t missed_cleavages) Sets the number of missed cleavages for the digestion (default is 0). This setting is ignored when log model is used

setSpecificity()

Cython signature: void setSpecificity(Specificity spec) Sets the specificity for the digestion (default is SPEC_FULL)