Skip to main content

Identify peptides and derivatives from small molecule datasets

Project description

PepSift

Summary

Identify peptides and their derivatives from small molecule datasets.

Installation

pip install pepsift

Usage

PepSift relies on multiple criteria defining different types od amino acids and polymers thereof.

There are currently 5 different levels available from most to least stringent:

level description comment
SiftLevel.NaturalLAminoAcids natural L-amino acids and peptides thereof e.g. identify L-Alanine or the sequence ACDEFGHIKLMNPQRSTVWY
L-Ala
SiftLevel.NaturalLDAminoAcids natural L- and D-amino acid and peptides thereof e.g. identify L-Alanine or the sequences D-A L-W, L-H D-Q, D-M D-K
D-Ala
SiftLevel.NaturalAminoAcidDerivatives derivatives of natural L- and D-amino acid and peptides thereof i.e. any compound containing a canonical amino acid/peptide backbone
AA
SiftLevel.NonNaturalAminoAcidDerivatives non-natural amino acid derivatives and peptides thereof e.g. identify beta-homo-alanine or alpha-methyl-Tyr
BH-Ala AM-Tyr
SiftLevel.AllAmineAndAcid compounds containing amine and carboxylic acid moieties e.g. 3-[3-(2-Aminoethyl)cyclohexyl]propionic acid
5N1NAPHT

These levels allow for granular selection of different types amino acids/peptides.


The decreasing stringency of `SiftLevel` criteria is exemplified below.
from pepsift import PepSift, SiftLevel

from rdkit import Chem

ps1 = PepSift(SiftLevel.NaturalLAminoAcids)
ps2 = PepSift(SiftLevel.NaturalLDAminoAcids)
ps3 = PepSift(SiftLevel.NaturalAminoAcidDerivatives)
ps4 = PepSift(SiftLevel.NonNaturalAminoAcidDerivatives)
ps5 = PepSift(SiftLevel.AllAmineAndAcid)

mols = [Chem.MolFromSmiles('C[C@@H](C(=O)O)N'),  # L-Ala
        Chem.MolFromSmiles('C[C@H](C(=O)O)N'),   # D-Ala
        Chem.MolFromSmiles('C[C@@H](CN)C(=O)O'), # Beta-homo-Ala
        Chem.MolFromSmiles('CC(C)(C(=O)O)N'),    # Alpha-methyl-Ala
        Chem.MolFromSmiles('NCCCCCCCCCCCCCCCC(=O)O'),   # Amino-hexadecanoic acid
        Chem.MolFromSmiles('c1ccccc1'),           # Benzene
       ]

for mol in mols:
    print((ps1.is_peptide(mol),
           ps2.is_peptide(mol),
           ps3.is_peptide(mol),
           ps4.is_peptide(mol),
           ps5.is_peptide(mol)
           )
          )

# L-Ala
# (True, True, True, True, True)
# D-Ala
# (False, True, True, True, True)
# Beta-homo-Ala
# (False, False, True, True, True)
# Alpha-methyl-Ala
# (False, False, False, True, True)
# Amino-hexadecanoic acid
# (False, False, False, False, True)
# Benzene
# (False, False, False, False, False)

:warning: Any peptide containing a natural amino acid is considered a derivative of natural amino acids (even if it also contains non natural amino acids)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pepsift-0.0.2.post1.tar.gz (9.4 kB view hashes)

Uploaded Source

Built Distribution

pepsift-0.0.2.post1-py3-none-any.whl (7.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page