Skip to main content

Estimate copy number from deep profile MS experiment using the Proteomic Ruler algorithm from Wiśniewski, J. R., Hein, M. Y., Cox, J. and Mann, M. (2014) A “Proteomic Ruler” for Protein Copy Number and Concentration Estimation without Spike-in Standards. Mol Cell Proteomics 13, 3497–3506.

Project description

Proteomic Ruler

An implementation of the same algorithm from Perseus Wiśniewski, J. R., Hein, M. Y., Cox, J. and Mann, M. (2014) A “Proteomic Ruler” for Protein Copy Number and Concentration Estimation without Spike-in Standards. Mol Cell Proteomics 13, 3497–3506. used for estimation of protein copy number from deep profile experiment.

Requirements

Python >= 3.9

Installation

pip install proteomicruler

Usage

In order to use the package, it is required that the input data is loaded into a pandas.DataFrame object. The following basic parameters are also required:

  • accession_id_col - column name that contains protein accession ids
  • mw_col - column name that contains molecular weight of proteins
  • ploidy - ploidy number
  • total_cellular_protein_concentration - total cellular protein concentration used for calculation of total volume
  • intensity_columns - list of column names that contain sample intensities
import pandas as pd

accession_id_col = "Protein IDs"
# used as unique index and to directly fetch mw data from UniProt

mw_col = "Mass"
# molecular weight column name

ploidy = 2
# ploidy number

total_cellular_protein_concentration = 200
# cellular protein concentration used for calculation of total volume

filename = r"example_data\example_data.tsv" # example data from Perseus
df = pd.read_csv(filename, sep="\t")

# selecting intensity columns
intensity_columns = df.columns[57:57+16] # select 16 columns starting from column 57th that contain sample intensity

If the data does not contain molecular weight information, it is required to fetch it from UniProt.

from proteomicRuler.ruler import add_mw

df = add_mw(df, accession_id_col)
df = df[pd.notnull(df[mw_col])]
df[mw_col] = df[mw_col].astype(float)

The Ruler object can be created by passing the DataFrame object and the required parameters.

from proteomicRuler.ruler import Ruler

ruler = Ruler(df, intensity_columns, mw_col, accession_id_col, ploidy, total_cellular_protein_concentration) #
ruler.df.to_csv("output.txt", sep="\t", index=False)

It is also possible to use the package through the command line interface.

Usage: ruler [OPTIONS]

Options:
  -i, --input FILENAME          Input file containing intensity of samples and
                                uniprot accession ids
  -o, --output FILENAME         Output file
  -p, --ploidy INTEGER          Ploidy of the organism
  -t, --total-cellular FLOAT    Total cellular protein concentration
  -m, --mw-column TEXT          Molecular weight column name
  -a, --accession-id-col TEXT   Accession id column name
  -c, --intensity-columns TEXT  Intensity columns list delimited by commas
  -g, --get-mw                  Get molecular weight from uniprot
  --help                        Show this message and exit.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

proteomicruler-0.1.4.tar.gz (9.3 kB view hashes)

Uploaded Source

Built Distribution

proteomicruler-0.1.4-py3-none-any.whl (10.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page