It is important to know the molecular weight, content of amino acids which can contribute to the absorbance at 280nm and the molar absorbtivity/extinction coefficient of a peptide or protein sequence. The term extinction coefficient is an older one, and absorbtivity is now generally used. Both can be defined as the optical density of a 1% or 0.1% solution of a substance at a specified wavelength, generally measured in a 1cm wide cuvette. The molar absorbivity is the absorbance that a 1M solution would be predicted to have, though this is impossible to attain with most proteins. This program takes as input a protein sequence and calculates the expected absorbtivity of a 0.1% solution of this protein. You can also input an experimentally determined OD280 value and the program will calculate the protein concentration. The program will ignore numbers, spaces, punctuation or characters like B or Z which do not correspond to one of the 20 genetically encoded amino acids. This version can also deal with FASTA format sequences (see here for info on that); in other words it ignores any line of text which is started by a ">" character. The program can also process multiple FASTA format entries, only counting the sequence data and ignoring the stuff following the ">".
Molecular Weight Calculation: It was actually quite interesting to research how to make this program. The atomic weights for each atom used are from the International Union of Pure and Applied Chemistry (IUPAC) web site here. This data gives average atomic weights based on average isotopic content on the planet earth, which can of course vary somewhat. The values, in Daltons, are H=1.00794, C=12.0107, N=14.0067, O=15.9994 and S=32.065. For phosphoproteins P=30.973761. You'll find that the results you'll get from this program are more or less but not quite the same as what you will get from other programs; If you just want the molecular weight to get an idea of where to look for the protein in a gel, any of these programs will be plenty accurate enough. However in the days of mass spectroscopy it can be quite important to get very accurate results. The small differences you get from different programs are due to several factors. One is that other programs may use slightly different values for the average atomic weight of each atom, using older IUPAC numbers or simply rounding off the atomic weights. Also the molecular weight of a protein is a function of the ionization state; at neutral pH, aspartic and glutamic acid have both lost the vast majority of their hydrogens, so proteins containing these amino acids lose most of 1.00794 Daltons per glutamic and aspartic. Similarly at neutral pH, the vast majority of lysine and arginine residues have an additional hydrogen, they are protonated, adding 1.00794 for each of these two amino acids. Finally histidine becomes protonated at neutral pH, but only to a limited extent, roughly 50% at pH=7.0. So in fact, when you get down to it, a protein only ever has an average molecular weight, which is a function of the local pH. At least we are telling you exactly how our program works...
Calculating Absorptivity/Extinction Coefficients from Protein Sequence: Typically spectrophotometers measure the optical density of protein solutions at a wavelength of 280nM. At this wavelength only Tryptophan, Tyrosine and Cystine absorb significantly. The biggest influence is Tryptophan, since 1M of that will absorb 5,500 cm-1M-1, while Tyrosine at the same molarity absorbs only 1,490 cm-1M-1. Cystine is two disulfide linked Cysteine residues, and absorbs only 125 cm-1M-1. Proteins inside cells generally do not have many or any Cystine residues, while most extracellular proteins generally contain many Cystines, so select which using the buttons below. It turns out that Tryptophan is the rarest amino acid, with only one codon and an abundance of only about 1.3% in a typical genome. This means that a protein may have unusually high amounts of Tryptophan, or in fact none at all, and the exact number will have a huge impact on the absorptivity/extinction coefficient. It is therefore important, particularly for small proteins, to know the Tryptophan content. Similarly Tyrosine has only two codons and is typically present at about 3.25% or so in typical genomes. Cysteine is also quite rare, with only two codon and a general abundance of about 2%. So it is quite possible for a protein to have no Tryptophan, Tyrosine or Cystine residues, so that it cannot be accurately quantified using OD280 measurements. The basic method was published by Gill and Von Hippel and is generally accurate to about 3%, assuming a favorable Tryptophan and Tyrosine content- if only cystines are present quantification is much less accurate. The original paper on which this method is based can be downloaded from here.
Please type or paste your protein sequence in box below, can be upper or lowercase, program will read either and both. Also select button for extracellular or cytoplasmic protein and you can also enter an experimentally determined OD value if you wish.