Animal cell

Protein Sorting

Different compartments in eukaryotic cells contain different sets of proteins. Even prokaryotic cells have differences between cytosolic proteins, membrane proteins, and secretory proteins. So how does the cell know where to put which proteins? In other words: Where is the information that makes protein sorting possible?
Proteins have intrinsic signals that govern their transport and localization in the cell — this was the discovery that earned Günter Blobel the Nobel prize in Physiology or Medicine 1999. In the protein sorting group, we aim to characterize and predict these intrinsic signals — the "zip codes" of proteins.

Plant cell

The most well-known and ubiquitous protein “zip code” is the secretory signal peptide, which is found in all domains of life. In prokaryotes, this is a signal for export across the plasma membrane, while in eukaryotes, it signals export across the endoplasmic reticulum (ER) membrane. The protein sorting group is responsible for the SignalP server for predicting signal peptides and their cleavage sites. The SignalP web server is used more than 1,000 times daily, and thousands of users have downloaded the program for use at their own computers. The articles about SignalP, now in version 5.0, have been cited more than 20,000 times in total; see Henrik Nielsen's Google Scholar, Scopus, and Publons pages.

Other prediction methods from the protein sorting group include

  • TargetP (now in version 2.0) — predicts transit peptides for protein import into chloroplasts and mitochondria;
  • DeepLoc — predicts eukaryotic protein subcellular localization in 10 categories;
  • NetGPI — predicts GPI-anchoring, a post-translation modification responsible for attaching many proteins to the outer face of the plasma membrane.

The methods used for prediction are primarily artificial neural networks. We have been using neural networks since the first SignalP version in 1996, but during the latest decade, neural networks have experienced a revival since deep learning became practically possible with the implementations of Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM).

Other interests

Protein sorting isn't everything. The protein sorting group has also been working with:

  • Prediction of various protein post-translational modifications (PTMs).
  • Language modeling for protein sequences (see preprint).
  • Expressibility and solubility of proteins expressed in a production host.
  • Prediction of eukaryotic start codons in nucleotide sequences (the NetStart web server).
  • Prediction of protein structure (the NetSurfP server).
  • Prediction of various aspects of protein function.



Henrik Nielsen
Associate Professor
DTU Health Tech
+45 45 25 20 98


Jose Juan Almagro Armenteros
Guest Postdoc
DTU Health Tech