2024bioinformatics group Mittweida
SequenceCEROSENE - color encoding of residues obtained by spatial neighborhood embedding

Using SequenceCEROSENE is rather straightforward. You can upload a single structure or an archive containing multiple structures. Here all common archive formats are supported (.tar, .tar.gz, .zip, .7z, .rar - provided by Java Archiving Library and Junrar ). At the moment, internal processing requires the data to be specified in Protein Data Bank (PDB) format. In addition, you can also specify any valid PDB identifier and let the server retrieve the corresponding structure data in the background. In the advanced settings section, you may choose which coordinates are to be used in color embedding. By default, the coordinates of the beta-carbon are referred to in case of amino acids (alpha-carbon for glycine), but you may choose between using alpha-carbons, side chain centroids, or terminal heavy atoms. Note that centroid coordinates are always considered for ligands, RNA, and DNA!


Further, you can specify if SVG, PNG, PML, CSV are supposed to be generated in the process. As rendering of an SVG file in the background can take some time - especially for large structures - you may consider turning this feature off. For submitting a job, an ID does not have to be provided necessarily, but you may use this feature for your own comfort.


Upon submitting a job, you are redirected to the session page, where a bookmark URL and a list of processed and finished jobs is presented. You can return to this page via the View results tab or the bookmark URL. Your data is exclusively associated to your session id and is removed from the server after 72 hours. Each job can be accessed by selecting it from the list and clicking the View results button. Again, each job can contain multiple processed structures if an archive has been provided.


On the result page, generated color encodings are presented. If multiple structures have been processed in one job, each is presented in individually selectable tabs. The ProteinViewer (PV, M. Biasini (2014), "PV - WebGL-based Protein Viewer") and colorized sequences are fully interactive and interconnected, allowing you to select residues and regions in the protein in sequence and structure simultaneously. The Shift key can be used to select multiple residues and alter PV's behavior. Furthermore, representations can be downloaded in PNG and SVG format. Raw data is available in CSV text format. Here, chain id, PDB residue index, one letter code, reference coordinates, and RGB color contents (ranging from 0-255) are reported for each residue or compound. The downloadable PML (PyMOL log file) can be useful to PyMOL users. The corresponding structure can be loaded into a PyMOL instance and colorized as proposed by SequenceCEROSENE using the PML script. This allows working offline with generated color encoded sequences.


If you encounter any bugs or issues with the server, or simply find this service useful to you, please feel free to contact us. Any feedback is very much appreciated :)