Some of you may have noticed problems where SIRIUS 4.4 GUI did not start without reporting any error.
This might be due to old incompatible configs (.sirius directory) from version 4.0.1. SIRIUS 4.4.21 fixes this problem and now uses a separate config directory (.sirius-4.4). It is now possible to use version 4.0.1 and 4.4.x along on the same system without interfering each other.
Since we could fix the deadlocks of the SIRIUS GUI on Mac with build 4.4.18, the SIRIUS 4.4. GUI now also available for MacOS: https://bio.informatik.uni-jena.de/software/sirius/
We are happy to introduce CANOPUS, a tool for the comprehensive annotation of compound classes from MS/MS data (certain restrictions apply, see below). In principle, CANOPUS is doing something similar as CSI:FingerID: Whereas CSI:FingerID can tell you what substructures are part of the query compound, CANOPUS does so for compound classes. The differences between both tasks are subtle but have massive consequences. See this preprint on the details of this difference, how CANOPUS works, how good it works etc.
At present, CANOPUS predicts 1270 compound classes. In more detail, CANOPUS predicts ClassyFire compound classes. ClassyFire is not the first but, to the best of our knowledge, by far the most comprehensive approach to assign classes solely from structure. (This last point is key, as this allows us to assign thousands of classes for millions of molecular structures.) Please have a look there if you use CANOPUS: Certain compound class definitions may be not what you expect. For example, we found that many phytosteroids are classified as bile acids in ClassyFire. While the biochemical origin of both classes is very different, they are structural very similar and, therefore, represented by the same class in the ClassyFire ontology.
You can download, install and use CANOPUS through SIRIUS 4.4. You will notice a new tab where you can access, for each compound, all compound classes it does or does not belong to (and, how sure we are about that). Fancier visualizations (see the preprint) will be made available with upcoming releases.
ps. Clearly, CANOPUS is comprehensive only within the limits of the LC-MS/MS technology: If a compound does not ionize, if no fragmentation spectrum is recorded in Data Dependent Acquisition, if a compound does not show any fragmentation, if multiple compounds are fragmented in a single spectrum etc, then CANOPUS cannot help you. We don’t do magic. Also, CANOPUS is limited by the available (structure and MS/MS) training data; but several years of thinking have been invested to get the most out of it.
We are happy to introduce ZODIAC, a tool for the comprehensive annotation of molecular formulas for complete LC-MS/MS runs. SIRIUS 4 is currently best-of-class for this task (as far as we know); but ZODIAC can do better. Different from SIRIUS which considers one compound at a time, ZODIAC considers a complete dataset, assuming that all compounds are somehow related (usually through biotransformations). See the preprint for evaluation and method details.
ZODIAC is about de novo annotations, meaning that we can assign molecular formulas for novel compounds currently absent from any structure database. ZODIAC takes into account “uncommon” elements, as in C24H47BrNO8P or C15H30ClIO5; both examples are indeed novel molecular formulas annotated by ZODIAC (and verified by us). Enter those molecular formulas into the PubChem search and see what you get back. (Fun fact: the first query now returns two entries created Jan 2020 based on our annotations.)
You can download, install and use ZODIAC through SIRIUS 4.4. Results of ZODIAC are simply displayed in the molecular formula tab, if you choose to run it. You should definitely use ZODIAC if you want to run CANOPUS: Assigning molecular classes to novel compounds implies that some of the molecular formulas may be novel, too; and you do not want provide CANOPUS a wrong molecular formula.
ps. Sorry for tweeting early, WordPress sometimes has a mind of its own.
We are happy to announce that SIRIUS 4.4 is finally released. (Unfortunately, the MacOS version will have to wait a few more days.) There have been numerous changes and improvements, only few of which can be mentioned here.
Probably the biggest change is that SIRIUS 4.4 now reads mzML files (“centroided” data) and processes complete LC-MS/MS datasets. You can use ProteoWizard to transform your dataset to mzML. This does not only make things easier for you; it also allows SIRIUS to extract isotope patterns and adduct information more thoroughly from the MS1 data. SIRIUS 4.4 also supports multi-run datasets and aligns runs.
If you are using the graphical user interface (GUI) you no longer have to care about installing (the correct version of) Java. It is part of the installed SIRIUS software.
SIRIUS 4.4 uses the same project space for the command-line (CLI) and the GUI version, allowing you to use the SIRIUS GUI to browse through results computed with the CLI. The GUI also allows you to save your project and reload it later, including all previously computed results. Finally, you can export summary CSV and mzTab-M files for downstream analysis.
CSI:FingerID also had some updates:
Yesterday (27 April 2020) our university computer network experienced some issues and was unavailable for several hours. Not unexpectedly, this also resulted in the unavailability of the CSI:FingerID web service, website etc. As usual, computer problems cause more computer problems: It looks like today (28 April 2020) we still have certain issues restarting the CSI:FingerID workers. That is hopefully resolved soon. We apologize for any inconvenience.
Some of you may have noticed that yesterday, April 17, the SIRIUS 4.4 beta has been released. This update is huge so we are particularly careful not to break too many things. (We will definitely break some things so please report bugs using the SIRIUS GitHub repository or .) Some facts of what you can expect:
The International Max Planck Research School at the Max Planck Institute for Chemical Ecology in Jena is looking for PhD students. One of the projects is from our group on “making SIRIUS and CSI:FingerID GCMS-ready”. Deadline is May 08, 2020.
SIRIUS and CSI:FingerID are the best-of-class tools for MS-based compound identification in metabolomics, natural products and related fields. More than one million compound queries have been submitted to our web service, from over 3000 users and 47 countries. See our recent publication in Nature Methods (Dührkop et al., 2019).
Currently, our tools can only process tandem mass spectrometry data; extending them to Gas Chromatography Electron Ionization appears natural, but comes with numerous challenging problems from algorithmics and machine learning. This will be done in cooperation with the group of Georg Pohnert, see his recent publication in Nature (Thume et al., 2018).
We are searching for motivated candidates from bioinformatics, machine learning, cheminformatics and/or computer science who want to work in this exciting, quickly evolving interdisciplinary field. Please contact Sebastian Böcker in case of questions.
Half a position is being paid by the IMPRS; this will be supplemented by funding from our chair to 2/3 TV-L E13. (Note that the cost of living in East Germany is still considerably lower than in West Germany.) Jena is a beautiful city and wine is grown in the region: https://www.youtube.com/watch?v=DQPafhqkabc.
SIRIUS & CSI:FingerID: https://bio.informatik.uni-jena.de/software/sirius/
Literature: https://bio.informatik.uni-jena.de/publications/ and https://bio.informatik.uni-jena.de/textbook-algoms/
It’s been a while since SIRIUS 4 received its last update. We are excited to announce that SIRIUS 4.4 is coming soon.
It comes with many new features, e.g.:
To provide user friendly but also flexible and customizable access to the different tools we completely redesigned the command line interface (CLI).
We know that this might break your workflows and therefore we provide you an early access version of the CLI that can be used for testing and adapting your workflows:
You will also find an updated version of the manual which is still work-in-progress but contains already an updated section on the new CLI.
No worries, even when SIRIUS 4.4. will be released (as soon as the GUI is ready) version 4.0.1 will still be available for some time.
If you find bugs or have any feedback feel free to open an issue on the SIRIUS GitHub repository or contact us via .
A preprint of our paper “ZODIAC: database-independent molecular formula annotation using Gibbs sampling reveals unknown small molecules.” is now available: https://doi.org/10.1101/842740
ZODIAC takes advantage of the fact that an organism produces related metabolites. ZODIAC builds upon SIRIUS and reranks molecular formula candidates, optimizing annotations on whole datasets. By applying ZODIAC to multiple datasets we greatly increased the number of correct annotations and identified novel molecular formulas which are not present even present in PubChem.
ZODIAC will be made available in an upcoming release of the SIRIUS software.
Sebastian, Kai, Martin and Marcus are attending the German Conference on Bioinformatics in Heidelberg. We look forward to a great conference.
Marcus (with the help of The People) wrote a not-too-short, not-too-shabby HowTo document on, well, how to use SIRIUS 4 and CSI:FingerID. This will be published as a book chapter in a few months, but check out a preprint here.
I have just uploaded a new version (0.8.3) of the Lecture Notes on Algorithmic Mass Spectrometry. As expected, I did not have too much time to work on it (them?) during lecture time, which is luckily over now. It is a lot of small improvements. Also, Magnus Palmblad was so kind and had an expert look through the isotope pattern sections. Unfortunately, the stuff that was missing from the previous version, is still missing now…
Meet Markus at the ISMB/ECCB 2019 in Basel.
On Tuesday, Markus will give a talk about “SIRIUS 4: turning tandem mass spectra into metabolite structure information”.
There is also a corresponding poster in Session A (J-06) which will be presented on Tuesday 6:00pm-8:00pm.
Meet Marcus and Sebastian at the conference of the Metabolomics Society 2019.
On Monday and Tuesday, Marcus will present a poster (539) about SIRIUS 4 and turning tandem mass spectra into metabolite structure information.
The idea of the project is to integrate retention times from liquid chromatography into the SIRIUS/CSI:FingerID identification pipeline. Literally hundreds of papers have been published on the topic of retention time prediction, but all of them fail to provide predictions that are transferable across chromatography conditions and compound classes; see Héberger’s review (Journal of Chromatography A, 2007) where he speaks rather frankly about the malpractices of publishing such RT-prediction methods. On the other hand, retention times can indeed be used to further boost CSI:FingerID’s identification performance. Also, transferable retention prediction is not impossible, as we have shown here. The trick is not to try to predict retention time (which is extremely dependent on instrument parameters etc) but rather retention order.
We are searching for a qualified and motivated PhD student who wants to accept this challenge. (S)he should be knowledgeable in machine learning and preferably also bioinformatics in general; biochemistry knowledge is clearly also a plus. We believe that this can be the next big thing to further push CSI:FingerID’s performance. Please contact Sebastian or Kathrin in case you are interested and qualified.
Meet Kai, Markus and Martin at ASMS 2019.
On Wednesday Kai presents a poster (WP-408) about SIRIUS 4 and how it turns tandem mass spectra into metabolite structure information.
Some of you might have noticed problems with the