SIRIUS and CSI:FingerID are offered to the public as freely available resources. (Re-)distribution of the methods, in whole or in part, for commercial purposes requires explicit permission of the authors and explicit acknowledgment of the source material and the original publications. We ask that users who use SIRIUS and CSI:FingerID cite the corresponding papers in any resulting publications.
The CSI:FingerID web-service hosted by the boecker group at https://www.csi-fingerid.uni-jena.de, which is used by default in SIRIUS, is for non-commercial use only. For commercial users the Bright Giant GmbH provides CSI:FignerID related services that can be used with SIRIUS.
SIRIUS is a java-based software framework for discovering a landscape of de-novo identification of metabolites using single and tandem mass spectrometry. SIRIUS uses isotope pattern analysis for detecting the molecular formula and further analyses the fragmentation pattern of a compound using fragmentation trees. Fragmentation trees can be uploaded to CSI:FingerID via a web service, and results can be displayed in the SIRIUS graphical user interface. (This is also possible using the command line version of SIRIUS.) This is the recommended way of using CSI:FingerID.
SIRIUS+CSI:FingerID GUI and CLI - Version 4.5.1 (2020-11-24)
This versions have the JRE already included! Just download, install/unpack and execute.
SIRIUS+CSI:FingerID Commandline only - Version 4.5.1 (2020-11-24)
This versions have the JRE already included! Just download, install/unpack and execute.
On Windows and MacOS the installer version of SIRIUS (msi/pkg) should be preferred but might require admin permissions. Since we do not pay Microsoft/Apple for certification you might have to confirm that you want to trust software from an unknown source on Windows/MacOS. On MacOS the option to confirm the execution of the installer (pkg) might be hidden under 'System Settings' -> 'Security & Privacy'.
Sources on GitHub
For SIRIUS 4.0.1 click here.
Integration of CSI:FingerID
Fragmentation trees and spectra can be directly uploaded from SIRIUS to a CSI:FingerID web service (without the need to access the CSI:FingerID website). Results are retrieved from the web service and can be displayed in the SIRIUS graphical user interface. This functionality is also available for the SIRIUS command-line tool. The training Structures of CSI:FingerID predictors are available through the CSI:FingerID WebAPI.
Training structures for positive ion mode:
Training structures for negative ion mode:
Fragmentation Tree Computation
The manual interpretation of tandem mass spectra is time-consuming and non-trivial. SIRIUS analyses the fragmentation pattern resulting in hypothetical fragmentation trees in which nodes are annotated with molecular formulas of the fragments and arcs represent fragmentation events. SIRIUS allows for the automated and high-throughput analysis of small-compound MS data beyond elemental composition without requiring compound structures or a mass spectral database.
Isotope Pattern Analysis
SIRIUS deduces molecular formulas of small compounds by ranking isotope patterns from mass spectra of high resolution. After preprocessing, the output of a mass spectrometer is a list of peaks which corresponds to the masses of the sample molecules and their abundance. In principle, elemental compositions of small molecules can be identified using only accurate masses. However, even with very high mass accuracy, many formulas are obtained in higher mass regions. High resolution mass spectrometry allows us to determine the isotope pattern of sample molecule with outstanding accuracy and apply this information to identify the elemental composition of the sample molecule. SIRIUS can be downloaded either as graphical user interface (see Sirius GUI) or as command-line tool.
Kai Dührkop, Markus Fleischauer, Marcus Ludwig, Alexander A. Aksenov, Alexey V. Melnik, Marvin Meusel, Pieter C. Dorrestein, Juho Rousu, and Sebastian Böcker, Sirius 4: turning tandem mass spectra into metabolite structure information, Nat methods, 16, 2019.
Kai Dührkop, Louis-Félix Nothias, Markus Fleischauer, Raphael Reher, Marcus Ludwig, Martin A. Hoffmann, Daniel Petras, William H. Gerwick, Juho Rousu, Pieter C. Dorrestein and Sebastian Böcker
Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra
Nature Biotechnology, 2020.
(Cite if you are using CANOPUS)
Marcus Ludwig, Louis-Félix Nothias, Kai Dührkop, Irina Koester, Markus Fleischauer, Martin A. Hoffmann, Daniel Petras, Fernando Vargas, Mustafa Morsy, Lihini Aluwihare, Pieter C. Dorrestein, Sebastian Böcker ZODIAC: database-independent molecular formula annotation using Gibbs sampling reveals unknown small molecules bioRxiv, 2019. (Cite if you are using ZODIAC)
Yannick Djoumbou Feunang, Roman Eisner, Craig Knox, Leonid Chepelev, Janna Hastings, Gareth Owen, Eoin Fahy, Christoph Steinbeck, Shankar Subramanian, Evan Bolton, Russell Greiner, David S. Wishart ClassyFire: automated chemical classification with a comprehensive, computable taxonomy J Cheminf, 8, 2016. (Cite if you are using CANOPUS)
Kai Dührkop and Sebastian Böcker. Fragmentation trees reloaded. J Cheminform, 8:5, 2016. (Cite this for fragmentation pattern analysis and fragmentation tree computation)
Kai Dührkop, Huibin Shen, Marvin Meusel, Juho Rousu, and Sebastian Böcker. Searching molecular structure databases with tandem mass spectra using CSI:FingerID. Proc Natl Acad Sci U S A, 112(41):12580-12585, 2015. (cite this when using CSI:FingerID)
Sebastian Böcker, Matthias C. Letzel, Zsuzsanna Lipták and Anton Pervukhin. SIRIUS: decomposing isotope patterns for metabolite identification. Bioinformatics (2009) 25 (2): 218-224. (Cite this for isotope pattern analysis)
Marcus Ludwig, Kai Dührkop and Sebastian and Böcker. Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints. Bioinformatics, 34(13): i333-i340. 2018. Proc. of Intelligent Systems for Molecular Biology (ISMB 2018). (Cite for CSI:FingerID Scoring)
W. Timothy J. White, Stephan Beyer, Kai Dührkop, Markus Chimani and Sebastian Böcker. Speedy Colorful Subtrees. In Proc. of Computing and Combinatorics Conference (COCOON 2015), volume 9198 of Lect Notes Comput Sci, pages 310-322. Springer, Berlin, 2015. (cite this on why computations are swift, even on a laptop computer)
Huibin Shen, Kai Dührkop, Sebastian Böcker and Juho Rousu. Metabolite Identification through Multiple Kernel Learning on Fragmentation Trees. Bioinformatics, 30(12):i157-i164, 2014. Proc. of Intelligent Systems for Molecular Biology (ISMB 2014). (Introduces the machinery behind CSI:FingerID)
Imran Rauf, Florian Rasche, François Nicolas and Sebastian Böcker. Finding Maximum Colorful Subtrees in practice. J Comput Biol, 20(4):1-11, 2013. (More, earlier work on why computations are swift today)
Heinonen, M.; Shen, H.; Zamboni, N.; Rousu, J. Metabolite identification and molecular fingerprint prediction through machine learning. Bioinformatics, 2012. Vol. 28, nro 18, pp. 2333-2341. (Introduces the idea of predicting molecular fingerprints from tandem MS data)
Florian Rasche, Aleš Svatoš, Ravi Kumar Maddula, Christoph Böttcher, and Sebastian Böcker. Computing Fragmentation Trees from Tandem Mass Spectrometry Data. Analytical Chemistry (2011) 83 (4): 1243–1251. (Cite this for introduction of fragmentation trees as used by SIRIUS)
Sebastian Böcker and Florian Rasche. Towards de novo identification of metabolites by analyzing tandem mass spectra. Bioinformatics (2008) 24 (16): i49-i55. (The very first paper to mention fragmentation trees as used by SIRIUS)
Starting with version 4.4.27, SIRIUS is licensed under the GNU Affero General Public License (GPL). If you integrate SIRIUS into other software, we strongly encourage you to make the usage of SIRIUS as well as the literature to cite transparent to the user.
- GUI: Progress information for running jobs
- GUI: More detailed Visualisation of what has already been computed
- more bugfixes 😉
- improvement: CLP native libs are now compatible with glibc 2.12+ (instead of 2.18+)
- fix: project-space with outdated fingeprint versions (e.g. from SIRIUS 4.4) are now handled correctly and can be converted.
- fix: database formulas could be used if candidates even if they were incompatible with the adduct
- fix: mzml/mzxml files are now shown in input file selector
feature: CANOPUS: for negative ion mode data
feature: Bayesian (individual tree) scoring is now the default for ranking structure candidates
update: Structure DB update due to major changes in PubChem standardization since the last one.
- feature: COCONUT, NORMAN and Super Natural are now officially supported
feature: Custom-DB importer View (GUI)
feature: mgf export for Feature Based Molecular Networking is now available in the GUI
breaking: additional columns (
retentionTimeInSeconds) have been added to project wide summary files such as
breaking: column names in
breaking: column names describing scores now use camel case instead of underscores:
fix: incompatibility with recent MaOSX version caused by gatekeeper. We now provide an installable packages.
fix: missing SCANS annotation in mgf-export subtool - creates now a valid input for FBMN
fix: un-parsed retention times in CEF format.
fix: Structure DB linking (wrong ids, missing link flags, duplicate entries, etc.)
fix: reduced memory consumption of CLI and GUI
JRE is now included in all version of SIRIUS
Many more bug fixes and performance improvements
NOTE: SIRIUS versions will now follow semantic versioning (all upcoming releases) regarding the command line interface and project-space output.
- fix: Error when parsing FragTree json with non numeric double values
- fix: layout of screener progress bar on Mac
- feature: Retention time will now be imported by SIRIUS
- RT is shown in the Compound list in the SIRUS GUI and the list can be sorted by RT
- RT is part of the compound.info file in the project-space
- feature: Loglevel can now be changed from CLI
- feature: Summaries can not be written without closing SIRIUS GUI
- Improvement: Better progress reporting when Summary writing summaries (GUI)
- fix: Agilent CEF files without CE can now be imported
- feature: coin-or ilp solver (CLP) is now included. This allows parallel computation of FragTrees without the need for a commercial solver.
- improvement: Compounds without given charge are can now be imported. SIRIUS tries to guess the charge from the name (keyword: pos/neg) or falls back to positive.
- improvement: additional parameters in compute dialog
- improvement: commands of the 'show command' dialog can now be copied
- fix: error when writing/reading fragmentation trees with new Jackson parser
- fix: mgf exporter (CLI) now outputs feature name properly
- fix: deadlock during connection check without internet connection
- fix: tree rendering bug on non linux systems
- fix: crash when aborting recompute dialog
- upgrade (GUI): included JRE to
- fix: deadlock and waiting time due to webservice connections
- fix/improvement: Adduct Settings and Adduct detection
- fix: memory leak in third party json lib -> Zodiac memory consumption has been reduced dramatically
- fix: several minor bug fixes in the sirius libs
- fix: removed spring boot packaging to
- solve several class not found issues,
- solve github issue #7
- errors when importing and aligning mzml files.
- improve startup time
- fix: cosine similarity tool ignores instances without spectra (failed before)
- fix: mgf-export tool skips invalid instances if possible (failed before)
- instance validation after lcms-align tool
- feature: ms2 istotope scorer now available in cli and gui
- fix: wrong missing value handling in xlogp filter (some candidates were invisible)
- improvement: less cores for computations if gui is running to have mor cpu time for GUI tasks
- improvement: show deviation to target ion in FragTree root if precursor is missing in MS/MS
- fix: Classloader exceptions when using CLI from the GUI version
- fix: Wrong mass deviation for trees with adducts
- fix: misplaced labels when exporting svg/pdf fragtrees
- fix: some minor GUI bugs
- fix: incompatibilities with existing configs from previous versions (.sirius)
- fix: CANOPUS detail view has size zero
- fix: failing CSI:FingerID computation with Zodiac re-ranking and existing Adducts
- improvement: errors that occur before GUI is started are now reported
- improvement: minor GUI improvements
- fix: some more fixes on MacOS GUI freezes
- fix: GUI Deadlock on MacOS X fixed. Mac version is now available.
- improvement: Character separated files in project-space have now .tsv extension for better excel compatibility.
- feature: Windows headless executable respects
%JAVA_HOME%as JRE location.
- improvement: Improved packaging and startup of the GUI version
- fixes GitHub issues: 4 and 6
- feature: CSI:FingerID for negative ion mode is available
- NOTE: CANOPUS for negative mode data is not ready yet and will still take some time.
- fix: Too small Heapsize on Windows
- improvement: better GUI performance
- feature: CLI Sub-Tool to export projects to mgf.
- feature: multiple candidate number for Zodiac.
- fix: zodiac score rendering.
- fix: deadlock project-space import
- fixes: tree rendering
- improvement: import and deletion performance
- improvement: import progress now shown
- fix: MacOS included JRE not found.
- fix: ignored parameters.
- fix: recompute does not correctly invalidate and delete previous results.
- fix: UI now correctly update when data will by deleted by the computations.
New (and newly integrated) tools:
- CANOPUS:: A tool for the comprehensive annotation of compound classes from MS/MS data.
- ZODIAC: Builds upon the SIRIUS molecular formula identifications and uses, say, its top 50 molecular formula annotations as candidates for one compound. It then re-ranks molecular formula candidates using Bayesian statistics.
- PASSATUTTO: Is now part of SIRIUS and allows you to generate dataset specific decoy databases from computed fragmentation trees.
- Other handy standalone tools e.g. compound similarity calculation, mass decomposition, custom-db creation and project-space manipulation.
Project-Space: A standardized persistence layer shared by CLI and GUI that makes both fully compatible.
- Save and reimport your projects with all previously calculated results.
- Review your results computed with the CLI in the GUI.
- Handy project-space summary CSV and mzTab-M files for downstream analysis.
- Preojects can be stored and modified as directory structure or as compressed archive.
LCMS-Runs: SIRIUS can now handle full LCMS-Runs given in mzML/mzXML format and performs automatic feature detection.
- The lcms-align preprocessing tool performs feature detection and feature alignment for multiple LCMS-Runs based on the available the MS/MS spectra.
Redesigned Command line interface: SIRIUS is now a toolbox containing many subtools that may be combined to ToolChains based on the project-space.
CSI:FingerID had some massive updates, including more and larger molecular properties.
- Structure DBs New version of the CSI:FingerID PubChem copy that now uses PubChem standardized structures.
- NORMAN is now available as search DB
- All available database filters can now be combined to arbitrary subsets for searching (even with custom databases).
Interactive fragmentation tree viewer with vector graphics export in the GUI.
Java 11 or higher is now mandatory
- GUI version ships with an integrated JRE
Many minor improvements and Bugfixes
- Java 9 and higher are now supported
- CSI:FingerID trainings structures available
- Trainings structures available via WebAPI.
- Trainings structures are flagged in CSI:FingerID candidate list.
- SMARTS filter for candidate list (GUI)
- Molecular Property filter for candidate list (GUI)
- Available prediction workers of the CSI:FingerID webservice can be listed from SIRIUS
- Improved connection handling and auto reconnect to Webservice
- Improved error messaged
- Improved stability and load balancing of the CSI:FingerID webservice
- Several bug fixes
- Fragmentation tree heuristics
- Negative ion mode data is now supported
- Polished and more informative GUI
- Sirius Overview: Explained intensity, number of explained peaks, median mass deviation
- Fragmentation trees: Color coding of nodes by intensity/mass deviation, more informative Fragmentation tree nodes
- CSI:FingerID Overview: Number of Pubmed publication with pubmed linking for each Candidate, Visualization of CSI:FingerID score.
- Predicted Fingerprints: Visualisation of prediction (posterior probability), predictor quality (F1) and number of training examples.
- Several small improvements
- CPLEX ILP solver support
- Consider a specific list of ionizations for Sirius
- Consider a specific list of adducts for CSI:FingerID
- Custom ionizations/adducts can be specified (CLI and GUI)
- Full-featured standalone command line version (headless version)
- Improved parallelization and task management
- Improved stability of the CSI:FingerID webservice
- Time limit for fragmentation tree computations
- Specify fields to import name and ID from .sdf into a custom database (GUI).
- CSI:FingerID results can be filtered by Custom databases (GUI).
- Better filtering performance (GUI)
- Bug fix in Database filtering view (GUI)
- Error Reporter bug fixed (GUI)
- Logging bugs fixed
- Many minor bug fixes
- Custom databases can be imported by hand or via csv file. You can manage multiple databases within Sirius.
- New Bayesian Network scoring for CSI:FingerID which takes dependencies between molecular properties into account.
- CSI:FingerID Overview which lists results for all molecular formulas.
- Visualization of the predicted fingerprints.
- ECFP fingerprints are now also in the CSI:FingerID database and do no longer have to be computed on the users side.
- Connection error detection and refresh feature. No restart required to apply Sirius internal proxy settings anymore.
- System wide proxy settings are now supported.
- Many minor bug fixes and small improvements of the GUI
- element prediction using isotope pattern
- CSI:FingerID now predicts more molecular properties which improves structure identification
- improved structure of the result output generated by the command
line tool to its final version
- fix missing MS2 data error
- MacOSX compatible start script
- add proxy settings, bug reporter, feature request
- new GUI look
- integration of CSI:FingerID and structure identification into SIRIUS
- it is now possible to search formulas or structures in molecular databases
- isotope pattern analysis is now rewritten and hopefully more stable than before
- fix bug with penalizing molecular formulas on intrinsically charged mode
- fix critical bug in CSV reader
- Sirius User Interface
- new output type -O sirius. The .sirius format can be imported into the User Interface.
- Experimental support for in-source fragmentations and adducts
- fix crash when using GLPK solver
- fix bug: SIRIUS uses the old scoring system by default when -p parameter is not given
- fix some minor bugs
- if MS1 data is available, SIRIUS will now always use the parent peak from MS1 to decompose the parent ion, instead of using the peak from an MS/MS spectrum
- fix bugs in isotope pattern selection
- SIRIUS ships now with the correct version of the GLPK binary
- release version