COSMIC is a tool for that allows you to assign confidence to structure annotations. For every structure annotated by CSI:FingerID, COSMIC provides a confidence score (a number between 0 and 1) that tells you how likely it is that this annotation is correct. This is similar in spirit to what is done in spectral library search: Not only is the cosine score used to decide which candidate best fits to the query spectrum; in addition, we use the cosine of the top-scoring candidate (the hit) to decide whether it is likely correct (say, above 0.8), incorrect (say, below 0.6) or in the “twilight” in-between. If you have been using CSI:FingerID for some time, you might have noticed that finding such thresholds is not possible for the CSI:FingerID score. COSMIC closes this gap and tells you if an annotation is likely correct or incorrect. COSMIC will soon be integrated in SIRIUS.


COSMIC will be integrated into SIRIUS soon. Please, download the appropriate SIRIUS version for your operating system, here.
The GUI version (which also includes the full command line version) requires no external JRE. Everything ist included. Download, extract, execute.


COSMIC is parameter free and will be executed automatically every time a CSI:FingerID search is performed. COSMIC scores for a compound are shown in the compound list on the left. The compound list can be sorted by COSMIC score by right clicking it and selecting “Order by COSMIC”. COSMIC scores need to interpreted with some care, and it is important to understand that COSMIC scores are not probabilities.

An annotation received a low COSMIC score, even though am i sure it is correct!

In this example, the spectrum of Campherol from the SIRIUS training data receives the low COSMIC score of 0.3, even though the structure annotation is correct. The reason for this is, that the database we searched in contains multiple extremely similar structures, that have very similar fingerprint representations as well as CSI:FingerID scores. In some of these cases, the mass spectra for very similar structure candidates may be close to indistinguishable. For that reason, COSMIC assigns a low confidence. 



SIRIUS documentation.

Use cases

Apart from enhancing your previous CSI:FingerID experience by telling you if an annotation is likely correct or not, COSMIC makes searching in hypothetical databases viable. We demonstrate this, by generating a database of hypothetical bile acid structures, combinatorially adding amino acids to bile acid cores, yielding 28,630 plausible bile acid conjugate structures. We then searched query MS/MS data from a mice fecal dataset in this combinatorial database, and used the COSMIC confidence score to distinguish between hits that are likely correct or incorrect. We manually evaluated the top 12 hits and found that 11 annotation (91.6%) were likely correct; two annotations were further confirmed using synthetic standards. All 11 bile acid conjugates are “truly novel”, meaning that we could not find those structures in PubChem or any other structure database (or publication). Whereas reporting 11 novel bile acid conjugates may appear rather cool, we argue it is even cooler that we did this without a biological hypothesis beyond “there might be bile acid conjugates out there which nobody knows about”; and that COSMIC found the top bile acid conjugate annotations in a fully automated manner.

Additionally, COSMIC enables you to “flip the workflow”: Annotate large quantities of data, then look at novel compound annotations with high confidence and form your hypothesis from there! To demonstrate this ability, we have also annotated 2,666 LC-MS/MS runs from human samples with molecular structures which are currently absent from HMDB, and for which no MS/MS reference data are available; and finally, 17,414 LC-MS/MS runs with annotations for which no MS/MS reference data are available (see the COSMIC preprint for details)



Assigning confidence to structural annotations from mass spectra with COSMIC
Martin A. Hoffmann, Louis-Félix Nothias, Marcus Ludwig, Markus Fleischauer, Emily C. Gentry, Michael Witting, Pieter C. Dorrestein, Kai Dührkop, Sebastian Böcker


Data availability