A few years ago, Michael Witting and I joined forces to get a transferable prediction of retention times going: That is, we want to predict retention times (more precisely, retention order) for a column even if we have no training data for that column. Yet, to describe a column to a machine learning model, you have to provide some numerical values that allow the model to learn what columns are similar, and how similar. We are currently focusing on reversed-phase (RP) columns because there are more datasets available, and also because it appears to be much easier to predict retention times for RP.
Tanaka parameters and Hydrophobic Subtraction Model (HSM) parameters are reasonable choices for describing a column. Unfortunately, for many columns that are in “heavy use” by the metabolomics and lipidomics community, we do not know these parameters! Michael recently tweeted about this problem, and we got some helpful literature references — kudos! for that. Yet, there are still many columns in the unknown.
Now, the problem is not so much that the machine learning community will not be able to make use of training data from these columns, simply because a few column parameters are unknown. This is unfortunate, but so be it. The much bigger problem is that even if someone comes up with a fantastic machine learning model for transferable retention time prediction — it may not be applicable for your column. Because for your column we do not know the parameters! That would be very sad.
So, here is a list of columns that are heavily used, but where we do not know Tanaka parameters, HSM parameters, or both. Columns are ordered by “importance to the community”, whatever that means… If you happen to know parameters for any of the columns below, please let us know! You can post a comment below or write us an email or send a carrier pigeon, whatever you prefer. Edit: I have switched off comments, it was all spam.
Missing HSM parameters
- Waters ACQUITY UPLC HSS T3
- Waters ACQUITY UPLC HSS C18
- Restek Raptor Biphenyl
- Waters CORTECS UPLC C18
- Phenomenex Kinetex PS C18
Missing Tanaka parameters
- Waters CORTECS T3
- Waters ACQUITY UPLC HSS T3
- Waters ACQUITY UPLC HSS C18
- Restek Raptor Biphenyl
- Waters CORTECS UPLC C18
- Phenomenex Kinetex PS C18