Cool stuff!
From a feature standpoint to enable this, it looks like there are two paths; either using the SimilarityModel() or adding all parts of it.
Adding the SimilarityModel() requires the model wrapper to change, which is currently behind the scenes. This would essentially mean a new type of workspace, where the training code “behind” the workspace differs. If you want to compare to our old workflow, this would mean a new training component.
Adding parts of it would require:
- Adding the similarity loss function
- Adding a way to do clustering in PL (maybe a new workspace dedicated for ensemble models, clustering, etc., kind of a deployment pipeline builder)
Our plans for better tackling these kind of things is to gradually start opening up the customization more and more.
Starting with custom losses, then custom trainer (enabling you to use SimilarityModel() for example) as well as custom data ingestion.
What we want to make sure this time though, is that they are modular, easy to edit and customize and finally and most importantly, save-able and share-able. The hope being that one person finds something cool, can implement it and share with the rest of the community without friction.