nrtk currently helps to evaluate the robustness of existing models or algorithms. The goal would be to expend its functionality to enable insights from datasets alone.
Using unsupervised manifold learning, one can automatically extract insights from an un-annotated dataset by projecting the n-dimensionnal latent space to a 3D/2D reduce space (with PCA, UMAP or T-SNE as you are already doing).
The additionnal features for nrtk would be:
- Select an "embedding model" such as a convolutionnal VAE, or classical non-reference image quality like opencv BRISQUE.
- Train the model if required (for ML-based techniques like CVAE)
- Infer the model on an un-labelled dataset (optionally with augmentation) (not sure if nrtk support no annotations)
- Visualize the features on the new space
We are currently working on this topic with ifremer, you can find a first POC here. The current visualization is really basic with a static file (html) exported from bokeh.
Let us know what you think,