Skip to content

Feature proposal: unsupervised manifold learning to understand un-labelled datasets ? #6

@ltetrel

Description

@ltetrel

nrtk currently helps to evaluate the robustness of existing models or algorithms. The goal would be to expend its functionality to enable insights from datasets alone.
Using unsupervised manifold learning, one can automatically extract insights from an un-annotated dataset by projecting the n-dimensionnal latent space to a 3D/2D reduce space (with PCA, UMAP or T-SNE as you are already doing).

The additionnal features for nrtk would be:

  1. Select an "embedding model" such as a convolutionnal VAE, or classical non-reference image quality like opencv BRISQUE.
  2. Train the model if required (for ML-based techniques like CVAE)
  3. Infer the model on an un-labelled dataset (optionally with augmentation) (not sure if nrtk support no annotations)
  4. Visualize the features on the new space

We are currently working on this topic with ifremer, you can find a first POC here. The current visualization is really basic with a static file (html) exported from bokeh.

Let us know what you think,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions