Comparison-of-Deep-Image-Embedding-Methods

This Repo compares different deep image embedding methods with the goal to achieve good general embeddings for images given a small amount of training data.

This Repo was created for an assignment in a deep vision course at the OTH-Amberg-Weiden. Therefore a report is included.

Datasets

Following datasets were used:

Tiny ImageNet (Download) for comparing the different embedding methods.
Internal and External Parts of Cars for the final test set.

The notebooks expect the datasets to be in the root of the repo.

Backbones

The Backbones-Notebook compares the following backbones.

Results:

Backbone	F1-Score
ResNet50	0.664
EfficientNetV2_L	0.540
MobilNetV3	0.367
DenseNet169	0.612
ViT	0.893
Swin	0.934

Losses

The Losses-Notebook compares the following Loss-Functions.

Results:

Loss	F1-Score
ContrastiveLoss	0.650
TripletLoss	0.660
SupConLoss	0.709
SNRLoss	0.685
NTXentLoss	0.618

Embedding Size

The Embedding Size-Notebook compares different Embedding-Sizes.

Results:

Embedding Size	F1-Score
64	0.654
128	0.683
256	0.712
512	0.719
1024	0.724
2048	0.731

Dataset Size

The Dataset Size-Notebook compares different Train-Sample-Sizes for each class in the dataset.

Results:

Samples per Class	F1-Score
10	0.223
20	0.280
30	0.369
50	0.443
80	0.507
100	0.520
200	0.632
400	0.703

Augmentation Factor

The Augmentation Factor-Notebook compares different augmentation factors for a small dataset with 20 images per class.

Results:

Factor	F1-Score
1x (Baseline)	0.306
2x	0.352
4x	0.437
8x	0.518
16x	0.569

Augmentation Methods

The Augmentation Methods-Notebook compares different auto-augmentation methods integrated in pytorch.

Results:

Method	F1-Score
Baseline	0.289
AutoAugment	0.392
RandAugment	0.436
TrivialAugmentWide	0.441

Zero Shot Learning

In the Zero Shot-Notebook we test the capabilities of a SWIN-Network finetuned on the "Tiny-ImageNet"-Dataset to embed images from the "Internal and External Parts of Cars"-Dataset.

Results:

F1-Score
0.865

Putting it all together

In the Final-Notebook we try to finetune a Network on 20 images of 4 classes of the "Internal and External Parts of Cars"-Dataset and perform normal and zero-shot detection on all 8 classes with 230 images per class.

Results:

Mode	F1-Score
Normal	0.995
Zero-Shot	0.975

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.vscode		.vscode
doc		doc
modules		modules
plots		plots
.gitignore		.gitignore
AugmentationMethods.ipynb		AugmentationMethods.ipynb
AugmentationSize.ipynb		AugmentationSize.ipynb
Backbones.ipynb		Backbones.ipynb
DatasetSize.ipynb		DatasetSize.ipynb
Datasets.ipynb		Datasets.ipynb
EmbeddingSize.ipynb		EmbeddingSize.ipynb
LICENSE		LICENSE
Losses.ipynb		Losses.ipynb
Putting_It_All_Together.ipynb		Putting_It_All_Together.ipynb
README.md		README.md
ZeroShot.ipynb		ZeroShot.ipynb
pretrain_swin_imagenet.py		pretrain_swin_imagenet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comparison-of-Deep-Image-Embedding-Methods

Datasets

Backbones

Losses

Embedding Size

Dataset Size

Augmentation Factor

Augmentation Methods

Zero Shot Learning

Putting it all together

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Comparison-of-Deep-Image-Embedding-Methods

Datasets

Backbones

Losses

Embedding Size

Dataset Size

Augmentation Factor

Augmentation Methods

Zero Shot Learning

Putting it all together

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages