Abstract
Sketch-based image retrieval is demanding interest in the computer vision community due to its relevance in the visual perception system and its potential application in a wide diversity of industries. In the literature, we observe significant advances when the models are evaluated in public datasets. However, when assessed in real environments, the performance drops drastically. The big problem is that the SOTA SBIR models follow a supervised regimen, strongly depending on a considerable amount of labeled sketch-photo pairs, which is unfeasible in real contexts. Therefore, we propose SBIR-BYOL, an extension of the well-known BYOL, to work in a bimodal scenario for sketch-based image retrieval. To this end, we also propose a two-stage self-supervised training methodology, exploiting existing sketch-photo pairs and contour-photo pairs generated from photographs of a target catalog. We demonstrate the benefits of our model for the eCommerce environments, where searching is a critical component. Here, our self-supervised SBIR model shows an increase of over 60 % of mAP.
Original language | English |
---|---|
Pages (from-to) | 5395-5408 |
Number of pages | 14 |
Journal | Neural Computing and Applications |
Volume | 35 |
Issue number | 7 |
DOIs | |
State | Published - Mar 2023 |
Bibliographical note
Publisher Copyright:© 2022, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.
Keywords
- Deep-learning
- Representation learning
- Self-supervision
- Sketch-based image retrieval