Resumen
We propose modifications to the RetinaNet architecture and Focal Loss [1] to improve the effectiveness of them in the context of object detection. We show that through normalizing the embeddings generated by each receptive field in the feature map, classifying them using cosine similarity, increasing the loss for the foreground anchors and forcing the dispersion of the classification vectors will improve the performance of the model. We test our proposal with the FlickrLogos-32 [2] dataset that contains 40 examples for each one of the 32 classes, achieving competitive results with respect to the state-of-the-art approaches with a precision of 0.975, a recall of 0.944, an accuracy of 0.949 and improving the mAP over the dataset from 0.657 with RetinaNet [1] to 0.775 using our new architecture DLDENet.
Idioma original | Inglés |
---|---|
Título de la publicación alojada | 2019 38th International Conference of the Chilean Computer Science Society, SCCC 2019 |
Editorial | IEEE Computer Society |
ISBN (versión digital) | 9781728156132 |
DOI | |
Estado | Publicada - nov. 2019 |
Publicado de forma externa | Sí |
Evento | 38th International Conference of the Chilean Computer Science Society, SCCC 2019 - Concepcion, Chile Duración: 4 nov. 2019 → 9 nov. 2019 |
Serie de la publicación
Nombre | Proceedings - International Conference of the Chilean Computer Science Society, SCCC |
---|---|
Volumen | 2019-November |
ISSN (versión impresa) | 1522-4902 |
Conferencia
Conferencia | 38th International Conference of the Chilean Computer Science Society, SCCC 2019 |
---|---|
País/Territorio | Chile |
Ciudad | Concepcion |
Período | 4/11/19 → 9/11/19 |
Nota bibliográfica
Publisher Copyright:© 2019 IEEE.