Abstract
We propose modifications to the RetinaNet architecture and Focal Loss [1] to improve the effectiveness of them in the context of object detection. We show that through normalizing the embeddings generated by each receptive field in the feature map, classifying them using cosine similarity, increasing the loss for the foreground anchors and forcing the dispersion of the classification vectors will improve the performance of the model. We test our proposal with the FlickrLogos-32 [2] dataset that contains 40 examples for each one of the 32 classes, achieving competitive results with respect to the state-of-the-art approaches with a precision of 0.975, a recall of 0.944, an accuracy of 0.949 and improving the mAP over the dataset from 0.657 with RetinaNet [1] to 0.775 using our new architecture DLDENet.
Original language | English |
---|---|
Title of host publication | 2019 38th International Conference of the Chilean Computer Science Society, SCCC 2019 |
Publisher | IEEE Computer Society |
ISBN (Electronic) | 9781728156132 |
DOIs | |
State | Published - Nov 2019 |
Externally published | Yes |
Event | 38th International Conference of the Chilean Computer Science Society, SCCC 2019 - Concepcion, Chile Duration: 4 Nov 2019 → 9 Nov 2019 |
Publication series
Name | Proceedings - International Conference of the Chilean Computer Science Society, SCCC |
---|---|
Volume | 2019-November |
ISSN (Print) | 1522-4902 |
Conference
Conference | 38th International Conference of the Chilean Computer Science Society, SCCC 2019 |
---|---|
Country/Territory | Chile |
City | Concepcion |
Period | 4/11/19 → 9/11/19 |
Bibliographical note
Publisher Copyright:© 2019 IEEE.
Keywords
- Convolutional neural networks
- Deep learning
- Object detection