DLDENet: Deep Local Directional Embeddings with Increased Foreground Focal Loss for object detection

Fabian Souto Herrera, Jose M. Saavedra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

We propose modifications to the RetinaNet architecture and Focal Loss [1] to improve the effectiveness of them in the context of object detection. We show that through normalizing the embeddings generated by each receptive field in the feature map, classifying them using cosine similarity, increasing the loss for the foreground anchors and forcing the dispersion of the classification vectors will improve the performance of the model. We test our proposal with the FlickrLogos-32 [2] dataset that contains 40 examples for each one of the 32 classes, achieving competitive results with respect to the state-of-the-art approaches with a precision of 0.975, a recall of 0.944, an accuracy of 0.949 and improving the mAP over the dataset from 0.657 with RetinaNet [1] to 0.775 using our new architecture DLDENet.

Original languageEnglish
Title of host publication2019 38th International Conference of the Chilean Computer Science Society, SCCC 2019
PublisherIEEE Computer Society
ISBN (Electronic)9781728156132
DOIs
StatePublished - Nov 2019
Externally publishedYes
Event38th International Conference of the Chilean Computer Science Society, SCCC 2019 - Concepcion, Chile
Duration: 4 Nov 20199 Nov 2019

Publication series

NameProceedings - International Conference of the Chilean Computer Science Society, SCCC
Volume2019-November
ISSN (Print)1522-4902

Conference

Conference38th International Conference of the Chilean Computer Science Society, SCCC 2019
Country/TerritoryChile
CityConcepcion
Period4/11/199/11/19

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

Keywords

  • Convolutional neural networks
  • Deep learning
  • Object detection

Fingerprint

Dive into the research topics of 'DLDENet: Deep Local Directional Embeddings with Increased Foreground Focal Loss for object detection'. Together they form a unique fingerprint.

Cite this