Amanda Duarte

About Me

Hi, I'm Amanda Duarte!

I am a PhD student at Barcelona Supercomputing Center / Universitat Politècnica de Catalunya under the supervision of Prof. Jordi Torres and Prof. Xavier Giró, working with the Image Processing Group (GPI).
I graduated in Systems Analysis at Instituto Federal Sul-rio-grandense Câmpus Pelotas (IFSul) in Brazil spending one year at Northern Virginia Community College (NVCC) in United States as part of an exchange program. I received my master’s in Computer Engineering at Federal University of Rio Grande (FURG) under supervision of Dra. Silvia Silva da Costa Botelho and Dr. Paulo Lilles Jorge Drews Junior working with the Automation and Intelligent Robotics (NAUTEC) group.

My research interests lie in the area of Computer Vision, Machine Learning, Neural Machine Translation, Speech Recognition, Video Generation, Video Understanding, Multi-modal Learning and Computer Vision applied to Robotics.

Besides research, I am passionate about travel, photography, and art.


Please feel free to contact me for any further information.



Marie Skłodowska-Curie fellow - INPhINIT - ”La Caixa” Doctoral fellowship Oct 2018

Happy to annouce that I was award with a Marie Skłodowska-Curie fellow under the INPhINIT - ”La Caixa” Doctoral fellowship. Thus, will be joining the Barcelona Supercomputing Center as a research student during my PhD studies.

Participating on the Frederick Jelinek Memorial Summer Workshop 2018 Summer 2018

I will be part of the Grounded Sequence-to-Sequence Transduction team during the six-week-long research program on Machine Learning for Speech, Language and Computer Vision Technology during the Frederick Jelinek Memorial Summer Workshop at John Hopkins University during the summer.

Participating on the JHU Summer School on Human Language TechnologyJun 2018

Accepted to participate on the Summer School on Human Language Technology at the Johns Hopkins University.

Presenting at WiML 2017Dec 2017

I will be presenting our work Temporal-aware Cross-modal Embeddings for Video and Audio Retrieval at Women in Machine Learning (WiML) workshop in December.

Student Volunteer at NIPS 2017Dec 2017

I will be a student volunteer at NIPS 2017. Looking forward to see you in Long Beach, CA.

New start: PhD Student at UPC Oct 2017

I am very excited to announce that I am now a PhD student at Universitat Politècnica de Catalunya (UPC) under the supervision of Prof. Jordi Torres and Prof. Xavier Giró.
I will be part of the Image Processing Group (GPI).

Facebook/Caffe2 research award Oct 2017

Happy to announce that our project “Speech2Signs: Spoken to Sign Language Translation using Neural Networks won 1 out of 5 Caffe2 research awards.
Thanks Facebook Research and Academic Relations Program.

Teaching Assistant at NVIDIA Deep Learning Institute Workshop Sep 2017

I will be a teaching assistant at the NVIDIA Deep Learning Institute Workshop by UPC and BSC.

Visiting Student at BSC - UPC Sep - Dec 2017

I will be a visiting student at Barcelona Computing Center(BSC) - Universitat Politècnica de Catalunya(UPC) doing research under the supervision of Prof. Jordi Torres and Prof. Xavier Giró.
Thanks Severo Ochoa Mobility Program

Master in Computer Engineering April 2017

I received my master's in Computer Engineering from the Federal University of Rio Grande.
Research topic: Dataset Generation for Computer Vision and Performance Analysis of Image Restoration Methods Applied to Underwater Environments.
My Master's thesis is available in Portuguese.

TURBID Dataset is now Available Fev 2017

Our public Underwater image dataset for algorithm performance analysis is now available.


Full list at my Google Scholar profile.

A. Duarte, G. Camli, J. Torres, X. Giró-i-Nieto. Towards Speech to Sign Language Translation In European Conference on Computer Vision (ECCV) Workshop on Shortcomings in Vision and Language (SiVL), Munich, September 2018.

D. Surís, A. Duarte, A. Salvador, X. Giró-i-Nieto. Cross-modal Embeddings for Video and Audio Retrieval arXiv preprint:arXiv:1801.02200, January 2017.

A. Duarte, F. Codevilla, J.D.O. Gaya and S.S.C. Botelho. TURBID: An Underwater Turbid Image Dataset. In European Conference on Computer Vision (ECCV) Workshop on Datasets and Analysis Performance in Early Vision, Amsterdam, October 2016.

A. Duarte, F. Codevilla, J.D.O. Gaya and S.S.C. Botelho. A dataset to evaluate underwater image restoration methods.In IEEE OCEANS 2016-Shanghai, Shanghai, China, April 2016.

J.O. Gaya, F Codevilla, A.C. Duarte, P. Drews-Jr, S.S. Botelho. Single Image Restoration for Participating Media Based on Prior Fusion. arXiv preprint arXiv: arXiv:1603.01864, Jan 2017.

M.M. dos Santos, G.B. Zaffari, A.C. Duarte, D.A. Fernandes, P.L.J. Drews-Jr, S.S.C. Botelho. A modified topological descriptor for forward looking sonar images. In IEEE OCEANS 2016-Shanghai (Oral Paper), Shanghai, China, April 2016.

G.B. Zaffari, M.M. dos Santos, A.C. Duarte, D.A. Fernandes, S.S.C. Botelho. Exploring the DolphinSLAM’s parameters. In IEEE OCEANS 2016-Shanghai (Oral Paper), Shanghai, China, April 2016.

A.C. Duarte, G.B. Zaffari, R.T.S. da Rosa, L.M. Longaray, P. Drews-Jr, S.S.C. Botelho. Towards Comparison of Underwater SLAM Methods: An Open Dataset Collection. In IEEE OCEANS 2016-Monterey, Monterey, United States, October 2016.

J.D.O. Gaya, L. Gonçalves, A.C. Duarte, B. Zanchetta, P. Drews-Jr, S.S.C. Botelho. Vision-based Obstacle Avoidance Using Deep Learning. In 13th Latin-America Robotics Symposium - LARS 2016.

Research Projects

Neural Machine Translation for Multimedia Accessibility

Speech2Signs: Spoken to Sign Language Translation using Neural Networks 2017-2021

Hearing impairment is the most common communication disorder affecting about 360 million people worldwide according to the World Health Organization. For many of these individuals, Sign Language is their primary mean of communication. Speech2Signs aims to remove the difficulties and barriers which deaf people encounter when watching online video, by automatically generate a puppet embedded in online videos to interpret the translation of the speech signal into American Sign Language.
This project has been awarded with one of the five Caffe2 Research Awards of 2017 granted by Facebook.

Single Image Restoration for Participating Media

Single Image Restoration for Participating Media Based on Prior Fusion 2015-2016

Differently from the existing methods developed to deal with the single image restoration problem in participating media i.e. underwater, fog, smoke, cloud and sand storm) that are made to specific kind of degradation, we come up with a novel interpretation of the image formation model in these environments by considering the color variation present on the medium. Thus, we propose a general single image restoration method considering that joining different image priors, such as local contrast and color data, is an effective approach for image restoration more robust to environment changes.

General Participating Media Image Restoration Using Deep Learning 2016-2017

For image restoration methods designed to participating media (underwater, fog, smoke, cloud and sand storm) one of the most important points besides a good restoration is to consider different amount of degradation that can occur in the same environment. For that, methods should consider different conditions as a way to propose priors. Thus, in this project we propose to use learning approaches such as Convolution Neural Network (CNN) as a way to solve this problem and make the method more robust to environment changes. This project is under development but already presents powerful results as the state-of-the art.

Computer Vision and Deep Learning Applied to Underwater Robotics

Vision-based Obstacle Avoidance using Deep Learning 2016

In this project we developed a solution applicable to small autonomous underwater vehicles equipped with cheaper equipment like a single monocular camera. We proposed a real-time obstacle avoidance method that works with monocular images and obtains a direction of escape. For each input image, our approach uses a deep neural network to compute a transmission map that can be understood as a relative depth map. With this map we are able to identify the most appropriate Region of Interest (RoI) and indicate the direction of escape. This work contributes not only providing the first underwater obstacle avoidance method using deep learning, but also proposing a new convolutional neural network (CNN) topology to estimate a transmission map of a input image that can be used for many other applications.

A Topological Descriptor for Forward Looking Sonar Images 2016

In this project we proposed a method capable to produce a segmentation, a description and a comparison between Forward Looking Sonar (FLS) images. The proposed method is based on the shapes and the topological relationships among the detected objects. The main idea is to build a graph of the Gaussian probability density function that represents both the shape and the topological relation.

DolphinSLAM: An Open-source Bio-inspired Solution to Underwater SLAM 2016

DolphinSLAM is an open-source bio-inspired method for underwater Simultaneous Localization and Mapping (SLAM). This method is inspired on the brain mechanisms for self-localization found in rodents where the external input of this method is filtered in a Continuous Attractor Neural Network (CANN) to estimate the robot position in one unknown environment.

Dataset Creation and Algorithm Performance Analysis

TURBID: An Underwater Turbid Image Dataset 2015-2017

Although the number of available datasets still progressively growing relating to different issues in many areas, there are hardly any resources currently available to the research community for high-quality underwater imaging data. In this project, we presented a dataset composed by different sets of underwater images with several levels of degradation, with its reference images. This degradation was produced by two different ways: i) in a controlled environment, where the amount of degradation were increased by successively addition of different substances into the water and ii) by a turbidity simulator where the degradation caused by real underwater particles can be reproduced in non-degraded images according to the relation of the objects' distance. This simulator was developed since it is not possible to create a large amount of underwater images with its ground truth in a controlled space or in real environments to be used in learning based methods, for example.

Simulated Datasets for Underwater Simultaneous Localization and Mapping (SLAM) 2016

On this project, we present an open collection of simulated datasets produced using the UnderWater Simulator (UWSim). These datasets contain several trajectories in simulated scenarios with various levels of turbidity. Also, several sensor to estimate the robot displacement are available. The ground truth is available by using Global Positioning System data (GPS). Those information can be used to analyse and to perform created methods for underwater SLAM.