Dena Bazazian is a research scientist at CTTC (Centre Tecnològic de Telecomunicacions de Catalunya), her research focuses on computer vision and geometric deep learning algorithms to analyze 3D point clouds.
Before that she was a postdoctoral researcher at the Computer Vision Center (CVC), Autonomous University of Barcelona (UAB) where she accomplished her PhD in 2018.
She made research stays at NAVER LABS Europe in Grenoble, France and at Media Integration and Communication Center (MICC), University of Florence, Italy. She was also working on edge extraction and feature description of unorganized point clouds at Universitat Politecnica de Catalunya (UPC) between 2013 and 2015.
Dena Bazazian was one of the members of the organizing committee of Women in Computer Vision (WiCV) Workshops at CVPR2018 and ECCV2018. Furthermore, she was one of the co-organizers of ICDAR 2017 Robust Reading Challenge on Omnidirectional Video.
Can Generative Adversarial Networks Teach Themselves Text Segmentation?
Al-Rawi, M., Bazazian, D., Valveny, E. ICCV w, 2019.
In the information age in which we live, text segmentationfrom scene images is a vital prerequisite task used in manytext understanding applications. Text segmentation is a dif-ficult problem because of the potentially vast variation intext and scene landscape. Moreover, systems that learn toperform text segmentation usually need non-trivial annota-tion efforts. We present in this work a novel unsupervisedmethod to segment text at the pixel-level from scene images.The model we propose, which relies on generative adversar-ial neural networks, segments text intelligently; and doesnot therefore need to associate the scene image that con-tains the text to the ground-truth of the text. The main ad-vantage is thus skipping the need to obtain the pixel-levelannotation dataset, which is normally required in trainingpowerful text segmentation models. The results are promis-ing, and to the best of our knowledge, constitute the firststep towards reliable unsupervised text segmentation. Ourwork opens a new research path in unsupervised text seg-mentation and poses many research questions with a lot oftrends available for further improvement.
Word Spotting in Scene Images based on Character Recognition Bazazian, D., Karatzas, D. and Bagdanov, A. CVPR w, 2018.
In this work we address the challenge of spotting text in scene images without restricting the words to a fixed lexicon or dictionary. Words which are typically out of dictionary include, for instance, cases where exclamation or other punctuation marks are present in words, telephone numbers, URLs, dates, etc.
To this end, we train a Fully Convolutional Network to produce heatmaps of all the character classes. Then, we employ the Text Proposals approach and, via a rectangle classifier, detect the most likely rectangle for each query word based on the character attribute maps. The key advantage of the proposed method is that it allows unconstrained out-of-dictionary word spotting independent from any dictionary or lexicon. We evaluate the proposed method on ICDAR2015 and show that it is capable of identifying and recognizing query words in natural scene images.
FAST: Facilitated and Accurate Scene TextProposal through FCN Guided Pruning Bazazian, D., Gomez, R., Gomez, L., Nicolaou, A., Karatzas, D., and Bagdanov, A. Pattern Recognition Letter (PRL) Journal, 2017.
Class-specific text proposal algorithms can efficiently reduce the search space for possible text object locations in an image. In this paper we combine the Text Proposals algorithm with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same recall level and thus gaining a significant speed up. Our experiments demonstrate that such text proposal approaches yield significantly higher recall rates than state-of-the-art text localization techniques, while also producing better-quality localizations. Our results on the ICDAR 2015 Robust Reading Competition (Challenge 4) and the COCO-text datasets show that, when combined with strong word classifiers, this recall margin leads to state-of-the-art results in end-to-end scene text recognition.
Reading Text in the Wild from Compressed Images
Galteri, L., Bazazian, D., Seidenari, L., Bertini, M., Bagdanov, A., Nicolaou, A., Karatzas, D. ICCV w, 2017.
Reading text in the wild is gaining attention in the computer vision community. Images captured in the wild are
almost always compressed to varying degrees, depending on application context, and this compression introduces artifacts that distort image content into the captured images. In this paper we investigate the impact these compression
artifacts have on text localization and recognition in the wild. We also propose a deep Convolutional Neural Network (CNN) that can eliminate text-specific compression artifacts and which leads to an improvement in text recognition. Experimental results on the ICDAR-Challenge4 dataset demonstrate that compression artifacts have a significant impact on text localization and recognition and that our approach yields an improvement in both – especially at high compression rates.
Improving Text Proposals for Scene Images with Fully Convolutional Networks Bazazian, D., Gomez, R., Gomez, L., Nicolaou, A., Karatzas, D., and Bagdanov, A. ICPR, 2016.
Text Proposals have emerged as a class-dependent version of object proposals – efficient approaches to reduce the search space of possible text object locations in an image. Combined with strong word classifiers, text proposals currently yield top state of the art results in end-to-end scene text
recognition. In this paper we propose an improvement over the original Text Proposals algorithm, combining it with Fully Convolutional Networks to improve the ranking of proposals. Results on the ICDAR RRC and the COCO-text datasets show superior performance over current state-of-the-art.
ICDAR2017 Robust Reading Challenge on Omnidirectional Video
Iwamura, M., Morimoto, N., Tainaka, K., Bazazian, D., Gomez, L., Karatzas, D. ICDAR, 2017.
This challenge focuses on scene text localization and recognition on the Downtown Osaka Scene Text (DOST) dataset. Five tasks will be opened within this Challenge: Text Localisation in Videos, Text Localisation in Still Images, Cropped Word Recognition, End-to-End Recognition in Videos, and End-toEnd Recognition in Still Images. The DOST dataset preserves scene texts observed in the real environment as they were. The dataset contains videos (sequential images) captured in shopping streets in downtown Osaka with an omnidirectional camera. Use of the omnidirectional camera contributes to excluding user’s intention in capturing images. Sequential images contained in the dataset contribute to encouraging developing a new kind of text detection and recognition techniques that utilize temporal information. Another important feature of DOST dataset is that it contains non-Latin text. Since the images were captured in Japan, a lot of Japanese text is contained while it also contains adequate amount of Latin text. Because of these features of the dataset, we can say that the DOST dataset preserved scene texts in the wild.
Segmentation-based Multi-Scale Edge Extraction to Measure the Persistence of Features in Unorganized Point Clouds Bazazian, D., , Casas, J.R., Ruiz-Hidalgo, J. VISAPP, 2017.
Edge extraction has attracted a lot of attention in computer vision. The accuracy of extracting edges in point clouds can be a significant asset for a variety of engineering scenarios. To address these issues, we propose a segmentation-based multi-scale edge extraction technique. In this approach, different regions of a point cloud are segmented by a global analysis according to the geodesic distance. Afterwards, a multi-scale operator is defined according to local neighborhoods. Thereupon, by applying this operator at multiple scales of the point cloud, the persistence of features is determined. We illustrate the proposed method by computing a feature weight that measures the likelihood of a point to be an edge, then detects the edge points based on that value at both global and local scales. Moreover, we evaluate quantitatively and qualitatively our method. Experimental results show that the proposed approach achieves a superior accuracy. Furthermore, we demonstrate the robustness of our approach in noisier real-world datasets.
Fast and Robust Edge Extraction in unorganized Point Clouds Bazazian, D., , Casas, J.R., Ruiz-Hidalgo, J. DICTA, 2015.
Edges provide important visual information in scene surfaces. The need for fast and robust feature extraction from 3D data is nowadays fostered by the widespread availability of cheap commercial depth sensors and multi-camera setups. This article investigates the challenge of detecting edges in surfaces represented by unorganized point clouds. Generally, edge recognition requires the extraction of geometric features
such as normal vectors and curvatures. Since the normals alone do not provide enough information about the geometry of the
cloud, further analysis of extracted normals is needed for edge extraction, such as a clustering method. Edge extraction through these techniques consists of several steps with parameters which depend on the density and the scale of the point cloud. In this paper we propose a fast and precise method to detect sharp edge features by analysing the eigenvalues of the covariance matrix that are defined by each point’s k-nearest neighbors. Moreover, we evaluate quantitatively, and qualitatively the proposed methods for sharp edge extraction using several dihedral angles and well known examples of unorganized point clouds. Furthermore, we demonstrate the robustness of our approach in the noisier real-world datasets
An interview with the International Association for Pattern Recognition ( IAPR ) about the second IAPR Education Committee Research Scholar. [Link - Pages 7 and 8]
Fully Convolutional Networks for Text Understanding in Scene Images. CVC, 2018.[Video]
The introduction of the fifth Women in Computer Vision Workshop (WiCV) at ECCV 2018.[Video]
WiCV workshop was featured in the Best of ECCV section of Computer Vision News by RSIPVision. [Link - Pages 20—23]
RSIP article about WiCV at CVPR2018. [Link - Pages 46—50]