umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Object Detection and Recognition in Unstructured Outdoor Environments
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Robotics)
2019 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Computer vision and machine learning based systems are often developed to replace humans in harsh, dangerous, or tedious situations, as well as to reduce the required time to accomplish a task. Another goal is to increase performance by introducing automation to tasks such as inspections in manufacturing applications, sorting timber during harvesting, surveillance, fruit grading, yield prediction, and harvesting operations.Depending on the task, a variety of object detection and recognition algorithms can be applied, including both conventional and deep learning based approaches. Moreover, within the process of developing image analysis algorithms, it is essential to consider environmental challenges, e.g. illumination changes, occlusion, shadows, and divergence in colour, shape, texture, and size of objects.

The goal of this thesis is to address these challenges to support development of autonomous agricultural and forestry systems with enhanced performance and reduced need for human involvement.This thesis provides algorithms and techniques based on adaptive image segmentation for tree detection in forest environment and also yellow pepper recognition in greenhouses. For segmentation, seed point generation and a region growing method was used to detect trees. An algorithm based on reinforcement learning was developed to detect yellow peppers. RGB and depth data was integrated and used in classifiers to detect trees, bushes, stones, and humans in forest environments. Another part of the thesis describe deep learning based approaches to detect stumps and classify the level of rot based on images.

Another major contribution of this thesis is a method using infrared images to detect humans in forest environments. To detect humans, one shape-dependent and one shape-independent method were proposed.

Algorithms to recognize the intention of humans based on hand gestures were also developed. 3D hand gestures were recognized by first detecting and tracking hands in a sequence of depth images, and then utilizing optical flow constraint equations.

The thesis also presents methods to answer human queries about objects and their spatial relation in images. The solution was developed by merging a deep learning based method for object detection and recognition with natural language processing techniques.

Place, publisher, year, edition, pages
Umeå: Umeå University , 2019. , p. 88
Series
Report / UMINF, ISSN 0348-0542 ; 19.08
Keywords [en]
Computer vision, Deep Learning, Harvesting Robots, Automatic Detection and Recognition
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:umu:diva-165069ISBN: 978-91-7855-147-7 (print)OAI: oai:DiVA.org:umu-165069DiVA, id: diva2:1368781
Public defence
2019-12-05, MA121, MIT Building, Umeå, 13:00 (English)
Opponent
Supervisors
Available from: 2019-11-14 Created: 2019-11-08 Last updated: 2019-11-12Bibliographically approved
List of papers
1. Detection of Trees Based on Quality Guided Image Segmentation
Open this publication in new window or tab >>Detection of Trees Based on Quality Guided Image Segmentation
2014 (English)In: Second International Conference on Robotics and associated High-technologies and Equipment for Agriculture and forestry (RHEA-2014): New trends in mobile robotics, perception and actuation for agriculture and forestry / [ed] Pablo Gonzalez-de-Santos and Angela Ribeiro, RHEA Consortium , 2014, p. 531-540Conference paper, Published paper (Refereed)
Abstract [en]

Detection of objects is crucial for any autonomous field robot orvehicle. Typically, object detection is used to avoid collisions whennavigating, but detection capability is essential also for autonomous or semiautonomousobject manipulation such as automatic gripping of logs withharvester cranes used in forestry. In the EU financed project CROPS,special focus is given to detection of trees, bushes, humans, and rocks inforest environments. In this paper we address the specific problem ofidentifying trees using color images. A presented method combinesalgorithms for seed point generation and segmentation similar to regiongrowing. Both algorithms are tailored by heuristics for the specific task oftree detection. Seed points are generated by scanning a verticallycompressed hue matrix for outliers. Each one of these seed points is thenused to segment the entire image into segments with pixels similar to asmall surrounding around the seed point. All generated segments are refinedby a series of morphological operations, taking into account thepredominantly vertical nature of trees. The refined segments are evaluatedby a heuristically designed quality function. For each seed point, thesegment with the highest quality is selected among all segments that coverthe seed point. The set of all selected segments constitute the identified treeobjects in the image. The method was evaluated with images containing intotal 197 trees, collected in forest environments in northern Sweden. In thispreliminary evaluation, precision in detection was 81% and recall rate 87%.

Place, publisher, year, edition, pages
RHEA Consortium, 2014
Keywords
Seed point, Image segmentation, Region growing
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:umu:diva-93290 (URN)978-84-697-0248-2 (ISBN)
Conference
Second International Conference on Robotics and associated High-technologies and Equipment for Agriculture and forestry (RHEA-2014)
Funder
EU, FP7, Seventh Framework Programme, 246252
Available from: 2014-09-15 Created: 2014-09-15 Last updated: 2019-11-11Bibliographically approved
2. Adaptive Image Thresholding of Yellow Peppers for a Harvesting Robot
Open this publication in new window or tab >>Adaptive Image Thresholding of Yellow Peppers for a Harvesting Robot
2018 (English)In: Robotics, E-ISSN 2218-6581, Vol. 7, no 1, article id 11Article in journal (Refereed) Published
Abstract [en]

The presented work is part of the H2020 project SWEEPER with the overall goal to develop a sweet pepper harvesting robot for use in greenhouses. As part of the solution, visual servoing is used to direct the manipulator towards the fruit. This requires accurate and stable fruit detection based on video images. To segment an image into background and foreground, thresholding techniques are commonly used. The varying illumination conditions in the unstructured greenhouse environment often cause shadows and overexposure. Furthermore, the color of the fruits to be harvested varies over the season. All this makes it sub-optimal to use fixed pre-selected thresholds. In this paper we suggest an adaptive image-dependent thresholding method. A variant of reinforcement learning (RL) is used with a reward function that computes the similarity between the segmented image and the labeled image to give feedback for action selection. The RL-based approach requires less computational resources than exhaustive search, which is used as a benchmark, and results in higher performance compared to a Lipschitzian based optimization approach. The proposed method also requires fewer labeled images compared to other methods. Several exploration-exploitation strategies are compared, and the results indicate that the Decaying Epsilon-Greedy algorithm gives highest performance for this task. The highest performance with the Epsilon-Greedy algorithm ( ϵ = 0.7) reached 87% of the performance achieved by exhaustive search, with 50% fewer iterations than the benchmark. The performance increased to 91.5% using Decaying Epsilon-Greedy algorithm, with 73% less number of iterations than the benchmark.

Place, publisher, year, edition, pages
MDPI, 2018
Keywords
reinforcement learning, Q-Learning, image thresholding, ϵ-greedy strategies
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computerized Image Analysis
Identifiers
urn:nbn:se:umu:diva-144513 (URN)10.3390/robotics7010011 (DOI)000432680200008 ()
Funder
EU, Horizon 2020, 644313
Available from: 2018-02-05 Created: 2018-02-05 Last updated: 2019-11-11Bibliographically approved
3. Integrating kinect depth data with a stochastic object classification framework for forestry robots
Open this publication in new window or tab >>Integrating kinect depth data with a stochastic object classification framework for forestry robots
2012 (English)In: Proceedings of the 9th International Conference on Informatics in Control, Automation and Robotics: Volume 2, SciTePress , 2012, p. 314-320Conference paper, Published paper (Other academic)
Place, publisher, year, edition, pages
SciTePress, 2012
National Category
Robotics
Identifiers
urn:nbn:se:umu:diva-71443 (URN)
Conference
9th International Conference on Informatics in Control, Automation and Robotics, 28-31 July 2012, Rome, Italy
Available from: 2013-05-29 Created: 2013-05-29 Last updated: 2019-11-11Bibliographically approved
4. Detection and classification of Root and Butt-Rot (RBR) in Stumps of Norway Spruce Using RGB Images and Machine Learning
Open this publication in new window or tab >>Detection and classification of Root and Butt-Rot (RBR) in Stumps of Norway Spruce Using RGB Images and Machine Learning
Show others...
2019 (English)In: Sensors, ISSN 1424-8220, E-ISSN 1424-8220, Vol. 19, no 7, article id 1579Article in journal (Refereed) Published
Abstract [en]

Root and butt-rot (RBR) has a significant impact on both the material and economic outcome of timber harvesting, and therewith on the individual forest owner and collectively on the forest and wood processing industries. An accurate recording of the presence of RBR during timber harvesting would enable a mapping of the location and extent of the problem, providing a basis for evaluating spread in a climate anticipated to enhance pathogenic growth in the future. Therefore, a system to automatically identify and detect the presence of RBR would constitute an important contribution to addressing the problem without increasing workload complexity for the machine operator. In this study, we developed and evaluated an approach based on RGB images to automatically detect tree stumps and classify them as to the absence or presence of rot. Furthermore, since knowledge of the extent of RBR is valuable in categorizing logs, we also classify stumps into three classes of infestation; rot = 0%, 0% < rot > 50% and rot ≥ 50%. In this work we used deep-learning approaches and conventional machine-learning algorithms for detection and classification tasks. The results showed that tree stumps were detected with precision rate of 95% and recall of 80%. Using only the correct output (TP) of the stump detector, stumps without and with RBR were correctly classified with accuracy of 83.5% and 77.5%, respectively. Classifying rot into three classes resulted in 79.4%, 72.4%, and 74.1% accuracy for stumps with rot = 0%, 0% < rot > 50% and rot ≥ 50%, respectively. With some modifications, the developed algorithm could be used either during the harvesting operation to detect RBR regions on the tree stumps or as an RBR detector for post-harvest assessment of tree stumps and logs.

Place, publisher, year, edition, pages
MDPI, 2019
Keywords
deep learning; forest harvesting; tree stumps; automatic detection and classification
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computerized Image Analysis
Identifiers
urn:nbn:se:umu:diva-157716 (URN)10.3390/s19071579 (DOI)000465570700098 ()30939827 (PubMedID)
Projects
PRECISION
Funder
The Research Council of Norway, NFR281140
Available from: 2019-04-01 Created: 2019-04-01 Last updated: 2019-11-11Bibliographically approved
5. A Direct Method for 3D Hand Pose Recovery
Open this publication in new window or tab >>A Direct Method for 3D Hand Pose Recovery
Show others...
2014 (English)In: 22nd International Conference on Pattern Recognition, 2014, p. 345-350Conference paper, Published paper (Refereed)
Abstract [en]

This paper presents a novel approach for performing intuitive 3D gesture-based interaction using depth data acquired by Kinect. Unlike current depth-based systems that focus only on classical gesture recognition problem, we also consider 3D gesture pose estimation for creating immersive gestural interaction. In this paper, we formulate gesture-based interaction system as a combination of two separate problems, gesture recognition and gesture pose estimation. We focus on the second problem and propose a direct method for recovering hand motion parameters. Based on the range images, a new version of optical flow constraint equation is derived, which can be utilized to directly estimate 3D hand motion without any need of imposing other constraints. Our experiments illustrate that the proposed approach performs properly in real-time with high accuracy. As a proof of concept, we demonstrate the system performance in 3D object manipulation. This application is intended to explore the system capabilities in real-time biomedical applications. Eventually, system usability test is conducted to evaluate the learnability, user experience and interaction quality in 3D interaction in comparison to 2D touch-screen interaction.

Series
International Conference on Pattern Recognition, ISSN 1051-4651
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:umu:diva-108475 (URN)10.1109/ICPR.2014.68 (DOI)000359818000057 ()978-1-4799-5208-3 (ISBN)
Conference
22ND International Conference on Pattern Recognition (ICPR, 24–28 August 2014, Stockholm, Sweden
Available from: 2015-09-14 Created: 2015-09-11 Last updated: 2019-11-11Bibliographically approved
6. Human Detection Based on Infrared Images in Forestry Environments
Open this publication in new window or tab >>Human Detection Based on Infrared Images in Forestry Environments
2016 (English)In: Image Analysis and Recognition (ICIAR 2016): 13th International Conference, ICIAR 2016, in Memory of Mohamed Kamel, Póvoa de Varzim, Portugal, July 13-15, 2016, Proceedings, 2016, p. 175-182Conference paper, Published paper (Refereed)
Abstract [en]

It is essential to have a reliable system to detect humans in close range of forestry machines to stop cutting or carrying operations to prohibit any harm to humans. Due to the lighting conditions and high occlusion from the vegetation, human detection using RGB cameras is difficult. This paper introduces two human detection methods in forestry environments using a thermal camera; one shape-dependent and one shape-independent approach. Our segmentation algorithm estimates location of the human by extracting vertical and horizontal borders of regions of interest (ROIs). Based on segmentation results, features such as ratio of height to width and location of the hottest spot are extracted for the shape-dependent method. For the shape-independent method all extracted ROI are resized to the same size, then the pixel values (temperatures) are used as a set of features. The features from both methods are fed into different classifiers and the results are evaluated using side-accuracy and side-efficiency. The results show that by using shape-independent features, based on three consecutive frames, we reach a precision rate of 80 % and recall of 76 %.

Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 9730
Keywords
Human detection, Thermal images, Shape-dependent, Shape-independent, Side-accuracy, Side-efficiency
National Category
Robotics
Identifiers
urn:nbn:se:umu:diva-124428 (URN)10.1007/978-3-319-41501-7_20 (DOI)000386604000020 ()978-3-319-41501-7 (ISBN)978-3-319-41500-0 (ISBN)
Conference
13th International Conference on Image Analysis and Recognition, ICIAR 2016, July 13-15, 2016, Póvoa de Varzim, Portugal
Available from: 2016-08-10 Created: 2016-08-10 Last updated: 2019-11-11Bibliographically approved
7. Natural Language Guided Object Retrieval in Images
Open this publication in new window or tab >>Natural Language Guided Object Retrieval in Images
2019 (English)In: Sensors, ISSN 1424-8220, E-ISSN 1424-8220Article in journal (Refereed) Submitted
Abstract [en]

In this paper we propose a method for generation of responses to natural language queries regarding objects and their spatial relations in given images. The responses comprise identification of objects in the image, and generation of appropriate text answering the query. The proposed method uses a pre-defined neural network (YOLO) for object detection, combined with natural language processing of the given queries. Probabilistic measures are constructed for object classes, spatial relations, and word similarity such that the most likely grounding of the query can be done. By computing semantic similarity, our method overcame the problems with a limited number of object classes in pre-trained network models. At the same time, flexibility regarding the varying ways users express spatial relations was achieved. The method was implemented, and evaluated by 30 test users who considered 81.9\% of the generated answers as correct. The work may be applied in applications where visual input (images or video) and natural language input (speech or text) have to be related to each other. For example, processing of videos may benefit from functionality that relates audio to visual content. Urban Search and Rescue Robots (USAR) are used to find people in catastrophic situations such as flooding or earthquakes. It would be very beneficial if such a robot is able to respond to verbal questions from the operator about what the robot sees with its remote cameras.

Place, publisher, year, edition, pages
MDPI, 2019
Keywords
convolutional neural network, natural language grounding, object retrieval, spatial relations, semantic similarity
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:umu:diva-165065 (URN)
Available from: 2019-11-08 Created: 2019-11-08 Last updated: 2019-12-10

Open Access in DiVA

fulltext(776 kB)48 downloads
File information
File name FULLTEXT02.pdfFile size 776 kBChecksum SHA-512
13133636a9b0e9b9451da53ac5f771f5ce4495d2cb5d2f4df0eba42d167bc2fc2ebcf0cc586e560795d4ec79893ad2c1d64fcca7c6407f5d4fb0b95d782ffb85
Type fulltextMimetype application/pdf
spikblad(343 kB)5 downloads
File information
File name SPIKBLAD01.pdfFile size 343 kBChecksum SHA-512
171ca6d8e285032fa39c487b3b3904a250f1515dbcb443d78eb6c03b83e93623fff3a1c8cccd3466b8370574382ee415edae6ef96364a1e2c8c67f6ce949c573
Type spikbladMimetype application/pdf

Authority records BETA

Ostovar, Ahmad

Search in DiVA

By author/editor
Ostovar, Ahmad
By organisation
Department of Computing Science
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar
Total: 48 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1083 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf