Fayez Tarsha Kurdi; Wijdan Amakhchan; Zahra Gharineiat

doi:10.19080/IJESNR.2021.28.556234

Mini Review

Random Forest Machine Learning Technique for Automatic Vegetation Detection and Modelling in LiDAR Data

Fayez Tarsha Kurdi¹*, Wijdan Amakhchan² and Zahra Gharineiat³

¹Institute of Integrated and Intelligent Systems, Griffith University, Australia

²Faculté des Sciences et Techniques de Tanger

³ School of Civil Engineering & Surveying, Faculty of Health, Engineering and Sciences, University of Southern Queensland, Australia

Submission: May 12, 2021; Published: June 04, 2021

*Corresponding author: Fayez Tarsha Kurdi, Research Fellow, Institute of Integrated and Intelligent Systems, Griffith University, Nathan, QLD 4111, Australia

How to cite this article: Fayez Tarsha K, Wijdan A, Zahra G. Random Forest Machine Learning Technique for Automatic Vegetation Detection and Modelling in LiDAR Data. Int J Environ Sci Nat Res. 2021; 28(2): 556234. DOI: 10.19080/IJESNR.2021.28.556234

Abstract

Machine learning techniques have gained a distinguished position in the automatic processing of Light Detection and Ranging (LiDAR) data area. They represent the actual research topic in the remote sensing domain. Indeed, this paper presents one method of supervised machine learning, which is called Random Forest. This algorithm is discussed, and their primary applications in automatic vegetation extraction and modelling in the LiDAR data area are presented here.

Keywords: LiDAR; Random forest; Classification; Modelling

Introduction

Nowadays, Light Detection and Ranging (LiDAR) data wins an advanced position among other remote sensing data [1]. Automatic vegetation detection and modelling in forest and urban areas are one of the important envisaged applications of LiDAR data. In fact, automatic tree detection in LiDAR data belongs to the automatic classification of LiDAR data topic. The scanned scene consists of different man-made objects such as buildings, bridges, roads and dams. Furthermore, the studied zone may contain natural item classes such as vegetation, terrain, rivers, and lakes. In order to model the project area, its LiDAR point cloud has to be classified according to the main classes [2]. Once the classification is achieved successfully, the next step is to model each class aside. Concerning the vegetation detection and modelling, classic approaches were employed in the literature such as RANdom Sample Consensus (RANSAC) algorithm [3,4], local maximum algorithm [5], surface growing algorithm and multiple echo analysis [6], voxel layer single tree modelling algorithm [7], morphological algorithm [8] and analysis of fullwave form LiDAR data [9].

Recently, a modern technique called machine learning enhanced the automatic processing of the LIDAR data area. This technique becomes quickly widespread and occupied a major position in regard to the other classical approaches. This paper presents one machine learning method that widely applied for automatic vegetation recognition and modelling in LiDAR data field. This method is Random Forest (RF). The paper aims to summarize the principal of this technique in addition to its main applications in LiDAR data field.

In the next section, Random forest technique will be discussed.

Random Forest

RF is an ensemble of supervised learning algorithms used for classification and regression, used in predictive modelling and machine learning technique [10]. It gathers the results and the predictions of several decision trees to finally choose the best output which is the mode (the value that appears most often in the set of decision trees results) of the classes or mean prediction.

RF works by splitting the dataset into two sections, the training set and the test set. Then randomly select multiple samples from the training set. Next, use the decision tree for each sample which divides each selection into two daughters using the best division. Thereafter, repeat the last step to finally vote for each prediction result and select the most voted prediction as the final result (Figure 1).

The main hyperparameters in Random Forest are either used to increase the predictive power of the model or to make the model faster [11]. In this context, a higher number of trees can increase the performance as well as makes the predictions more stable, but unfortunately, the processing time becomes longer. Furthermore, the employment of a maximum number of features in addition to a minimum number of leaves are that requested splitting internal nodes may improve the algorithm performance. Once the training step is realized, the trained model can be applied to a dataset that is not used for training. This procedure allows estimating their predictions and compared them to the expected values [12].

In literature, many authors applied RF exclusively on LiDAR data [13,14] whereas other authors simultaneously used additional data and LiDAR point cloud as input for RF algorithm [15,16]. From another viewpoints, several applications were achieved on LiDAR data by using the RF technique. Yu et al. [14] suggested an approach for estimating tree characteristics such as height, diameter, and stem volume using LiDAR data. For attending this goal, RF is considered as a classifier. Levick et al. [17] fused the Digital Surface Model (DSM) calculated from LiDAR point cloud and field-measured wood volume using RF Algorithm. Chen et al. [13] used the feature selection method and RF algorithm for forested landslide detection. For this purpose, the Digital Terrain Model (DTM) and the slope model was established for the scanned scene, and the selected features are calculated at the pixel level. The same principle was applied by Guan et al. [18] to classify the city components in the urban zones.

RF was broadly used for vegetation detection in forest and urban areas. Niemeyer et al. [19] classified the scanned city elements by integrating RF classifier into a Conditional Random Field (CRF) framework. Moreover, Man et al. [16] extracted grasses and trees in urban areas using airborne LiDAR and hyperspectral data. RF and object-based classification methods were employed together to extract the distribution map of urban vegetation. [20,21] underlined the efficiency of RF for vegetation detection in forest and urban areas. Huang & Zhu [15] developed an approach for fusing hyperspectral image and LiDAR data based on RF. In this context, each feature is ranked by RF, and more useful features are selected as inputs for RF for data classification.

Conclusion

RF is an efficient machine learning technique that can be used for automatic vegetation extraction and modelling in forests and urban zones in LiDAR data. In this context, LiDAR data can be used exclusively or in addition to other supplementary data such as field-measurement and hyperspectral data.

IJESNR.MS.ID.556234

Our Media Partner

IJESNR Menu

Useful Links

Downloads

Random Forest Machine Learning Technique for Automatic Vegetation Detection and Modelling in LiDAR Data

Fayez Tarsha Kurdi¹*, Wijdan Amakhchan² and Zahra Gharineiat³

Abstract

Introduction

Random Forest

Conclusion

References

Member In:

IJESNR.MS.ID.556234

Our Media Partner

IJESNR Menu

Useful Links

Downloads

Random Forest Machine Learning Technique for Automatic Vegetation Detection and Modelling in LiDAR Data

Fayez Tarsha Kurdi1*, Wijdan Amakhchan2 and Zahra Gharineiat3

Abstract

Introduction

Random Forest

Conclusion

References

Member In:

Fayez Tarsha Kurdi¹*, Wijdan Amakhchan² and Zahra Gharineiat³