Data Readiness and Data Exploration for Successful Power Line Inspection

Eldad Antwi-Bekoe; Gerald Tietaa Maale; Ezekiel Mensah Martey; William Asiedu; Gabriel Nyame; Emmanuel Frimpong Nyamaah

doi:10.5772/intechopen.112637

Abstract

Sufficiently large, curated, and representative training data remains key to successful implementation of deep learning applications for wide-scale power line inspection. However, most researchers have offered limited insight regarding the inherent readiness of the knowledge bases that drives power line algorithm development. In most cases, these high dimensional datasets are also unexplored before modeling. In this article, power line image data readiness (PLIDaR) scale for AI algorithm development is proposed. Using the PLIDaR benchmark, this study analyzes the fundamental steps involved in preparing overhead transmission power line (OTPL) insulator image data for deep supervised learning algorithm development. Data visualization approach is implemented by reengineering the ground truth instance annotations of two recent public insulator datasets, while exploratory data analysis is also employed by implementing a robust dimensionality reduction technique to optimize construction, visualization, clustering, and analysis of these recent insulator datasets in a lower dimensional space. The implementations reveal representational variabilities and hidden patterns that could be exploited to improve data quality before predictive modeling. Moreover, the visualizations from dimensionality reduction technique have potential to help develop classifiers that are more reliable.

Keywords

preprocessing
dimensionality reduction
insulator image data
data curation
transmission power line

Author Information

Show +

Eldad Antwi-Bekoe*
- AAMUSTED, Kumasi, Ghana
Gerald Tietaa Maale
- University of Electronic Science and Technology of China, Chengdu, China
Ezekiel Mensah Martey
- University of Mines and Technology, Takoradi, Ghana
William Asiedu
- AAMUSTED, Kumasi, Ghana
Gabriel Nyame
- AAMUSTED, Kumasi, Ghana
Emmanuel Frimpong Nyamaah
- AAMUSTED, Kumasi, Ghana

*Address all correspondence to: eabekoe@aamusted.edu.gh

1. Introduction

Research efforts to provide enhanced computer-aided solutions for recognition and diagnosis of overhead power line insulators have evolved over the past decade. Interest in these efforts is sustained by the significant role insulators play in the power transmission and distribution networks. This include maintaining quality service delivery, forestalling costly power interruptions, providing safety guarantees to both humans and environment, and so on. The ever-increasing global need for uninterrupted power supply by electric consumers also imposes a crucial necessity for new and efficient inspection techniques. Compared with relatively older learning algorithms, deep learning (DL) [1] stands as the potential catalyst to satisfy diagnostic demands due to its recent impressive performances. However, DL currently lacks a mechanism for learning abstractions through explicit, verbal definition and works best when there are thousands, millions, or even billions of training examples [2].

Supervised learning model has been the predominant approach for insulator inspection in machine learning (ML) literature recently. This means that, as a prerequisite, insulator image data needs to be preprocessed in the required format capable of addressing the format requirement needs necessary to facilitate supervised or semi-supervised learning. In summary, data must be ready in order to ‘make sense’ for a DL model. As for standard evaluated baseline, generic and successful datasets such as ImageNet [3] and COCO [4] have offered impactful experiences. Currently, to the authors’ knowledge, the only existing public insulator datasets are Chinese Power Line Insulator Dataset (CPLID) [5] and Electric Power Research Institute (EPRI) Insulator Defect Image Dataset (EPRI-IDID V1.1) [6], which are only recent. Moreover, there is no study that provides a guideline or procedure for overhead transmission power line (OTPL) image data preparation for deep supervised learning at the preprocessing step. Furthermore, high-dimensional datasets are very difficult to visualize or understand. The only two recently released box-level public insulator datasets offer no understanding of their internal structure. Visualizing and understanding the internal structure of a high dimensional dataset before predictive modeling can be beneficial for decision-making on data and improving data quality for algorithm development. Visualization techniques offer a deeper understanding of datasets by uncovering patterns, trends, and relationships that are not readily apparent in the raw data and identifying potential outliers, anomalies, and errors.

In this article, motivated by accessibility and interoperability tenets of FAIR (Findable, Accessible, Interoperable, and Reusable) [7],

Power Line Image Data Readiness (PLIDaR) scale for AI algorithm development with focus on insulator image data is proposed to highlight insulator image data collection and preparation essentials. Other contributions are:
Using data visualization approach, the shapes of two recent public insulator datasets are evaluated by reengineering their ground truth instance annotations to visualize, analyze, and compare them.
Implementation of a supervised dimensionality reduction technique using the topological data analysis tool UMAP, to visualize, analyze, and compare two public high-dimensional insulator datasets while preserving the structure of the datasets.
Implementation of clustering task that reveals hidden patterns and outliers, suggesting room for data quality improvements and possibilities for refinement of the predictive modeling process, for more precise results.

The proposed PLIDaR benchmark is aimed to bridge the gap in existing literature regarding data collection and preparation essentials for OTPL image data inspection. The concept of Power Distribution Equipment Identifier (PDEID) integration is introduced, which could have potential benefits for data analysis. Using custom image samples of insulators from industry, the pivotal role that two randomly selected labeling tools play in data preparation process for three vision tasks is illustrated. Though this presentation focuses on insulators, the proposed methodology encompasses almost all essential OTPL components of inspection interest such as spacers, conductor fittings, and other nuisance on power towers including nests and so forth.

2. Data readiness - PLIDaR scale

Crowdsourcing (e.g., for annotations) and collaborative annotation have increasingly involved participants with multidisciplinary backgrounds. This approach has been a common practice for the image data preparation process in recent times. However, these participants are usually afforded little meaning to the significance of their task. Several organizations have emphasized the need for AI development within a country to happen at the grassroots level so that those implementing AI as a solution understand the context of the problem being solved [8, 9]. To better allow interdisciplinary conversations and bridge the educational gap on the inherent readiness of data, this article conforms to accessibility and interoperability standards, as highlighted by the FAIR data principles [7]. A power line image data readiness PLIDaR scale is proposed as a basic standard of appraisal reference for insulator image data. PLIDaR is inspired by the three-point scale schema in [10] but takes into account additional and specific requirements relating to the context of power line image data. The relevant entities are formulated into context by excluding nonrelevant processes and reconstructing MIDaR scale [11] to a four-level context-aligned PLIDaR scale shown in Figure 1. It is the belief that the PLIDaR scale will represent a standard reference during collaborative academic, technical, and business conversations (including preparedness levels during grant seeking). Furthermore, it is expected to sustain a level understanding and facilitate assessment of the relevant stages involving data readiness with respect to power line image data, as well as AI development projects under insulator inspection.

Figure 1.
PLIDaR. The four-level power line image data readiness scale.

2.1 PLIDaR - level 1

The first level of the PLIDaR scale portrays aerial image data in its ‘natural habitat’; the industrial setting of OTPL, or, in some cases, a combination of industry and other sources as in [12, 13, 14, 15] and so on. Typically, these are elements to facilitate electric utility delivery or to provide infrastructural support, without any consideration for any kind of research. They are unverified in both quality and quantity. The elements of interest are yet to be inspected or tracked. This type of data is the most abundant per unit of the four-scale comparison levels yet not easily accessible to interested researchers of electric transmission system globally. Data represent the least valuable of all the data readiness levels and are unsuitable for supervised or semi-supervised machine learning tasks. Level 1 data is defined by the attributes summarized at Figure 1.

Access to level 1 inspection data is in itself often a daunting challenge for some interested researchers globally. This is because stakeholders of image data acquisition phase usually act as a restricted coalition group limiting access to only approved researchers and conduct initial tests within a closed system. Limits to these barriers if removed should enable interested researchers gain access, increasing possibilities for faster development of deep learning applications for OTPL inspection. Moreover, further efforts through consensus at institutional, provincial, or a nationwide level could enable the combination and transformation of these aerial image repositories into level 2 data. Such data could serve as a new industry or foundation on which algorithms can be built to enhance generalizability.

2.2 PLIDaR - level 2

This data level characterizes raw data obtained after ethical approval and subsequent data collection from source. Acquired data is typically subject to errors, omissions, noise, and artifacts affecting both image and value of metadata. Level 2 data is defined by the attributes summarized in Figure 1. Data is considered level 2 data after ethical approval, data access, and collection from the industrial setting of OTPL. Access to collected data is usually restricted by authorization at this level.

2.3 PLIDaR - level 3

Level 3 data is defined by the attributes summarized in Figure 1. Data controllers are fully aware of data quantity and quality of relevant dataset(s) at this stage. Large-scale errors in data format and structure are also accounted for at this level. Task-specific data is detached from inferior quality or unwanted data as well. To refine data to meet standard of level 3, the processes of querying, visualization, resampling, and initial control of quality are performed on level 2 data.

2.4 PLIDaR - level 4

Data at level 4 exemplifies readiness for algorithm learning. It is the closest to perfect data for algorithmic learning and development as possible. It is better structured, fully annotated, has minimal noise, and, most importantly, is contextually appropriate and ready for a specific ML [11]. Level 4 data is defined by the attributes summarized at Figure 1. While representing the apex of data readiness (and most valuable), the volume of level 4 data significantly represents the smallest fraction of data volume compared to other levels on the PLIDaR benchmark. This fractional representation is accounted for by the cycle of quality control that excludes inferior quality and some unwanted data and lack of human annotators to provide adequate and well-labeled data. More often, researchers and the industry struggle to obtain enough level 4 data to provide robust statistical analysis. The context of PLIDaR scale is extended to provide a more elaborate description of the processes summarized at Figure 1 in the subsequent section.

3. Data collection and preparation cycle

3.1 Data acquisition - ethical consent

Before insulator images can be acquired for the development of an AI algorithm in most countries, certain local ethical procedures and checks are required. For example, institutional permission needs to be granted, regulatory checks passed, and data collection and usage compliance needs met. In some cases, an institutional review board may need to evaluate the potential risks to both the individual and environment, as well as benefits to the industry. This could be a lengthy process depending on the setup of the local ethics committee. Moreover, some jurisdictions may require authorization to fly certain classification of drones. In cases where data already exist, research participants escape bureaucratic procedures and explicit informed consent is generally waived. Where prospective data gathering is required, informed consent is necessary. The lack of adequate public datasets makes prospective data collection more prevalent compared to retrospective use cases.

Documenting clear objectives regarding the purpose and target task(s) to be undertaken, including any future objectives before collecting data, could prove beneficial. For instance, in image classification tasks, the assumption is that there is only one major object in the image and how to recognize its category is the objective. In object detection, semantic image segmentation and instance segmentation tasks are not only the categories of interest but also multiple objects and their specific positions within the image. Drones equipped with multipurpose camera modes such as infrared imaging capabilities alongside visible color images could be considered. Due to the increasing complexity and diversity of vision research, it is recommended that objectives are carefully thought through and projected beyond the immediate research objectives to minimize prospective data collection cycles. An official approval from owners or stakeholders of power transmission equipment fulfills the conceptual ethical obligation phase and secures the mandate to begin data collection process. This signifies entry into level 2 phase of data readiness PLIDaR scale.

3.2 Data collection overview

To the authors’ knowledge, all insulator image-based inspection research in current literature are supported by offline inspection technique. While Figure 2(b) provides an abstract illustration of drone-based real-time inspection scheme (which remains a target of technology industry), Figure 2(a) demonstrates UAV-based (Unmanned Aerial Vehicle) offline inspection (and data collection) technique that forms the bedrock support for existing research in current literature.

Figure 2.
(a) Offline inspection technique. Existing methods have predominantly used offline inspection techniques for aerial image data collection on OTPL. Images are collected and stored on storage devices during inspection and processed later. (b) Online inspection technique. Embedded vision computing platform for UAVs and wireless communication will enable automated real-time communication and assessment.

3.3 Data collection

Most integrated cameras on UAVs are enabled with picture- or video-storing capabilities through an external storage like a USB drive or an SD card. The more the storage capacity, the fewer times an on-site operator needs to swap media or dump footage. The risk of harm to both the individual and environment, the risk of device damage from magnetic interference between UAV and high-tension conductors, explains the centrality and necessity for an experienced operator in the data collection procedure in most cases. For example, safety navigation through usually difficult geographic terrains and safe distance flight compliance of drones will have to be adhered to. To our knowledge, existing reports from literature have failed to link insulator samples to Power Distribution Equipment Identifiers (PDEIDs) during data preparation. We recommend a scheme of photographing, ahead of every insulator image group for each power tower, the PDEID of these towers. In this paper, PDEID is defined as the unique identification label used for exclusively identifying equipment such as power tower in order to identify customers served by a transmission or distribution network. Establishing an image nomenclature system that associates PDIEDs with insulator image IDs should offer an additional level of data structuring (involving tagging) that could prove beneficial in tracking the location of a group of insulators linked to a tower spot, within a mapped study area. Moreover, it adds an additional layer of semantics to data that could be useful in analytics, especially for instance segmentation tasks.

After the data collection process, relevant data needs to be initially queried, access-secured, properly cleaned, and stored for further preprocessing. The next step is to link images to PDEIDs, annotate images (including labeling), and structure the data in homogenized and machine-readable formats. The final step is to link the images to ground-truth information, which can be one or more labels, bounding box, polygonal segments, and so on. The entire process to prepare power line aerial images for AI development is summarized in Figure 3.

Figure 3.
Summary of insulator image data acquisition and preparation process for AI algorithm development. Often, this process is undervalued and underappreciated; nonetheless, it is labor-intensive, takes the most of ML development process, and remains the bedrock on which AI algorithm development thrives.

3.4 Data storage, querying, and resampling

3.4.1 Storage

Depending on the purpose of usage, multiple options exist for specifically image datasets storage: (1) store image files individually on a file system; (2) store a few aggregated files in a file system by putting multiple images into a file; (3) store images as values (or records) in a key-value storage (or relational database) [16]. Lim et al. [16] found inefficiencies of using image files on local file systems including poor indexing when locating files, as well as inappropriate caching mechanism when used as back-end to train deep neural networks on large datasets. In their study, key-value systems demonstrated better efficiency of caching and indexing capabilities for large training samples.

3.4.2 Querying

Data collected is highly unstructured and inconsistent, rendering it unfit for ML analysis. Collected data will most likely contain accidental shots including objects of no interest, severely truncated objects, damaged photos, and so on. Unstructured data need to be queried, visualized, and cleaned, to engineer a resampled data that is ready to be annotated. At this stage, task-specific data is separated from unwanted or poor-quality data. Figure 4 shows some damaged and inferior quality image samples considered for removal during data preprocessing stage for both insulator detection and segmentation tasks. The process of excluding unwanted and low-quality images from unprocessed data is a quality control measure, in pursuit of level 2 data readiness of the PLIDaR scale.

Figure 4.
(a) and (b) Sample damaged UAV images excluded from unprocessed data as a quality control measure, in pursuit of level 3 data readiness of the PLIDaR benchmark. (c) An accidental photographic image from an industrial field (with no insulator object) captured by UAV camera during an offline inspection.

4. Ground truth and label assignment

Current AI algorithms for OTPL image inspection are predominantly based on a supervised learning approach. This means that before a computer vision algorithm can be trained and tested, the ground truth must be defined and linked to the image. Ground truth hereby is restricted to the insulator component representing the actual object of target for the deep supervised learning model, reflected by the relevant dataset. Ground truth definition encompasses metadata pointers or tags that define the spatial boundaries of the object of relevance within an image. On the contrary, labels are metadata tags (e.g., object classes) used to markup insulator elements within the image dataset providing pointers to what is relevant to be memorized by the vision algorithm. Figure 5 illustrates a typical user interface of labelImg¹, a tagging-based image-labeling tool with image annotation support capabilities for object detection tasks, while Figure 6 depicts the user interface of VGG image annotator (VIA) [17], a browser-based tool that supports annotation of images for segmentation tasks (semantic and instance). VIA also support the labeling of videos and voice records. Sager et al. [18] provide a fair analysis and a balanced decision template for annotation tool selection on several task-specific and multitasking computer vision tasks.

Figure 5.
A snapshot from the user interface of labelling image-labeling tool.

Figure 6.
A snapshot from the user interface of VGG image annotator.

4.1 Choosing appropriate label and ground truth definition

Ground truth labels intended to identify key features within images to enforce meaning during algorithmic training should be accurate. Creating predefined labels and ensuring they are consistent with conventions before annotating is priceless. The concept of “garbage in is garbage out” with computers is especially the case with regard to machine learning [19]. Flawless (consistent) labels are likewise always necessary to validate algorithms, even when weak labels [20, 21] are used for training. Pointers or metadata tags chosen to spatially define boundaries for objects must also be appropriate. Appropriateness is defined by the use case in question, the required input format of the chosen algorithm, and the desired output. For example, bounding boxes annotation have predominantly been used for ground truth definition of insulator objects in detection tasks, with PASCAL VOC [22] and YOLO input formats dominating COCO [4] and other formats in use cases. While the annotation format for PASCAL VOC is embodied as an XML file, COCO detection tasks can be saved as a JSON file [23] in VOC-format or COCO-format (as a single file for COCO dataset) [24]. These file formats, including others such as CSV, XLS, and so forth, define machine readable content for a tasks.

Annotating pixel-wise segmentation instead of bounding boxes puts even higher pressure on the sustainability of manual annotation [22]. This is particularly the case with strain and suspension insulators on transmission lines given their slender, irregular, and morphological contour. However, polygon annotation can express more fine details of object shape, and define spatial expressiveness in a better refined segment for each object in a scene compared to coarse bounding box detection (see Figure 7). Moreover, annotating individual instances of a class can add more meaning to an image description and improve data structure. Semiautomated labeling tasks [25, 26] for large-scale datasets in different domains are increasingly gaining attention, with potential to nudge humans and assume the supervision role of filtering, selecting, and updating to ease the tedious (and repetitive) labeling burden. Moreover, automated labeling tasks [27, 28] hold potential to eliminate human-in-the-loop based on some predefined model of classes. Zhou et al. [27] affirm that fully automated labeling has the propensity to yield better results than manual image labels, as the latter is always subject to human bias and error.

Figure 7.
Linking images to ground truths: Context readiness exemplified by outputs from visualization tools to illustrate readiness for (a) object detection, (b) semantic object segmentation, and (c) instance segmentation.

4.2 The setup of manual labeling

Inferring from the inadequacy of data volumes in literature, the authors conjecture that manual labeling is the annotation approach of choice and crowd-sourcing is an unlikely annotation approach for power line image data preparation. Manual labeling task organization can be grouped into the options of single-user and collaborative labeling. Collaborative labeling is likely to stand as a check to minimize bias, as compromise adds another level of quality control in support of data readiness (at PLIDaR Level 3). Additional meta-data tags are added to image data at this stage. The process of annotation is interlaced with continual screening, selection, and updating. It enables an expert to review ground truth labels and definitions as a quality control measure.

4.3 Context readiness and machine-readability

Post-processing the annotated dataset for model training is crucial. Some annotation formats also accommodate additional metadata (e.g., captions) for images that may not be relevant for a particular machine learning pipeline. Post-processing may include creating a dataframe to eliminate metadata or data outliers that are not relevant for model training, parsing through the annotations to inspect and verify label accuracy and readiness of annotated images for training, and reformatting predefined annotation format to other input formats (if necessary). Annotated dataset must also be split into training and test samples and linked to images. At this stage, training and testing data must be code-ready for other pre-training statistical analysis and documentation. Linking images to machine-readable ground truth provides an intuitive perspective on the behavior of a model on different classes, relative to the ground truth, in a much faster and precise manner.

For the same ground truth image, Figure 7(a) shows a GUI application preview of ground truth annotations obtained by importing the ground truth annotated image as an XML file, to be passed to Faster RCNN [29] object detection model for training. Figure 7(b) shows a preprocessed single channel input image mask generated from a JSON ground truth annotation file, to be passed to UNet [30] semantic segmentation model. Figure 7(c) is obtained by importing image dataset as a json file to be passed to Mask RCNN [31] instance segmentation model. The order of arrangement signifies the increasing semantic difficulty of task (a to c). Notice that object instance segmentation (Figure 7c) aims to distinguish different instances of the same object class, as opposed to semantic segmentation (Figure 7b) and object detection (Figure 7a), which do not. The entire process of executing attributes of level 3 PLIDaR data involves the craft of data molding, manipulation, and cleansing by expert knowledge that renders the image data machine ready.

5. The case of EPRI-IDID (V1.1) and CPLID

The ground truth data format employed for the two public insulator datasets EPRI-IDID and CPLID are COCO and PASCAL VOC, respectively. Figure 8(a) and (b) show sample images from EPRI-IDID and CPLID.

Figure 8.
Sample images from EPRI-IDID and CPLID. Best viewed in color.

Sample images from EPRI-IDID are annotated with ground truth instance categories for the categories of good string, broken string, flashover damage string, good shell, broken shell, and flashover damage shell, while ground truth instances of CPLID are annotated for the categories of good string, broken string (defective insulator strings), and defective insulator shells (defective discs). Figure 9(a–d) and Figure 9(e, f) are data visualizations of bar plots that detail statistical distribution of annotated instances from EPRI-IDID and CPLID, respectively, based on their categories.

Figure 9.
Enumerated categories of EPRI-IDID (a–d) and CPLID (e and f) -labeled assets grouped into bins. Best viewed in color.

Each bar plot is an encoding of data from the specified dataset that communicates information to the human visual system. The goal is to make salient features of the high-dimensional datasets available pre-attentively with less effort and also for further analysis. The encoded data is based on probability density functions, binning, and counting to represent the distribution of variables in the datasets. Let X be a discrete random variable representing a category or label, with probability mass function fx. Suppose that k is the number of bins used in the histogram of X. The probability density function (pdf) of X is a function that describes the likelihood of X taking on a certain value. The probability mass function and the pdf are equal for discrete variables: fx=PX=x. The range of X is divided into k nonoverlapping intervals or bins to produce a histogram of X. Each bin is represented by an interval aibi, where ai and bi are the lower and upper bounds of bin i, respectively. Each bin has a width of h=bi−ai, and its boundaries are chosen to cover the whole range of X. To construct a histogram, the number of observations that fall within each bin is counted. Assume ni is the number of observations that fall within the i-th bin, then the count is given by:

ni=∑j=1nIai≤Xj≤bi,E1

where the indicator function Iai≤Xj≤bi has a value of 1 if the j-th observation falls within the i-th bin and 0 otherwise. The histogram of X is a graph that represents the frequency of observations that fall within each bin. The height of each bar represents the i-th bin and is proportional to the count ni divided by the width hi of the bin, given as:

hi=nih,E2

where hi denote the height of the i-th bar in the graph.

The scatter plots are developed from the mathematical concept of showing the distribution of a bivariate dataset. Given a dataset of n observations xiyi, where i=1,…,n, a scatter plot represents a graph in which the values of xi are plotted along the horizontal axis, and the values of yi are plotted along the vertical axis. Each observation is represented by a single point on the graph, which can be colored or labeled according to a variable that represents a category. The scatter plot’s points will typically congregate around a line or curve if there is a strong correlation between x and y. The graph’s points will be dispersed at random if there is no association.

Figure 9(a) is a data visualization of bar plots that details statistical distribution of annotated instances from EPRI-IDID based on their categories. Notice that all data visualizations modeled on EPRI-IDID in this study are based only on the training set.

A graphical analysis enumerating the distribution of defective shell instances, grouped into bins and by their frequency of occurrence, can be visualized in Figure 9(c) and (d) for EPRI-IDID, while Figure 9(b) features the distribution of shell instances labeled as good shells across all EPRI images. Figure 9(e) depicts bar plots detailing statistical distribution of annotated instances from CPLID. CPLID broken shell instances are represented in Figure 9(f). The maximum bin size value for each visualization plot references the maximum number of shells per image with respect to the dataset. The sum of all shell instance occurrences (from each bin) across all bins per chart denotes the total number of instances for that category.

From Figure 9(b)–(d), the shapes of both good and defective shell instance distributions from EPRI resemble a log-normal distribution (positively skewed). The right-skewed distribution of good shells in Figure 9(b) also reveals outlier observations of 124, fitting into bin 10 to 11. Statistically, outliers and the distribution of data could have a potential influence on probability prediction. While shell instance distribution across images of EPRI follows a positive skew, that of CPLID assumes a uniform shape as shown in Figure 9(f). With reference to Figure 9(e), it is obvious that each image (for the broken shells category) has a single broken shell.

While most visualizations in Figure 9 present individual class observations of each dataset as a numeric range (fitted into bins), Figure 10(a) enables a comparison of class distribution for EPRI and CPLID datasets at a glance. Class distributions for each dataset category are easily understood by visualizing bar heights for each class distributions. Notice that emphasis is placed on defective instances relative to good insulator strings by excluding good insulator shell instances in Figure 10(a). The observation of a higher defective instance to good string ratio for EPRI compared to CPLID (for comparable categories) perhaps emphasizes a defect oriented data modeling for machine learning algorithm development. On the other hand, CPLID reflects a more natural observation in industrial settings where there is usually an imbalanced distribution of more negative insulator samples compared to positive defective cases.

Figure 10.
EPRI and CPLID comparison based on (a) defects to good insulator ratio, (b) insulator Bbox aspect ratio, (c), and (d) insulator asset width to height relationships. Best viewed in color.

Figure 10(b) compares the distribution of insulator bounding boxes for each dataset based on their aspect ratios. Aspect ratio for each insulator component is computed as width/height for both datasets. EPRI offers a more straightforward computation with width and height parameters extracted from the coordinates of the insulator instances (JSON COCO file). Width and height for CPLID’s PASCAL VOC’s XML file are computed from the relative coordinates of each annotated insulator instance using the equations width = Xmax−Xmin and height = Ymax−Ymin, respectively.

From Figure 10(b), the general overview of insulator object orientations for each dataset category can be perceived at a glance without physically examining the entire dataset. For example, it can be deduced from the distribution of CPLID that the majority (x>2≃93.7%) of insulator objects are oriented in the landscape position (see Figure 8(b)), with most representations appearing in the distribution of x>2 where x is aspect ratio. This is true because bounding box is correlated with the geometry of insulator objects. This pattern is contrary to observations from EPRI where the majority (x<1≃88.8%) of the insulator objects retain a portrait orientation (See Figure 8(a)). The difference also outlines representational variability for datasets in pursuit of algorithm development.

Figure 10(c) depicts an overlay density plot of 1788 data points, with each data point representing the relationship between the width and height coordinates of a bounding box for an insulator asset in EPRI, while 10(d) references a similar plot for 1321 data points for insulator assets in CPLID. These visualizations offer an elevated intuitive understanding away from the ‘all insulators’ aspect ratio analysis in Figure 10(b) and deeper down to class-specific analysis of insulator assets. The relative class distribution or clusters of data points creates an opportunity to target and prioritize class-specific predictive modeling. For instance, Figure 10(d) creates an opportunity to model an algorithm by prioritizing defective (broken) insulators based on the dimensional (width-height) clusters of bounding boxes (e.g., via anchor box choices). Choosing anchor box settings closer to the ground truth enables the model to converge faster during training. Works in [32, 33] provide some leads. The plot also appears to suggest a strong relationship among the groups of clusters. Visualizations like these also create an opportunity to observe outliers to improve preprocessing before algorithm development. By this, the efficiency of algorithm development for applications can be optimized.

From Figure 10(c), it can be observed that the majority of the data points for the respective classes fall within a certain range of width-height space (rectangular space). This provides a justification to model an algorithm to leverage the knowledge of dimensional space to improve algorithm performance and application development. In some cases, preprocessing techniques like normalization [34, 35] and data augmentation [36, 37, 38] could prove beneficial to algorithm development process. Besides performance gains, normalization has been known to facilitate convergence during training, while data augmentation could improve the balance of data distribution to mitigate effect of outliers and imbalanced datasets. These visualizations maintain significance as a more definitive insight into the two public insulator datasets available for algorithm development, compared to the limited or no data visualization presentations in literature.

5.1 Supervised dimensionality reduction with UMAP

To improve the topological representation of EPRI and CPLID datasets, a manifold learning technique is applied using UMAP algorithm [39]. Specifically, the geometric properties (input features) of these datasets in their high dimensional representation are used to implement clustering and supervised dimensionality reduction to yield a more compact, more easily interpretable representation of the insulator target concept. This is a useful data preparation technique before training a predictive model. The goal is to provide a general overview of the data and its structure and identify possible relationships or patterns in the data. In order to project high-dimensional data into a lower-dimensional space that preserves the fundamental structure (and makes sense to humans visually), UMAP’s internal machine learning system incorporates variety of methods from k- nearest neighbor (k-NN) search, graph theory from the k-NN graph, stochastic gradient descent optimization, and optimization techniques intended to minimize the distortions in the projection. UMAP algorithm is chosen over other similar dimensionality reduction techniques because data under consideration have complex nonlinear structure. UMAP offers flexibility, computing efficiency, and better efficiency at preserving the local and global structure of data in clustering. This can help to reveal hidden patterns and relationships in the data that may not be immediately apparent. Besides, less information is lost compared with other popular dimensionality reduction techniques while producing high-quality embeddings.

Previous visualization approaches are limited and are often used to show the relationship between two variables. Each data point is represented by a single point on a two-dimensional grid, with the x- and y-coordinates determined by the values of the two variables being plotted. UMAP, on the other hand, explores the structure and connections between data points in a high-dimensional dataset by projecting the data points into a two- or three-dimensional visualization while preserving some of the data’s original characteristics. Each data point in a UMAP scatter plot can be represented by a point in a low-dimensional space, with the position of the point being selected by the dimensionality reduction technique.

UMAP algorithm implements embeddings of the data input features from a probabilistic viewpoint, while input features are measured and optimized along a topological data manifold. In UMAP, dimensionality reduction is implemented in two steps. First, a high-dimensional graph is constructed for the data, and then, a cost function is optimized to represent the lowest-dimensional graph that is the most comparable to the high-dimensional graph. The cross-entropy between probability distributions in the high-dimensional and low-dimensional spaces provides the foundation for the cost function optimization in UMAP. The goal is to find a low-dimensional representation that preserves the topological structure. The optimization process involves finding a low-dimensional representation that minimizes the cost function E using gradient descent. The resulting representation, which is simpler to visualize, understand, and analyze, captures the structure of the high-dimensional data in a reduced dimensional space. The standard cost function can be expressed as:

E=minY∑i,jwi,j⋅Li,jY+α∑iΩiai,E3

where the dataset’s low-dimensional embedding is represented by Y. The pairwise cost Li,j between data points i and j is given the weight wi,j. In the high-dimensional space, the weight is a function of the separation between the data points. The pair-wise cost between data points i and j in the low-dimensional space is expressed as Li,jY. In order to preserve the global and local structure of the high-dimensional dataset, the cost is a function of the distance between the data points in the low-dimensional space. The hyperparameter α regulates how “fuzzy” the low-dimensional embedding is. A repulsive force called Ωiai is applied to each data point i in the low-dimensional embedding. The force, which is intended to avoid overcrowding in the low-dimensional embedding, is a function of the local density of the high-dimensional dataset surrounding the data point.

The objective function is used to modify the low-dimensional representation of the data to guarantee that it is consistent with the original data distribution, whereas the cost function in UMAP is used to optimize the low-dimensional representation of the data points. The standard objective function for UMAP can be written as follows:

Yi⋅=argminYi⋅∈Rn∑j=1nwi,jf‖(Yi⋅−Yj⋅‖ρi,j+β∑j≠iwi,jf‖(Yi⋅−Yj⋅‖ρi,j−fdi,jρi,j,E4

where Yi⋅ denotes the low-dimensional representation of the i-th data point, ρi,j is the scale of the kernel that computes the edge weight, β denotes the regularization parameter, the distance between data points i and j in the high-dimensional space is denoted as di,j, wi,j represents the weight of the edge connecting data points i and j in the high-dimensional space, and f⋅ is a decreasing function that maps distances to affinities. While the second term promotes the low-dimensional representations of the data points to be evenly distributed in the low-dimensional space, the first component in the equation measures how similar the pairwise distances of the data points in the high-dimensional and low-dimensional spaces are.

During experiments, The UMAP model takes in the target insulator data when fitting, and a target parameter that specifies the target insulator information (obtained via categorical labels) to perform supervised dimension reduction. This enables UMAP to learn and develop a mapping that highlights the differences between various classes or groups in the data while still trying to preserve as much of the fundamental structure of the data. Experiments are conducted on individual datasets separately, then finally compared on the same lower dimensional space. To more accurately represent the internal structure of the data, results are presented as embeddings (or datapoints) in 3-D visualization plots.

5.1.1 Experimental setup

UMAP algorithm is implemented on a computer with the following specifications, CPU: Intel Core i7-6700HQ @ 2.60GHz, RAM: 16GB DDR4, Storage: 1 TB HDD. The operating system used was Windows 10, and the programming language used to implement the algorithm was Python 3.7. The experiments were carried out on a single-core version of UMAP algorithm. In all experimental runs, the entire dimensionality specified for the category of either EPRI or CPLID dataset is passed to the UMAP model, reconstructed, optimized in a lower dimensional map, and projected as a three-dimensional scatter plot. After each experimental run based on the default parameter settings, an alternate experiment is carried out based on a custom parameter setting. The aim is to optimize the quality of the resulting low-dimensional representation and to provide more perspective of the underlying structure of the data.

Input images are first resized to a fixed size of 224×224 pixels (and represented as a 3-dimensional array). This means that each image has 224 rows and 224 columns. UMAP flattens the image pixels into one long vector of 150, 528 features.

Each neighborhood graph contains information about the cost function parameter settings, n_neighbors and min_dist, where n_neighbors is the number of approximate nearest neighbors used to create the initial high-dimension graph and min_dist represents the minimum distance between points in the low-dimensional map. Jaccard metric is chosen for experimental runs particularly due to the fact that both datasets under investigation are binary, non-Euclidean, and represent datasets with class imbalance. The Jaccard metric is more easily understood and is robust to the size of the sets under comparison. The “number of epochs” (iteration) is set to 400, and a random seed state is set to “none” for all experimental runs.

Information is available for all of the datasets regarding the class to which each data point belongs. However, this information is only used to assign a color to the map points and does not impact their spatial coordinates. Therefore, the color coding serves as a means to assess how accurately the map maintains the similarities between the data points within each class.

5.1.2 Experimental results and analysis

Figures 11 and 12 show low-dimensional visualizations of EPRI and CPLID datasets, respectively, from the supervised UMAP algorithm. These figures show visualizations with UMAP’s default parameter settings in the left column and visualizations with custom parameter settings in the right column, using Jaccard metric setting. Besides visualization task, the results also depict UMAP’s implementation of clustering task using features from the datasets. Data points are clustered based on feature similarities and differences. It is worth noting that the quality of the embeddings is also optimized by categorical labels serving as an extra information to UMAP model. Each neighborhood graph is captioned to capture the visualization of the specified dataset based on all or some categorical features.

Figure 11.
Visualizations of embeddings as similarity clusters from EPRI dataset.

Figure 12.
Visualizations of embeddings as similarity clusters from CPLID dataset.

The top row of Figure 11 shows clustered UMAP embeddings of all the different categories of EPRI dataset based on their similarities and differences. The second row projects defective insulator disc features of EPRI as embeddings in a low-dimensional 3-D space. Figure 12 depicts clustered UMAP embeddings of all the different categorical features of CPLID dataset based on their similarities and differences. Figure 13 represents a comparison of the combined features of EPRI and CPLID as embeddings. EPRI and CPLID datasets are combined and compared on the same neighborhood graph based on the argument that though different datasets; they represent the same type of object (target concept) or scenes and could serve to reveal the underlying patterns and relationships among the insulator objects.

Figure 13.
Visualizations of embeddings as similarity clusters from EPRI and CPLID.

The first and notable observation from these visualizations is a cleanly separated set of classes (and a small amount of stray noise, or points that differed enough from their class to be separated from the others). The internal structure of the individual classes has also been retained. Generally, it can be observed that the parameter settings influence the visualizations. A higher value of n_neighbors results in a smoother representation of the data, whereas a lower value results in a more detailed representation. This is the most prominent in Figure 12 where more details are revealed on CPLID dataset. As a result, the selection of n neighbors can alter both the separation of various clusters and the level of detail in the display. On the other hand, the tightness of the clusters in the low-dimensional space is controlled by the minimal distance value (min dist). More compact clusters will be produced by a lower value of min_dist, while more dispersed clusters will be produced by a greater value. So, the depiction of the distance between clusters and their compactness can be influenced by the selection of min_dist. By adjusting parametric settings, researchers can control the trade-off between local and global structure preservation in the embedding. The aim is to find the optimal balance between these factors, resulting in a low-dimensional embedding that best captures the underlying structure of the data. This can be useful step because the effectiveness of subsequent machine learning tasks may be enhanced by taking this step.

It can also be observed from Figure 12 that data points from CPLID appear to be more “glued together”, at default UMAP setting where more local details are expected. Coalesced data points in a visualization map using UMAP dimensionality reduction may be a sign that these points are particularly similar to one another in the high-dimensional space. This typically occurs when there are dense groupings of points in the high-dimensional space that are challenging to separate, but it is easier to do so in the low-dimensional space. In order to retain the local structure of the data, UMAP seeks to ensure that points close to one another in the high-dimensional space are equally close to one another in the low-dimensional space. This can reveal information about the data’s structure and is frequently a sign that these points are part of the same cluster or group. However, it is also possible that the coalescing is an artifact of the visualization process and that additional data analysis is required to fully grasp the connections between the data points. In the case of CPLID, the former is the more likely case as many CPLID images appear similar in high-dimensional space. Another possibility is that data augmentation techniques applied to generate some CPLID images may have also introduced artifacts (noise) or distortions that affect the relationships between the data points, which in turn impact UMAP result. However, further investigation into this is not an objective of this dissertation.

Another observation from Figure 12 is the difference in the pattern of separation between cluster groups of CPLID and EPRI dataset in Figure 11. The close proximity of CPLID cluster groups compared to well-separated cluster groups of EPRI dataset could be an inherent feature of (or the complexity of) the underlying structure of CPLID dataset. The reason could be the genuine similarity of data points. A potential solution to this problem is to experiment with various UMAP parameter settings, such as adjusting the number of nearest neighbors or the minimum distance between points.

Based on the UMAP visualizations from Figure 13, UMAP successfully implements clustering as similarity clusters for features of each dataset after the combination of EPRI and CPLID. Custom parameter settings (right column) particularly show that internal structures of individual classes of both datasets have been maintained. No distinct grouping pattern of clusters is observed for any 3-D plot regardless of parametric settings when EPRI and CPLID datasets are combined. For example, insulator discs (shells) and strings are not grouped by their state of condition or by type (discs or strings), as observed with previous visualizations in Figures 11 and 12. This is a property of the data under investigation and underlines the complexity of the data under investigation. Moreover, there is no clearly discernible cluster grouping by dataset type. It is also important to note that occlusion and distortions are also introduced with 3-D visualizations in default settings of Figure 13. Such occurrence can affect the interpretation of the results and could influence misinterpretations. One option to minimize occurrence is to visualize data in 2-dimensional space.

Visualizations from the default settings (left column) of Figure 13 also depict few data points (or isolated clusters) that sufficiently deviate from their class. These data points persist as miniature clusters that extend their class even after enhancing the global structure of the dataset via custom settings (right column) of Figure 13. Such ‘stray noise’ could be representative of outliers in the datasets. Identifying outliers in image data is an important preprocessing step before model development. It may be necessary to pay extra attention to outliers in order to prevent or lessen their impact on machine learning models. This might entail methods like outlier elimination, robust normalization, or the adoption of alternative model architectures that are less susceptible to outliers.

In general, supervised UMAP visualization can be beneficial for classification tasks, where the objective is to predict the class or label of a new data point based on its features. A classification model can be trained on the supervised UMAP visualization to learn to recognize the patterns and connections between various classes of data points. This can help create classifiers that are more reliable and accurate and are good at generalizing to new data. For examining and studying complex datasets, UMAP graphs and other dimensionality reduction methods are generally more effective tools. It is crucial to remember that they do not serve as a replacement for proper statistical analysis. It would be helpful to supplement them with the proper methodologies to make sure that their findings are robust and trustworthy. Moreover, while basic visualization plots are a flexible tool for two-dimensional data visualization, UMAP scatter plots are a more sophisticated technique for seeing high-dimensional data in a lower-dimensional environment and require more specific expertise and resources to develop and analyze.

6. Conclusions

With a focus on insulator image data, this chapter outlines procedures for preparing aerial image data that serve as the knowledgebase for deep learning algorithm development. Power Line Image Data Readiness benchmark is proposed and expected to serve as a standard appraisal reference in future technical, academic, and business conversations. The section also proposes the concepts of PDEID to be integrated with data curation process with the potential to improve data analytics. Using two data visualization techniques, ground truth instance annotations of two recent public insulator datasets are first reengineered to visualize, analyze, and compare them. In the second approach, a data exploratory tool is implemented as a dimensionality reduction strategy to intuitively visualize, cluster, and analyze the datasets. The visualization approaches reveal representational variabilities that could be exploited to improve predictive modeling in applications for power line inspection.

References

1. Wang H, Raj B, Xing EP. On the origin of deep learning. arXiv. 2017
2. Marcus G. Deep learning: A critical appraisal. arXiv. 2018
3. Deng J, Dong W, Socher R, Li L, Li K, Fei-Fei L. ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009, Miami, Florida, USA. NW, Washington: IEEE Computer Society; 2009. pp. 248-255. DOI: 10.1109/CVPR.2009.5206848
4. Lin T, Maire M, Belongie SJ, Hays J, Perona P, Ramanan D, et al. Microsoft COCO: Common objects in context. In: Fleet DJ, Pajdla T, Schiele B, Tuytelaars T, editors. Computer Vision-ECCV 2014-13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V. vol. 8693 of Lecture Notes in Computer Science. Cham: Springer; 2014. pp. 740-755. DOI: 10.1007/978-3-319-10602-1_48
5. Raimundo A. Insulator Data Set-Chinese Power Line Insulator Dataset (CPLID). IEEE DataPort; [online]. 2020. Available from: https://ieee-dataport.org/open-access/insulator-data-set-chinese-power-line-insulator-dataset-cplid
6. Lewis D, Kulkarni P. EPRI insulator defect image dataset. IEEE DataPort. 2021
7. Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, et al. The FAIR guiding principles for scientific data management and stewardship. Scientific Data. 2016;3(1):160018. DOI: 10.1038/sdata.2016.18
8. Mbayo H. Data and Power: AI and Development in the Global South; [online]. 2020. Available from: https://www.oxfordinsights.com/insights/2020/10/2/data-and-power-ai-and-development-in-the-global-south
9. Gul E. Is Artificial Intelligence the frontier solution to Global South’s wicked development challenges? [online]. 2019. Available from: https://towardsdatascience.com/is-artificial-intelligence-the-frontier-solution-to-global-souths-wicked-development-challenges-4206221a3c78
10. Lawrence ND. Data readiness levels. arXiv. 2017
11. Harvey H, Glocker B. In: Ranschaert ER, Morozov S, Algra PR, editors. A Standardised Approach for Preparing Imaging Data for Machine Learning Tasks in Radiology. Cham: Springer International Publishing; 2019. pp. 61-72. DOI: 10.1007/978-3-319-94878-2_6
12. Chang W, Yang G, Yu J, Liang Z. Real-time segmentation of various insulators using generative adversarial networks. IET Computer Vision. 2018;12(5):596-602. DOI: 10.1049/iet-cvi.2017.0591
13. Sampedro Pérez C, Rodriguez-Vazquez J, Rodríguez Ramos A, Carrio A, Campoy P. Deep learning-based system for automatic recognition and diagnosis of electrical insulator strings. IEEE Access. 2019;7:1
14. Liu C, Wu Y, Liu J, Han J. MTI-YOLO: A light-weight and real-time deep neural network for insulator detection in complex aerial images. Energies. 2021;14(5):1426
15. Wu C, Ma X, Kong X, Zhu H. Research on insulator defect detection algorithm of transmission line based on CenterNet. PLoS One. 2021;16(7):e0255135
16. Lim SH, Young S, Patton R. An analysis of image storage systems for scalable training of deep neural networks. In: The Seventh Workshop on Big Data Benchmarks, Performance Optimization, and Emerging Hardware (in Conjunction with ASPLOS’16). Atlanta, GA, USA, 2016. Oak Ridge, TN (United States): Oak Ridge National Lab. (ORNL); 2016
17. Dutta A, Zisserman A. The VIA annotation software for images, audio and video. In: Amsaleg L, Huet B, Larson MA, Gravier G, Hung H, Ngo C, et al., editors. Proceedings of the 27th ACM international conference on multimedia, MM 2019, Nice, France, October 21-25, 2019. New York, NY, United States: ACM; 2019. p. 2276-2279. DOI: 10.1145/3343031.3350535
18. Sager C, Janiesch C, Zschech P. A survey of image labelling for computer vision applications. arXiv. 2021
19. Schmelzer R. Data engineering, preparation, and labeling for AI 2019 CGR-DE100. Cognilytica. 2019
20. Sukhbaatar S, Bruna J, Paluri M, Bourdev L, Fergus R. Training convolutional networks with Noisy labels. arXiv e-prints. 2014
21. Agarwal V, Podchiyska T, Banda JM, Goel V, Leung TI, Minty EP, et al. Learning statistical models of phenotypes using noisy labeled training data. Journal of the American Medical Informatics Association. 2016;23(6):1166-1173. DOI: 10.1093/jamia/ocw028
22. Everingham M, Gool LV, Williams CKI, Winn JM, Zisserman A. The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision. 2010;88(2):303-338. DOI: 10.1007/s11263-009-0275-4
23. Lv T, Yan P, He W. Survey on JSON data modelling. Journal of Physics: Conference Series. 2018;1069:012101
24. Liu X, Miao X, Jiang H, Chen J. Review of data analysis in vision inspection of power lines with an in-depth discussion of deep learning technology. arXiv. 2020
25. Bianco S, Ciocca G, Napoletano P, Schettini R. An interactive tool for manual, semi-automatic and automatic video annotation. Computer Vision and Image Understanding. 2015;131:88-99. DOI: 10.1016/j.cviu.2014.06.015
26. Larumbe-Bergera A, Porta S, Cabeza R, Villanueva A. SeTA: Semiautomatic tool for annotation of eye tracking images. In: Krejtz K, Sharif B, editors. Proceedings of the 11th ACM Symposium on eye Tracking Research & Applications, ETRA 2019, Denver, CO, USA, June 25–28, 2019. New York, NY, United States: ACM; 2019. pp. 1-45. DOI: 10.1145/3314111.3319830
27. Zhuo X, Fraundorfer F, Kurz F, Reinartz P. Automatic annotation of airborne images by label propagation based on a Bayesian-CRF model. Remote Sensing. 2019;11(2):145. DOI: 10.3390/rs11020145
28. Cao J, Zhao A, Zhang Z. Automatic image annotation method based on a convolutional neural network with threshold optimization. PLoS One. 2020;15(9):1-21. DOI: 10.1371/journal.pone.0238956
29. Ren S, He K, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 1. NIPS’15. Cambridge, MA, USA: MIT Press; 2015. pp. 91-99
30. Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. arXiv. 2015
31. He K, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV). NJ, USA: IEEE; 2017. pp. 2980-2988. DOI: 10.1109/ICCV.2017.322
32. Zhao Z, Zhen Z, Zhang L, Qi Y, Kong Y, Zhang K. Insulator detection method in inspection image based on improved faster R-CNN. Energies. 2019;12(7):1204
33. Redmon J, Farhadi A. YOLOv3: An Incremental Improvement. arXiv. 2018
34. Koo KM, Cha EY. Image recognition performance enhancements using image normalization. Human-centric Computing and Information Sciences. 2017;7(1):33. DOI: 10.1186/s13673-017-0114-5
35. Ulyanov D, Vedaldi A, Lempitsky VS. Instance normalization: The missing ingredient for fast stylization. arXiv. 2016
36. Shorten C, Khoshgoftaar TM. A survey on image data augmentation for deep learning. Journal of Big Data. 2019;6:60. DOI: 10.1186/s40537-019-0197-0
37. Zoph B, Cubuk ED, Ghiasi G, Lin T, Shlens J, Le QV. Learning data augmentation strategies for object detection. In: Vedaldi A, Bischof H, Brox T, Frahm J, editors. Computer Vision-ECCV 2020-16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVII. Vol. 12372 of Lecture Notes in Computer Science. Cham: Springer; 2020. pp. 566-583. DOI: 10.1007/978-3-030-58583-9_34
38. Ghiasi G, Cui Y, Srinivas A, Qian R, Lin T, Cubuk ED, et al. Simple copy-paste is a strong data augmentation method for instance segmentation. arXiv. 2020
39. McInnes L, Healy J. UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv. 2018

Notes

https://github.com/tzutalin/labelImg

[1] 1. Wang H, Raj B, Xing EP. On the origin of deep learning. arXiv. 2017

[2] 2. Marcus G. Deep learning: A critical appraisal. arXiv. 2018

[3] 3. Deng J, Dong W, Socher R, Li L, Li K, Fei-Fei L. ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009, Miami, Florida, USA. NW, Washington: IEEE Computer Society; 2009. pp. 248-255. DOI: 10.1109/CVPR.2009.5206848

[4] 4. Lin T, Maire M, Belongie SJ, Hays J, Perona P, Ramanan D, et al. Microsoft COCO: Common objects in context. In: Fleet DJ, Pajdla T, Schiele B, Tuytelaars T, editors. Computer Vision-ECCV 2014-13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V. vol. 8693 of Lecture Notes in Computer Science. Cham: Springer; 2014. pp. 740-755. DOI: 10.1007/978-3-319-10602-1_48

[5] 5. Raimundo A. Insulator Data Set-Chinese Power Line Insulator Dataset (CPLID). IEEE DataPort; [online]. 2020. Available from: https://ieee-dataport.org/open-access/insulator-data-set-chinese-power-line-insulator-dataset-cplid

[6] 6. Lewis D, Kulkarni P. EPRI insulator defect image dataset. IEEE DataPort. 2021

[7] 7. Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, et al. The FAIR guiding principles for scientific data management and stewardship. Scientific Data. 2016;3(1):160018. DOI: 10.1038/sdata.2016.18

[8] 8. Mbayo H. Data and Power: AI and Development in the Global South; [online]. 2020. Available from: https://www.oxfordinsights.com/insights/2020/10/2/data-and-power-ai-and-development-in-the-global-south

[9] 9. Gul E. Is Artificial Intelligence the frontier solution to Global South’s wicked development challenges? [online]. 2019. Available from: https://towardsdatascience.com/is-artificial-intelligence-the-frontier-solution-to-global-souths-wicked-development-challenges-4206221a3c78

[10] 10. Lawrence ND. Data readiness levels. arXiv. 2017

[11] 11. Harvey H, Glocker B. In: Ranschaert ER, Morozov S, Algra PR, editors. A Standardised Approach for Preparing Imaging Data for Machine Learning Tasks in Radiology. Cham: Springer International Publishing; 2019. pp. 61-72. DOI: 10.1007/978-3-319-94878-2_6

[12] 12. Chang W, Yang G, Yu J, Liang Z. Real-time segmentation of various insulators using generative adversarial networks. IET Computer Vision. 2018;12(5):596-602. DOI: 10.1049/iet-cvi.2017.0591

[13] 13. Sampedro Pérez C, Rodriguez-Vazquez J, Rodríguez Ramos A, Carrio A, Campoy P. Deep learning-based system for automatic recognition and diagnosis of electrical insulator strings. IEEE Access. 2019;7:1

[14] 14. Liu C, Wu Y, Liu J, Han J. MTI-YOLO: A light-weight and real-time deep neural network for insulator detection in complex aerial images. Energies. 2021;14(5):1426

[15] 15. Wu C, Ma X, Kong X, Zhu H. Research on insulator defect detection algorithm of transmission line based on CenterNet. PLoS One. 2021;16(7):e0255135

[16] 16. Lim SH, Young S, Patton R. An analysis of image storage systems for scalable training of deep neural networks. In: The Seventh Workshop on Big Data Benchmarks, Performance Optimization, and Emerging Hardware (in Conjunction with ASPLOS’16). Atlanta, GA, USA, 2016. Oak Ridge, TN (United States): Oak Ridge National Lab. (ORNL); 2016

[17] 17. Dutta A, Zisserman A. The VIA annotation software for images, audio and video. In: Amsaleg L, Huet B, Larson MA, Gravier G, Hung H, Ngo C, et al., editors. Proceedings of the 27th ACM international conference on multimedia, MM 2019, Nice, France, October 21-25, 2019. New York, NY, United States: ACM; 2019. p. 2276-2279. DOI: 10.1145/3343031.3350535

[18] 18. Sager C, Janiesch C, Zschech P. A survey of image labelling for computer vision applications. arXiv. 2021

[19] 19. Schmelzer R. Data engineering, preparation, and labeling for AI 2019 CGR-DE100. Cognilytica. 2019

[20] 20. Sukhbaatar S, Bruna J, Paluri M, Bourdev L, Fergus R. Training convolutional networks with Noisy labels. arXiv e-prints. 2014

[21] 21. Agarwal V, Podchiyska T, Banda JM, Goel V, Leung TI, Minty EP, et al. Learning statistical models of phenotypes using noisy labeled training data. Journal of the American Medical Informatics Association. 2016;23(6):1166-1173. DOI: 10.1093/jamia/ocw028

[22] 22. Everingham M, Gool LV, Williams CKI, Winn JM, Zisserman A. The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision. 2010;88(2):303-338. DOI: 10.1007/s11263-009-0275-4

[23] 23. Lv T, Yan P, He W. Survey on JSON data modelling. Journal of Physics: Conference Series. 2018;1069:012101

[24] 24. Liu X, Miao X, Jiang H, Chen J. Review of data analysis in vision inspection of power lines with an in-depth discussion of deep learning technology. arXiv. 2020

[25] 25. Bianco S, Ciocca G, Napoletano P, Schettini R. An interactive tool for manual, semi-automatic and automatic video annotation. Computer Vision and Image Understanding. 2015;131:88-99. DOI: 10.1016/j.cviu.2014.06.015

[26] 26. Larumbe-Bergera A, Porta S, Cabeza R, Villanueva A. SeTA: Semiautomatic tool for annotation of eye tracking images. In: Krejtz K, Sharif B, editors. Proceedings of the 11th ACM Symposium on eye Tracking Research & Applications, ETRA 2019, Denver, CO, USA, June 25–28, 2019. New York, NY, United States: ACM; 2019. pp. 1-45. DOI: 10.1145/3314111.3319830

[27] 27. Zhuo X, Fraundorfer F, Kurz F, Reinartz P. Automatic annotation of airborne images by label propagation based on a Bayesian-CRF model. Remote Sensing. 2019;11(2):145. DOI: 10.3390/rs11020145

[28] 28. Cao J, Zhao A, Zhang Z. Automatic image annotation method based on a convolutional neural network with threshold optimization. PLoS One. 2020;15(9):1-21. DOI: 10.1371/journal.pone.0238956

[29] 29. Ren S, He K, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 1. NIPS’15. Cambridge, MA, USA: MIT Press; 2015. pp. 91-99

[30] 30. Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. arXiv. 2015

[31] 31. He K, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV). NJ, USA: IEEE; 2017. pp. 2980-2988. DOI: 10.1109/ICCV.2017.322

[32] 32. Zhao Z, Zhen Z, Zhang L, Qi Y, Kong Y, Zhang K. Insulator detection method in inspection image based on improved faster R-CNN. Energies. 2019;12(7):1204

[33] 33. Redmon J, Farhadi A. YOLOv3: An Incremental Improvement. arXiv. 2018

[34] 34. Koo KM, Cha EY. Image recognition performance enhancements using image normalization. Human-centric Computing and Information Sciences. 2017;7(1):33. DOI: 10.1186/s13673-017-0114-5

[35] 35. Ulyanov D, Vedaldi A, Lempitsky VS. Instance normalization: The missing ingredient for fast stylization. arXiv. 2016

[36] 36. Shorten C, Khoshgoftaar TM. A survey on image data augmentation for deep learning. Journal of Big Data. 2019;6:60. DOI: 10.1186/s40537-019-0197-0

[37] 37. Zoph B, Cubuk ED, Ghiasi G, Lin T, Shlens J, Le QV. Learning data augmentation strategies for object detection. In: Vedaldi A, Bischof H, Brox T, Frahm J, editors. Computer Vision-ECCV 2020-16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVII. Vol. 12372 of Lecture Notes in Computer Science. Cham: Springer; 2020. pp. 566-583. DOI: 10.1007/978-3-030-58583-9_34

[38] 38. Ghiasi G, Cui Y, Srinivas A, Qian R, Lin T, Cubuk ED, et al. Simple copy-paste is a strong data augmentation method for instance segmentation. arXiv. 2020

[39] 39. McInnes L, Healy J. UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv. 2018

Data Readiness and Data Exploration for Successful Power Line Inspection

Deep Learning - Recent Findings and Research

Abstract

Keywords

Author Information

Eldad Antwi-Bekoe*

Gerald Tietaa Maale

Ezekiel Mensah Martey

William Asiedu

Gabriel Nyame

Emmanuel Frimpong Nyamaah

1. Introduction

2. Data readiness - PLIDaR scale

Figure 1.

2.1 PLIDaR - level 1

2.2 PLIDaR - level 2

2.3 PLIDaR - level 3

2.4 PLIDaR - level 4

3. Data collection and preparation cycle

3.1 Data acquisition - ethical consent

3.2 Data collection overview

Figure 2.

3.3 Data collection

Figure 3.

3.4 Data storage, querying, and resampling

3.4.1 Storage

3.4.2 Querying

Figure 4.

4. Ground truth and label assignment

Figure 5.

Figure 6.

4.1 Choosing appropriate label and ground truth definition

Figure 7.

4.2 The setup of manual labeling

4.3 Context readiness and machine-readability

5. The case of EPRI-IDID (V1.1) and CPLID

Figure 8.

Figure 9.

Figure 10.

5.1 Supervised dimensionality reduction with UMAP

5.1.1 Experimental setup

5.1.2 Experimental results and analysis

Figure 11.

Figure 12.

Figure 13.

6. Conclusions

References

Notes

Continue reading from the same book

Deep Learning

Your cart