Open access peer-reviewed chapter - ONLINE FIRST

Use of Data Mining Algorithms in Chicken Breeding: A Systematic Review

Written By

Thobela Louis Tyasi, Madumetja Cyril Mathapo, Kwena Mokoena, Victoria Rankotsane Hlokoe and Kagisho Madikadike Molabe

Submitted: 16 January 2024 Reviewed: 24 January 2024 Published: 30 April 2024

DOI: 10.5772/intechopen.1004389

Association Rule Mining and Data Mining - Recent Advances, New Perspectives and Applications IntechOpen
Association Rule Mining and Data Mining - Recent Advances, New Pe... Edited by Jainath Yadav

From the Edited Volume

Association Rule Mining and Data Mining - Recent Advances, New Perspectives and Applications [Working Title]

Dr. Jainath Yadav

Chapter metrics overview

7 Chapter Downloads

View Full Metrics

Abstract

Data mining algorithms have been performed to reveal the factors that can be used to enhance live body weight and egg weight during chicken breeding. This work was conducted to systematically review the published articles on the use of data mining algorithms in chicken breeding. ScienceDirect, Web of Science, PubMed, Google Scholar and were used for searching articles. Using the combination of chicken or chicken breeding, data mining algorithm or decision tree, body weight and egg weight as keywords. The results indicated that 8 articles were included from 120 articles were found from searching. The 8 included articles were published from 2016 to 2021 and most of them were originated from South Africa (n = 3) followed by Turkey (n = 2) with. CHAID as the most used data mining algorithm (n = 5) followed by CART (n = 4). Out of 8 included articles, 6 of them used coefficient of determination (R2) as the selection criteria and CART was found as the best model followed by the CHAID model. It is concluded that CART followed by CHAID data mining algorithms are the recommended models that might be used for improving egg production and growth performance of chickens.

Keywords

  • body weight
  • coefficient of determination
  • chicken breeding
  • data mining algorithm
  • egg weight

1. Introduction

Chicken breeding focuses on improving different animal productions including the growth performance, carcass characteristics and egg production. Different studies had been conducted trying to improve growth performance [1, 2, 3] and egg production [1, 4, 5, 6, 7] using different data mining algorithms. Data mining algorithms are nonparametric methods superior and simpler in statistically calculating complex data sets [3]. Moreover, Gevrekçi and Takma [5] reported that they are computer-based procedures to detect evidence from data removing multicollinearity and can run large data. The common data mining algorithms that are performed for estimation of chicken live body weight are classification and regression tree (CART) and artificial neural network (ANN) in Sasso breed [1], and exhaustive chi-square automatic interaction detector (exhaustive CHAID) and chi-square automatic interaction detector (CHAID) in Hy-line Silver Brown and Potchefstroom Koekoek chicken breed [3] and multivariate adaptive regression splines (MARS) in Hy-line Silver Brown chicken [2]. chi-square automatic interaction detector (CHAID) in Hy-line Silver Brown and Boschveld layers [8] and in White layer hybrids chicken [4], chi-square automatic interaction detector (CHAID) and classification and regression tree (CART) in Indigenous chicken of Zambia [7], k-nearest neighbor (KNN), linear discriminant analysis (LDA) and Support vector machine (SVM) in Beijing You Chicken and Dwarf Beijing You Chicken [6], and chi-square automatic interaction detector (CHAID) and ridge regression (RR) in White layer hybrids [4]. Chi-square automatic interaction detector (CHAID, and Classification and regression tree (CART) are commonly performed algorithm methods to improve egg production [5].

Based on authors knowledge, there is no systematic review on the use of data mining algorithms in chicken breeding. To close the identified knowledge gap, the objective of this work was to perform the systematic approach to review the information on the use of data mining algorithms in chicken breeding. This book chapter will help the chicken breeders and researchers to identify the potential data mining algorithms that might be used for estimation of live body weight and egg weight.

Advertisement

2. Methods and materials

2.1 Eligibility criteria

The Population, Exposure and Outcomes (PEO) as components were identified as outlined by Saltikov [9]. The “Chicken” was defined as population of the study, while the “Data mining algorithm or decision tree” as intervention, “Eggs weight” and “Body weight” as outcome. A preliminary search of the PEO component on Google Scholar, Web of Science, PubMed and ScienceDirect was performed before deciding to conduct the study.

2.2 Search strategy

A scientific publication search was performed independently by two investigators (Kwena Mokoena and Thobela Louis Tyasi) in databases up to 10th November 2023, using Google Scholar, Web of Science, PubMed and ScienceDirect. The search was performed using the combination of keywords as follows: ‘Chicken” or “Chicken breeding”, “Data mining algorithm” or “Decision tree”,” Body weight”, and “Egg weight”.

2.3 Inclusion criteria

Searched articles were selected for eligibility according to several standard and considered for inclusion if they met the following criteria:

  • Chicken

  • Data mining algorithm or decision tree

  • Egg weight

  • Live body weight

2.4 Exclusion criteria

The criteria of excluding searched articles contained the following:

  • Records irrelevant to data mining algorithm, egg weight, carcass weight and body weight

  • Studies published as abstract without full text

  • Records duplicated

  • Studies not on chickens

  • Articles with no available original data in the publication and failure to contact the authors

2.5 Data extraction

The data for the current study was extracted independently by Kwena Mokoena and Thobela Louis Tyasi, and an agreement was made involving all sections. The information obtained from each article consisted of the following:

  • First author

  • Year of publication

  • Number of eggs weight

  • Chicken breed

  • Data mining algorithm or decision tree

  • Dependent variables (egg weight and live body weight).

2.6 Ethical considerations

When performing this work all authors considered plagiarism, fabrication, and data falsification.

Advertisement

3. Results

3.1 Searched results

One hundred and twenty (n = 120) articles were retrieved from a publication search, were twenty-five (n = 25) of which were duplicated were removed. As a result, ninety-five (n = 95) articles were considered for title and abstract screening, which resulted in seventy-two (n = 72) articles eliminated after title and abstract review. Twenty-three (n = 23) articles were considered for full text-review, a total of fifteen (n = 15) articles were eliminated after a full text- review, the reasons are stated in Figure 1. A total of eight (n = 8) articles qualified for the inclusion in the study.

Figure 1.

Flowchart of identification and selection of studies for systematic review.

3.2 Characterization of included articles

Table 1 shows eight articles that met the inclusion procedure. The results indicated that [2, 3, 4, 5, 8] used commercial chicken breeds and their eggs. The study that used large sample size of chickens was [8], while the study that used large sample size of chicken eggs was of [4]. The results showed that the most dominant chicken breed was Hy-line silver, Brown layer [2, 3, 5, 8].

AuthorsYearsCountryBreedsData mining algorithm
Gevrekçi and Takma2018TurkeyClassification and regression tree (CART), and chi-square automatic interaction detector (CHAID)
Dong et al.2021ChinaBeijing You Chicken and Dwarf Beijing You Chickenk-nearest neighbor (KNN), linear discriminant analysis (LDA) and Support vector machine (SVM)
Liswaniso et al2020ZambiaIndigenous chicken of ZambiaClassification and regression tree (CART), and chi-square automatic interaction detector (CHAID)
Okoro et al.2017South AfricaHy-line Silver Brown and Boschveld layerschi-square automatic interaction detector (CHAID)
Orhan et al2016TurkeyWhite layer hybridschi-square automatic interaction detector (CHAID) and ridge regression (RR).
Tyasi et al.2020South AfricaHy-line Silver BrownMultivariate adaptive regression splines (MARS).
Tyasi et al2021South AfricaHy-line Silver Brown and Potchefstroom KoekoekClassification and regression tree (CART), Chi-square automatic interaction detector (CHAID) and exhaustive chi-square automatic interaction detector (exhaustive CHAID).
Yakubu and Madaki2017NigeriaSasso breedArtificial neural network (ANN), Classification and regression tree (CART)

Table 1.

Characteristics of included studies.

3.3 Publication by year

Figure 2 indicates the year of publication of included articles. The findings indicated that year 2017 [1, 3, 7, 8] had the highest numbers of articles published (n = 2). The year 2016 [4] and 2018 [5] showed the least number of articles.

Figure 2.

Publication by year.

3.4 Publication by county

The origin of the included articles is presented in Figure 3. The results indicated that from the eight articles included in the review, South Africa [2, 3, 8] had the maximum number of articles (n = 3) and followed by Turkey [5, 8] with two articles. The results also indicated that Nigeria [1], Zambia [7] and China [6].

Figure 3.

Publications by a country.

3.5 Publications by data mining algorithms

Figure 4 displays the number of published articles by data mining algorithms. The findings showed that CHAID was the more commonly used data mining algorithm (n = 5), followed by CART (n = 4). The results also indicated that KNN, LDA, SVM, RR, MARS, Exhaustive CHAID and ANN were the data mining algorithms to be used (n = 1).

Figure 4.

Publications by data mining algorithms.

3.6 Predictive performance of data mining algorithms

Table 2 displays the predictive performance of different data mining algorithms used in the included articles for this study. From eight included articles only seven articles reported goodness of fit. Out of seven studies, most of them (six) used coefficient of determination (R2) as the selection criteria. However, only one study used CV, RAE, MAD and RMSE [5]. From the six studies that used R2 as the selection criteria, it was found that CART was the best model, then followed by the CHAID model. The study of Gevrekçi and Takma [5] indicated that CHAID was the best data mining algorithm model.

Author and yearDependent variableGoodness of fit criteriaModels
CARTCHAIDExhaustive CHAIDRRMARSANNKNNLDASVM
Gevrekçi and Takma, 2018Egg productionCV%10.579.32N/AN/AN/AN/AN/AN/AN/A
RAE0.00240.0021N/AN/AN/AN/AN/AN/AN/A
MAD8.857.56N/AN/AN/AN/AN/AN/AN/A
RMSE11.259.93N/AN/AN/AN/AN/AN/AN/A
Dong et al. 2021Egg discrimination (fatty acid)R2N/AN/AN/AN/AN/AN/A91.7%83.3%91.7%
Dong et al. 2021egg discrimination (flavor characteristics)R2N/AN/AN/AN/AN/AN/A50%N/A16.7%
Liswaniso et al. 2020Egg weightR259.3%82.3%N/AN/AN/AN/AN/AN/AN/A
Orhan et al. 2016Egg weightR2N/A99.98%N/A93.15%N/AN/AN/AN/AN/A
Tyasi et al. 2020Body weightR2N/AN/AN/AN/A100%N/AN/AN/AN/A
Yakubu and Madaki, 2017Body weight (deep litter)R293.4%N/AN/AN/AN/A87%N/AN/AN/A
Yakubu and Madaki, 2017Body weight (battery cage)R293.4%N/AN/AN/AN/A99%N/AN/AN/A
Tyasi et al. 2021Body weightR283.2%65.9%64.1%N/AN/AN/AN/AN/AN/A
Okoro et al. 2017Egg size and performanceR2N/AN/AN/AN/AN/AN/AN/AN/AN/A

Table 2.

Goodness of fit criteria for data mining algorithms in prediction of dependent variables.

CART: Classification and regression tree; CHAID: Chi-square automatic interaction detector; RR: Ridge regression; MARS: Multivariate adaptive regression splines; ANN: Artificial neural network; KNN: K-nearest neighbor; LDA: linear discriminant analysis; SVM: Support vector machine; R2: Coefficient of determination; CV: Coefficient of variation; RAE: Relative approximate error; MAD: Mean absolute deviation; RMSE: Root mean square error; N/A: Not applicable.

Advertisement

4. Discussion

This review was conducted to discover the suitable data mining algorithm model that might be used in chicken breeding from 8 included articles. The findings showed that CHAID was the most used data mining algorithm (5/8) out of the eight articles included in the review, followed by CART (4/8). However, the predictive performance results indicated six articles from included studies used the coefficient of determination (R2) as the selection criteria. This shows that R2 is the reliable goodness of fit criteria for selecting the best model. However, the CV, RAE, MAD and RMSE were used by only one article [5]. From the six articles that used R2 as the selection criteria, it was found that CART was the best model, then followed by the CHAID model. CART algorithm is a kind of machine learning technique performed to assemble a decision tree [2]. The study of Gevrekçi and Takma [5] indicated that CHAID was the best data mining algorithm model. Liswaniso et al. [7] used the different data mining algorithm models to determine egg weight as dependent variable from egg characteristics and found that CHAID is the best model. Similarly, Tyasi et al. [3] found that CHAID model is the best in predicting the body weights of Hy-line Silver Brown commercial layers and Potchefstroom Koekoek indigenous chickens raised in South Africa. To the best of authors’ knowledge, this is the first review in a systematic approach reporting the use of data mining algorithms in chicken breeding. Hence, there is no comparison of our findings in this systematic review. The implication of this work is that the CART method might be used for prediction of live body weight and egg weight for growth performance and egg production improvement in different countries. The strength of the review is that there is no similar study had been done. The contribution of this systematic review is that out of all the commonly used statistical technique for prediction of egg weight and live body weight in chickens, CART is the best model that can be used during chicken breeding. However, more studies need to be done to confirm and add to the results of the study.

Advertisement

5. Conclusion

The current systematic review was conducted to discover the best data mining algorithm that might be used by chicken breeders to identify the factors for improving live body weight and egg weight. Included articles identified factors that can be used during breeding as selection criteria to improve egg production and growth performance. This systematic review showed that included articles used different data mining algorithms including classification and regression tree (CART), artificial neural network (ANN), chi-square automatic interaction detector (CHAID), exhaustive chi-square automatic interaction detector (exhaustive CHAID), multivariate adaptive regression splines (MARS), k-nearest neighbor (KNN), linear discriminant analysis (LDA), support vector machine (SVM) and ridge regression (RR) for chicken breeding. Included articles used goodness of fit criteria such as coefficient of determination and root mean square error to select the best data mining algorithm. This systematic review concludes that CART was the best data mining algorithm model to be used in chicken breeding, followed by CHAID. Furthermore, the researchers should involve the CART and CHAID methods in chicken breeding for prediction of egg weight and live body weight.

Advertisement

Conflict of interest

The authors declare no competing of interests.

Advertisement

Declarations

We declare that this is our work.

References

  1. 1. Yakubu A, Madaki J. Modelling growth of dual-purpose Sasso hens in the tropics using different algorithms. Journal Genetic Biology. 2017;1(1):1-9
  2. 2. Tyasi TL, Makgowo KM, Mokoena K, Rashijane LT, Mathapo MC, Danguru LW, et al. Multivariate adaptive regression splines data mining algorithm for Presiction of body weight of Hy-line silver Brown commercial layer chicken breed. Advances in Animal and Veterinary Sciences. 2020;8(8):794-799. DOI: 10.17582/journal.aavs/2020/8.8.794.799
  3. 3. Tyasi TL, Eyduran E, Celik S. Comparison of tree-based regression tree methods for predicting live body weight from morphological traits in Hy-line silver brown commercial layer and indigenous Potchefstroom Koekoek breeds raised in South Africa. Tropical Animal Health and Production. 2021;53(7):1-8. DOI: 10.1007/s11250-020-02443-y
  4. 4. Orhan H, Eyduran E, Tatliye A, Saygici H. Prediction of egg weight from egg quality characteristics via ridge regression and regression tree methods. 2016;45(7):380-385
  5. 5. Gevrekci Y, Takma C. A comparative study for egg production in layers by decision tree analysis. Pakistan Journal of Zoology. 2018;50(2):437-444. DOI: 10.17582/journal.pjz/2018.50.2.437.444
  6. 6. Dong X, Gao L, Zhang H, Wang J, Qiu K, Qi G, et al. Discriminating eggs from two local breeds based on fatty acid profile and flavor characteristics combined with classification algorithms. Food Science of Animal Resources. 2021;41(6):936-949. DOI: 10.5851/kosfa.2021.e47
  7. 7. Liswaniso S, Qin N, Tyasi TL, Chimbaka IM, Sun X, Xu R. Use of data mining algorithms Chaid and CART in predicting egg weight from egg quality traits of indigenous free-range chickens in Zambia. Advances in Animal and Veterinary Sciences. 2020;9(2):215-220. DOI: 10.17582/journal.aavs/2021/9.2.215.220
  8. 8. Okoro VMO, Ravhuhali KE, Mapholi TH, Mbajiorgu EF, Mbajiorgu AC. Comparison of commercial and locally developed layers performance and egg size prediction using regression tree methods. The Journal of Applied Poultry Research. 2017;26:477-484
  9. 9. Bettany-Saltikov J. Learning how to undertake a systematic review: Part 2. Nursing Standard. 2010;24:47-56

Written By

Thobela Louis Tyasi, Madumetja Cyril Mathapo, Kwena Mokoena, Victoria Rankotsane Hlokoe and Kagisho Madikadike Molabe

Submitted: 16 January 2024 Reviewed: 24 January 2024 Published: 30 April 2024