Given that common egg counting methods in conventional layer farms are inefficient and costly, there is a growing demand for cost-effective solutions with high counting accuracy, expandable functionality, and flexibility that can be easily shared between different coops. However, accurate real-time egg counting faces challenges due to small size, density variation, and egg similarity, exacerbated by dynamic poses. Moreover, current animal industry methods emphasize single-image counting, limiting suitability for video-based counting due to a lack of frame-to-frame target association. The you only look once version 5-DeepSORT-spatial encoding (YOLO v5-DSE) algorithm is proposed as a solution for efficient and reliable egg counting to tackle these issues. The algorithm contains the following three main modules: 1) the egg detector utilizes the improved YOLOv5 to locate eggs in video frames automatically, 2) the DeepSORT-based tracking module is employed to continuously track each egg’s position between frames, preventing the detector from losing egg localization, and 3) the spatial encoding (SE) module is designed to count eggs. Extensive experiments are conducted on 4808 eggs on a commercial farm. Our proposed egg-counting approach achieves a counting accuracy of 99.52%99.52 \% and a speed of 22.57 fps , surpassing not only the DeepSORT-SE and ByteTrack-SE versions of eight advanced YOLO-series object detectors (YOLOX, and YOLOv6-v9) but also other egg-counting methods. The proposed YOLOv5-DSE provides real-time and reliable egg counting for commercial layer farms. This approach could be further expanded to the egg conveyor to locate cages for low-lying hens and help companies cull more efficiently. 鉴于传统蛋鸡养殖场常用的鸡蛋计数方法效率低下且成本高昂,市场对经济高效的解决方案的需求日益增长,这些解决方案应具备高计数精度、可扩展功能和灵活性,并易于在不同鸡舍之间共享。然而,由于鸡蛋体积小、密度差异大以及鸡蛋相似性等因素,实时准确的鸡蛋计数面临着挑战,而动态姿态则加剧了这一挑战。此外,目前的畜牧业方法侧重于单幅图像计数,由于缺乏帧间目标关联,限制了基于视频的计数的适用性。YOLO v5-DSE 算法旨在解决这些问题,从而实现高效可靠的鸡蛋计数。该算法包含以下三个主要模块:1)鸡蛋检测器利用改进的 YOLOv5 自动定位视频帧中的鸡蛋,2)基于 DeepSORT 的跟踪模块用于在帧之间连续跟踪每个鸡蛋的位置,防止检测器丢失鸡蛋定位,3)空间编码(SE)模块用于计数鸡蛋。在商业农场对 4808 个鸡蛋进行了广泛的实验。我们提出的鸡蛋计数方法实现了 99.52%99.52 \% 的计数精度和 22.57 fps 的速度,不仅超过了八种先进的 YOLO 系列目标检测器(YOLOX 和 YOLOv6-v9)的 DeepSORT-SE 和 ByteTrack-SE 版本,也超过了其他鸡蛋计数方法。所提出的 YOLOv5-DSE 为商业蛋鸡农场提供实时可靠的鸡蛋计数。该方法可以进一步扩展到鸡蛋传送带,以定位低洼母鸡的笼子并帮助公司更有效地淘汰。
Index Terms-Computer vision, deep learning, edge computing, egg counting, smart-oriented layer farms. 索引词——计算机视觉、深度学习、边缘计算、鸡蛋计数、智能化蛋鸡农场。
I. Introduction 一、引言
WITH rising global population and urbanization, egg demand remains high [1]. In China, more than 88.0%88.0 \% 随着全球人口和城市化进程的不断增长,鸡蛋需求持续高涨[1]。在中国,每年超过 88.0%88.0 \%
Automatic and reliable egg counting via intelligent CV algorithms on the collection transmission line of a commercial layer farm is the goal of this study. Theoretically, using an object detection (OD) algorithm to directly detect eggs in each frame seems intuitive. However, eggs’ small size (occupying less than 1.1%1.1 \% of the image) and varying density pose significant challenges for reliable detection. Moreover, lacking object tracking may result in duplicate counts across consecutive frames, compromising count accuracy. Though object tracking correlates eggs across frames, direct counting using tracked IDs is unreliable due to occlusions, overlaps, and losses, necessitating additional methods to enhance count stability. 本研究的目标是通过智能 CV 算法在商业蛋鸡养殖场的采集传输线上实现自动可靠的鸡蛋计数。理论上,使用目标检测(OD)算法直接检测每帧中的鸡蛋似乎很直观。然而,鸡蛋体积小(在图像中所占空间小于 1.1%1.1 \% )且密度参差不齐,给可靠检测带来了巨大挑战。此外,缺乏目标追踪可能会导致连续帧间出现重复计数,从而影响计数准确性。虽然目标追踪能够关联不同帧间的鸡蛋,但由于遮挡、重叠和丢失等原因,直接使用追踪 ID 进行计数并不可靠,因此需要采用其他方法来增强计数的稳定性。
Generally, existing counting methods in agriculture fall into two categories: 1) Handcrafted feature-based methods [4] rely on prior scene knowledge to manually design suitable features and 一般来说,现有的农业计数方法分为两类:1)基于手工特征的方法[4]依靠先验场景知识来手动设计合适的特征,并
Recently, researchers in agriculture have presented many works toward tackling counting problems in various scenarios. Liu et al. [9], [10] designed a fruit counting pipeline using a monocular camera. Target tracking is performed with a Kalman-filter-corrected Kanade-Lucas-Tomasi tracker based on fruit segmentation. Subsequently, the structure from motion is utilized to acquire relative three-dimensional (3-D) positions for counting. This method is suitable only for rigid shape and stationary target counting tasks and does not work for moving egg counting cases. Chen et al. [6] combined keypoint detection with online tracking to determine pig correlations between video frames. With the aid of spatial filtering, a stable estimate of the pig count can be achieved. This technique is deployed on a commercial inspection robot by JD Finance America Corporation. However, detecting the key points of a small, nearly circular target such as an egg presents greater challenges. With respect to eggs, Ulaszewski et al. [14] integrated the YOLOv3 object detection algorithm and a distance tracker for egg counting on production lines. However, without algorithmic enhancements to improve egg localization, interframe target association, or mitigate ID errors from background interference, occlusion, or egg pose variations, its performance is limited. 近年来,农业研究人员针对各种场景下的计数问题提出了许多研究成果。刘等人 [9],[10] 设计了一种使用单目摄像机的水果计数流程。目标跟踪采用基于水果分割的卡尔曼滤波校正的 Kanade-Lucas-Tomasi 跟踪器进行。随后,利用运动恢复结构获取相对的三维 (3-D) 位置进行计数。该方法仅适用于刚性形状和静止目标的计数任务,不适用于移动鸡蛋计数的情况。陈等人 [6] 将关键点检测与在线跟踪相结合,以确定视频帧之间猪的相关性。借助空间滤波,可以实现对猪数量的稳定估计。京东金融美国公司已将这项技术应用于商用巡检机器人。然而,检测鸡蛋等小型近圆形目标的关键点更具挑战性。关于鸡蛋,Ulaszewski 等人 [7] 提出了一种基于关键点检测的在线跟踪方法。 [14] 集成了 YOLOv3 目标检测算法和距离追踪器,用于生产线上的鸡蛋计数。然而,如果不对算法进行改进,以改善鸡蛋定位、帧间目标关联,或减轻背景干扰、遮挡或鸡蛋姿势变化造成的识别误差,其性能将受到限制。
Motivated by the aforementioned observations, in this article, based on the constructed counting system, we propose a you only look once version 5-DeepSORT-spatial encoding (YOLOv5-DSE) real-time egg counting pipeline to achieve automatic and high counting accuracy on commercial layer farms. Here, YOLOv5 is an improved version of the you only look once (YOLO) OD algorithm designed to localize targets of interest (eggs in this study) in images automatically. As demonstrated in 基于上述观察,本文基于已构建的计数系统,提出了一种基于“你只需看一次”版本 5-DeepSORT-空间编码 (YOLOv5-DSE) 的实时鸡蛋计数流程,旨在实现商业蛋鸡场的自动化高计数精度。YOLOv5 是“你只需看一次” (YOLO) OD 算法的改进版本,旨在自动定位图像中的目标(本研究中为鸡蛋)。如图所示
Fig. 3, the proposed approach consists of three main modules: 1) egg detector module, 2) DeepSORT-based egg tracker module, and 3) spatial encoding (SE) based counting module. Specifically, the CIoU-enhanced YOLOv5 egg detector is employed to reliably detect eggs in video frames, and then, these positions are used for the follow-up egg tracker; the DeepSORT-based tracking module is used to continuously track each egg’s position by correlating sequence frames to ensure stable position tracking; the SE module reliably counts eggs based on tracking results by encoding spatial positions within the camera’s field of view (FOV) and analyzing the encoding sequence of each egg. 如图 3 所示,该方法由三个主要模块组成:1)鸡蛋检测器模块;2)基于 DeepSORT 的鸡蛋追踪器模块;3)基于空间编码(SE)的计数模块。具体而言,采用 CIoU 增强的 YOLOv5 鸡蛋检测器可靠地检测视频帧中的鸡蛋,并将这些位置用于后续的鸡蛋追踪器;基于 DeepSORT 的追踪模块通过关联序列帧来连续追踪每个鸡蛋的位置,以确保稳定的位置追踪;SE 模块通过对摄像机视场(FOV)内的空间位置进行编码并分析每个鸡蛋的编码序列,根据追踪结果可靠地对鸡蛋进行计数。
In our work, the algorithm is deployed in the constructed egg counting system based on an edge computing device on a commercial layer farm. The main contributions of this work are as follows. 本研究将该算法部署到基于边缘计算设备构建的鸡蛋计数系统中,该系统部署在商业蛋鸡养殖场。本研究的主要贡献如下。
The proposed enhanced YOLOv5 egg detector, leveraging CIoUC I o U loss, substantially enhances egg detection accuracy by effectively addressing the challenge of accurately determining overlap between predicted and true bounding boxes. This improvement is achieved without imposing additional computational burden during inference, thereby enhancing the efficiency and effectiveness of the detection process. 所提出的增强型 YOLOv5 鸡蛋检测器利用 CIoUC I o U 损失函数,有效解决了准确确定预测边界框与真实边界框之间重叠度的难题,从而显著提升了鸡蛋检测的准确率。这一改进无需在推理过程中增加额外的计算负担,从而提升了检测过程的效率和效果。
A novel SE module is designed to count eggs. It employs a novel FOV coding strategy to avoid errors linked to the direct use of tracked egg ID. To ensure accurate counting, eggs whose encoding has yet to change are also taken into account in the start and end frames. 我们设计了一个新颖的 SE 模块来计数鸡蛋。它采用了一种新颖的 FOV 编码策略,以避免直接使用跟踪鸡蛋 ID 带来的错误。为了确保计数准确,在起始帧和结束帧中还会考虑编码尚未改变的鸡蛋。
The proposed counting pipeline presents a practical and efficient system for egg counting in commercial layer farms. Through extensive experiments involving 4808 eggs, the system achieves an outstanding counting accuracy of 99.52%99.52 \% and operates at a speed of 22.57 fps , surpassing existing methods in both real-time accuracy and speed. We also investigated the impact of egg collection transmission line direction on counting accuracy. Experimental results show no significant accuracy changes with direction alteration. 所提出的计数流水线为商业蛋鸡养殖场的鸡蛋计数提供了一种实用高效的系统。通过对 4808 枚鸡蛋的大量实验,该系统实现了 99.52%99.52 \% 的出色计数精度,运行速度高达 22.57 fps,在实时精度和速度方面均超越现有方法。我们还研究了鸡蛋收集传输线方向对计数精度的影响。实验结果表明,方向改变对精度没有显著影响。
Fig. 3. Technical route of the proposed method. The enhanced YOLOv5 egg detector initially detects the bounding boxes of all eggs in each video frame. Then, the DeepSORT online egg tracker creates the temporal correlation between frames to avoid losing their localization and assign IDs to the eggs. Eventually, the SE module (spatial encoding) produces the final counts. 图 3. 所提方法的技术路线。增强型 YOLOv5 鸡蛋检测器首先检测每个视频帧中所有鸡蛋的边界框。然后,DeepSORT 在线鸡蛋追踪器创建帧之间的时间相关性,以避免丢失其定位,并为鸡蛋分配 ID。最后,SE 模块(空间编码)生成最终计数。
II. Methodology 二、方法论
A. Overview of the Proposed Approach A. 所提出方法的概述
To provide a solution for an efficient and intelligent eggcounting algorithm in commercial layer farms, as illustrated in Fig. 3, the proposed YOLOv5-DSE egg-counting pipeline is composed of three main modules. First, to locate the egg quickly and accurately, an egg detector module is constructed by improving YOLOv5 with CIoUC I o U loss on the manually labeled egg detection dataset. Moreover, within the framework of the DeepSORT egg tracker module, eggs are correlated across different frames to achieve continuous and reliable tracking, mitigating the risk of target loss. Furthermore, a spatial coding (SE) module is designed to enhance the robustness of egg counting. This module incorporates spatial and identity information to improve the accuracy and reliability of the counting process. 为了在商业蛋鸡养殖场提供高效智能的鸡蛋计数算法解决方案,如图 3 所示,提出的 YOLOv5-DSE 鸡蛋计数流程由三个主要模块组成。首先,为了快速准确地定位鸡蛋,我们在手动标记的鸡蛋检测数据集上对 YOLOv5 进行改进,并使用 CIoUC I o U 损失函数,构建了一个鸡蛋检测模块。此外,在 DeepSORT 鸡蛋追踪器模块的框架内,将不同帧之间的鸡蛋关联起来,以实现连续可靠的追踪,从而降低目标丢失的风险。此外,我们还设计了一个空间编码(SE)模块来增强鸡蛋计数的鲁棒性。该模块结合空间和身份信息,以提高计数过程的准确性和可靠性。
B. Egg Detector Module Based on Improved YOLOv5 B.基于改进的 YOLOv5 的鸡蛋检测器模块
YOLO is a CNN-based OD technique that treats the detection problem as a regression task. It is widely used in the field of target detection because of its fast speed and good adaptability for complex scenes [12], [13], [14], [15]. YOLOv5 [16] is an improved version of the YOLO series that introduced three data enhancement methods, namely, mosaic, adaptive anchor box computing, and adaptive picture scaling. It also employs advanced feature extraction and fusion structures such as focus, communicating sequential process (CSP), feature pyramid network, and pyramid attention network. Equipped with these techniques, YOLOv5 achieves a good tradeoff between detection performance and cost. As illustrated in Fig. 4, it includes four parts: the input, backbone, neck, and prediction head. YOLO 是一种基于 CNN 的目标检测 (OD) 技术,将检测问题视为回归任务,因其速度快、对复杂场景适应性强等优点,在目标检测领域得到广泛应用 [12]–[15]。YOLOv5 [16] 是 YOLO 系列的改进版本,引入了马赛克、自适应锚框计算和自适应图片缩放三种数据增强方法,并采用了先进的特征提取与融合结构,例如焦点网络、通信顺序处理 (CSP)、特征金字塔网络和金字塔注意力网络。借助这些技术,YOLOv5 在检测性能和成本之间取得了良好的平衡。如图 4 所示,YOLOv5 包含四个部分:输入网络、主干网络、颈部网络和预测头网络。
The input involves filtering and labeling egg images, resizing them, and sending them to the backbone. It also utilizes the above three image enhancement methods to improve detection performance. In addition, CSP and focus are integrated into the backbone to enhance the learning ability and ensure the accuracy of the model simultaneously. Moreover, the designed CSP structure in the neck effectively improves its feature fusion capability. The prediction head is composed of the GIoU loss and nonmaximum suppression functions, shown as follows, which could eliminate redundant bounding boxes to achieve egg detection. 输入部分包括对鸡蛋图像进行过滤和标记、调整大小以及发送到主干网络。它还利用上述三种图像增强方法来提升检测性能。此外,将 CSP 和焦点集成到主干网络中,以增强学习能力并同时确保模型的准确性。此外,在颈部设计的 CSP 结构有效提升了其特征融合能力。预测头由如下所示的 GIoU 损失函数和非极大值抑制函数组成,可以消除冗余边界框以实现鸡蛋检测。
As shown in Fig. 5(a), although the GIoU solves the problem of nonintersecting between the true bounding box and the predicted bounding box, when the predicted box is within the true box, i.e., A nn B=BA \cap B=B, the position information of the predicted box cannot be obtained well. 如图 5(a) 所示,虽然 GIoU 解决了真实边界框和预测边界框不相交的问题,但是当预测框位于真实框内,即 A nn B=BA \cap B=B 时,无法很好地获取预测框的位置信息。
Fig. 4. Egg detector module’s structure based on the fine-tuned YOLOv5. The detector consists of 3 main parts, the backbone, the neck, and the head, which are used for image feature extraction, multiscale feature fusion, and outputting a detection bounding box, respectively. 图 4. 基于微调 YOLOv5 的鸡蛋检测器模块结构。检测器由 3 个主要部分组成:主干、颈部和头部,分别用于图像特征提取、多尺度特征融合和输出检测边界框。
Fig. 5. Schematic diagram of different IoU. (a) GloU’s inability to identify the bounding box’s location. (b) CloU’s principle diagram. 图 5. 不同 IoU 的示意图。(a) GloU 无法识别边界框的位置。(b) CloU 的原理图。
To address this issue, we introduce the CIoUC I o U loss by simultaneously considering the sizes of both Boxes AA and BB, as well as the distance between them. As shown in Fig. 5(b), the CIoUC I o U can accurately determine the overlap between bounding Boxes AA and BB by considering their scales and center distances. This allows for the extraction of position information of the predicted frame, which improves the network parameters for backpropagation. It is calculated as 为了解决这个问题,我们引入了 CIoUC I o U 损失,通过同时考虑框 AA 和 BB 的尺寸以及它们之间的距离。如图 5(b) 所示, CIoUC I o U 可以通过考虑边界框 AA 和 BB 的比例和中心距离来准确确定它们之间的重叠。这可以提取预测框的位置信息,从而改进反向传播的网络参数。其计算公式如下:
CIoU_("loss ")=1-IoU+(L^(2))/(d^(2))+(z^(2))/((1-IoU)+z)C I o U_{\text {loss }}=1-I o U+\frac{L^{2}}{d^{2}}+\frac{z^{2}}{(1-I o U)+z}
where LL is the diagonal, dd is the distance between the center points of AA and BB, and zz is the parameter for the consistency of the aspect ratio of the predicted bounding box 其中 LL 是对角线, dd 是 AA 和 BB 中心点之间的距离, zz 是预测边界框长宽比一致性的参数
where w^(A),h^(A),w^(B)w^{A}, h^{A}, w^{B}, and h^(B)h^{B} are the width and height of the real and the predicted bounding box, respectively. 其中 w^(A),h^(A),w^(B)w^{A}, h^{A}, w^{B} 和 h^(B)h^{B} 分别是实际边界框和预测边界框的宽度和高度。
Received 19 March 2024; revised 22 May 2024, 3 July 2024, and 31 July 2024; accepted 16 August 2024. Date of publication 23 September 2024; date of current version 7 January 2025. This work was supported in part by the National Key R&D Program of China under Grant 2023YFD2000801, in part by the Fundamental Research Funds for the Central University under Grant 226-2022-00067, and in part by the China Agriculture Research System under Grant CARS-40. Paper no. TII-24-1255. (Corresponding author: Yibin Ying.) 收稿日期:2024 年 3 月 19 日;修订日期:2024 年 5 月 22 日、2024 年 7 月 3 日和 2024 年 7 月 31 日;接受日期:2024 年 8 月 16 日。出版日期:2024 年 9 月 23 日;当前版本日期:2025 年 1 月 7 日。本研究部分由国家重点研发计划(项目编号:2023YFD2000801)、中央高校基本科研业务费专项资金(项目编号:226-2022-00067)以及国家农业产业技术体系(项目编号:CARS-40)资助。论文编号:TII-24-1255。(通讯作者:应义斌)