Object detection is the technique used to identify and classify various items in an image. Different methods are commonly used for object detection to recognize and locate objects, and these algorithms use deep learning to provide relevant outcomes. Deep learning object identification is a quick and accurate method of predicting an object’s placement in an image, which may be helpful in various circumstances.
Because objects in natural situations are typically oriented upward due to gravity, early research mainly concentrated on horizontal object detection. Oriented bounding boxes are preferred in other contexts, such as aerial pictures, industrial inspection, and scene text. Oriented object identification has rapidly become more prominent due to needs in these settings. Unfortunately, two significant challenges exist: boundary discontinuity problem mainly caused by angular periodicity and square-like problem that usually happens when a square bounding box cannot be uniquely defined. To deal with these problems, a Chinese research team from the southeast university propose to utilize phase-shifting coding for angle prediction in oriented object detection.
The authors proposed to modify the phase-shifting coding (PSC), which was mainly created for optical measurement, to adapt it to oriented object detection. This choice was made for two main reasons:
1 – In optical measurement, phase-shifting converts the measured distance into periodic phases. The boundary discontinuity is then automatically resolved because the orientation angle may likewise be encoded into periodic phases.
2 – There are several solutions to the periodic fuzzy problem, which also arises in phase-shifting and is comparable to the square-like problem. By combining the phase of several frequencies, the dual-frequency phase-shifting approach, for instance, resolves the periodic fuzzy issue.
The authors postulate that it is possible to naturally unite the boundary problem and the square-like problem by reconsidering both of them. The boundary problem arises when a bounding box is identical to itself when rotated 180 degrees, whereas the square-like problem arises when they are equivalent when rotated 90 degrees. Although they have distinct cycles, both situations are periodic fuzzy issues. The enhanced version, dual-frequency phase-shifting code (PSCD), is then proposed to perform this operation.
An experimental study was conducted to evaluate the proposed method (PSC and PSCD) through three publicly available datasets: DOTA, HRSC, and OCDPCB using PyTorch, ultralytics/yolov5, and MMRotate tool kits. The mean average precision (mAP) was elected as the principal metric to compare with the existing literature.
Additionally, to confirm the efficacy of the dual-frequency module and aid researchers in choosing, this study provides an understandable comparison between single-frequency PSC and dual-frequency PSC. A visual comparison demonstrates that the dual-frequency approach can function as predicted and provide a unified solution to border discontinuity and square-like issues. Therefore, the dual-frequency process is strongly advised in settings with square-like objects.
In this work, the phase-shifting coder is used for the first time in deep learning to deal with the orientation angle regression problem. The proposed method encodes the orientation angle into a periodic phase to solve the boundary discontinuity problem. Based on PSC, an improved dual-frequency variant PSCD is presented to elegantly resolve both boundary discontinuity and square-like issues by mapping the rotational periodicity of various cycles into phases of multiple frequencies. The authors provided well-written public codes with reproducible results.
Check out the paper and code. All Credit For This Research Goes To Researchers on This Project. Also, don’t forget to join our Reddit page and discord channel, where we share the latest AI research news, cool AI projects, and more.
Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor’s degree in physical science and a master’s degree in
telecommunications and networking systems. His current areas of
research concern computer vision, stock market prediction and deep
learning. He produced several scientific articles about person re-
identification and the study of the robustness and stability of deep