An Integrated Framework of Grasp Detection and Imitation Learning for Space Robotics Applications

Yuming Ning; Tuanjie Li; Yulin Zhang; Ziang Li; Wenqian Du; Yan Zhang

doi:10.1186/s10033-025-01321-8

An Integrated Framework of Grasp Detection and Imitation Learning for Space Robotics Applications

Ning, Yuming; Li, Tuanjie; Zhang, Yulin; Li, Ziang; Du, Wenqian; Zhang, Yan 2025-08-05 00:00:00 Robots are key to expanding the scope of space applications. The end-to-end training for robot vision-based detec- tion and precision operations is challenging owing to constraints such as extreme environments and high computa- tional overhead. This study proposes a lightweight integrated framework for grasp detection and imitation learning, named GD-IL; it comprises a grasp detection algorithm based on manipulability and Gaussian mixture model (manip- ulability–GMM), and a grasp trajectory generation algorithm based on a two-stage robot imitation learning algorithm ( TS-RIL). In the manipulability–GMM algorithm, we apply GMM clustering and ellipse regression to the object point cloud, propose two judgment criteria to generate multiple candidate grasp bounding boxes for the robot, and use manipulability as a metric for selecting the optimal grasp bounding box. The stages of the TS-RIL algorithm are grasp trajectory learning and robot pose optimization. In the first stage, the robot grasp trajectory is characterized using a second-order dynamic movement primitive model and Gaussian mixture regression (GMM). By adjusting the func- tion form of the forcing term, the robot closely approximates the target-grasping trajectory. In the second stage, a robot pose optimization model is built based on the derived pose error formula and manipulability metric. This model allows the robot to adjust its configuration in real time while grasping, thereby effectively avoiding singulari- ties. Finally, an algorithm verification platform is developed based on a Robot Operating System and a series of com- parative experiments are conducted in real-world scenarios. The experimental results demonstrate that GD-IL signifi- cantly improves the effectiveness and robustness of grasp detection and trajectory imitation learning, outperforming existing state-of-the-art methods in execution efficiency, manipulability, and success rate. Keywords Grasp detection, Robot imitation learning, Manipulability, Dynamic movement primitives, Gaussian mixture model and Gaussian mixture regression, Pose optimization 1 Introduction Robots have emerged as an ultimate solution for the on- orbit assembly and maintenance of ultra-large structures in space, which is a strategic high-ground contested by *Correspondence: Tuanjie Li major spacefaring nations worldwide [1–5]. To cope with [email protected] extreme environmental constraints [6] and reduce mis- State Key Laboratory of Electromechanical Integrated Manufacturing sion costs and risks [7], on-orbit servicing imposes higher of High-performance Electronic Equipments, Xidian University, Xi’an 710071, China requirements on the intelligence level of robots [8]. The School of Mechano-Electronic Engineering, Xidian University, traditional on-orbit operation mode relies on human– Xi’an 710071, China robot teleoperation [9, 10] or repetitive programming School of Informatics, The University of Edinburgh, Edinburgh EH8 9AB, United Kingdom to drive robotic actions [11, 12]; however, this does not Beijing MegaRobo Technologies Co., Ltd., Beijing 100085, China support unstructured on-orbit task scenarios and cannot © The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 2 of 20 satisfy the growing demand for customized services. Fol- within 20 min, improving the planning efficiency by lowing this, grasp detection [13] and imitation learning 26.1% compared with traditional approaches. Kim et al. [14] are currently two important research and applica- [33] applied a transformer-based self-attention mecha- tion directions in the field of robot-based on-orbit ser - nism to deep imitation learning for dual-arm manipu- vicing [15], as well as key indicators of a robot’s ability lation tasks to improve performance by focusing on to achieve a high level of autonomy in unmanned envi- important sensory inputs and reducing distractions in ronments [16]. Grasp detection ensures that the robot real-world experiments. Cortes et al. [34] proposed an can accurately locate the target object, whereas imitation imitation learning algorithm for a soft gripper by com- learning enables the robot to generalize learned opera- bining a Mask RCNN for object localization and deep tional skills to different task scenarios [17, 18], thereby learning for robot grasping tasks. They demonstrated enhancing its adaptability to diverse environments. The high performance and grasping success in various object combination of grasp detection and imitation learning configurations and environments. Jonnavittula et al. provides valuable insights into exploring new models for [35] introduced VIEW, an algorithm that improves the on-orbit servicing, attracting widespread attention across human-to-robot visual imitation learning (VIL) efficiency various industries [14, 19]. by extracting condensed trajectories using agent-agnostic The real-time grasp detection performance of a robot rewards and segmenting tasks into phases. Wang et al. is a crucial factor affecting its efficiency [20]. Generally, [36] proposed a robust imitation learning framework robot grasping is achieved using vision-guided algo- for dual-arm tasks; it incorporated shared teleoperation, rithms, which can be categorized into two main types. coupled dynamical systems, mutual following, and reac- The first category includes methods such as reinforce - tive obstacle avoidance to improve generalization and ment learning (RL) [21, 22], artificial neural networks stability. Duan et al. [37] proposed a vision-based hand (ANNs) [23], and knowledge-based reasoning [24]. Lin gesture control system and a reinforcement learning et al. [25] proposed a fast and robust image-processing method that integrated demonstrations and environmen- algorithm that utilizes a recurrent deep deterministic tal rewards to accelerate imitation learning for construc- policy gradient (recurrent DDPG) to detect obstacles tion robots. Leading research teams worldwide have and predict a collision-free path based on the current actively explored intelligent grasp controls to enhance state. Ribeiro et al. [26] introduced a novel convolutional the intelligence of robotic on-orbit servicing. Xie et al. neural network (CNN) to improve the visual percep- [38] reviewed the progress in learning-based grasping, tion stage in robotic grasping tasks for accurately esti- highlighting the role of deep learning, 3D object segmen- mating the pose of the object to be grasped. Chen et al. tation, and tactile sensing in improving adaptability. In [27] developed an algorithm that utilizes point cloud space robotics, artificial intelligence (AI) techniques such data from multiple stereo vision systems for object pose as deep learning and reinforcement learning have enabled estimation and introduced an improved iterative clos- more autonomous and precise manipulation, replacing est point (ICP) method for real-world object pose esti- traditional grippers with algorithm-driven systems [12]. mation. However, these first-category algorithms suffer Liu et al. [39] proposed a deep reinforcement learning from long pretraining times and low motion robustness, framework with a multimodal gripper capable of diverse making them difficult to apply in on-orbit operations grasping modes. Jung et al. [40] introduced a physics- with limited computational resources. On the contrary, guided reward model to improve learning generalization. the second category employs online grasp detection Liu et al. [41] developed DexRepNet, which achieves high methods, such as the Oriented FAST and Rotated BRIFE grasp success through spatial hand-object representation (ORB) [28, 29], Scale-Invariant Feature Transform (SIFT) learning. Deng et al. [42] addressed grasping in cluttered [30], and Speeded-Up Robust Features (SURF) [31]; these scenes using a suction gripper system with affordance- methods offer better real-time performances than the based exploration. Although these methods enhance the first category algorithms with the constrained computa - autonomy of robot manipulation to a certain extent, they tional resources of a spacecraft. typically depend on large volumes of sample data and In addition to grasp detection, imitation learning is require extended training times, making them unsuit- a key technology enabling robots to achieve high lev- able for space-on-orbit assembly tasks. In addition, when els of autonomy. Currently, robot imitation learning is the operating environment and object configuration of generally classified into two categories: model-free and the robot are changed, the generalization ability of these model-based. For model-free methods, Zhang et al. [32] model-free methods significantly diminish. developed a CNN-based imitation learning framework Model-based methods are generally imitation learn- for efficient grasping point selection. This approach ena - ing methods based on trajectory representation, such bled robots to quickly learn the correct grasping postures as dynamic systems [43], movement primitives [44], the N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 3 of 20 Gaussian Mixture Model and Gaussian Mixture Regres- rithm and a TS-RIL-based grasp trajectory genera- sion (GMM/GMR) [45], and the Hidden Markov Model tion algorithm. (HMM) [46]. Among these, dynamic movement primi- (2) For the manipulability–GMM algorithm, we tives (DMPs) and their variants have become mainstream applied GMM clustering and ellipse regression to the methods in robot imitation learning owing to their sim- point cloud of the object, proposed two judgment ple modeling and low computational complexity [47]. criteria to generate multiple candidate grasp bound- The concept of DMPs was first introduced by Ijspeert ing boxes for the robot, and used manipulability as a et al. [48], and it involves the construction of a nonlinear metric for selecting the optimal grasp bounding box. system for robot control. Yi et al. [49] proposed an auton- (3) The TS-RIL algorithm performs grasp trajec - omous grasping approach for complex-shaped objects tory learning and robot pose optimization. In the using a high-DOF robotic hand, combining human dem- first stage, we use a second-order DMPs model and onstration data, 3D object reconstruction, and DMPs for GMM/GMR to characterize the robot’s grasp trajec- efficient grasping; they evaluated multiple objects and tory. The robot can closely approximate the target obtained promising results. Chen et al. [50] proposed grasp trajectory by adjusting the functional form of an online trajectory guidance framework for novice sur- the forcing term. In the second stage, we built a robot geons by using DMP-based imitation learning. They inte - pose optimization model based on the derived pose grated obstacle avoidance, augmented reality (AR), and error formula and manipulability metric. This model interactive feedback (IF) to improve the manipulation allows the robot to adjust its configuration in real performance during surgery. Coelho et al. [51] applied time while grasping, thereby effectively avoiding sin - learning from demonstration (LfD) to teach collaborative gularities. robot—human-like movements using DMPs and a covar- (4) We developed an algorithm verification platform iance matrix adaptation evolution strategy (CMA-ES) based on a robot operating system (ROS) and con- for skill transfer. Lauretti et al. [52] proposed a method ducted a series of comparative experiments in real- to scale the DMP parameters using two demonstrations, world scenarios to validate the effectiveness and improving the generalization in the reachable workspace robustness of the GD-IL framework. of the robot while ensuring a fast and efficient learning process. Lu et al. [53] introduced a framework for learn- The remainder of this paper is organized as follows. ing robot tool-use skills using DMPs, focusing on object In Section 2, we present the system architecture of the operation and tool-flipping skills, thereby enabling bet - GD-IL and describe its working principles. Section 3 ter generalization and adaptation to new tasks and tools. provides detailed design steps for the manipulability– These DMP variants effectively expanded the field of GMM-based grasp detection algorithm. Section 4 intro- robot imitation learning. duces the specific working principles of the two stages of However, the integration of lightweight robot grasp the TS-RIL algorithm: grasp trajectory learning and pose detection and imitation learning methods remains chal- optimization. Section 5 describes the development of an lenging. Firstly, many scholars continue to research grasp ROS-based algorithm verification platform and validates detection and imitation learning separately, rather than the effectiveness and robustness of the GD-IL algorithm integrating them; this reduces the smoothness and effi - through a series of prototype experiments. Finally, Sec- ciency of visually guided robot operations. Secondly, tion 6 concludes the study. existing methods rely on end-to-end training; however, as real-world robot tasks often involve a wide variety of 2 GD‑IL Framework motion trajectories, they require long pretraining times In this section, we introduce the system framework and and high computational power. This severely limits the working principle of GD-IL. Figure 1(a) shows the hard- application of robots in on-orbit space services. ware system and network architecture of GD-IL. At the To overcome the aforementioned challenges, we pro- beginning of the task, the host computer sends the ini- pose a lightweight integrated framework for grasp tialization parameters to the move_group [54] and depth detection and imitation learning, called GD-IL. This camera interfaces. Subsequently, the manipulability– framework enables precise online grasping under visual GMM algorithm, shown in Figure 1(b), performs point- guidance, without relying on large-sample training. The cloud clustering and ellipse regression on the target main contributions of this study are as follows. object, filtering the optimal grasp bounding box based on the manipulability metric. Finally, the TS-RIL algorithm, (1) We propose GD-IL—an integrated framework for as depicted in Figure 1(c), generates multiple subtrajec- grasp detection and imitation learning. It comprises tories for the robot grasping task using a second-order a manipulability–GMM-based grasp detection algo- DMPs model and GMM/GMR, and calculates the joint Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 4 of 20 Figure 1 System architecture of the proposed GD-IL angle profiles to drive the robot movement through the information is represented by six basic parameters: the pose optimizer for discretized trajectory points. 3D coordinates x , y , z , leng th W and width H of the t t t grasping bounding box, and angle θ between the bound- 3 Manipulability–GMM‑based Grasp Detection ing box and the X-axis, that is, Algorithm g = x , y , z , θ , H , W t t t t (1) In this section, we discuss the manipulability–GMM- based robot grasp detection algorithm. First, we intro- where, the five basic parameters x , y , θ , H , W can t t t duce a GMM to cluster the point clouds of the objects determine the size and pose of the grasping bounding to be grasped and perform ellipse fitting while propos - box in the plane, while z determines the grasping depth ing two judgment criteria to generate multiple candidate of the robot end-effector, as shown in Figure 2 grasping frames for the robot. Subsequently, based on the The GMM is a mixture of K Gaussian distributions and robot dynamic model, we establish manipulability as an has been widely applied in engineering fields such as defect evaluation metric to assess the quality of the candidate detection, pattern recognition, and clustering analysis [55]. detection results, ultimately determining the optimal When K = 1 , the GMM degenerates into a single Gauss- grasping box for the robot grasping task. ian distribution whose projection in the 2D space forms an Object grasping is a fundamental function of intelligent ellipse, as shown in Figure 3(b). In this section, we first use robots. For any object to be grasped, its grasp detection Figure 2 Graphical representation of robot grasp detection N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 5 of 20 Figure 3 Graphical representation of clustering and elliptical fitting of the point cloud of objects to be grasped using GMM: (a) Workpiece to be grasped, (b) Fitted ellipse, (c) Candidate grasping bounding box the GMM to cluster the point clouds of the objects to be π N(x |µ , � ) k n k k γ (n, k) = . grasped and perform ellipse fitting. The expression for a K (7) π N x µ , � j n j j GMM with K Gaussian distributions can be written as j=1 We then establish the expected log-likelihood function p(x) = π N(x|µ , � ), (2) k k k Q based on Eqs. (5) and (7), as expressed in Eq. (8). k=1 N K 1 1 T −1 { | } Q = γ (n, k) ln π N(x µ , � ) . k n k k (8) N(x|µ , � ) = exp − (x − µ ) � (x − µ ) , k k k k (2π) |� | n=1 k=1 (3) By computing the partial derivatives of Q with respect where, each Gaussian probability density function to µ and , and setting them to zero, we further derive k k N(x|µ , � ) is a component of p(x) , with its correspond- k k the updated formulas for parameter set Θ = {π ,µ, �} , a s ing mean and covariance denoted as µ and , resp e c- k k shown in Eqs. (9)−(12). tively. The coefficient π represents the mixture weight, which satisfies the normalization condition new µ = γ (n, k)x , n (9) K n,k n=1 π = 1, π ∈ [0, 1]. k k (4) k=1 new new new � = γ (n, k) x − µ x − µ , n n k k k n,k n=1 Based on the above analysis, the position and shape (10) of a GMM can be determined using the parameters n,k π = {π , π , ··· , π } , µ = {µ ,µ , ··· ,µ } , and new 1 2 K 1 2 K π = , (11) = { , , ··· , } . In this case, the parameter set 1 2 K Θ = {π ,µ, �} is estimated by constructing an objective where, function and using the maximum likelihood method [56], as shown in Eqs. (5) and (6). N = γ (n, k). (12) n,k N K n=1 L(X|Θ )= ln (p(X|π ,µ, � ))= ln π N(x |µ , � ) , k n k k n=1 k=1 Next, we use the previously derived Eqs. (9)–(12) to (5) cluster the point cloud of the target object and fit an ellipse (see Figure 3(b)). Furthermore, we compute and Θ = arg max L(X|Θ ), (6) generate candidate grasping bounding boxes for the robot end-effector based on the geometric dimensions where, X = (x , x , ··· , x ) represents a set of observed and rotation angle of the fitted ellipse; the length W and 1 2 N data, and for any data point x in X , its posterior prob- width H of the grasping bounding box can be obtained ability γ (n, k) is computed using Bayes’ theorem [57]: using Eq. (13). Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 6 of 20 Then, we use Criterion 1 to check whether W = f · 2 , s 2 (13) W ∈ [0,L ] . If this condition is satisfied, the process max H = · W , terminates, and the unique candidate grasping box is where f is the scaling factor for the long side of the selected as the optimal grasping box for the robot. If this grasping bounding box; we set f = 1.25 . and are the condition is not met, we set K = K + 1 . When K ≥ 2 , we s 1 2 eigenvalues of the covariance matrix , as shown in Fig further apply Criterion 2 to check whether there is any 3(c). overlap between multiple candidate grasping boxes. If However, grasp detection is more challenging than tra- an overlap exists, we increment K by 1 (see Figure 4(c)). ditional data clustering. The candidate grasping bound - Otherwise, each candidate grasping box is sequentially c c c c ing boxes generated based on the fitted ellipse may be output and recorded as a sequence g = g , g , ··· , g , 1 2 invalid when the grasping box length W exceeds the as shown in Figure 4(d). maximum opening distance of the robot end-effector. After obtaining multiple candidate grasping boxes for However, for more complex target objects, the ellipses the target object using the above steps, we establish an fitted by GMM and the corresponding candidate grasp - evaluation metric based on the robot dynamic model to ing boxes are often not unique ( K ≥ 2 ). Therefore, estab - evaluate the quality of the candidate detection results. In lishing evaluation metrics and selecting the most suitable particular, the fundamental dynamic model of the manip- grasping bounding box for a robot from among multiple ulator system can be expressed using Eq. (16). candidates are critical issues that need to be addressed. ¨ ˙ M(θ )θ + C θ, θ + G(θ ) = τ − J (θ ) F , (16) u e To address these challenges, we first define the fol - lowing criteria for generating candidate grasp bounding n×1 where, τ ∈ R represents the joint torques of the boxes: n×n manipulator, M θ ∈ R represents the inertia matrix ( ) Criterion 1: Check whether the length W of the gen- of the manipulator, C θ , θ ∈ R denotes the Coriolis erated candidate grasping box is smaller than the maxi- and centrifugal force vector, and G θ ∈ R represents ( ) mum opening distance L of the robot end effector. If max m×n the gravity vector. J θ ∈ R represents the Jacobian ( ) W ∈ [0,L ] , we set S = 1 ; otherwise, we set S = 0. max W W matrix of the manipulator, and F is the external force Criterion 2: If a target object has multiple candidate and torque acting on the robot end-effector. Here, m grasping boxes, we check for an overlap between them. denotes the degrees of freedom of the end effector, and If no overlap is detected, we set S = 1 ; otherwise, we set n denotes the degrees of freedom of the manipulator S = 0. ( m < n for a redundant manipulator). Considering the workpiece in Figure 4(a) as an exam- According to the robot dynamic model in Eq. (16) and ple, we initialized K = 1 . At this point, the GMM the mapping relationship between the joint space and the degenerates into a single Gaussian distribution (see Fig- operational space of the manipulator, the velocity vector ure 4(b)), and its parameters µ and are computed using of the robot end effector can be expressed as Eqs. (14)–(15). v = x˙ = J (θ )θ . (17) N ee ee µ = x , n (14) Furthermore, the acceleration vector can be expressed n=1 as ¨ ˙ v˙ = x ¨ = J (θ )θ + J (θ )θ . (18) 1 ee ee � = (x − µ )(x − µ ) . n n (15) n=1 At this point, we construct two new vectors, u and v , a s shown in Eq. (19). Figure 4 Graphical representation of generating candidate grasping bounding boxes for the robot using the above two judgment criteria N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 7 of 20 easier to evaluate which grasp point offers better u˜ = τ − J (θ ) F − C θ , θ − G(θ ), u e (19) dynamic performance), the end-effector acceleration ˙ ˙ v˜ = v˙ − J (θ )θ . ee vector v ˙ can be simplified as ee By combining Eqs. (16), (18), and (19), the relationship v˙ = v˜ + J (θ )θ = v˜ . (24) ee between the two new vectors u and v is obtained as −1 Combining Eqs. (22), (23), and (24), we obtain: v˜ = J (θ )M (θ )u˜ . (20) T T + + Dp MJ MJ Dp = 1. (25) The unit sphere of the joint driving torque in the joint ee ee space satisfies Eq. (25) can be further simplified as (21) u˜ u˜ = 1. ˙ ˙ v˙ = v˜ + J (θ )θ = v˜ , (26) ee Substituting Eq. (21) into Eq. (20), the expression for where D is a metric for evaluating the dynamic perfor- the acceleration ellipsoid in the operational space can be mance of the robot during the target grasping process. written as: Here, we refer to this as manipulability, as shown in + + Figure 5. v˜ MJ MJ v˜ = 1, (22) Using Eq. (26), we sequentially compute the manipu- where, J is the pseudo-inverse matrix. lability D (i = 1, 2, ··· , m) of the robot corresponding to Furthermore, the acceleration vector of the robot end- each of the m candidate grasping boxes, and determine effector v ˙ can be expressed as ee the maximum value as D = max{D ,D , ··· ,D } . At max 1 2 m this point, the candidate grasping box corresponding to v˙ = Dp , ee (23) ee D is the optimal grasping box, which is denoted as max g . where, p = cosγ , cosγ , cosγ ∈ R represents the [ ] opt 1 2 3 ee Through the above GMM-based point cloud cluster - three directional components of the acceleration vec- ing and the manipulability-based candidate grasping tor of the robot end-effector. γ , γ , γ denote the angles 1 2 3 box selection strategy, we determine the optimal grasp- between the acceleration vector and the positive direc- ing box g in the plane for the robot, along with its tions of the X, Y, and Z coordinate axes, respectively. opt five geometric parameters x , y , θ , H , W , as shown in Next, we sequentially set the centers of the m can- t t t Figure 6. didate grasping boxes g (i = 1,2, ··· , m) as the Based on the mapping relationship between the 3D center of the robot acceleration ellipsoid, denoted as point cloud of the object to be grasped and the RGB p = x , y . By setting θ = 0 (while this indicates t t t init i i i image [58], we obtain the depth information of the tar- that the robotic arm initiates the grasping motion from get object as shown in Eq. (27) and in Figure 7. a stationary state, by setting candidate grasp points as the centers of the acceleration ellipsoid, it becomes Figure 5 Graphical representation of the robot manipulability D Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 8 of 20 Figure 6 Grasp detection results for four typical workpieces using the manipulability–GMM algorithm: (a) 2D point clouds of four target objects, (b) Candidate grasping boxes, (c) Optimal grasping box Figure 7 Coordinate transformation between the point within an RGB image and its corresponding point in the 3D point cloud N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 9 of 20  � �� P c corresponding trajectory points along the path, thereby x = d x − x f , i c x i i � �� P c ensuring that the two time series exhibit the highest pos- y = d y − y f , (27) i c y i i sible similarity. z = d , For example, suppose we obtain two dem- c c c where p x , y denote arbitrary points in the RGB onstration trajectories manually, denoted as i i i image, and their corresponding points in the 3D point Ŵ , Ŵ ∈ Ŵ . The discrete trajectory demo,1 demo,2 demo P P P points contained in each trajectory are represented cloud are denoted as P x , y , z . d denotes the depth i i i i i as Ŵ = Ŵ , Ŵ , ··· , Ŵ ∈ Ŵ and value at point P . x , y denote the optical center positions tp tp ,1 tp ,2 tp ,N demo,1 i c c 1 1 1 1 1 of the RGB-D camera in the pixel coordinate system. Ŵ = Ŵ , Ŵ , ··· , Ŵ ∈ Ŵ . We define tp tp ,1 tp ,2 tp ,N demo,2 2 2 2 2 f , f are the internal parameters of the camera. x y a time warping sequence T (k) to record the time indices By combining the depth information z obtained t,opt mapping between discrete trajectory points Ŵ ∈ Ŵ tp ,i tp 1 1 from Eq. (27) with the 2D pose of the optimal grasping and Ŵ ∈ Ŵ , as shown in Eq. (28). tp ,j tp 2 2 box x , y , θ , the robot motion planner can t,opt t,opt t,opt T (k) = Ŵ (k), Ŵ (k) , tp tp (28) effectively execute the grasping planning instructions 1 2 and determine the opening distance L of the end-effector where, Ŵ k and Ŵ k represent the time points cor- ( ) ( ) tp tp 1 2 based on the short side length H of the optimal grasping responding to the discrete trajectory points of the aligned box. trajectories Ŵ and Ŵ , resp e ctively . demo,1 demo,2 We then define the distance function D T to compute ( ) the Euclidean distance between the corresponding points 4 TS‑RIL‑Based Grasp Trajectory Generation on the two trajectories, as shown in Eq. (29). Algorithm In this section, we propose a two-stage robot imitation D(T) = D Ŵ (1 : N ), Ŵ (1 : N ) tp 1 tp 2 1 2 learning algorithm, TS-RIL; its achieves robot grasp tra- = D i, j (29) jectory generation in two stages: grasp trajectory learn- = Ŵ − Ŵ . ing and pose optimization. First, we introduce the DTW tp ,i tp ,j 1 2 [59] to align the collected multiple-grasp demonstration Furthermore, we use dynamic programming [60] to trajectories, ensuring that they have the same time steps. compute the minimum cost path, as shown in Eq. (30). Subsequently, in the grasp trajectory learning stage, we introduce a second-order DMPs model and GMM/ p i, j = D i, j + min p i − 1, j , p i, j − 1 , p i − 1, j − 1 , cost cost cost cost GMR to model these trajectories. The robot can closely (30) approximate the target grasp trajectory by adjusting the where p i, j represents the minimum-cost path from cost functional form of the forcing term. Finally, in the pose the origin (1,1) to i, j . optimization stage, we establish a robot grasp pose opti- Figure 8 shows a graphical representation of the DTW mization model by integrating the pose error formula alignment of multiple trajectories. We take two trajec- and manipulability derived in Section 3. This model tories with time steps of 5 (blue line) and 7 (red line) enables real-time adjustment of the robot configuration as examples and view them as two vectors of lengths 5 during the grasping task, thereby effectively avoiding and 7, respectively. From this, an accumulated distance singularities. matrix D can be obtained, where each element is 5×7 computed using Eq. (30). After applying the DTW-based alignment processing to multiple trajectories, all demon- 4.1 Grasp Trajectory Preprocessing stration trajectories had the same time steps. Trajectory data acquisition is a critical step in the process of robot imitation and grasp execution. However, owing to the difficulty in ensuring consistency in both time and 4.2 TS‑RIL Algorithm space through manual demonstrations, the acquired tra- 4.2.1 Gr asp Trajectory Learning Stage jectory data are highly susceptible to distortion. Ijspeert et al. [48] were the first to propose the use of In this section, we first introduce DTW to perform a nonlinear system—dynamic movement primitives time-series alignment on the demonstration trajectory (DMPs)—to represent the trajectory of a robot end-effec - data. The fundamental concept of DTW is to measure tor. By dynamically adjusting the forcing term, the robot the similarity between two time series by computing can generalize new trajectories that satisfy the task exe- their distances and then aligning them along an opti- cution requirements in similar scenarios. A basic DMPs mal path. This alignment involves local scaling along model can be described as: the time axis to minimize the total distance between the Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 10 of 20 Figure 8 Graphical representation of DTW aligning multiple trajectories: (a) Warping between two trajectories, (b) Accumulated distance matrix 5×7 τ z˙ = α β g − y − z + f , z z 0 τ s˙ =−α s,0 < s ≤ 1, s (32) (31) τ y˙ = z, where, s is the phase variable, and α is a constant. where, z is an auxiliary variable, α and β are two time- By combining Eqs. (31) and (32), the general form of the z z dependent constants, and α = 4β . τ is a temporal scal- second-order DMPs model for the manipulator can be z z ing factor, g is a point attractor, and y is the reference derived, as shown in Eq. (33). A graphical representation of trajectory generated by transformation system. f repre- the DMPs model is presented in Figure 9. sents the forcing term, which comprises a series of kernel     � � � � v˙ α β g − y − v + f (s) z z 0 functions.     y˙ = · v , (33) To overcome the jump in the acceleration profiles dur - s˙ −α s ing the initial movement of the manipulator, we intro- duce an exponential decay function to alleviate the where f (s) represents the forcing term, which consists of abrupt motions of each robot joint, as shown in Eq. (32). a series of kernel functions as shown in Eq. (34). s is the Figure 9 Graphical representation of the second-order DMPs model for the manipulator N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 11 of 20 phase variable, v is the velocity of the robot end effector, 1 N s, f µ , � = and α is a constant. (2π) |� | T T −1 exp − [s, f ] − µ � [s, f ] − µ , n n n ψ (s) · w i i i=1 (41) f (s) = � (s) · s g − y = · s g − y . θ 0 0 The conditional probability can be computed using ψ (s) the GMR [61], as shown in Eq. (42). i=1 (34) P f |s ∼ φ N f |s ;ˆη , σˆ , (42) n n Here, ψ is the kernel function and w is its correspond- i i n=1 ing weight, as shown in Eq. (35). where, ψ (s) = exp −h (s − c ) , i i i (35) −1 ηˆ = µ + � � s − µ , (43) n f ,n fs,n s,n s,n where, h is the variance of this series of kernel functions, −1 and c is the center point corresponding to ψ . 2 i i σˆ = � − � � � , (44) f ,n fs,n s,n sf ,n Next, DTW is applied to align multiple end-effector demonstration trajectories obtained through man- π N s µ , � ual teaching for the robot grasping task, denoted as n s,n s,n φ = . t ,N ˙ ¨ Ŵ = Ŵ , Ŵ , Ŵ . Here, Ŵ represents (45) DTW t,n t,n t,n t ,n t=0,n=1 π N s µ , � j s,n s,n the position profiles of the robot end-effector, t denotes j=1 the discrete time segments of each DTW-processed Eq. (42) can be further simplified as trajectory, and N is the number of demonstration tra- jectories. Using Eq. (34), the corresponding f (t ) target n P f |s ∼ N ηˆ, σˆ , for each grasp trajectory can be computed using Eqs. (46) (36) and (37). where, f (t ) = τ y¨ − α β g − y − τ y˙ ,1 ≤ n ≤ N, target n DTW z z DTW DTW N N (36) 2 2 2 ηˆ = φ ηˆ , σˆ = φ σˆ . (47) n n n n f (t ) f (t ) ··· f (t ) f = . target 1 target 2 target N target n=1 n=1 (37) Through Eqs. (43)–(47), we represent the multiple Subsequently, following the GMM approach men- demonstration trajectories f (t ) contained in Eq. target n tioned in Section 3, we model the demonstration trajec- (36) as Eq. (48). tories. Specifically, we use the joint distribution P s, f of N trajectories to encode the nonlinear function f , a s target −1 f (t ) =ˆη = φ µ + � � s − µ . target n n s,n s,n f ,n fs,n shown in Eq. (38). n=1 (48) P s, f = π N s, f µ ,� , Using a robot grasping motion as an example, we first n n (38) n=1 obtain six demonstration trajectories through human teaching. Then, we use DTW to align these trajectories, where, ensuring that they had equal time steps. Next, we model the DTW-processed demonstration trajectories using π = 1, π ∈ [0, 1], n n (39) GMM and generate multiple Gaussian distribution ellip- n=1 ses. Finally, the GMR uses the time index as the input for the regression computation and generates a smooth rep- resentative trajectory that preserves the characteristics µ � � s,n s,n sf ,n µ = , � = . n (40) µ of the demonstration trajectories, as shown in Figure 10. � � f ,n fs,n f ,n GMR not only enables smoothing of a single robot tra- At this point, the Gaussian probability distribution jectory, but also allows multiple fitted ellipses generated N s, f µ ,� can be expressed as n by the GMM to be regressed into a new trajectory, which can be directly used in the second-order DMPs model. Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 12 of 20 Figure 10 Robot grasp trajectory learning process: (a) Collected multiple demonstration trajectories, (b) Demonstration trajectories after DTW processing with the same time steps, (c) Modeling the DTW-processed trajectories using GMM, (d) Performing regression using GMR to generate a smooth representative trajectory We sequentially performed the above computation function approximation problem. By adjusting the process for multiple discrete robot grasping actions and function form of the forcing term, the actual grasp recorded the execution orders of the final representa - trajectory converges to the target trajectory as much tive trajectories. This enables the robot to generate all as possible. The adjustable weight w corresponding the execution trajectories corresponding to a complete to each kernel function can then be determined using grasping task, as shown in Figure 11. Locally Weighted Regression (LWR) [62]. Specifically, Finally, by inputting smooth representative trajec- for any dimension of the robot motion, we expect to tories into the second-order DMPs model, we trans- achieve f ≈ f . target formed the grasp trajectory learning problem into a Figure 11 Generating all representative trajectories for a complete robot grasping task using GMM/GMR and second-order DMPs model N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 13 of 20   The mathematical model of the function-approxima - r r r p 11 12 13 � �   0 0 tion problem is described using Eq. (49). r r r p R (θ ) P (θ )  y  0 i−1 21 22 23 7 7 T (θ ) = T = = ,   7 i  r r r p  0 1 31 32 33 i=1 � � � 2 w = arg min f (t ) − f (t ) , n target n (51) w (1≤i≤N) n=1 i−1 � � � �� where T represents the pose transformation matrix  −1  ˆ f (t ) = φ µ + � � s − µ , target n n f ,n fs,n s,n s,n 0 0 between adjacent joints of the robot. R (θ ) and P (θ ) n=1 7 7 � � s.t. represent the robot posture matrix and end-effector posi -  ψ (s) = exp −h (s − c ) , i i i tion matrix, respectively. τ s˙ = −α s,0 < s ≤ 1, Subsequently, we integrate the manipulability D from Eq. (49) (26) and the forward kinematics equation from Eq. (52) to where f (t ) represents the forcing term of the target n construct the robot pose optimization model. Its expres- desired robot grasp trajectory processed by the GMM/ sion is shown in Eq. (52). GMR, and f (t ) represents the forcing term of the actual robot grasp trajectory. θ = arg min F(θ), Gen.i θ (1≤i≤7) By solving Eq. (49), we obtain the expression for w , a s F(θ) = w F (θ) + w F (θ), w = w = 0.5, shown in Eq. (50). F 1 F 2 F F  1 2 1 2 � � −1 2  � � � � /  T + + F (θ) = p MJ MJ p ,  1 T ee ee w w ··· w   w = . (50) 1 2 N  1 2  /  3 9 � �  � � � �  ∗ ∗   F θ = p − p + r − r , ( ) 2 ξ kj ξ kj Next, we characterize and learn multi-segment dem- s.t. ξ k,j onstration trajectories Ŵ using the second-order DTW   �  0 i−1 T (θ) = T , forward kinematic equation, DMPs model and GMM/GMR, respectively, and gen- 7 i  i=1 eralize a new grasp trajectory Ŵ for the robot in a  Gen  θ ∈ [θ , θ ], i = 1, 2, ··· , 7, i i,min i,max � � real-world operating scenario by calculating the weight ξ = x, y, z , k, j = {1, 2, 3}, matrix w online. (52) where F (θ ) and F (θ ) are objective functions formulated 1 2 based on robot pose error and robot manipulability, 4.2.2 Pose Optimization Stage respectively. To ensure that their values remain within In the second stage of TS-RIL, we discretize the tra- the same order of magnitude, we normalized them using jectory Ŵ into m trajectory points, denoted as Gen Eqs. (53) and (54). The corresponding weights, w and Ŵ ∈ Ŵ , i = 1, 2, ··· , m , and further design a pose Gen,i Gen w , are assigned to F (θ ) and F (θ ) , respectively. Here, F 1 2 optimizer to calculate the joint position profiles during i−1 we set w = w = 0.5 . T represents the pose trans- F F 1 2 i robot trajectory execution. Specifically, considering the formation matrix between adjacent joints of the robot, 7-DoF manipulator shown in Figure 12 as an example, θ is the joint angle sequence corresponding to the ith Gen,i we describe the forward kinematics equation using Eq. trajectory point Ŵ ∈ Ŵ , i = 1, 2, ··· , m . p and p Gen,i Gen ξ (51). represent the actual position and target position of the Figure 12 The 7-DoF manipulator: (a) Integrated robot hand-eye system, (b) Virtual prototype model, (c) D-H coordinate system Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 14 of 20 robot end-effector, respectively. r and r represent the the beginning of the task, the host computer first sends kj kj the initialization parameter information to both move_ actual position and target posture of the robot end-effec - group and GD-IL. Subsequently, the GD-IL sequentially tor, respectively. generates multiple subtrajectories for the robot grasping F (θ ) − F (θ ) 1,max 1 task and publishes discretized trajectory points. At this F (θ ) = , 1 (53) F (θ ) − F (θ ) 1,max 1,min point, move_group subscribes to a topic containing these discrete trajectory points, and drives the robot to move according to the specified poses. Finally, the final results F (θ ) − F (θ ) 2,max 2 F (θ ) = . 2 (54) of robot grasp detection and motion execution are dis- F (θ ) − F (θ ) 2,max 2,min played using the 3D visualization software Rviz [64]. Next, we employ a conventional optimization algo- Subsequently, we designed a multi-object pick-and- rithm, such as the Particle Swarm Optimization (PSO) place task scenario, as shown in Figure 14(a); the experi- algorithm [63], to solve the model in Eq. (52). This allows mental workspace was limited to a 0.9 × 0.6 m tabletop. us to obtain an optimal singularity-free robot configura - This experiment aimed to validate the generalization tion with the best dynamic performance, along with the capability of the GD-IL algorithm through a representa- corresponding joint angles. tive irregular object-grasping task, while laying both theoretical and engineering foundations for the future ground-based intelligent assembly of modular anten- 5 Experiments and Results nas. We employed the Adaptive PSO (A-PSO) algo- In this section, we describe the development of an ROS- rithm [65] to solve the robot pose optimization model based experimental testing platform comprising an inte- in Eq. (52), and set the number of particles m = 80 , grated robot hand-eye system, a force-feedback-enabled initial inertia weight ω = 0.3 , initial learning fac- end effector, a host computer, a network architecture, the max tors c = 1.5, c = 1.5 , and approximation coefficient GD-IL system framework, and a 3D visualization simula- 1 2 −3 tion platform. A graphical representation of this is shown δ = 10 . Considering the randomness of the prototype in Figure 13. We effectively integrated the move_group experiments, each set of experiments was repeated 20 interface in ROS with the proposed GD-IL framework. At times. All comparative experiments were conducted with Figure 13 System architecture of ROS-based robot semi-physical simulation and testing platform N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 15 of 20 Figure 14 Experimental process of the robot performing grasp detection and imitation learning using GD-IL: (a) Output the depth information and 3D point clouds, (b) Calculate the 3D object coordinates, (c) TS-RIL generalizes executable grasp sub-trajectories for the robot identical initial parameters to ensure consistent software shown in Figure 14(b). Finally, the TS-RIL algorithm uti- and hardware configurations across the experiments lizes the 3D object coordinates obtained in the previous (hardware configuration: Intel Core i7-10850H CPU @ step to generalize the executable grasp subtrajectories for 2.60 GHz, 16 GB RAM; software version: Ubuntu 18.04 the manipulator, as shown in Figure 14(c). LTS, ROS Melodic). As shown in Figure 15, when the robot performs grasp In the multi-object pick-and-place task shown in Fig- detection and imitation learning in the real-world sce- ure 14(a), the manipulator sequentially picked each nario using GD-IL, it first applies the manipulability– object from the tabletop and smoothly placed it into the GMM algorithm to compute the dynamic performance κ c blue basket on the right. During this process, the robot index D for each candidate grasp bounding box g of i κ ,i first uses a depth camera D435i [66] on its end-effector the κ objects to be grasped. Next, it sequentially outputs to detect objects on the tabletop, outputting the number D and sorts these values in descending order to gen- max of objects, their corresponding depth information, and erate the robot grasp task sequence. Finally, the TS-RIL 3D point clouds. The manipulability–GMM algorithm is algorithm generates grasp subtrajectories for each object. then applied to sequentially compute the optimal grasp- To verify the superior performance of GD-IL in ing bounding box for each object, with the 3D coordi- handling grasp detection and grasp trajectory imita- nates of the objects displayed on the host computer, as tion learning, multiple comparative experiments were Figure 15 Screenshots of the robot performing grasp detection and imitation learning using GD-IL in a real-world scenario Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 16 of 20 conducted using imitating CNN [32], Transformer- deviation between the actual and target robot motion DIL [33], IL-RGC [34], IL-ARG [49], vision IL [50], and trajectories. GD-IL (ours). The following performance metrics were Figures 16 and 17 illustrate the results of grasp trajec- used for evaluation: average computation time (CT), tory imitation learning using GD-IL. After the two pro- average execution time (ET), average manipulability, root cessing stages (grasp trajectory learning and robot pose mean square error (RMSE), and success rate (SR); the optimization) in the TS-RIL, the position profiles of the mathematical expression for RMSE is given in Eq. (55). robot joints became significantly smoother, as shown in Figure 16. The position and velocity curves of the robot wp end effector exhibited no abrupt changes or discontinui - (55) RMSE = p − p , wp,i target,i ties in the X-, Y-, and Z-directions, as shown in Figure 17. wp i=1 Additionally, during the preprocessing phase, the com- plete grasping task was segmented into multiple grasping where N represents the number of discrete waypoints wp subtasks, each corresponding to a time-dependent robot in the robot motion trajectory, and p and p wp,i target,i motion trajectory. The TS-RIL then sequentially trains denote the actual and target waypoints of the robot end these subtrajectories, enhancing the granularity of the effector, respectively. RMSE was used to measure the Figure 16 Joint angle profiles during the sequential robot grasping of multiple objects Figure 17 Motion trajectories for robot grasping generated using GD-IL in a real-world scenario: (a) Grasping motion trajectories of the robot end-effector, (b)−(d) the position and velocity profiles of the robot end-effector in the X, Y, and Z directions N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 17 of 20 robot’s grasping motion, enabling the robot to execute imitating CNN and transformer-DIL achieves shorter and manage motion commands with higher precision. computation times owing to their prolonged early stage Table 1 and Figure 18 present the comparative results training. However, compared to these pre-training- of the six algorithms used in the robot grasping experi- dependent machine learning algorithms, GD-IL incurs ments. As shown in Table 1, the proposed GD-IL out- only negligible computational overhead, losing less performs the other five algorithms in terms of efficiency, than 10% of the processing time. Therefore, the GD-IL robot manipulability, and success rate for both grasp remained within an engineering-acceptable range in detection and trajectory imitation learning. This advan - terms of computational efficiency. The manipulability tage stems from the integration of the manipulabil- increased by 47.1%, 67.7%, 135.2%, 56.3%, and 48.1%, ity–GMM and TS-RIL algorithms within the GD-IL while RMSE decreased by 74.3%, 28.4%, 17.4%, 84.1%, framework. These algorithms not only enable the robot and 55.0%, respectively. These comparative results indi - to rapidly compute the 3D coordinates of target objects cate that GD-IL outperforms the other five algorithms in without requiring extensive pre-training but also quickly terms of avoiding singular configurations and joint limi - converge the actual grasping trajectory to the target tations. This is because of the pose optimizer detailed in trajectory by adjusting the function form of the forcing Section 4.2.2, which effectively guides the robot out of term. Specifically, compared to CNN, IL-ARG, trans - the local optima and quickly searches for the optimal former-DIL, IL-RGC, and vision IL, the ET of GD-IL was singularity-free configuration. In addition, the results reduced by approximately 10.6%, 17.9%, 13.3%, 20.0%, qualitatively reflect the relationship between the manipu - and 15.5%, respectively. This demonstrates that GD-IL lability and RMSE. Specifically, the higher the manipula - maintains a leading execution time without relying on bility during task execution, the better the manipulator extensive pretraining or high computational power. The approaches the target trajectory. CT is reduced by approximately −7.4%, 24.2%, −10.2%, Finally, the GD-IL achieved a significant improve - 22.6%, and 34.3%, respectively. It is worth noting that ment in SR across multiple prototype experiments. Table 1 Comparative results of six imitation learning algorithms in robot grasping experiments Algorithm Short pre‑training Pose optimizer CT (s) ET (s) Manipulability RMSE SR (%) period −1 imitating CNN × × 2.8234 229.4388 0.5067 83.3 3.057 × 10 −1 IL-ARG × × 4.0028 249.7093 0.4445 66.6 1.097 × 10 −2 Transformer-DIL × × 2.7517 236.3922 0.3170 73.3 9.502 × 10 −1 IL-RGC × × 3.9220 256.3190 0.4769 63.3 4.948 × 10 −1 vision IL √ × 4.6169 242.6557 0.5035 80.0 1.743 × 10 −2 GD-IL (ours) √ √ 3.0337 205.0108 0.7455 93.3 7.851 × 10 CT and ET denote the average computation time and the average execution time, respectively. Figure 18 Relative ratios of six imitation learning algorithms in robot grasping experiments Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 18 of 20 Compared to the other five algorithms, the SR of GD-IL operational capabilities of robotic systems in complex increased by approximately 12.0%, 40.1%, 27.3%, 47.4%, space environments. and 16.6%, respectively. The experimental results dem - onstrate that the proposed GD-IL framework is both Supplementary Information efficient and reliable for multi-object grasp detection The online version contains supplementary material available at https:// doi. org/ 10. 1186/ s10033- 025- 01321-8. and trajectory imitation learning in complex real-world environments. In addition, through multiple multi- Additional file 1. object pick-and-place experiments, it was observed that manipulability is a critical factor affecting the Acknowledgements success rate of multi-object grasping tasks. Therefore, Not applicable after deploying the robot pose optimizer within the Authors’ Contributions GD-IL framework, the success rate of robot trajectory Yuming Ning and Tuanjie Li were responsible for the formal analysis, visualiza- imitation learning and motion execution was further tion, and editing; Yulin Zhang and Ziang Li contributed to the investigation, improved. conceptualization, and conducting of the experiments; and Wenqian Du and Yan Zhang assisted with methodology and resources. All authors read and approved the final manuscript. 6 Conclusions Funding Supported by National Natural Science Foundation of China (Grant No. In this study, we proposed a lightweight integrated 52475280), and Shaanxi Provincial Natural Science Basic Research Program framework for grasp detection and imitation learning, (Grant No. 2025SYSSYSZD-105). called GD-IL, which includes the following key compo- Data Availability nents and findings: The datasets and materials used and/or analyzed during the current study are available from the corresponding author on reasonable request. (1) We designed a manipulability–GMM-based grasp detection algorithm that applies GMM clus- Declarations tering and ellipse regression to the point cloud of Competing Interests an object. Two selection criteria were introduced to The authors declare no competing financial interests. generate multiple candidate grasp bounding boxes, and the optimal one was selected based on the Received: 28 February 2025 Revised: 21 June 2025 Accepted: 2 July 2025 manipulability metric. (2) We proposed a two-stage robot imitation learning method (TS-RIL). In Stage 1, a second-order DMPs model combined with the GMM/GMR was used to References model the grasping trajectory of the robot. In Stage [1] E Y Zhang, H Y Sai, Y H Li, et al. Modular robotic manipulator and ground 2, a robot pose optimization model was constructed assembly system for on-orbit assembly of space telescopes. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical using a derived pose error formulation and manip- Engineering Science, 2024, 238(6): 2283-2293. ulability index, allowing real-time configuration [2] C J Zhao, W Z Guo, M Chen, et al. Space truss construction modeling adjustments to avoid singularities during grasping. based on on-orbit assembly motion feature. Chinese Journal of Aeronaut- ics, 2024, 37(3): 365-379. (3) An ROS-based experimental platform was devel- [3] M H Nair, M C Rai, M Poozhiyil, et al. Robotic technologies for in-orbit oped to verify the GD-IL algorithm. Real-world assembly of a large aperture space telescope: A review. Advances in Space experiments were conducted to evaluate their per- Research, 2024, 74(10): 5118-5141. [4] D Timmermann, C Plasberg, F Graaf, et al. AI-based assembly sequence formance in comparison with existing methods. The planning in a robotic on-orbit assembly application. 10th International results showed that the GD-IL significantly enhanced Conference on Automation, Robotics and Applications, Greece, February the grasp detection and trajectory learning perfor- 22-24, 2024: 69-74. [5] A Nanjangud, C Underwood, C M Rai, et al. Towards robotic on-orbit mances. Specifically, the average computation time, assembly of large space telescopes: Mission architectures, concepts, and execution time, and RMSE reduced by more than analyses. Acta Astronautica, 2024, 224: 379-396. 20%, 15%, and 30%, respectively, and the average [6] Z Sheng, W Chen, Z Chen, et al. Sequence planning for on-orbit robotic assembly based on symbiotic organisms search with diversification manipulability and grasp success rate improved by strategy. Acta Astronautica, 2024, 219: 941-951. more than 50% and 15%, respectively. [7] B Sawik. Space mission risk, sustainability and supply chain: review, multi- objective optimization model and practical approach. Sustainability, 2023, 15(14): 11002. In future work, we will further investigate imita- [8] I Rodríguez, A S Bauer, K Nottensteiner, et al. Autonomous robot planning tion learning for time-varying tasks and in grasp- system for in-space assembly of reconfigurable structures. 2021 IEEE ing moving objects with the aim of enhancing the Aerospace Conference, USA, March 06-13, 2021: 1-17. N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 19 of 20 [9] K Hambuchen, J Marquez, T Fong. A review of NASA human-robot inter- [33] H Kim, Y Ohmura, Y Kuniyoshi. Transformer-based deep imitation learning action in space. Current Robotics Reports, 2021, 2(3): 265-272. for dual-arm robot manipulation. 2021 IEEE/RSJ International Conference [10] B Ma, Z Jiang, Y Liu, et al. Advances in space robots for on-orbit servic- on Intelligent Robots and Systems, September 27-30, 2021: 8965-8972. ing: A comprehensive review. Advanced Intelligent Systems, 2023, 5(8): [34] D S D Cortes, G Hwang, K U Kyung. Imitation learning based soft robotic 2200397. grasping control without precise estimation of target posture. 2021 [11] B Y Cheng, Q G Sun, L B Zeng, et al. Workspace optimization of on-orbit IEEE 4th International Conference on Soft Robotics, USA, April 12-16, 2021: assembly robotic system. Proceedings of the 2023 3rd International Confer- 149-154. ence on Robotics and Control Engineering, USA, May 12-14, 2023: 90-95. [35] A Jonnavittula, S Parekh, D P Losey. View: Visual imitation learning with [12] H Jahanshahi, Z H Zhu. Review of machine learning in robotic grasping waypoints. Autonomous Robots, 2025, 49(1): 1-26. control in space application. Acta Astronautica, 2024, 220: 37-61. [36] W Wang, C Zeng, Z Lu, et al. A novel robust imitation learning framework [13] R B Ashith Shyam, Z Hao, U Montanaro, et al. Autonomous robots for for dual-arm object-moving tasks. IEEE Transactions on Industrial Electron- space: Trajectory learning and adaptation using imitation. Frontiers in ics, 2024. Robotics and AI, 2021, 8: 638849. [37] K Duan, Z Zou, T Y Yang. Training of construction robots using imitation [14] M Alizadeh, Z H Zhu. A comprehensive survey of space robotic manipula- learning and environmental rewards. Computer-Aided Civil and Infrastruc- tors for on-orbit servicing. Frontiers in Robotics and AI, 2024, 11: 1470950. ture Engineering, 2025, 40(9): 1150-1165. [15] Z Huang, L L Shi, L J Fan, et al. Multirobot assembly sequence planning of [38] Z Xie, X Liang, C Roberto. Learning-based robotic grasping: A review. a large space telescope. Journal of Spacecraft and Rockets, 2024: 1-13. Frontiers in Robotics and AI, 2023, 10: 1038658. [16] S B Yang, S Lu, M X Li, et al. Study on virtual training platform for reinforce- [39] F Liu, F Sun, B Fang, et al. Hybrid robotic grasping with a soft multimodal ment learning of space manipulator. 2023 International Conference on gripper and a deep multistage learning scheme. IEEE Transactions on Service Robotics, China, July 21-23, 2023: 73-77. Robotics, 2023, 39(3): 2379-2399. [17] K Fujii, I Rodríguez, M Schedl, et al. Comparative analysis of robotic grip- [40] Y Jung, L Tao, M Bowman, et al. Physics-guided hierarchical reward ping solutions for cooperative and non-cooperative targets. 2024 IEEE mechanism for learning-based robotic grasping. arXiv preprint, 2022: Aerospace Conference, USA, March 2-9, 2024: 1-14. 2205.13561. https:// arxiv. org/ abs/ 2205. 13561 [18] C Li, J Y Yang, S Chang. Review on key technologies of space intelligent [41] Q T Liu, Y Cui, Q Ye, et al. Dexrepnet: Learning dexterous robotic grasping grasping robot. Journal of the Brazilian Society of Mechanical Sciences and network with geometric and spatial hand-object representations. 2023 Engineering, 2022, 44(2): 64. IEEE/RSJ International Conference on Intelligent Robots and Systems, USA, [19] E Papadopoulos, F Aghili, O Ma, et al. Robotic manipulation and capture October 1-5, 2023: 3153-3160. in space: A survey. Frontiers in Robotics and AI, 2021, 8: 686723. [42] Y Deng, X Guo, Y Wei, et al. Deep reinforcement learning for robotic [20] S Kumra, S Joshi, F Sahin. Gr-convnet v2: A real-time multi-grasp detec- pushing and picking in cluttered environment. 2019 IEEE/RSJ International tion network for robotic grasping. Sensors, 2022, 22(16): 6208. Conference on Intelligent Robots and Systems, China, November 3-8, 2019: [21] Y L Chen, Y R Cai, M Y Cheng. Vision-based robotic object grasping—a 619-626. deep reinforcement learning approach. Machines, 2023, 11(2): 275. [43] M Saveriano, F J Abu-Dakka, A Kramberger, et al. Dynamic movement [22] H Sekkat, S Tigani, R Saadane, et al. Vision-based robotic arm control primitives in robotics: A tutorial survey. The International Journal of Robot- algorithm using deep reinforcement learning for autonomous objects ics Research, 2023, 42(13): 1133-1184. grasping. Applied Sciences, 2021, 11(17): 7917. [44] C G Yang, C Z Chen, W He, et al. Robot learning system based on adaptive [23] S Ainetter, F Fraundorfer. End-to-end trainable deep neural network for neural control and dynamic movement primitives. IEEE Transactions on robotic grasp detection and semantic segmentation from rgb. 2021 IEEE Neural Networks and Learning Systems, 2018, 30(3): 777-787. International Conference on Robotics and Automation, China, June 03-05, [45] B Ti, Y Gao, Q Li, et al. Dynamic movement primitives for movement gen- 2021: 13452-13458. eration using GMM-GMR analytical method. 2019 IEEE 2nd International [24] C C Li, G H Tian, M Y Zhang. A semantic knowledge-based method for Conference on Information and Computer Technologies, USA, March 14-17, home service robot to grasp an object. Knowledge-Based Systems, 2024, 2019: 250-254. 297: 111947. [46] A Vakanski, I Mantegh, A Irish, et al. Trajectory learning for robot program- [25] G C Lin, L X Zhu, J H Li, et al. Collision-free path planning for a guava- ming by demonstration using hidden Markov model and dynamic harvesting robot based on recurrent deep reinforcement learning. time warping. IEEE Transactions on Systems, Man, and Cybernetics, Part B Computers and Electronics in Agriculture, 2021, 188: 106350. (Cybernetics), 2012, 42(4): 1039-1052. [26] E G Ribeiro, R de Queiroz Mendes, V Grassi Jr. Real-time deep learning [47] Z Lu, N Wang, D Shi. DMPs-based skill learning for redundant dual-arm approach to visual servo control and grasp detection for autonomous robotic synchronized cooperative manipulation. Complex & Intelligent robotic manipulation. Robotics and Autonomous Systems, 2021, 139: Systems, 2022, 8(4): 2873-2882. 103757. [48] S Schaal, P Mohajerian, A Ijspeert. Dynamics systems vs. optimal [27] F Chen, M Selvaggio, D G Caldwell. Dexterous grasping by manipulability control—a unifying view. Progress in Brain Research, 2007, 165: 425-445. selection for mobile manipulator with visual guidance. IEEE Transactions [49] J B Yi, J Kim, T Kang, et al. Anthropomorphic grasping of complex-shaped on Industrial Informatics, 2018, 15(2): 1202-1210. objects using imitation learning. Applied Sciences, 2022, 12(24): 12861. [28] F F Zheng, Z Y Gong, B Tao. Robotic grasping pose estimation based on [50] Z Chen, K Fan. An online trajectory guidance framework via imitation point cloud accelerated by image feature correspondence. 2023 IEEE learning and interactive feedback in robot-assisted surgery. Neural Net- International Conference on Real-time Computing and Robotics, China, July works, 2025: 107197. 17-20, 2023: 334-339. [51] L Coelho, S M Cerqueira, V Martins, et al. A DMPs-based approach for [29] Y Tian, J Y Zhang, Z K Yin, et al. Robot structure prior guided temporal human-like robotic movements. 2024 IEEE International Conference on attention for camera-to-robot pose estimation from image sequence. Autonomous Robot Systems and Competitions, Portugal, May 2-3, 2024: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern 201-206. Recognition, Canada, June 18-22, 2023: 8917-8926. [52] C Lauretti, C Tamantini, L Zollo. A new dmp scaling method for robot [30] J B Wu, Y X Zhang, Y J Zhang, et al. Research on object detection for ser- learning by demonstration and application to the agricultural domain. vice robots based on image feature. 2022 IEEE 5th Advanced Information IEEE Access, 2024, 12: 7661-7673. Management, Communicates, Electronic and Automation Control Confer- [53] Z Y Lu, N Wang, C G Yang. A dynamic movement primitives-based tool ence, China, December 16-18, 2022, 5: 1535-1539. use skill learning and transfer framework for robot manipulation. IEEE [31] Y S Yang, S S Yeh. Manipulator point teaching system design integrated Transactions on Automation Science and Engineering, 2024, 22: 1748-1763. with image processing and iterative learning control. Journal of Intelligent [54] Z Kingston, L E Kavraki. Robowflex: Robot motion planning with MoveIt & Robotic Systems, 2019, 96: 477-492. made easy. 2022 IEEE/RSJ International Conference on Intelligent Robots and [32] S Zhang, S Li, Y Li, et al. A visual imitation learning algorithm for the selec- Systems, Japan, October 23-27, 2022: 3108-3114. tion of robots’ grasping points. Robotics and Autonomous Systems, 2024, 55. N Vuković, M Mitić, Z Miljković. Trajectory learning and reproduction for 172: 104600. differential drive mobile robots based on GMM/HMM and dynamic time Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 20 of 20 warping using learning from demonstration framework. Engineering Sorbonne University, France, in 2021. His research interests include Applications of Artificial Intelligence, 2015, 45: 388-404. mobile manipulation, and whole-body motion generation of human- [56] C Ye, J Yang, H Ding. Bagging for Gaussian mixture regression in robot oid and quadruped robots. learning from demonstration. Journal of Intelligent Manufacturing, 2022, 33(3): 867-879. Yan Zhang born in 1982, is currently a PhD candidate at School of [57] H W Zhang, X N Han, W Zhang, et al. Complex sequential tasks learning Mechano-Electronic Engineering, Xidian University, China. He received with Bayesian inference and Gaussian mixture model. 2018 IEEE Interna- his master degree from Xidian University, China, in 2008. His research tional Conference on Robotics and Biomimetics, Malaysia, December 12-15, interests include intelligent robot and life science lab automation. 2018: 1927-1934. [58] X Wang, C Yang, Z Ju, et al. Robot manipulator self-identification for sur - rounding obstacle detection. Multimedia Tools and Applications, 2017, 76: 6495-6520. [59] B Johnen, B Kuhlenkoetter. A dynamic time warping algorithm for indus- trial robot motion analysis. 2016 Annual Conference on Information Science and Systems, USA, March 16-18, 2016: 18-23. [60] Y Yao, X Zhao, Y Wu, et al. Clustering driver behavior using dynamic time warping and hidden Markov model. Journal of Intelligent Transportation Systems, 2021, 25(3): 249-262. [61] C Li, C Yang, Z Ju, et al. An enhanced teaching interface for a robot using DMP and GMR. International Journal of Intelligent Robotics and Applica- tions, 2018, 2(1): 110-121. [62] W S Cleveland, S J Devlin. Locally weighted regression: An approach to regression analysis by local fitting. Journal of the American Statistical Association, 1988, 83(403): 596-610. [63] Y M Ning, T J Li, W Q Du, et al. Inverse kinematics and planning/control co-design method of redundant manipulator for precision operation: Design and experiments. Robotics and Computer-Integrated Manufactur- ing, 2023, 80: 102457. [64] H R Kam, S H Lee, T Park, et al. Rviz: A toolkit for real domain data visuali- zation. Telecommunication Systems, 2015, 60: 337-345. [65] Y M Ning, T J Li, C Yao, et al. MT-RSL: A multitasking-oriented robot skill learning framework based on continuous dynamic movement primitives for improving efficiency and quality in robot-based intelligent operation. Robotics and Computer-Integrated Manufacturing, 2024, 90: 102817. [66] M Jiang, Z Wang, F Peng. Smart home robot based on 3D LiDAR and depth camera technology. 2024 3rd International Conference on Cloud Computing, Big Data Application and Software Engineering, China, October 11-13, 2024: 750-753. Yuming Ning born in 1997, is currently a PhD candidate at School of Mechano-Electronic Engineering, Xidian University, China. He received his bachelor degree from Xidian University, China, in 2019. His research interests include robot skill learning, motion planning, and intelligent control for autonomous vehicles. Tuanjie Li born in 1972, is currently a professor at School of Mech- ano-Electronic Engineering, Xidian University, China. He received his Ph.D. degree in mechanical engineering from Xi’an University of Tech- nology, China, in 1999. His research interests include space deploy- able antenna, intelligent control, and nonlinear dynamics. Yulin Zhang born in 1997, is currently a PhD candidate at School of Mechano-Electronic Engineering, Xidian University, China. He received his bachelor degree from Xidian University, China, in 2020. His research interests include robot skill learning, motion coordination, and intel- ligent multi-mode robots. Ziang Li born in 2000, is currently a master candidate at School of Mechano-Electronic Engineering, Xidian University, China. He received his bachelor degree from Inner Mongolia University of Technology, China, in 2020. His research interests include robot skill learning and motion control. Wenqian Du born in 1988, is currently a research associate at The University of Edinburgh, the United Kingdom. He received his Ph.D. degree from Institut des systèmes intelligents et de robotique (ISIR) of http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Chinese Journal of Mechanical Engineering Springer Journals http://www.deepdyve.com/lp/springer-journals/an-integrated-framework-of-grasp-detection-and-imitation-learning-for-d3pLDlHopx

Loading next page...

References (63)

C Li (2018)
An enhanced teaching interface for a robot using DMP and GMR
International journal of intelligent robotics and Applications, 2
Z Sheng (2024)
Sequence planning for on-orbit robotic assembly based on symbiotic organisms search with diversification strategy
Acta Astronautica, 219
Z Lu (2022)
DMPs-based skill learning for redundant dual-arm robotic synchronized cooperative manipulation
Complex & Intelligent Systems, 8
Transformer-based deep imitation learning for dual-arm robot manipulation
Robotic Grasping Pose Estimation based on Point Cloud accelerated by image feature correspondence
Z Xie (2023)
Learning-based robotic grasping: A review
Frontiers in Robotics and AI, 10
Study on Virtual Training Platform for Reinforcement Learning of Space Manipulator
ZY Lu (2024)

IEEE Transactions on Automation Science and Engineering, 22
WS Cleveland (1988)
Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting
Journal of the American Statistical Association, 83
A Novel Robust Imitation Learning Framework for Dual-Arm Object-Moving Tasks
Autonomous Robots for Space: Trajectory Learning and Adaptation Using Imitation
H Jahanshahi (2024)
Review of machine learning in robotic grasping control in space application
Acta Astronautica, 220
C Lauretti (2024)
A New DMP Scaling Method for Robot Learning by Demonstration and Application to the Agricultural Domain
IEEE Access, 12
Dynamic Movement Primitives for Movement Generation Using GMM-GMR Analytical Method
K Duan (2025)
Training of construction robots using imitation learning and environmental rewards
Computer-Aided Civil and Infrastructure Engineering, 40
K Hambuchen (2021)
A Review of NASA Human-Robot Interaction in Space
Current Robotics Reports, 2
CC Li (2024)

Knowledge-Based Systems, 297
MH Nair (2024)
Robotic Technologies for In-Orbit Assembly of a Large Aperture Space Telescope: A Review
Advances in Space Research, 74
Smart Home Robot Based on 3D LiDAR and Depth Camera Technology
Trajectory Learning for Robot Programming by Demonstration Using Hidden Markov Model and Dynamic Time Warping
N Vuković (2015)
Trajectory learning and reproduction for differential drive mobile robots based on GMM/HMM and dynamic time warping using learning from demonstration framework
Engineering Applications of Artificial Intelligence, 45
CJ Zhao (2024)
Space truss construction modeling based on on-orbit assembly motion feature
Chinese Journal of Aeronautics, 37
Robowflex: Robot Motion Planning with MoveIt Made Easy
YL Chen (2023)
Vision-Based Robotic Object Grasping—A Deep Reinforcement Learning Approach
Machines, 11
M Saveriano (2023)
Dynamic movement primitives in robotics: A tutorial survey
The International Journal of Robotics Research, 42
YM Ning (2024)

Robotics and Computer-Integrated Manufacturing, 90
GC Lin (2021)

Computers and Electronics in Agriculture, 188
S Kumra (2022)
GR-ConvNet v2: A Real-Time Multi-Grasp Detection Network for Robotic Grasping
Sensors, 22
B Sawik (2023)
Space Mission Risk, Sustainability and Supply Chain: Review, Multi-Objective Optimization Model and Practical Approach
Sustainability, 15
M Alizadeh (2024)
A comprehensive survey of space robotic manipulators for on-orbit servicing
Frontiers in Robotics and AI, 11
Robot Structure Prior Guided Temporal Attention for Camera-to-Robot Pose Estimation from Image Sequence
E Papadopoulos (2021)
Robotic Manipulation and Capture in Space: A Survey
Frontiers in Robotics and AI, 8
YS Yang (2019)
Manipulator Point Teaching System Design Integrated with Image Processing and Iterative Learning Control
Journal of Intelligent & Robotic Systems, 96
End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and Semantic Segmentation from RGB
HR Kam (2015)
RViz: a toolkit for real domain data visualization
Telecommunication Systems, 60
CG Yang (2018)
Robot Learning System Based on Adaptive Neural Control and Dynamic Movement Primitives
IEEE Transactions on Neural Networks and Learning Systems, 30
Y Yao (2021)
Clustering driver behavior using dynamic time warping and hidden Markov model
Journal of Intelligent Transportation Systems, 25
S Zhang (2024)
A visual imitation learning algorithm for the selection of robots’ grasping points
Robotics and Autonomous Systems, 172
Autonomous Robot Planning System for In-Space Assembly of Reconfigurable Structures
F Liu (2023)
Hybrid Robotic Grasping With a Soft Multimodal Gripper and a Deep Multistage Learning Scheme
IEEE Transactions on Robotics, 39
Complex Sequential Tasks Learning with Bayesian Inference and Gaussian Mixture Model
C Li (2022)
Review on key technologies of space intelligent grasping robot
Journal of the Brazilian Society of Mechanical Sciences and Engineering, 44
An online trajectory guidance framework via imitation learning and interactive feedback in robot-assisted surgery
F Chen (2018)
Dexterous Grasping by Manipulability Selection for Mobile Manipulator With Visual Guidance
IEEE Transactions on Industrial Informatics, 15
B Ma (2023)
Advances in Space Robots for On‐Orbit Servicing: A Comprehensive Review
Advanced Intelligent Systems, 5
Workspace optimization of on-orbit assembly robotic system
A DMPs-based Approach for Human-like Robotic Movements
Deep Reinforcement Learning for Robotic Pushing and Picking in Cluttered Environment
JB Yi (2022)
Anthropomorphic Grasping of Complex-Shaped Objects Using Imitation Learning
Applied Sciences, 12
H Sekkat (2021)
Vision-Based Robotic Arm Control Algorithm Using Deep Reinforcement Learning for Autonomous Objects Grasping
Applied Sciences, 11
Real-time deep learning approach to visual servo control and grasp detection for autonomous robotic manipulation
A Nanjangud (2024)
Towards robotic on-orbit assembly of large space telescopes: Mission architectures, concepts, and analyses
Acta Astronautica, 224
EY Zhang (2024)

Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, 238
A Jonnavittula (2025)
View: visual imitation learning with waypoints
Autonomous Robots, 49
A Dynamic Time Warping algorithm for industrial robot motion analysis
C Ye (2022)
Bagging for Gaussian mixture regression in robot learning from demonstration
Journal of Intelligent Manufacturing, 33
Comparative Analysis of Robotic Gripping Solutions for Cooperative and Non-cooperative Targets
AI-Based Assembly Sequence Planning in a Robotic On-Orbit Assembly Application
Imitation Learning based Soft Robotic Grasping Control without Precise Estimation of Target Posture
X Wang (2017)
Robot manipulator self-identification for surrounding obstacle detection
Multimedia Tools and Applications, 76
Dynamics systems vs. optimal control — a unifying view
DexRepNet: Learning Dexterous Robotic Grasping Network with Geometric and Spatial Hand-Object Representations
Research on object detection for service robots based on image feature

Publisher: Springer Journals
Copyright: Copyright © The Author(s) 2025
ISSN: 1000-9345
eISSN: 2192-8258
DOI: 10.1186/s10033-025-01321-8
Publisher site: See Article on Publisher Site

Abstract

Robots are key to expanding the scope of space applications. The end-to-end training for robot vision-based detec- tion and precision operations is challenging owing to constraints such as extreme environments and high computa- tional overhead. This study proposes a lightweight integrated framework for grasp detection and imitation learning, named GD-IL; it comprises a grasp detection algorithm based on manipulability and Gaussian mixture model (manip- ulability–GMM), and a grasp trajectory generation algorithm based on a two-stage robot imitation learning algorithm ( TS-RIL). In the manipulability–GMM algorithm, we apply GMM clustering and ellipse regression to the object point cloud, propose two judgment criteria to generate multiple candidate grasp bounding boxes for the robot, and use manipulability as a metric for selecting the optimal grasp bounding box. The stages of the TS-RIL algorithm are grasp trajectory learning and robot pose optimization. In the first stage, the robot grasp trajectory is characterized using a second-order dynamic movement primitive model and Gaussian mixture regression (GMM). By adjusting the func- tion form of the forcing term, the robot closely approximates the target-grasping trajectory. In the second stage, a robot pose optimization model is built based on the derived pose error formula and manipulability metric. This model allows the robot to adjust its configuration in real time while grasping, thereby effectively avoiding singulari- ties. Finally, an algorithm verification platform is developed based on a Robot Operating System and a series of com- parative experiments are conducted in real-world scenarios. The experimental results demonstrate that GD-IL signifi- cantly improves the effectiveness and robustness of grasp detection and trajectory imitation learning, outperforming existing state-of-the-art methods in execution efficiency, manipulability, and success rate. Keywords Grasp detection, Robot imitation learning, Manipulability, Dynamic movement primitives, Gaussian mixture model and Gaussian mixture regression, Pose optimization 1 Introduction Robots have emerged as an ultimate solution for the on- orbit assembly and maintenance of ultra-large structures in space, which is a strategic high-ground contested by *Correspondence: Tuanjie Li major spacefaring nations worldwide [1–5]. To cope with [email protected] extreme environmental constraints [6] and reduce mis- State Key Laboratory of Electromechanical Integrated Manufacturing sion costs and risks [7], on-orbit servicing imposes higher of High-performance Electronic Equipments, Xidian University, Xi’an 710071, China requirements on the intelligence level of robots [8]. The School of Mechano-Electronic Engineering, Xidian University, traditional on-orbit operation mode relies on human– Xi’an 710071, China robot teleoperation [9, 10] or repetitive programming School of Informatics, The University of Edinburgh, Edinburgh EH8 9AB, United Kingdom to drive robotic actions [11, 12]; however, this does not Beijing MegaRobo Technologies Co., Ltd., Beijing 100085, China support unstructured on-orbit task scenarios and cannot © The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 2 of 20 satisfy the growing demand for customized services. Fol- within 20 min, improving the planning efficiency by lowing this, grasp detection [13] and imitation learning 26.1% compared with traditional approaches. Kim et al. [14] are currently two important research and applica- [33] applied a transformer-based self-attention mecha- tion directions in the field of robot-based on-orbit ser - nism to deep imitation learning for dual-arm manipu- vicing [15], as well as key indicators of a robot’s ability lation tasks to improve performance by focusing on to achieve a high level of autonomy in unmanned envi- important sensory inputs and reducing distractions in ronments [16]. Grasp detection ensures that the robot real-world experiments. Cortes et al. [34] proposed an can accurately locate the target object, whereas imitation imitation learning algorithm for a soft gripper by com- learning enables the robot to generalize learned opera- bining a Mask RCNN for object localization and deep tional skills to different task scenarios [17, 18], thereby learning for robot grasping tasks. They demonstrated enhancing its adaptability to diverse environments. The high performance and grasping success in various object combination of grasp detection and imitation learning configurations and environments. Jonnavittula et al. provides valuable insights into exploring new models for [35] introduced VIEW, an algorithm that improves the on-orbit servicing, attracting widespread attention across human-to-robot visual imitation learning (VIL) efficiency various industries [14, 19]. by extracting condensed trajectories using agent-agnostic The real-time grasp detection performance of a robot rewards and segmenting tasks into phases. Wang et al. is a crucial factor affecting its efficiency [20]. Generally, [36] proposed a robust imitation learning framework robot grasping is achieved using vision-guided algo- for dual-arm tasks; it incorporated shared teleoperation, rithms, which can be categorized into two main types. coupled dynamical systems, mutual following, and reac- The first category includes methods such as reinforce - tive obstacle avoidance to improve generalization and ment learning (RL) [21, 22], artificial neural networks stability. Duan et al. [37] proposed a vision-based hand (ANNs) [23], and knowledge-based reasoning [24]. Lin gesture control system and a reinforcement learning et al. [25] proposed a fast and robust image-processing method that integrated demonstrations and environmen- algorithm that utilizes a recurrent deep deterministic tal rewards to accelerate imitation learning for construc- policy gradient (recurrent DDPG) to detect obstacles tion robots. Leading research teams worldwide have and predict a collision-free path based on the current actively explored intelligent grasp controls to enhance state. Ribeiro et al. [26] introduced a novel convolutional the intelligence of robotic on-orbit servicing. Xie et al. neural network (CNN) to improve the visual percep- [38] reviewed the progress in learning-based grasping, tion stage in robotic grasping tasks for accurately esti- highlighting the role of deep learning, 3D object segmen- mating the pose of the object to be grasped. Chen et al. tation, and tactile sensing in improving adaptability. In [27] developed an algorithm that utilizes point cloud space robotics, artificial intelligence (AI) techniques such data from multiple stereo vision systems for object pose as deep learning and reinforcement learning have enabled estimation and introduced an improved iterative clos- more autonomous and precise manipulation, replacing est point (ICP) method for real-world object pose esti- traditional grippers with algorithm-driven systems [12]. mation. However, these first-category algorithms suffer Liu et al. [39] proposed a deep reinforcement learning from long pretraining times and low motion robustness, framework with a multimodal gripper capable of diverse making them difficult to apply in on-orbit operations grasping modes. Jung et al. [40] introduced a physics- with limited computational resources. On the contrary, guided reward model to improve learning generalization. the second category employs online grasp detection Liu et al. [41] developed DexRepNet, which achieves high methods, such as the Oriented FAST and Rotated BRIFE grasp success through spatial hand-object representation (ORB) [28, 29], Scale-Invariant Feature Transform (SIFT) learning. Deng et al. [42] addressed grasping in cluttered [30], and Speeded-Up Robust Features (SURF) [31]; these scenes using a suction gripper system with affordance- methods offer better real-time performances than the based exploration. Although these methods enhance the first category algorithms with the constrained computa - autonomy of robot manipulation to a certain extent, they tional resources of a spacecraft. typically depend on large volumes of sample data and In addition to grasp detection, imitation learning is require extended training times, making them unsuit- a key technology enabling robots to achieve high lev- able for space-on-orbit assembly tasks. In addition, when els of autonomy. Currently, robot imitation learning is the operating environment and object configuration of generally classified into two categories: model-free and the robot are changed, the generalization ability of these model-based. For model-free methods, Zhang et al. [32] model-free methods significantly diminish. developed a CNN-based imitation learning framework Model-based methods are generally imitation learn- for efficient grasping point selection. This approach ena - ing methods based on trajectory representation, such bled robots to quickly learn the correct grasping postures as dynamic systems [43], movement primitives [44], the N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 3 of 20 Gaussian Mixture Model and Gaussian Mixture Regres- rithm and a TS-RIL-based grasp trajectory genera- sion (GMM/GMR) [45], and the Hidden Markov Model tion algorithm. (HMM) [46]. Among these, dynamic movement primi- (2) For the manipulability–GMM algorithm, we tives (DMPs) and their variants have become mainstream applied GMM clustering and ellipse regression to the methods in robot imitation learning owing to their sim- point cloud of the object, proposed two judgment ple modeling and low computational complexity [47]. criteria to generate multiple candidate grasp bound- The concept of DMPs was first introduced by Ijspeert ing boxes for the robot, and used manipulability as a et al. [48], and it involves the construction of a nonlinear metric for selecting the optimal grasp bounding box. system for robot control. Yi et al. [49] proposed an auton- (3) The TS-RIL algorithm performs grasp trajec - omous grasping approach for complex-shaped objects tory learning and robot pose optimization. In the using a high-DOF robotic hand, combining human dem- first stage, we use a second-order DMPs model and onstration data, 3D object reconstruction, and DMPs for GMM/GMR to characterize the robot’s grasp trajec- efficient grasping; they evaluated multiple objects and tory. The robot can closely approximate the target obtained promising results. Chen et al. [50] proposed grasp trajectory by adjusting the functional form of an online trajectory guidance framework for novice sur- the forcing term. In the second stage, we built a robot geons by using DMP-based imitation learning. They inte - pose optimization model based on the derived pose grated obstacle avoidance, augmented reality (AR), and error formula and manipulability metric. This model interactive feedback (IF) to improve the manipulation allows the robot to adjust its configuration in real performance during surgery. Coelho et al. [51] applied time while grasping, thereby effectively avoiding sin - learning from demonstration (LfD) to teach collaborative gularities. robot—human-like movements using DMPs and a covar- (4) We developed an algorithm verification platform iance matrix adaptation evolution strategy (CMA-ES) based on a robot operating system (ROS) and con- for skill transfer. Lauretti et al. [52] proposed a method ducted a series of comparative experiments in real- to scale the DMP parameters using two demonstrations, world scenarios to validate the effectiveness and improving the generalization in the reachable workspace robustness of the GD-IL framework. of the robot while ensuring a fast and efficient learning process. Lu et al. [53] introduced a framework for learn- The remainder of this paper is organized as follows. ing robot tool-use skills using DMPs, focusing on object In Section 2, we present the system architecture of the operation and tool-flipping skills, thereby enabling bet - GD-IL and describe its working principles. Section 3 ter generalization and adaptation to new tasks and tools. provides detailed design steps for the manipulability– These DMP variants effectively expanded the field of GMM-based grasp detection algorithm. Section 4 intro- robot imitation learning. duces the specific working principles of the two stages of However, the integration of lightweight robot grasp the TS-RIL algorithm: grasp trajectory learning and pose detection and imitation learning methods remains chal- optimization. Section 5 describes the development of an lenging. Firstly, many scholars continue to research grasp ROS-based algorithm verification platform and validates detection and imitation learning separately, rather than the effectiveness and robustness of the GD-IL algorithm integrating them; this reduces the smoothness and effi - through a series of prototype experiments. Finally, Sec- ciency of visually guided robot operations. Secondly, tion 6 concludes the study. existing methods rely on end-to-end training; however, as real-world robot tasks often involve a wide variety of 2 GD‑IL Framework motion trajectories, they require long pretraining times In this section, we introduce the system framework and and high computational power. This severely limits the working principle of GD-IL. Figure 1(a) shows the hard- application of robots in on-orbit space services. ware system and network architecture of GD-IL. At the To overcome the aforementioned challenges, we pro- beginning of the task, the host computer sends the ini- pose a lightweight integrated framework for grasp tialization parameters to the move_group [54] and depth detection and imitation learning, called GD-IL. This camera interfaces. Subsequently, the manipulability– framework enables precise online grasping under visual GMM algorithm, shown in Figure 1(b), performs point- guidance, without relying on large-sample training. The cloud clustering and ellipse regression on the target main contributions of this study are as follows. object, filtering the optimal grasp bounding box based on the manipulability metric. Finally, the TS-RIL algorithm, (1) We propose GD-IL—an integrated framework for as depicted in Figure 1(c), generates multiple subtrajec- grasp detection and imitation learning. It comprises tories for the robot grasping task using a second-order a manipulability–GMM-based grasp detection algo- DMPs model and GMM/GMR, and calculates the joint Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 4 of 20 Figure 1 System architecture of the proposed GD-IL angle profiles to drive the robot movement through the information is represented by six basic parameters: the pose optimizer for discretized trajectory points. 3D coordinates x , y , z , leng th W and width H of the t t t grasping bounding box, and angle θ between the bound- 3 Manipulability–GMM‑based Grasp Detection ing box and the X-axis, that is, Algorithm g = x , y , z , θ , H , W t t t t (1) In this section, we discuss the manipulability–GMM- based robot grasp detection algorithm. First, we intro- where, the five basic parameters x , y , θ , H , W can t t t duce a GMM to cluster the point clouds of the objects determine the size and pose of the grasping bounding to be grasped and perform ellipse fitting while propos - box in the plane, while z determines the grasping depth ing two judgment criteria to generate multiple candidate of the robot end-effector, as shown in Figure 2 grasping frames for the robot. Subsequently, based on the The GMM is a mixture of K Gaussian distributions and robot dynamic model, we establish manipulability as an has been widely applied in engineering fields such as defect evaluation metric to assess the quality of the candidate detection, pattern recognition, and clustering analysis [55]. detection results, ultimately determining the optimal When K = 1 , the GMM degenerates into a single Gauss- grasping box for the robot grasping task. ian distribution whose projection in the 2D space forms an Object grasping is a fundamental function of intelligent ellipse, as shown in Figure 3(b). In this section, we first use robots. For any object to be grasped, its grasp detection Figure 2 Graphical representation of robot grasp detection N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 5 of 20 Figure 3 Graphical representation of clustering and elliptical fitting of the point cloud of objects to be grasped using GMM: (a) Workpiece to be grasped, (b) Fitted ellipse, (c) Candidate grasping bounding box the GMM to cluster the point clouds of the objects to be π N(x |µ , � ) k n k k γ (n, k) = . grasped and perform ellipse fitting. The expression for a K (7) π N x µ , � j n j j GMM with K Gaussian distributions can be written as j=1 We then establish the expected log-likelihood function p(x) = π N(x|µ , � ), (2) k k k Q based on Eqs. (5) and (7), as expressed in Eq. (8). k=1 N K 1 1 T −1 { | } Q = γ (n, k) ln π N(x µ , � ) . k n k k (8) N(x|µ , � ) = exp − (x − µ ) � (x − µ ) , k k k k (2π) |� | n=1 k=1 (3) By computing the partial derivatives of Q with respect where, each Gaussian probability density function to µ and , and setting them to zero, we further derive k k N(x|µ , � ) is a component of p(x) , with its correspond- k k the updated formulas for parameter set Θ = {π ,µ, �} , a s ing mean and covariance denoted as µ and , resp e c- k k shown in Eqs. (9)−(12). tively. The coefficient π represents the mixture weight, which satisfies the normalization condition new µ = γ (n, k)x , n (9) K n,k n=1 π = 1, π ∈ [0, 1]. k k (4) k=1 new new new � = γ (n, k) x − µ x − µ , n n k k k n,k n=1 Based on the above analysis, the position and shape (10) of a GMM can be determined using the parameters n,k π = {π , π , ··· , π } , µ = {µ ,µ , ··· ,µ } , and new 1 2 K 1 2 K π = , (11) = { , , ··· , } . In this case, the parameter set 1 2 K Θ = {π ,µ, �} is estimated by constructing an objective where, function and using the maximum likelihood method [56], as shown in Eqs. (5) and (6). N = γ (n, k). (12) n,k N K n=1 L(X|Θ )= ln (p(X|π ,µ, � ))= ln π N(x |µ , � ) , k n k k n=1 k=1 Next, we use the previously derived Eqs. (9)–(12) to (5) cluster the point cloud of the target object and fit an ellipse (see Figure 3(b)). Furthermore, we compute and Θ = arg max L(X|Θ ), (6) generate candidate grasping bounding boxes for the robot end-effector based on the geometric dimensions where, X = (x , x , ··· , x ) represents a set of observed and rotation angle of the fitted ellipse; the length W and 1 2 N data, and for any data point x in X , its posterior prob- width H of the grasping bounding box can be obtained ability γ (n, k) is computed using Bayes’ theorem [57]: using Eq. (13). Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 6 of 20 Then, we use Criterion 1 to check whether W = f · 2 , s 2 (13) W ∈ [0,L ] . If this condition is satisfied, the process max H = · W , terminates, and the unique candidate grasping box is where f is the scaling factor for the long side of the selected as the optimal grasping box for the robot. If this grasping bounding box; we set f = 1.25 . and are the condition is not met, we set K = K + 1 . When K ≥ 2 , we s 1 2 eigenvalues of the covariance matrix , as shown in Fig further apply Criterion 2 to check whether there is any 3(c). overlap between multiple candidate grasping boxes. If However, grasp detection is more challenging than tra- an overlap exists, we increment K by 1 (see Figure 4(c)). ditional data clustering. The candidate grasping bound - Otherwise, each candidate grasping box is sequentially c c c c ing boxes generated based on the fitted ellipse may be output and recorded as a sequence g = g , g , ··· , g , 1 2 invalid when the grasping box length W exceeds the as shown in Figure 4(d). maximum opening distance of the robot end-effector. After obtaining multiple candidate grasping boxes for However, for more complex target objects, the ellipses the target object using the above steps, we establish an fitted by GMM and the corresponding candidate grasp - evaluation metric based on the robot dynamic model to ing boxes are often not unique ( K ≥ 2 ). Therefore, estab - evaluate the quality of the candidate detection results. In lishing evaluation metrics and selecting the most suitable particular, the fundamental dynamic model of the manip- grasping bounding box for a robot from among multiple ulator system can be expressed using Eq. (16). candidates are critical issues that need to be addressed. ¨ ˙ M(θ )θ + C θ, θ + G(θ ) = τ − J (θ ) F , (16) u e To address these challenges, we first define the fol - lowing criteria for generating candidate grasp bounding n×1 where, τ ∈ R represents the joint torques of the boxes: n×n manipulator, M θ ∈ R represents the inertia matrix ( ) Criterion 1: Check whether the length W of the gen- of the manipulator, C θ , θ ∈ R denotes the Coriolis erated candidate grasping box is smaller than the maxi- and centrifugal force vector, and G θ ∈ R represents ( ) mum opening distance L of the robot end effector. If max m×n the gravity vector. J θ ∈ R represents the Jacobian ( ) W ∈ [0,L ] , we set S = 1 ; otherwise, we set S = 0. max W W matrix of the manipulator, and F is the external force Criterion 2: If a target object has multiple candidate and torque acting on the robot end-effector. Here, m grasping boxes, we check for an overlap between them. denotes the degrees of freedom of the end effector, and If no overlap is detected, we set S = 1 ; otherwise, we set n denotes the degrees of freedom of the manipulator S = 0. ( m < n for a redundant manipulator). Considering the workpiece in Figure 4(a) as an exam- According to the robot dynamic model in Eq. (16) and ple, we initialized K = 1 . At this point, the GMM the mapping relationship between the joint space and the degenerates into a single Gaussian distribution (see Fig- operational space of the manipulator, the velocity vector ure 4(b)), and its parameters µ and are computed using of the robot end effector can be expressed as Eqs. (14)–(15). v = x˙ = J (θ )θ . (17) N ee ee µ = x , n (14) Furthermore, the acceleration vector can be expressed n=1 as ¨ ˙ v˙ = x ¨ = J (θ )θ + J (θ )θ . (18) 1 ee ee � = (x − µ )(x − µ ) . n n (15) n=1 At this point, we construct two new vectors, u and v , a s shown in Eq. (19). Figure 4 Graphical representation of generating candidate grasping bounding boxes for the robot using the above two judgment criteria N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 7 of 20 easier to evaluate which grasp point offers better u˜ = τ − J (θ ) F − C θ , θ − G(θ ), u e (19) dynamic performance), the end-effector acceleration ˙ ˙ v˜ = v˙ − J (θ )θ . ee vector v ˙ can be simplified as ee By combining Eqs. (16), (18), and (19), the relationship v˙ = v˜ + J (θ )θ = v˜ . (24) ee between the two new vectors u and v is obtained as −1 Combining Eqs. (22), (23), and (24), we obtain: v˜ = J (θ )M (θ )u˜ . (20) T T + + Dp MJ MJ Dp = 1. (25) The unit sphere of the joint driving torque in the joint ee ee space satisfies Eq. (25) can be further simplified as (21) u˜ u˜ = 1. ˙ ˙ v˙ = v˜ + J (θ )θ = v˜ , (26) ee Substituting Eq. (21) into Eq. (20), the expression for where D is a metric for evaluating the dynamic perfor- the acceleration ellipsoid in the operational space can be mance of the robot during the target grasping process. written as: Here, we refer to this as manipulability, as shown in + + Figure 5. v˜ MJ MJ v˜ = 1, (22) Using Eq. (26), we sequentially compute the manipu- where, J is the pseudo-inverse matrix. lability D (i = 1, 2, ··· , m) of the robot corresponding to Furthermore, the acceleration vector of the robot end- each of the m candidate grasping boxes, and determine effector v ˙ can be expressed as ee the maximum value as D = max{D ,D , ··· ,D } . At max 1 2 m this point, the candidate grasping box corresponding to v˙ = Dp , ee (23) ee D is the optimal grasping box, which is denoted as max g . where, p = cosγ , cosγ , cosγ ∈ R represents the [ ] opt 1 2 3 ee Through the above GMM-based point cloud cluster - three directional components of the acceleration vec- ing and the manipulability-based candidate grasping tor of the robot end-effector. γ , γ , γ denote the angles 1 2 3 box selection strategy, we determine the optimal grasp- between the acceleration vector and the positive direc- ing box g in the plane for the robot, along with its tions of the X, Y, and Z coordinate axes, respectively. opt five geometric parameters x , y , θ , H , W , as shown in Next, we sequentially set the centers of the m can- t t t Figure 6. didate grasping boxes g (i = 1,2, ··· , m) as the Based on the mapping relationship between the 3D center of the robot acceleration ellipsoid, denoted as point cloud of the object to be grasped and the RGB p = x , y . By setting θ = 0 (while this indicates t t t init i i i image [58], we obtain the depth information of the tar- that the robotic arm initiates the grasping motion from get object as shown in Eq. (27) and in Figure 7. a stationary state, by setting candidate grasp points as the centers of the acceleration ellipsoid, it becomes Figure 5 Graphical representation of the robot manipulability D Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 8 of 20 Figure 6 Grasp detection results for four typical workpieces using the manipulability–GMM algorithm: (a) 2D point clouds of four target objects, (b) Candidate grasping boxes, (c) Optimal grasping box Figure 7 Coordinate transformation between the point within an RGB image and its corresponding point in the 3D point cloud N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 9 of 20  � �� P c corresponding trajectory points along the path, thereby x = d x − x f , i c x i i � �� P c ensuring that the two time series exhibit the highest pos- y = d y − y f , (27) i c y i i sible similarity. z = d , For example, suppose we obtain two dem- c c c where p x , y denote arbitrary points in the RGB onstration trajectories manually, denoted as i i i image, and their corresponding points in the 3D point Ŵ , Ŵ ∈ Ŵ . The discrete trajectory demo,1 demo,2 demo P P P points contained in each trajectory are represented cloud are denoted as P x , y , z . d denotes the depth i i i i i as Ŵ = Ŵ , Ŵ , ··· , Ŵ ∈ Ŵ and value at point P . x , y denote the optical center positions tp tp ,1 tp ,2 tp ,N demo,1 i c c 1 1 1 1 1 of the RGB-D camera in the pixel coordinate system. Ŵ = Ŵ , Ŵ , ··· , Ŵ ∈ Ŵ . We define tp tp ,1 tp ,2 tp ,N demo,2 2 2 2 2 f , f are the internal parameters of the camera. x y a time warping sequence T (k) to record the time indices By combining the depth information z obtained t,opt mapping between discrete trajectory points Ŵ ∈ Ŵ tp ,i tp 1 1 from Eq. (27) with the 2D pose of the optimal grasping and Ŵ ∈ Ŵ , as shown in Eq. (28). tp ,j tp 2 2 box x , y , θ , the robot motion planner can t,opt t,opt t,opt T (k) = Ŵ (k), Ŵ (k) , tp tp (28) effectively execute the grasping planning instructions 1 2 and determine the opening distance L of the end-effector where, Ŵ k and Ŵ k represent the time points cor- ( ) ( ) tp tp 1 2 based on the short side length H of the optimal grasping responding to the discrete trajectory points of the aligned box. trajectories Ŵ and Ŵ , resp e ctively . demo,1 demo,2 We then define the distance function D T to compute ( ) the Euclidean distance between the corresponding points 4 TS‑RIL‑Based Grasp Trajectory Generation on the two trajectories, as shown in Eq. (29). Algorithm In this section, we propose a two-stage robot imitation D(T) = D Ŵ (1 : N ), Ŵ (1 : N ) tp 1 tp 2 1 2 learning algorithm, TS-RIL; its achieves robot grasp tra- = D i, j (29) jectory generation in two stages: grasp trajectory learn- = Ŵ − Ŵ . ing and pose optimization. First, we introduce the DTW tp ,i tp ,j 1 2 [59] to align the collected multiple-grasp demonstration Furthermore, we use dynamic programming [60] to trajectories, ensuring that they have the same time steps. compute the minimum cost path, as shown in Eq. (30). Subsequently, in the grasp trajectory learning stage, we introduce a second-order DMPs model and GMM/ p i, j = D i, j + min p i − 1, j , p i, j − 1 , p i − 1, j − 1 , cost cost cost cost GMR to model these trajectories. The robot can closely (30) approximate the target grasp trajectory by adjusting the where p i, j represents the minimum-cost path from cost functional form of the forcing term. Finally, in the pose the origin (1,1) to i, j . optimization stage, we establish a robot grasp pose opti- Figure 8 shows a graphical representation of the DTW mization model by integrating the pose error formula alignment of multiple trajectories. We take two trajec- and manipulability derived in Section 3. This model tories with time steps of 5 (blue line) and 7 (red line) enables real-time adjustment of the robot configuration as examples and view them as two vectors of lengths 5 during the grasping task, thereby effectively avoiding and 7, respectively. From this, an accumulated distance singularities. matrix D can be obtained, where each element is 5×7 computed using Eq. (30). After applying the DTW-based alignment processing to multiple trajectories, all demon- 4.1 Grasp Trajectory Preprocessing stration trajectories had the same time steps. Trajectory data acquisition is a critical step in the process of robot imitation and grasp execution. However, owing to the difficulty in ensuring consistency in both time and 4.2 TS‑RIL Algorithm space through manual demonstrations, the acquired tra- 4.2.1 Gr asp Trajectory Learning Stage jectory data are highly susceptible to distortion. Ijspeert et al. [48] were the first to propose the use of In this section, we first introduce DTW to perform a nonlinear system—dynamic movement primitives time-series alignment on the demonstration trajectory (DMPs)—to represent the trajectory of a robot end-effec - data. The fundamental concept of DTW is to measure tor. By dynamically adjusting the forcing term, the robot the similarity between two time series by computing can generalize new trajectories that satisfy the task exe- their distances and then aligning them along an opti- cution requirements in similar scenarios. A basic DMPs mal path. This alignment involves local scaling along model can be described as: the time axis to minimize the total distance between the Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 10 of 20 Figure 8 Graphical representation of DTW aligning multiple trajectories: (a) Warping between two trajectories, (b) Accumulated distance matrix 5×7 τ z˙ = α β g − y − z + f , z z 0 τ s˙ =−α s,0 < s ≤ 1, s (32) (31) τ y˙ = z, where, s is the phase variable, and α is a constant. where, z is an auxiliary variable, α and β are two time- By combining Eqs. (31) and (32), the general form of the z z dependent constants, and α = 4β . τ is a temporal scal- second-order DMPs model for the manipulator can be z z ing factor, g is a point attractor, and y is the reference derived, as shown in Eq. (33). A graphical representation of trajectory generated by transformation system. f repre- the DMPs model is presented in Figure 9. sents the forcing term, which comprises a series of kernel     � � � � v˙ α β g − y − v + f (s) z z 0 functions.     y˙ = · v , (33) To overcome the jump in the acceleration profiles dur - s˙ −α s ing the initial movement of the manipulator, we intro- duce an exponential decay function to alleviate the where f (s) represents the forcing term, which consists of abrupt motions of each robot joint, as shown in Eq. (32). a series of kernel functions as shown in Eq. (34). s is the Figure 9 Graphical representation of the second-order DMPs model for the manipulator N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 11 of 20 phase variable, v is the velocity of the robot end effector, 1 N s, f µ , � = and α is a constant. (2π) |� | T T −1 exp − [s, f ] − µ � [s, f ] − µ , n n n ψ (s) · w i i i=1 (41) f (s) = � (s) · s g − y = · s g − y . θ 0 0 The conditional probability can be computed using ψ (s) the GMR [61], as shown in Eq. (42). i=1 (34) P f |s ∼ φ N f |s ;ˆη , σˆ , (42) n n Here, ψ is the kernel function and w is its correspond- i i n=1 ing weight, as shown in Eq. (35). where, ψ (s) = exp −h (s − c ) , i i i (35) −1 ηˆ = µ + � � s − µ , (43) n f ,n fs,n s,n s,n where, h is the variance of this series of kernel functions, −1 and c is the center point corresponding to ψ . 2 i i σˆ = � − � � � , (44) f ,n fs,n s,n sf ,n Next, DTW is applied to align multiple end-effector demonstration trajectories obtained through man- π N s µ , � ual teaching for the robot grasping task, denoted as n s,n s,n φ = . t ,N ˙ ¨ Ŵ = Ŵ , Ŵ , Ŵ . Here, Ŵ represents (45) DTW t,n t,n t,n t ,n t=0,n=1 π N s µ , � j s,n s,n the position profiles of the robot end-effector, t denotes j=1 the discrete time segments of each DTW-processed Eq. (42) can be further simplified as trajectory, and N is the number of demonstration tra- jectories. Using Eq. (34), the corresponding f (t ) target n P f |s ∼ N ηˆ, σˆ , for each grasp trajectory can be computed using Eqs. (46) (36) and (37). where, f (t ) = τ y¨ − α β g − y − τ y˙ ,1 ≤ n ≤ N, target n DTW z z DTW DTW N N (36) 2 2 2 ηˆ = φ ηˆ , σˆ = φ σˆ . (47) n n n n f (t ) f (t ) ··· f (t ) f = . target 1 target 2 target N target n=1 n=1 (37) Through Eqs. (43)–(47), we represent the multiple Subsequently, following the GMM approach men- demonstration trajectories f (t ) contained in Eq. target n tioned in Section 3, we model the demonstration trajec- (36) as Eq. (48). tories. Specifically, we use the joint distribution P s, f of N trajectories to encode the nonlinear function f , a s target −1 f (t ) =ˆη = φ µ + � � s − µ . target n n s,n s,n f ,n fs,n shown in Eq. (38). n=1 (48) P s, f = π N s, f µ ,� , Using a robot grasping motion as an example, we first n n (38) n=1 obtain six demonstration trajectories through human teaching. Then, we use DTW to align these trajectories, where, ensuring that they had equal time steps. Next, we model the DTW-processed demonstration trajectories using π = 1, π ∈ [0, 1], n n (39) GMM and generate multiple Gaussian distribution ellip- n=1 ses. Finally, the GMR uses the time index as the input for the regression computation and generates a smooth rep- resentative trajectory that preserves the characteristics µ � � s,n s,n sf ,n µ = , � = . n (40) µ of the demonstration trajectories, as shown in Figure 10. � � f ,n fs,n f ,n GMR not only enables smoothing of a single robot tra- At this point, the Gaussian probability distribution jectory, but also allows multiple fitted ellipses generated N s, f µ ,� can be expressed as n by the GMM to be regressed into a new trajectory, which can be directly used in the second-order DMPs model. Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 12 of 20 Figure 10 Robot grasp trajectory learning process: (a) Collected multiple demonstration trajectories, (b) Demonstration trajectories after DTW processing with the same time steps, (c) Modeling the DTW-processed trajectories using GMM, (d) Performing regression using GMR to generate a smooth representative trajectory We sequentially performed the above computation function approximation problem. By adjusting the process for multiple discrete robot grasping actions and function form of the forcing term, the actual grasp recorded the execution orders of the final representa - trajectory converges to the target trajectory as much tive trajectories. This enables the robot to generate all as possible. The adjustable weight w corresponding the execution trajectories corresponding to a complete to each kernel function can then be determined using grasping task, as shown in Figure 11. Locally Weighted Regression (LWR) [62]. Specifically, Finally, by inputting smooth representative trajec- for any dimension of the robot motion, we expect to tories into the second-order DMPs model, we trans- achieve f ≈ f . target formed the grasp trajectory learning problem into a Figure 11 Generating all representative trajectories for a complete robot grasping task using GMM/GMR and second-order DMPs model N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 13 of 20   The mathematical model of the function-approxima - r r r p 11 12 13 � �   0 0 tion problem is described using Eq. (49). r r r p R (θ ) P (θ )  y  0 i−1 21 22 23 7 7 T (θ ) = T = = ,   7 i  r r r p  0 1 31 32 33 i=1 � � � 2 w = arg min f (t ) − f (t ) , n target n (51) w (1≤i≤N) n=1 i−1 � � � �� where T represents the pose transformation matrix  −1  ˆ f (t ) = φ µ + � � s − µ , target n n f ,n fs,n s,n s,n 0 0 between adjacent joints of the robot. R (θ ) and P (θ ) n=1 7 7 � � s.t. represent the robot posture matrix and end-effector posi -  ψ (s) = exp −h (s − c ) , i i i tion matrix, respectively. τ s˙ = −α s,0 < s ≤ 1, Subsequently, we integrate the manipulability D from Eq. (49) (26) and the forward kinematics equation from Eq. (52) to where f (t ) represents the forcing term of the target n construct the robot pose optimization model. Its expres- desired robot grasp trajectory processed by the GMM/ sion is shown in Eq. (52). GMR, and f (t ) represents the forcing term of the actual robot grasp trajectory. θ = arg min F(θ), Gen.i θ (1≤i≤7) By solving Eq. (49), we obtain the expression for w , a s F(θ) = w F (θ) + w F (θ), w = w = 0.5, shown in Eq. (50). F 1 F 2 F F  1 2 1 2 � � −1 2  � � � � /  T + + F (θ) = p MJ MJ p ,  1 T ee ee w w ··· w   w = . (50) 1 2 N  1 2  /  3 9 � �  � � � �  ∗ ∗   F θ = p − p + r − r , ( ) 2 ξ kj ξ kj Next, we characterize and learn multi-segment dem- s.t. ξ k,j onstration trajectories Ŵ using the second-order DTW   �  0 i−1 T (θ) = T , forward kinematic equation, DMPs model and GMM/GMR, respectively, and gen- 7 i  i=1 eralize a new grasp trajectory Ŵ for the robot in a  Gen  θ ∈ [θ , θ ], i = 1, 2, ··· , 7, i i,min i,max � � real-world operating scenario by calculating the weight ξ = x, y, z , k, j = {1, 2, 3}, matrix w online. (52) where F (θ ) and F (θ ) are objective functions formulated 1 2 based on robot pose error and robot manipulability, 4.2.2 Pose Optimization Stage respectively. To ensure that their values remain within In the second stage of TS-RIL, we discretize the tra- the same order of magnitude, we normalized them using jectory Ŵ into m trajectory points, denoted as Gen Eqs. (53) and (54). The corresponding weights, w and Ŵ ∈ Ŵ , i = 1, 2, ··· , m , and further design a pose Gen,i Gen w , are assigned to F (θ ) and F (θ ) , respectively. Here, F 1 2 optimizer to calculate the joint position profiles during i−1 we set w = w = 0.5 . T represents the pose trans- F F 1 2 i robot trajectory execution. Specifically, considering the formation matrix between adjacent joints of the robot, 7-DoF manipulator shown in Figure 12 as an example, θ is the joint angle sequence corresponding to the ith Gen,i we describe the forward kinematics equation using Eq. trajectory point Ŵ ∈ Ŵ , i = 1, 2, ··· , m . p and p Gen,i Gen ξ (51). represent the actual position and target position of the Figure 12 The 7-DoF manipulator: (a) Integrated robot hand-eye system, (b) Virtual prototype model, (c) D-H coordinate system Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 14 of 20 robot end-effector, respectively. r and r represent the the beginning of the task, the host computer first sends kj kj the initialization parameter information to both move_ actual position and target posture of the robot end-effec - group and GD-IL. Subsequently, the GD-IL sequentially tor, respectively. generates multiple subtrajectories for the robot grasping F (θ ) − F (θ ) 1,max 1 task and publishes discretized trajectory points. At this F (θ ) = , 1 (53) F (θ ) − F (θ ) 1,max 1,min point, move_group subscribes to a topic containing these discrete trajectory points, and drives the robot to move according to the specified poses. Finally, the final results F (θ ) − F (θ ) 2,max 2 F (θ ) = . 2 (54) of robot grasp detection and motion execution are dis- F (θ ) − F (θ ) 2,max 2,min played using the 3D visualization software Rviz [64]. Next, we employ a conventional optimization algo- Subsequently, we designed a multi-object pick-and- rithm, such as the Particle Swarm Optimization (PSO) place task scenario, as shown in Figure 14(a); the experi- algorithm [63], to solve the model in Eq. (52). This allows mental workspace was limited to a 0.9 × 0.6 m tabletop. us to obtain an optimal singularity-free robot configura - This experiment aimed to validate the generalization tion with the best dynamic performance, along with the capability of the GD-IL algorithm through a representa- corresponding joint angles. tive irregular object-grasping task, while laying both theoretical and engineering foundations for the future ground-based intelligent assembly of modular anten- 5 Experiments and Results nas. We employed the Adaptive PSO (A-PSO) algo- In this section, we describe the development of an ROS- rithm [65] to solve the robot pose optimization model based experimental testing platform comprising an inte- in Eq. (52), and set the number of particles m = 80 , grated robot hand-eye system, a force-feedback-enabled initial inertia weight ω = 0.3 , initial learning fac- end effector, a host computer, a network architecture, the max tors c = 1.5, c = 1.5 , and approximation coefficient GD-IL system framework, and a 3D visualization simula- 1 2 −3 tion platform. A graphical representation of this is shown δ = 10 . Considering the randomness of the prototype in Figure 13. We effectively integrated the move_group experiments, each set of experiments was repeated 20 interface in ROS with the proposed GD-IL framework. At times. All comparative experiments were conducted with Figure 13 System architecture of ROS-based robot semi-physical simulation and testing platform N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 15 of 20 Figure 14 Experimental process of the robot performing grasp detection and imitation learning using GD-IL: (a) Output the depth information and 3D point clouds, (b) Calculate the 3D object coordinates, (c) TS-RIL generalizes executable grasp sub-trajectories for the robot identical initial parameters to ensure consistent software shown in Figure 14(b). Finally, the TS-RIL algorithm uti- and hardware configurations across the experiments lizes the 3D object coordinates obtained in the previous (hardware configuration: Intel Core i7-10850H CPU @ step to generalize the executable grasp subtrajectories for 2.60 GHz, 16 GB RAM; software version: Ubuntu 18.04 the manipulator, as shown in Figure 14(c). LTS, ROS Melodic). As shown in Figure 15, when the robot performs grasp In the multi-object pick-and-place task shown in Fig- detection and imitation learning in the real-world sce- ure 14(a), the manipulator sequentially picked each nario using GD-IL, it first applies the manipulability– object from the tabletop and smoothly placed it into the GMM algorithm to compute the dynamic performance κ c blue basket on the right. During this process, the robot index D for each candidate grasp bounding box g of i κ ,i first uses a depth camera D435i [66] on its end-effector the κ objects to be grasped. Next, it sequentially outputs to detect objects on the tabletop, outputting the number D and sorts these values in descending order to gen- max of objects, their corresponding depth information, and erate the robot grasp task sequence. Finally, the TS-RIL 3D point clouds. The manipulability–GMM algorithm is algorithm generates grasp subtrajectories for each object. then applied to sequentially compute the optimal grasp- To verify the superior performance of GD-IL in ing bounding box for each object, with the 3D coordi- handling grasp detection and grasp trajectory imita- nates of the objects displayed on the host computer, as tion learning, multiple comparative experiments were Figure 15 Screenshots of the robot performing grasp detection and imitation learning using GD-IL in a real-world scenario Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 16 of 20 conducted using imitating CNN [32], Transformer- deviation between the actual and target robot motion DIL [33], IL-RGC [34], IL-ARG [49], vision IL [50], and trajectories. GD-IL (ours). The following performance metrics were Figures 16 and 17 illustrate the results of grasp trajec- used for evaluation: average computation time (CT), tory imitation learning using GD-IL. After the two pro- average execution time (ET), average manipulability, root cessing stages (grasp trajectory learning and robot pose mean square error (RMSE), and success rate (SR); the optimization) in the TS-RIL, the position profiles of the mathematical expression for RMSE is given in Eq. (55). robot joints became significantly smoother, as shown in Figure 16. The position and velocity curves of the robot wp end effector exhibited no abrupt changes or discontinui - (55) RMSE = p − p , wp,i target,i ties in the X-, Y-, and Z-directions, as shown in Figure 17. wp i=1 Additionally, during the preprocessing phase, the com- plete grasping task was segmented into multiple grasping where N represents the number of discrete waypoints wp subtasks, each corresponding to a time-dependent robot in the robot motion trajectory, and p and p wp,i target,i motion trajectory. The TS-RIL then sequentially trains denote the actual and target waypoints of the robot end these subtrajectories, enhancing the granularity of the effector, respectively. RMSE was used to measure the Figure 16 Joint angle profiles during the sequential robot grasping of multiple objects Figure 17 Motion trajectories for robot grasping generated using GD-IL in a real-world scenario: (a) Grasping motion trajectories of the robot end-effector, (b)−(d) the position and velocity profiles of the robot end-effector in the X, Y, and Z directions N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 17 of 20 robot’s grasping motion, enabling the robot to execute imitating CNN and transformer-DIL achieves shorter and manage motion commands with higher precision. computation times owing to their prolonged early stage Table 1 and Figure 18 present the comparative results training. However, compared to these pre-training- of the six algorithms used in the robot grasping experi- dependent machine learning algorithms, GD-IL incurs ments. As shown in Table 1, the proposed GD-IL out- only negligible computational overhead, losing less performs the other five algorithms in terms of efficiency, than 10% of the processing time. Therefore, the GD-IL robot manipulability, and success rate for both grasp remained within an engineering-acceptable range in detection and trajectory imitation learning. This advan - terms of computational efficiency. The manipulability tage stems from the integration of the manipulabil- increased by 47.1%, 67.7%, 135.2%, 56.3%, and 48.1%, ity–GMM and TS-RIL algorithms within the GD-IL while RMSE decreased by 74.3%, 28.4%, 17.4%, 84.1%, framework. These algorithms not only enable the robot and 55.0%, respectively. These comparative results indi - to rapidly compute the 3D coordinates of target objects cate that GD-IL outperforms the other five algorithms in without requiring extensive pre-training but also quickly terms of avoiding singular configurations and joint limi - converge the actual grasping trajectory to the target tations. This is because of the pose optimizer detailed in trajectory by adjusting the function form of the forcing Section 4.2.2, which effectively guides the robot out of term. Specifically, compared to CNN, IL-ARG, trans - the local optima and quickly searches for the optimal former-DIL, IL-RGC, and vision IL, the ET of GD-IL was singularity-free configuration. In addition, the results reduced by approximately 10.6%, 17.9%, 13.3%, 20.0%, qualitatively reflect the relationship between the manipu - and 15.5%, respectively. This demonstrates that GD-IL lability and RMSE. Specifically, the higher the manipula - maintains a leading execution time without relying on bility during task execution, the better the manipulator extensive pretraining or high computational power. The approaches the target trajectory. CT is reduced by approximately −7.4%, 24.2%, −10.2%, Finally, the GD-IL achieved a significant improve - 22.6%, and 34.3%, respectively. It is worth noting that ment in SR across multiple prototype experiments. Table 1 Comparative results of six imitation learning algorithms in robot grasping experiments Algorithm Short pre‑training Pose optimizer CT (s) ET (s) Manipulability RMSE SR (%) period −1 imitating CNN × × 2.8234 229.4388 0.5067 83.3 3.057 × 10 −1 IL-ARG × × 4.0028 249.7093 0.4445 66.6 1.097 × 10 −2 Transformer-DIL × × 2.7517 236.3922 0.3170 73.3 9.502 × 10 −1 IL-RGC × × 3.9220 256.3190 0.4769 63.3 4.948 × 10 −1 vision IL √ × 4.6169 242.6557 0.5035 80.0 1.743 × 10 −2 GD-IL (ours) √ √ 3.0337 205.0108 0.7455 93.3 7.851 × 10 CT and ET denote the average computation time and the average execution time, respectively. Figure 18 Relative ratios of six imitation learning algorithms in robot grasping experiments Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 18 of 20 Compared to the other five algorithms, the SR of GD-IL operational capabilities of robotic systems in complex increased by approximately 12.0%, 40.1%, 27.3%, 47.4%, space environments. and 16.6%, respectively. The experimental results dem - onstrate that the proposed GD-IL framework is both Supplementary Information efficient and reliable for multi-object grasp detection The online version contains supplementary material available at https:// doi. org/ 10. 1186/ s10033- 025- 01321-8. and trajectory imitation learning in complex real-world environments. In addition, through multiple multi- Additional file 1. object pick-and-place experiments, it was observed that manipulability is a critical factor affecting the Acknowledgements success rate of multi-object grasping tasks. Therefore, Not applicable after deploying the robot pose optimizer within the Authors’ Contributions GD-IL framework, the success rate of robot trajectory Yuming Ning and Tuanjie Li were responsible for the formal analysis, visualiza- imitation learning and motion execution was further tion, and editing; Yulin Zhang and Ziang Li contributed to the investigation, improved. conceptualization, and conducting of the experiments; and Wenqian Du and Yan Zhang assisted with methodology and resources. All authors read and approved the final manuscript. 6 Conclusions Funding Supported by National Natural Science Foundation of China (Grant No. In this study, we proposed a lightweight integrated 52475280), and Shaanxi Provincial Natural Science Basic Research Program framework for grasp detection and imitation learning, (Grant No. 2025SYSSYSZD-105). called GD-IL, which includes the following key compo- Data Availability nents and findings: The datasets and materials used and/or analyzed during the current study are available from the corresponding author on reasonable request. (1) We designed a manipulability–GMM-based grasp detection algorithm that applies GMM clus- Declarations tering and ellipse regression to the point cloud of Competing Interests an object. Two selection criteria were introduced to The authors declare no competing financial interests. generate multiple candidate grasp bounding boxes, and the optimal one was selected based on the Received: 28 February 2025 Revised: 21 June 2025 Accepted: 2 July 2025 manipulability metric. (2) We proposed a two-stage robot imitation learning method (TS-RIL). In Stage 1, a second-order DMPs model combined with the GMM/GMR was used to References model the grasping trajectory of the robot. In Stage [1] E Y Zhang, H Y Sai, Y H Li, et al. Modular robotic manipulator and ground 2, a robot pose optimization model was constructed assembly system for on-orbit assembly of space telescopes. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical using a derived pose error formulation and manip- Engineering Science, 2024, 238(6): 2283-2293. ulability index, allowing real-time configuration [2] C J Zhao, W Z Guo, M Chen, et al. Space truss construction modeling adjustments to avoid singularities during grasping. based on on-orbit assembly motion feature. Chinese Journal of Aeronaut- ics, 2024, 37(3): 365-379. (3) An ROS-based experimental platform was devel- [3] M H Nair, M C Rai, M Poozhiyil, et al. Robotic technologies for in-orbit oped to verify the GD-IL algorithm. Real-world assembly of a large aperture space telescope: A review. Advances in Space experiments were conducted to evaluate their per- Research, 2024, 74(10): 5118-5141. [4] D Timmermann, C Plasberg, F Graaf, et al. AI-based assembly sequence formance in comparison with existing methods. The planning in a robotic on-orbit assembly application. 10th International results showed that the GD-IL significantly enhanced Conference on Automation, Robotics and Applications, Greece, February the grasp detection and trajectory learning perfor- 22-24, 2024: 69-74. [5] A Nanjangud, C Underwood, C M Rai, et al. Towards robotic on-orbit mances. Specifically, the average computation time, assembly of large space telescopes: Mission architectures, concepts, and execution time, and RMSE reduced by more than analyses. Acta Astronautica, 2024, 224: 379-396. 20%, 15%, and 30%, respectively, and the average [6] Z Sheng, W Chen, Z Chen, et al. Sequence planning for on-orbit robotic assembly based on symbiotic organisms search with diversification manipulability and grasp success rate improved by strategy. Acta Astronautica, 2024, 219: 941-951. more than 50% and 15%, respectively. [7] B Sawik. Space mission risk, sustainability and supply chain: review, multi- objective optimization model and practical approach. Sustainability, 2023, 15(14): 11002. In future work, we will further investigate imita- [8] I Rodríguez, A S Bauer, K Nottensteiner, et al. Autonomous robot planning tion learning for time-varying tasks and in grasp- system for in-space assembly of reconfigurable structures. 2021 IEEE ing moving objects with the aim of enhancing the Aerospace Conference, USA, March 06-13, 2021: 1-17. N ing et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 19 of 20 [9] K Hambuchen, J Marquez, T Fong. A review of NASA human-robot inter- [33] H Kim, Y Ohmura, Y Kuniyoshi. Transformer-based deep imitation learning action in space. Current Robotics Reports, 2021, 2(3): 265-272. for dual-arm robot manipulation. 2021 IEEE/RSJ International Conference [10] B Ma, Z Jiang, Y Liu, et al. Advances in space robots for on-orbit servic- on Intelligent Robots and Systems, September 27-30, 2021: 8965-8972. ing: A comprehensive review. Advanced Intelligent Systems, 2023, 5(8): [34] D S D Cortes, G Hwang, K U Kyung. Imitation learning based soft robotic 2200397. grasping control without precise estimation of target posture. 2021 [11] B Y Cheng, Q G Sun, L B Zeng, et al. Workspace optimization of on-orbit IEEE 4th International Conference on Soft Robotics, USA, April 12-16, 2021: assembly robotic system. Proceedings of the 2023 3rd International Confer- 149-154. ence on Robotics and Control Engineering, USA, May 12-14, 2023: 90-95. [35] A Jonnavittula, S Parekh, D P Losey. View: Visual imitation learning with [12] H Jahanshahi, Z H Zhu. Review of machine learning in robotic grasping waypoints. Autonomous Robots, 2025, 49(1): 1-26. control in space application. Acta Astronautica, 2024, 220: 37-61. [36] W Wang, C Zeng, Z Lu, et al. A novel robust imitation learning framework [13] R B Ashith Shyam, Z Hao, U Montanaro, et al. Autonomous robots for for dual-arm object-moving tasks. IEEE Transactions on Industrial Electron- space: Trajectory learning and adaptation using imitation. Frontiers in ics, 2024. Robotics and AI, 2021, 8: 638849. [37] K Duan, Z Zou, T Y Yang. Training of construction robots using imitation [14] M Alizadeh, Z H Zhu. A comprehensive survey of space robotic manipula- learning and environmental rewards. Computer-Aided Civil and Infrastruc- tors for on-orbit servicing. Frontiers in Robotics and AI, 2024, 11: 1470950. ture Engineering, 2025, 40(9): 1150-1165. [15] Z Huang, L L Shi, L J Fan, et al. Multirobot assembly sequence planning of [38] Z Xie, X Liang, C Roberto. Learning-based robotic grasping: A review. a large space telescope. Journal of Spacecraft and Rockets, 2024: 1-13. Frontiers in Robotics and AI, 2023, 10: 1038658. [16] S B Yang, S Lu, M X Li, et al. Study on virtual training platform for reinforce- [39] F Liu, F Sun, B Fang, et al. Hybrid robotic grasping with a soft multimodal ment learning of space manipulator. 2023 International Conference on gripper and a deep multistage learning scheme. IEEE Transactions on Service Robotics, China, July 21-23, 2023: 73-77. Robotics, 2023, 39(3): 2379-2399. [17] K Fujii, I Rodríguez, M Schedl, et al. Comparative analysis of robotic grip- [40] Y Jung, L Tao, M Bowman, et al. Physics-guided hierarchical reward ping solutions for cooperative and non-cooperative targets. 2024 IEEE mechanism for learning-based robotic grasping. arXiv preprint, 2022: Aerospace Conference, USA, March 2-9, 2024: 1-14. 2205.13561. https:// arxiv. org/ abs/ 2205. 13561 [18] C Li, J Y Yang, S Chang. Review on key technologies of space intelligent [41] Q T Liu, Y Cui, Q Ye, et al. Dexrepnet: Learning dexterous robotic grasping grasping robot. Journal of the Brazilian Society of Mechanical Sciences and network with geometric and spatial hand-object representations. 2023 Engineering, 2022, 44(2): 64. IEEE/RSJ International Conference on Intelligent Robots and Systems, USA, [19] E Papadopoulos, F Aghili, O Ma, et al. Robotic manipulation and capture October 1-5, 2023: 3153-3160. in space: A survey. Frontiers in Robotics and AI, 2021, 8: 686723. [42] Y Deng, X Guo, Y Wei, et al. Deep reinforcement learning for robotic [20] S Kumra, S Joshi, F Sahin. Gr-convnet v2: A real-time multi-grasp detec- pushing and picking in cluttered environment. 2019 IEEE/RSJ International tion network for robotic grasping. Sensors, 2022, 22(16): 6208. Conference on Intelligent Robots and Systems, China, November 3-8, 2019: [21] Y L Chen, Y R Cai, M Y Cheng. Vision-based robotic object grasping—a 619-626. deep reinforcement learning approach. Machines, 2023, 11(2): 275. [43] M Saveriano, F J Abu-Dakka, A Kramberger, et al. Dynamic movement [22] H Sekkat, S Tigani, R Saadane, et al. Vision-based robotic arm control primitives in robotics: A tutorial survey. The International Journal of Robot- algorithm using deep reinforcement learning for autonomous objects ics Research, 2023, 42(13): 1133-1184. grasping. Applied Sciences, 2021, 11(17): 7917. [44] C G Yang, C Z Chen, W He, et al. Robot learning system based on adaptive [23] S Ainetter, F Fraundorfer. End-to-end trainable deep neural network for neural control and dynamic movement primitives. IEEE Transactions on robotic grasp detection and semantic segmentation from rgb. 2021 IEEE Neural Networks and Learning Systems, 2018, 30(3): 777-787. International Conference on Robotics and Automation, China, June 03-05, [45] B Ti, Y Gao, Q Li, et al. Dynamic movement primitives for movement gen- 2021: 13452-13458. eration using GMM-GMR analytical method. 2019 IEEE 2nd International [24] C C Li, G H Tian, M Y Zhang. A semantic knowledge-based method for Conference on Information and Computer Technologies, USA, March 14-17, home service robot to grasp an object. Knowledge-Based Systems, 2024, 2019: 250-254. 297: 111947. [46] A Vakanski, I Mantegh, A Irish, et al. Trajectory learning for robot program- [25] G C Lin, L X Zhu, J H Li, et al. Collision-free path planning for a guava- ming by demonstration using hidden Markov model and dynamic harvesting robot based on recurrent deep reinforcement learning. time warping. IEEE Transactions on Systems, Man, and Cybernetics, Part B Computers and Electronics in Agriculture, 2021, 188: 106350. (Cybernetics), 2012, 42(4): 1039-1052. [26] E G Ribeiro, R de Queiroz Mendes, V Grassi Jr. Real-time deep learning [47] Z Lu, N Wang, D Shi. DMPs-based skill learning for redundant dual-arm approach to visual servo control and grasp detection for autonomous robotic synchronized cooperative manipulation. Complex & Intelligent robotic manipulation. Robotics and Autonomous Systems, 2021, 139: Systems, 2022, 8(4): 2873-2882. 103757. [48] S Schaal, P Mohajerian, A Ijspeert. Dynamics systems vs. optimal [27] F Chen, M Selvaggio, D G Caldwell. Dexterous grasping by manipulability control—a unifying view. Progress in Brain Research, 2007, 165: 425-445. selection for mobile manipulator with visual guidance. IEEE Transactions [49] J B Yi, J Kim, T Kang, et al. Anthropomorphic grasping of complex-shaped on Industrial Informatics, 2018, 15(2): 1202-1210. objects using imitation learning. Applied Sciences, 2022, 12(24): 12861. [28] F F Zheng, Z Y Gong, B Tao. Robotic grasping pose estimation based on [50] Z Chen, K Fan. An online trajectory guidance framework via imitation point cloud accelerated by image feature correspondence. 2023 IEEE learning and interactive feedback in robot-assisted surgery. Neural Net- International Conference on Real-time Computing and Robotics, China, July works, 2025: 107197. 17-20, 2023: 334-339. [51] L Coelho, S M Cerqueira, V Martins, et al. A DMPs-based approach for [29] Y Tian, J Y Zhang, Z K Yin, et al. Robot structure prior guided temporal human-like robotic movements. 2024 IEEE International Conference on attention for camera-to-robot pose estimation from image sequence. Autonomous Robot Systems and Competitions, Portugal, May 2-3, 2024: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern 201-206. Recognition, Canada, June 18-22, 2023: 8917-8926. [52] C Lauretti, C Tamantini, L Zollo. A new dmp scaling method for robot [30] J B Wu, Y X Zhang, Y J Zhang, et al. Research on object detection for ser- learning by demonstration and application to the agricultural domain. vice robots based on image feature. 2022 IEEE 5th Advanced Information IEEE Access, 2024, 12: 7661-7673. Management, Communicates, Electronic and Automation Control Confer- [53] Z Y Lu, N Wang, C G Yang. A dynamic movement primitives-based tool ence, China, December 16-18, 2022, 5: 1535-1539. use skill learning and transfer framework for robot manipulation. IEEE [31] Y S Yang, S S Yeh. Manipulator point teaching system design integrated Transactions on Automation Science and Engineering, 2024, 22: 1748-1763. with image processing and iterative learning control. Journal of Intelligent [54] Z Kingston, L E Kavraki. Robowflex: Robot motion planning with MoveIt & Robotic Systems, 2019, 96: 477-492. made easy. 2022 IEEE/RSJ International Conference on Intelligent Robots and [32] S Zhang, S Li, Y Li, et al. A visual imitation learning algorithm for the selec- Systems, Japan, October 23-27, 2022: 3108-3114. tion of robots’ grasping points. Robotics and Autonomous Systems, 2024, 55. N Vuković, M Mitić, Z Miljković. Trajectory learning and reproduction for 172: 104600. differential drive mobile robots based on GMM/HMM and dynamic time Ning et al. Chinese Journal of Mechanical Engineering (2025) 38:139 Page 20 of 20 warping using learning from demonstration framework. Engineering Sorbonne University, France, in 2021. His research interests include Applications of Artificial Intelligence, 2015, 45: 388-404. mobile manipulation, and whole-body motion generation of human- [56] C Ye, J Yang, H Ding. Bagging for Gaussian mixture regression in robot oid and quadruped robots. learning from demonstration. Journal of Intelligent Manufacturing, 2022, 33(3): 867-879. Yan Zhang born in 1982, is currently a PhD candidate at School of [57] H W Zhang, X N Han, W Zhang, et al. Complex sequential tasks learning Mechano-Electronic Engineering, Xidian University, China. He received with Bayesian inference and Gaussian mixture model. 2018 IEEE Interna- his master degree from Xidian University, China, in 2008. His research tional Conference on Robotics and Biomimetics, Malaysia, December 12-15, interests include intelligent robot and life science lab automation. 2018: 1927-1934. [58] X Wang, C Yang, Z Ju, et al. Robot manipulator self-identification for sur - rounding obstacle detection. Multimedia Tools and Applications, 2017, 76: 6495-6520. [59] B Johnen, B Kuhlenkoetter. A dynamic time warping algorithm for indus- trial robot motion analysis. 2016 Annual Conference on Information Science and Systems, USA, March 16-18, 2016: 18-23. [60] Y Yao, X Zhao, Y Wu, et al. Clustering driver behavior using dynamic time warping and hidden Markov model. Journal of Intelligent Transportation Systems, 2021, 25(3): 249-262. [61] C Li, C Yang, Z Ju, et al. An enhanced teaching interface for a robot using DMP and GMR. International Journal of Intelligent Robotics and Applica- tions, 2018, 2(1): 110-121. [62] W S Cleveland, S J Devlin. Locally weighted regression: An approach to regression analysis by local fitting. Journal of the American Statistical Association, 1988, 83(403): 596-610. [63] Y M Ning, T J Li, W Q Du, et al. Inverse kinematics and planning/control co-design method of redundant manipulator for precision operation: Design and experiments. Robotics and Computer-Integrated Manufactur- ing, 2023, 80: 102457. [64] H R Kam, S H Lee, T Park, et al. Rviz: A toolkit for real domain data visuali- zation. Telecommunication Systems, 2015, 60: 337-345. [65] Y M Ning, T J Li, C Yao, et al. MT-RSL: A multitasking-oriented robot skill learning framework based on continuous dynamic movement primitives for improving efficiency and quality in robot-based intelligent operation. Robotics and Computer-Integrated Manufacturing, 2024, 90: 102817. [66] M Jiang, Z Wang, F Peng. Smart home robot based on 3D LiDAR and depth camera technology. 2024 3rd International Conference on Cloud Computing, Big Data Application and Software Engineering, China, October 11-13, 2024: 750-753. Yuming Ning born in 1997, is currently a PhD candidate at School of Mechano-Electronic Engineering, Xidian University, China. He received his bachelor degree from Xidian University, China, in 2019. His research interests include robot skill learning, motion planning, and intelligent control for autonomous vehicles. Tuanjie Li born in 1972, is currently a professor at School of Mech- ano-Electronic Engineering, Xidian University, China. He received his Ph.D. degree in mechanical engineering from Xi’an University of Tech- nology, China, in 1999. His research interests include space deploy- able antenna, intelligent control, and nonlinear dynamics. Yulin Zhang born in 1997, is currently a PhD candidate at School of Mechano-Electronic Engineering, Xidian University, China. He received his bachelor degree from Xidian University, China, in 2020. His research interests include robot skill learning, motion coordination, and intel- ligent multi-mode robots. Ziang Li born in 2000, is currently a master candidate at School of Mechano-Electronic Engineering, Xidian University, China. He received his bachelor degree from Inner Mongolia University of Technology, China, in 2020. His research interests include robot skill learning and motion control. Wenqian Du born in 1988, is currently a research associate at The University of Edinburgh, the United Kingdom. He received his Ph.D. degree from Institut des systèmes intelligents et de robotique (ISIR) of

Journal

Chinese Journal of Mechanical Engineering – Springer Journals

Published: Aug 5, 2025

Keywords: Grasp detection; Robot imitation learning; Manipulability; Dynamic movement primitives; Gaussian mixture model and Gaussian mixture regression; Pose optimization

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

An Integrated Framework of Grasp Detection and Imitation Learning for Space Robotics Applications

An Integrated Framework of Grasp Detection and Imitation Learning for Space Robotics Applications

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

An Integrated Framework of Grasp Detection and Imitation Learning for Space Robotics Applications

An Integrated Framework of Grasp Detection and Imitation Learning for Space Robotics Applications

References (63)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies