You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am an undergraduate student very interested in your 6DOF Graspnet project. I have read your paper and am currently reading through your code. I am having a few questions about the code which I really hope you could spend some time helping me answer them. Thank you very much in advance.
First, in the folder demo/data, you provided a few data (.npy files) to run the demo. I am just wondering if these .npy files contain the 3D point clouds of test objects or something else? If they are point clouds, could you tell me the technique that you used to create these files? What kind of vision sensor did you use? How did you process data from the sensor and generate those .npy files? How did you filter the measured point clouds as well as remove the table plane? I am trying to implement your code in my Panda robot. Therefore, I hope you could give me some instructions of how a robot, at experiment time, can extract 3D point clouds from an object and feed it to the decoder properly (compatible with your code).
Second, at a lot of places in your code, you mentioned the Panda gripper control points. Could you help me define what are these control points? Specifically, in utils/utils.py, what are the purpose of these functions: transform_control_points; transform_control_points_numpy; transform_control_points; and control_points_from_rot_and_trans?
I saw that you utilized one of those functions in grasp_sampling_data.py, where you calculated the meta[‘target_cps’] (line 54, 68) from output_grasps (line 44)? I observed that ‘target_cps’, 'grasp_rt', and ‘pc’ are data from the dataset that will later be used to train the VAE. Could you tell me what is ‘target_cps’, and what is the difference between ‘target_cps’ and 'grasp_rt'? From my understanding, 'grasp_rt' is the grasp g that we feed to the encoder, and ‘target_cps’ is the reconstructed g_hat that comes out from the decoder. Therefore, should they be the same thing? Why 'grasp_rt' is directly computed from output_grasps (line 58), but ‘target_cps’ has to go through the control points?
Thanks a lot for your answer. Best regards.
The text was updated successfully, but these errors were encountered:
getting point cloud from depth images: given the intrinsic matrix and the depth image you can easily compute the point cloud. You can find plenty of code snippet for it and it is part of many computer vision slides that talk about projection and how images are transformed from 3D world to 2D plane.
Regarding control points: these are some points sampled on the gripper and fingers to represent the gripper and used for computing losses and also in the evaluator model.
grasp_rt: is the 4x4 matrix that contains rotation, translation of grasps. grasp_cps are the control points of the gripper transformed to the pose of grasp. I would recommend to read paper in more details to understand it. Implementation follows the paper closely and vice versa.
Hi Dr. Mousavian,
I am an undergraduate student very interested in your 6DOF Graspnet project. I have read your paper and am currently reading through your code. I am having a few questions about the code which I really hope you could spend some time helping me answer them. Thank you very much in advance.
First, in the folder demo/data, you provided a few data (.npy files) to run the demo. I am just wondering if these .npy files contain the 3D point clouds of test objects or something else? If they are point clouds, could you tell me the technique that you used to create these files? What kind of vision sensor did you use? How did you process data from the sensor and generate those .npy files? How did you filter the measured point clouds as well as remove the table plane? I am trying to implement your code in my Panda robot. Therefore, I hope you could give me some instructions of how a robot, at experiment time, can extract 3D point clouds from an object and feed it to the decoder properly (compatible with your code).
Second, at a lot of places in your code, you mentioned the Panda gripper control points. Could you help me define what are these control points? Specifically, in utils/utils.py, what are the purpose of these functions: transform_control_points; transform_control_points_numpy; transform_control_points; and control_points_from_rot_and_trans?
I saw that you utilized one of those functions in grasp_sampling_data.py, where you calculated the meta[‘target_cps’] (line 54, 68) from output_grasps (line 44)? I observed that ‘target_cps’, 'grasp_rt', and ‘pc’ are data from the dataset that will later be used to train the VAE. Could you tell me what is ‘target_cps’, and what is the difference between ‘target_cps’ and 'grasp_rt'? From my understanding, 'grasp_rt' is the grasp g that we feed to the encoder, and ‘target_cps’ is the reconstructed g_hat that comes out from the decoder. Therefore, should they be the same thing? Why 'grasp_rt' is directly computed from output_grasps (line 58), but ‘target_cps’ has to go through the control points?
Thanks a lot for your answer. Best regards.
The text was updated successfully, but these errors were encountered: