-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hard to run eval_video_official.py & some training questions #8
Comments
I found that it's the annotation_path matters. Running preprocess.py inside data/ folder, but annotation_path start with data/ folder. |
I follow the tutorial in Readme inside the objectron_eval folder, however, the After I modified the code in I figured out that I need to put the tfrecord data folder inside the path mentioned aboved. Then it runs a batch of videos and some debug windows appears and stucks. It's hard to run the evaluation part related scripts, please modify the tutorial readme with more details, thank you very much! |
Sorry for the inconvenience. I may update the tutorial to make it more user-friendly if I am more available later. |
We ran into some problems (annotation inconsistency or video missing) when processing the raw data from the Objectron dataset for training purposes. So we skip those clips. You can find out the related code in the |
Thanks for reply. I encounter some other problems when studying your code. The variable name confused me though you have made some comments, but i still wonder why the 'hp' stands for 'keypoints' instead of using 'kp'. My second question is the channel number when I start to train the code using |
Hi, I encounter another question, https://github.com/NVlabs/CenterPose/blob/main/src/lib/models/networks/pose_dla_dcn.py#L253 what is opt.pre_img / opt.pre_hm / opt.pre_hm_hp ? I can not find the comments about these parameters. |
That's for CenterPoseTrack. Its pipeline plot may give you some ideas.
|
Sorry for the confusion. I had some other thoughts about the name before but used "keypoints" in the end.
We have 8 keypoints, and each of them has 2 heads. The total channel is 16. |
Thanks for reply. I got another problem, if I remove the 'wh' head for training, then I will got some error when I inference. https://github.com/NVlabs/CenterPose/blob/main/src/lib/models/decode.py#L102 As you can see, if I don't have the 'wh' head, I run into However, Could you illustrate the decode processing with more detail please? Or some reference papers? I got confused when I try to understand the decode processing part. Thank you very much ! |
More questions: I try to remove 'scale' head and I set opt.obj_scale = False. And I got crash when I am running the demo.py, the crash happenes on the following link as it asks for scale variable, I tried to set the scale to be 1 constantly, the visualization just become a square cuboid which is not correct. Could I get the pose without predicting the scale? Is it possible to use pnp to get output3d points and infer the scale using the predicted 3d points? |
Sorry about the confusion. I once tried to see the impact of removing 'wh' option for CenterPose. I found that it might not be a good choice, so I did not go further. Then when I developed for CenterPoseTrack with some more parameters, e.g., kps_heatmap_std / kps_heatmap_mean / kps_heatmap_height, I assumed the 'wh' option was already enabled. In your case, you probably have to give kps_heatmap_std / kps_heatmap_mean / kps_heatmap_height some default values there (if you do not care about tracking). As for the reference, our implementation is based on CenterNet. I do not think there is detailed instruction available now, as most of the codes are just too detailed to be put on paper. |
In my implementation, 'scale' head is trying to predict the ratio between width/height/length. If not given the absolute height info from somewhere else, then its prediction is width/1/length (we call it relative scale). pnp is used to get the pose given 2d key points on the image and 3d key points (prior information). In our case, width/1/length or width/height/length can be used to get 3d key points (as we work on a 3d bounding box). I think your so-called "predicted 3d points" is more like transforming the 3d key points (prior information) into the camera space with the pose calculated by pnp. In other words, you cannot get the pose without predicting the scale. |
Thanks for the clarfication for the questions above. I got another question, how to train cup category? |
Did you succeed in the test? Some errors occur when I run the test code "python eval_video_official. py". |
Hi, I encounter some problems when using the code.
I followed data/Readme.md and download the chair data then preprocess it. After that, I got nothing inside the output folder (e.g. outf_all) except another empty folder named 'chair_train'.
I got a 'bug_lists.txt' file, seems including all the related 'chair' category video name after preprocessing.
I see there's also 'bug_list.txt' in the 'label' folder? What do you mean by 'bug_lists'?
Please help me figure out what matters, Thank you very much!
The text was updated successfully, but these errors were encountered: