verityw commited on
Commit
b1c8339
·
1 Parent(s): 738a16b
Files changed (4) hide show
  1. README.md +63 -2
  2. camera_serials.json +3 -0
  3. episode_id_to_path.json +3 -0
  4. intrinsics.json +3 -0
README.md CHANGED
@@ -1,5 +1,4 @@
1
  # DROID Annotations
2
-
3
  This repo contains additional annotation data for the DROID dataset which we completed after the initial dataset release.
4
 
5
  Concretely, it contains the following information:
@@ -19,7 +18,60 @@ for a subset of the DROID episodes. Concretely, we provide the following three c
19
  - `cam2base_extrinsics.json`: Contains ~36k entries with either the left or right camera calibrated with respect to base.
20
  - `cam2cam_extrinsics.json`: Contains ~90k entries with cam2cam relative poses and camera parameters for all of DROID.
21
  - `cam2base_extrinsic_superset.json`: Contains ~24k unique entries, total ~48k poses for both left and right camera calibrated with respect to the base.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
 
 
 
 
23
 
24
  ## Accessing Annotation Data
25
 
@@ -35,4 +87,13 @@ import tensorflow as tf
35
  episode_paths = tf.io.gfile.glob("gs://gresearch/robotics/droid_raw/1.0.1/*/success/*/*/metadata_*.json")
36
  for p in episode_paths:
37
  episode_id = p[:-5].split("/")[-1].split("_")[-1]
38
- ```
 
 
 
 
 
 
 
 
 
 
1
  # DROID Annotations
 
2
  This repo contains additional annotation data for the DROID dataset which we completed after the initial dataset release.
3
 
4
  Concretely, it contains the following information:
 
18
  - `cam2base_extrinsics.json`: Contains ~36k entries with either the left or right camera calibrated with respect to base.
19
  - `cam2cam_extrinsics.json`: Contains ~90k entries with cam2cam relative poses and camera parameters for all of DROID.
20
  - `cam2base_extrinsic_superset.json`: Contains ~24k unique entries, total ~48k poses for both left and right camera calibrated with respect to the base.
21
+ These files map episodes' unique ID (see Accessing Annotation Data below) to another dictionary containing metadata (e.g., detection quality metrics, see Appendix G of paper), as well as a map from camera ID to the extrinsics values. Said extrinsics is represented as a 6-element list of floats, indicating the translation and rotation. It can be easily converted into a homogeneous pose matrix:
22
+ ```
23
+ from scipy.spatial.transform import Rotation as R
24
+
25
+ # Assume extrinsics is that 6-element list
26
+ pos = extrinsics[0:3]
27
+ rot_mat = R.from_euler("xyz", extracted_extrinsics[3:6]).as_matrix()
28
+
29
+ # Make homogenous transformation matrix
30
+ cam_to_target_extrinsics_matrix = np.eye(4)
31
+ cam_to_target_extrinsics_matrix[:3, :3] = rot_mat
32
+ cam_to_target_extrinsics_matrix[:3, 3] = pos
33
+ ```
34
+ This represents a transformation matrix from the camera's frame to the target frame. Inverting it gets the transformation from target frame to camera frame (which is usually desirable, e.g., if one wants to project a point in the robot frame into the camera frame).
35
+
36
+ As the raw DROID video files were recorded on Zed cameras and saved in SVO format, they contain camera intrinsics which can be used in conjunction with the above. For convenience, we have extracted and saved all these annotations to `intrinsics.json` (~72k entries). This `json` has the following format:
37
+ ```
38
+ <episode ID>:
39
+ <external camera 1's serial>: [fx, cx, fy, cy for camera 1]
40
+ <external camera 2's serial>: [fx, cx, fy, cy for camera 2]
41
+ <wrist camera 1's serial>: [fx, cx, fy, cy for wrist camera]
42
+ ```
43
+ One can thus convert the list for a particular camera to a projection matrix via the following:
44
+ ```
45
+ import numpy as np
46
+
47
+ # Assume intrinsics is that 4-element list
48
+ fx, cx, fy, cy = intrinsics
49
+ intrinsics_matrix = np.array([
50
+ [fx, 0, cx],
51
+ [0, fy, cy],
52
+ [0, 0, 1]
53
+ ])
54
+ ```
55
+ Note that the intrinsics tend to not change much between episodes, but using the specific values corresponding to a particular episode tends to give the best results.
56
+
57
+ ## Example Calibration Use Case
58
+ Using the calibration information, one can project points in the robot's frame into pixel coordinates for the cameras. We will demonstrate how to map the robot gripper position to pixel coordinates for the external cameras with extrinsics in `cam2base_extrinsics.json`, see <TODO> for the full code.
59
+ ```
60
+ gripper_position_base = <Homogeneous gripper position in the base frame, as gotten from TFDS episode. Shape 4 x 1>
61
+ cam_to_base_extrinsics_matrix = <extrinsics matrix for some camera>
62
+ intrinsics_matrix = <intrinsics matrix for that same camera>
63
+
64
+ # Invert to get transform from base to camera frame
65
+ base_to_cam_extrinsics_matrix = np.linalg.inv(cam_to_base_extrinsics_matrix)
66
+
67
+ # Transform gripper position to camera frame, then remove homogeneous component
68
+ robot_gripper_position_cam = base_to_cam_extrinsics_matrix @ gripper_position_base
69
+ robot_gripper_position_cam = robot_gripper_position_cam[:3] # Now 3 x 1
70
 
71
+ # Project into pixel coordinates
72
+ pixel_positions = intrinsics_matrix @ robot_gripper_position_cam
73
+ pixel_positions = pixel_positions[:2] / pixel_positions[2] # Shape 2 x 1 # Done!
74
+ ```
75
 
76
  ## Accessing Annotation Data
77
 
 
87
  episode_paths = tf.io.gfile.glob("gs://gresearch/robotics/droid_raw/1.0.1/*/success/*/*/metadata_*.json")
88
  for p in episode_paths:
89
  episode_id = p[:-5].split("/")[-1].split("_")[-1]
90
+ ```
91
+
92
+ As using the above annotations requires these episode IDs (but the TFDS dataset only contains paths), we have included `episode_id_to_path.json` for convenience. The below code snippet loads this `json`, then gets the mapping from episode paths to IDs.
93
+ ```
94
+ import json
95
+ episode_id_to_path_path = "<path/to/episode_id_to_path.json>"
96
+ with open(episode_id_to_path_path, "r") as f:
97
+ episode_id_to_path = json.load(f)
98
+ episode_path_to_id = {v: k for k, v in episode_id_to_path.items()}
99
+ ```
camera_serials.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c8d346c51dcef71248e280e44dcd7985a94433f6911460b31dcc098cab30acc4
3
+ size 12743876
episode_id_to_path.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e88ee7da94a40602cde4aacf22f2b48068f4f582c8ab38cf1888e06162a8085
3
+ size 7237770
intrinsics.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:78c76755b075ae53e74a28c543bb1b185c50aa976458e95fbc9ba880a8cd2d51
3
+ size 125812944