Robotics
Transformers
Safetensors
English
VLA

Model Card for Hume-Libero_Object

A Dual-System Visual-Language-Action model with System-2 thinking trained on Libero-Object.

Uses

  • If you want to reproduce the results in paper, follow the instruction
  • If you want to directly use the model:
    from hume import HumePolicy
    import numpy as np
    
    # load policy
    hume = HumePolicy.from_pretrained("/path/to/checkpoints")
    
    # config Test-Time Computing args
    hume.init_infer(
        infer_cfg=dict(
            replan_steps=8,
            s2_replan_steps=16,
            s2_candidates_num=5,
            noise_temp_lower_bound=1.0,
            noise_temp_upper_bound=1.0,
            time_temp_lower_bound=0.9,
            time_temp_upper_bound=1.0,
            post_process_action=True,
            device="cuda",
        )
    )
    
    # prepare observations
    observation = {
        "observation.images.image": np.zeros((1,224,224,3), dtype = np.uint8), # (B, H, W, C)
        "observation.images.wrist_image": np.zeros((1,224,224,3), dtype = np.uint8), # (B, H, W, C)
        "observation.state": np.zeros((1, 7)), # (B, state_dim)
        "task": ["Lift the papper"],
    }
    
    # Infer the action
    action = hume.infer(observation) # (B, action_dim)
    

Citation

@article{song2025hume,
  title={Hume: Introducing System-2 Thinking in Visual-Language-Action Model},
  author={Anonimous Authors},
  journal={arXiv preprint arXiv:2505.21432},
  year={2025}
}
Downloads last month
2,836
Safetensors
Model size
3.99B params
Tensor type
F32
·
BF16
·
Video Preview
loading

Model tree for Hume-vla/Libero-Object-1

Finetuned
(3)
this model

Dataset used to train Hume-vla/Libero-Object-1

Collection including Hume-vla/Libero-Object-1