🦾 Diffusion Policy for Push-T (200k Steps)

Summary: This model demonstrates the capabilities of Diffusion Policy on the precision-demanding Push-T task. It was trained using the LeRobot framework as part of a thesis research project benchmarking Imitation Learning algorithms.

🧩 Task: Push-T (Simulated)
🧠 Algorithm: Diffusion Policy (DDPM)
🔄 Training Steps: 200,000 (Fine-tuned via Resume)
🎓 Author: Graduate Student, UESTC (University of Electronic Science and Technology of China)

🔬 Benchmark Results (vs ACT)

Compared to the ACT baseline (which achieved 0% success rate in our controlled experiments), this Diffusion Policy model demonstrates significantly better control precision and trajectory stability.

📊 Evaluation Metrics (50 Episodes)

Metric	Value	Comparison to ACT Baseline	Status
Success Rate	14.0%	Significant Improvement (ACT: 0%)	🏆
Avg Max Reward	0.81	+58% Higher Precision (ACT: ~0.51)	📈
Avg Sum Reward	130.46	+147% More Stable (ACT: ~52.7)	✅

Note: The Push-T environment requires >95% target coverage for success. An average max reward of 0.81 indicates the policy consistently moves the block very close to the target position, proving strong manipulation capabilities despite the strict success threshold.

⚙️ Model Details

Parameter	Description
Architecture	ResNet18 (Vision Backbone) + U-Net (Diffusion Head)
Prediction Horizon	16 steps
Observation History	2 steps
Action Steps	8 steps

Training Strategy:
- Phase 1: Initial training (100,000 steps) -> Model: Lemon-03/DP_PushT_test
- Phase 2: Resume/Fine-tuning (+100,000 steps) -> Model: Lemon-03/DP_PushT_test_Resume
- Total: 200,000 steps

🔧 Training Configuration (Reference)

For reproducibility, here are the key parameters used during the training session:

Batch Size: 64
Optimizer: AdamW (lr=1e-4)
Scheduler: Cosine with warmup
Vision: ResNet18 with random crop (84x84)
Precision: Mixed Precision (AMP) enabled

Original Training Command (My Resume Mode)

python -m lerobot.scripts.lerobot_train \
  --policy.type diffusion \
  --env.type pusht \
  --dataset.repo_id lerobot/pusht \
  --wandb.enable true \
  --eval.batch_size 8 \
  --job_name DP_PushT_Resume \
  --policy.repo_id Lemon-03/DP_PushT_test_Resume \
  --policy.pretrained_path outputs/train/2025-12-02/14-33-35_DP_PushT/checkpoints/last/pretrained_model \
  --steps 100000

🚀 Evaluate (My Evaluation Mode)

Run the following command in your terminal to evaluate the model for 50 episodes and save the visualization videos:

python -m lerobot.scripts.lerobot_eval \
  --policy.type diffusion \
  --policy.pretrained_path outputs/train/2025-12-04/14-47-37_DP_PushT_Resume/checkpoints/last/pretrained_model \
  --eval.n_episodes 50 \
  --eval.batch_size 10 \
  --env.type pusht \
  --env.task PushT-v0

To evaluate this model locally, run the following command:

python -m lerobot.scripts.lerobot_eval \
  --policy.type diffusion \
  --policy.pretrained_path Lemon-03/DP_PushT_test_Resume \
  --eval.n_episodes 50 \
  --eval.batch_size 10 \
  --env.type pusht \
  --env.task PushT-v0

Downloads last month: 50

Video Preview

Robotics

Lemon-03
/

DP_PushT_test_Resume