Commit History
draft commit for cpu_offload (#23) 10848ab unverified
Replace toy PP tests with real-model-based pipeline tests [skip-build] 67f7e11
Add correctness verification to PP tests using fully_shard [skip-build] a4d1f34
Remove correctness check from PP tests, focus on deadlock detection [skip-build] c0bbf2e
Add PP + dp_replicate deadlock regression tests [skip-build] cd587a6
Apply pre-commit formatting (isort) [skip-build] 96b287c
Add MoE uneven shard test with mixed expert and non-expert params [skip-build] bdada12
Add uneven shard correctness test [skip-build] 1a97671
Update tests for MoE and parallel optimizations [skip-build] 81f49fe
Add torch.compile, CUDA graph, and compiled momentum [skip-build] e74d98f
Apply suggestions from code review cdaaf4f
TaehyunKim Copilot commited on