Kernels

Commit History

Replace cpu_offload constructor param with turn_on/turn_off API (#26)
05a75f1
unverified

wyldecat Claude Opus 4.6 (1M context) github-actions[bot] commited on

Merge pull request #25 from MotifTechnologies/fix/invalidate_cache_adamw
b61425a
unverified

TaehyunKim commited on

Invalidate AdamW tensor caches on load_state_dict [skip-build]
89b6099

ca1207 Claude Opus 4.6 (1M context) commited on

draft commit for cpu_offload (#23)
10848ab
unverified

TaehyunKim github-actions[bot] wyldecat Claude Opus 4.6 (1M context) commited on

Replace toy PP tests with real-model-based pipeline tests [skip-build]
67f7e11

wyldecat Claude Opus 4.6 commited on

Add correctness verification to PP tests using fully_shard [skip-build]
a4d1f34

wyldecat Claude Opus 4.6 commited on

Remove correctness check from PP tests, focus on deadlock detection [skip-build]
c0bbf2e

wyldecat Claude Opus 4.6 commited on

Add PP + dp_replicate deadlock regression tests [skip-build]
cd587a6

wyldecat Claude Opus 4.6 commited on

Update fast path comment to reflect current behavior [skip-build]
7e33533

wyldecat Claude Opus 4.6 commited on

Update comment to reflect use_local_synchronization behavior [skip-build]
3f5cf49

wyldecat Claude Opus 4.6 commited on

Fix deadlock in construct_shard_mesh with PP + dp_replicate > 1
da7e5da

wyldecat Claude Opus 4.6 commited on

Apply pre-commit formatting (isort) [skip-build]
96b287c

wyldecat Claude Opus 4.6 commited on

Add MoE uneven shard test with mixed expert and non-expert params [skip-build]
bdada12

wyldecat Claude Opus 4.6 commited on

Add uneven shard correctness test [skip-build]
1a97671

wyldecat Claude Opus 4.6 commited on

Add optimization docs and update implementation guide [skip-build]
14040eb

wyldecat Claude Opus 4.6 commited on

Update tests for MoE and parallel optimizations [skip-build]
81f49fe

wyldecat Claude Opus 4.6 commited on

Muon optimizer: expert batching, parallel caching, A2A overlap [skip-build]
0f37d63

wyldecat Claude Opus 4.6 commited on

Optimize pipeline: batched update, zero-copy scatter, prelaunch gather [skip-build]
2816b64

wyldecat Claude Opus 4.6 commited on

Cache AdamW placement grouping and tensor lists [skip-build]
8ca2492

wyldecat Claude Opus 4.6 commited on

Add torch.compile, CUDA graph, and compiled momentum [skip-build]
e74d98f

wyldecat Claude Opus 4.6 commited on

Apply suggestions from code review
cdaaf4f

TaehyunKim Copilot commited on

Add mhc_attn, mhc_ffn, lambda_proj to skip_keys
ba293d0

wyldecat Claude Opus 4.6 commited on

Remove verbose param_groups summary logging
24f0957

wyldecat Claude Opus 4.6 commited on

Support multi-component expert_keys (e.g. "experts.w1")
5a99e12

wyldecat Claude Opus 4.6 commited on

Extract is_expert_param() helper to consolidate expert key matching
e615b1c

wyldecat Claude Opus 4.6 commited on

Include original (pre-normalize) FQN in is_muon logging
135fc66

wyldecat Claude Opus 4.6 commited on

Add info-level logging for param group classification (Muon vs AdamW)
1118752

wyldecat Claude Opus 4.6 commited on

Use component-level matching for expert_keys to avoid shared_experts collision
f008017

wyldecat Claude Opus 4.6 commited on

Normalize parameter FQNs to handle torch.compile / checkpoint wrappers
95a620f

wyldecat Claude Opus 4.6 commited on

Merge pull request #17 from MotifTechnologies/optimal-ns-coefficients
b220459
unverified

dongseokmotif commited on

Apply pre-commit formatting (yapf) [skip-build]
bf30b9b

dongseokmotif Claude Sonnet 4.6 commited on

Add max_iter cap and non-finite checks to _optimal_quintic [skip-build]
206b280

dongseokmotif commited on

Apply pre-commit formatting (yapf, isort) [skip-build]
aff01db

dongseokmotif commited on

Add comment explaining _coeffs_list and Polar Express vs former NS [skip-build]
abaa449

dongseokmotif Claude Sonnet 4.6 commited on

Replace hardcoded NS coefficients with analytically optimal ones [skip-build]
573242f

dongseokmotif Claude Sonnet 4.6 commited on

Refactor pipeline to async generator pattern (#16)
33929c0
unverified

wyldecat github-actions[bot] commited on

Support mHC (#15)
ae32572
unverified

wyldecat github-actions[bot] commited on

Update arxiv URL
fa059da

wyldecat commited on

Support param group with various placements (#13)
e2b41e5
unverified

wyldecat github-actions[bot] commited on

Merge pull request #14 from MotifTechnologies/fix_bug_in_fsdp
5458c82
unverified

TaehyunKim commited on

Add built binary [skip-build]
6ec5093

github-actions[bot] commited on

fix bug in fsdp
811726c

ca1207 commited on

feat(workflow): add Slack notifications for build start, success, and failure [skip-build] (#12)
0b8d958
unverified

wyldecat commited on

Merge pull request #11 from MotifTechnologies/ca1207-patch-1
53deea3
unverified

TaehyunKim commited on

Add built binary [skip-build]
de5bead

github-actions[bot] commited on

Update torch-ext/optimizer/muon.py
b0230e7
unverified

TaehyunKim commited on

Update torch-ext/optimizer/muon.py
ff2fcfb
unverified

TaehyunKim commited on

Update muon.py
c16b438
unverified

TaehyunKim commited on

Merge pull request #10 from MotifTechnologies/fix_a2a_gs_assert
4f71bc9
unverified

TaehyunKim commited on