AI & ML interests

Fundamental AI Architectures, Mixture-of-Experts (MoE), Mixture-of-Collaboration (MoC), Efficient Language Models, Emergent Reasoning, Large-Scale Training, Open-Source AI, Computational Efficiency.

Recent Activity

meryyllebr543  updated a Space 13 days ago
auren-research/README
meryyllebr543  updated a Space 7 months ago
auren-research/README
meryyllebr543  published a Space 7 months ago
auren-research/README
View all activity

Organization Card

Architecting the future of efficient, reasoning-driven AI.


Our Philosophy

The path to greater artificial intelligence cannot solely be paved with more parameters and more data. While scale is a powerful tool, true progress lies in creating architectures that are fundamentally smarter, not just bigger.

At Auren Research, our mission is to explore and develop novel AI architectures that treat reasoning not as an emergent property of brute-force scale, but as a core computational capability. We focus on building systems that think more deeply and efficiently, making state-of-the-art AI more accessible and sustainable.


Research Focus

Our work is centered on three technical pillars:

Mixture-of-Collaboration (MoC) Our core architectural contribution. In standard Mixture-of-Experts, expert sub-networks process tokens independently and their outputs are merged through a weighted sum. In MoC, experts collaborate through a learned mediator that aggregates cross-expert information and optionally feeds refined signals back before output fusion. This enables richer expert interaction at O(K) cost per token, where K is the number of active experts.

Iterative Reasoning Layers (IRL) A mechanism that increases effective network depth on a per-token basis without adding parameters. Each expert refines its representation across multiple forward passes through shared weights, allowing the model to allocate more computation to tokens that require it. Combined with adaptive gating, the model learns to skip unnecessary iterations for trivial tokens.

Adaptive Compute Allocation Both reasoning depth and collaboration rounds are governed by learned gates that decide per token how much computation to allocate. This produces models that match fixed-depth performance while using significantly less average compute — the model discovers on its own which tokens need deep processing and which do not.


Open Research

We believe that fundamental progress is accelerated through open collaboration and rigorous community review. Our foundational work — from datasets to model architectures to full experiment logs — is shared publicly.

Active Projects

Project Description Status
Lunaris Decoder-only Transformer featuring MoC and IRL. Full architecture, training pipeline, and reproducible experiment configs. v0.1.0-beta

Public Datasets

Dataset Description
pretrain-mix-150b 150B token pre-training corpus. Curated mix of educational web text (67%), source code (18%), and mathematical documents (15%).
lunaris-sft 115K+ instruction-following examples for supervised fine-tuning. Multi-model generated with AI-based quality filtering.

Experiment Logs

All training runs are logged publicly on Weights & Biases for full transparency and reproducibility: wandb.ai/lunaris-moc-validation


Validated Results

Initial controlled experiments at 64M parameters, same data, same seed, same compute budget:

Architecture Active Params/Token Validation Perplexity
Dense Transformer 35.8M 72.24
Standard MoE (top-2) 50.0M 62.89
MoC v1 50.0M 60.28
MoC vNext (adaptive) 50.0M 59.97

MoC achieves 4.6% lower perplexity than standard MoE under identical conditions. Full ablation methodology and results are available in the Lunaris repository.


Objective

Design, train, and release open-source foundational models that demonstrate how architectural innovation — not just scale — can advance reasoning capabilities in language models. Our near-term goal is validating MoC at 600M to 1B parameters.


Independent AI research from Brazil.

models 0

None public yet

datasets 0

None public yet