v2_vi2en - Vietnamese-English Translation

Model Description

Improved Vi→En training with label smoothing and AdamW

This model is trained from scratch using the Transformer architecture for machine translation.

Model Details

  • Language pair: Vietnamese → English
  • Architecture: Transformer (Encoder-Decoder)
  • Parameters:
    • d_model: 512
    • n_heads: 8
    • n_encoder_layers: 6
    • n_decoder_layers: 6
    • d_ff: 2048
    • dropout: 0.1

Training Details

  • Optimizer: ADAMW
  • Learning Rate: 0.0001
  • Batch Size: 32
  • Label Smoothing: 0.1
  • Scheduler: warmup
  • Dataset: IWSLT 2015 Vi-En

Performance

Improvements

  • Label smoothing (0.1)
  • AdamW optimizer with weight decay
  • Beam search (size=5)
  • Gradient accumulation
  • Early stopping

Usage

# Load model and translate
from src.models.transformer import Transformer
from src.inference.translator import Translator
from src.data.vocabulary import Vocabulary
import torch

# Load vocabularies
src_vocab = Vocabulary.load('src_vocab.json')
tgt_vocab = Vocabulary.load('tgt_vocab.json')

# Load model
model = Transformer(
    src_vocab_size=len(src_vocab),
    tgt_vocab_size=len(tgt_vocab),
    d_model=512,
    n_heads=8,
    n_encoder_layers=6,
    n_decoder_layers=6,
    d_ff=2048,
    dropout=0.1,
    max_seq_length=512,
    pad_idx=0
)

checkpoint = torch.load('best_model.pt')
model.load_state_dict(checkpoint['model_state_dict'])

# Create translator
translator = Translator(
    model=model,
    src_vocab=src_vocab,
    tgt_vocab=tgt_vocab,
    device='cuda',
    decoding_method='beam',
    beam_size=5
)

# Translate
vietnamese_text = "Xin chào, bạn khỏe không?"
translation = translator.translate(vietnamese_text)
print(translation)

Training Data

  • Dataset: IWSLT 2015 Vietnamese-English parallel corpus
  • Training pairs: ~500,000 sentence pairs
  • Validation pairs: ~50,000 sentence pairs
  • Test pairs: ~3,000 sentence pairs

Limitations

  • Trained specifically for Vietnamese to English translation
  • Performance may vary on out-of-domain text
  • Medical/technical domains may require fine-tuning

Citation

@misc{nlp-transformer-mt,
  author = {MothMalone},
  title = {Transformer Machine Translation Vi-En},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/MothMalone}}
}
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support