Meyssa
/

gector-base-2020

Token Classification

Transformers.js

grammatical-error-correction

Model card Files Files and versions

GECToR Base 2020 (ONNX)

ONNX quantized version of the original GECToR model from Grammarly for browser-based grammatical error correction with Transformers.js.

Original Model

Source: Grammarly GECToR
Paper: GECToR – Grammatical Error Correction: Tag, Not Rewrite (BEA Workshop 2020)
Architecture: RoBERTa-Base + token classification head
Parameters: ~125M

Conversion Details

Format: ONNX
Quantization: INT8 (dynamic quantization)
Size: ~125MB
Converted by: Manual export from PyTorch (AllenNLP format)

How It Works

GECToR uses a token classification approach - instead of generating corrected text, it predicts edit operations for each token:

$KEEP - Keep token unchanged
$DELETE - Remove token
$REPLACE_word - Replace with specific word
$APPEND_word - Append word after token
$TRANSFORM_* - Apply transformation (case, verb form, etc.)

The model runs iteratively (typically 2-3 passes) until no more edits are predicted.

Usage with Transformers.js

import { pipeline } from '@huggingface/transformers';

const classifier = await pipeline(
  'token-classification',
  'YOUR_USERNAME/gector-base-2020',
  { dtype: 'q8' }
);

const result = await classifier('He go to school yesterday.');
// Returns token predictions with edit tags

Performance

Faster than the 2024 version with slightly lower accuracy. Good balance of speed and quality.

License

Apache 2.0 (following original model license)

Downloads last month: 15