Skip to content

2.13.4

Compare
Choose a tag to compare
@xenova xenova released this 04 Jan 17:31
· 626 commits to main since this release

What's new?

  • Add support for cross-encoder models (+fix token type ids) (#501)

    Example: Information Retrieval w/ Xenova/ms-marco-TinyBERT-L-2-v2.

    import { AutoTokenizer, AutoModelForSequenceClassification } from '@xenova/transformers';
    
    const model = await AutoModelForSequenceClassification.from_pretrained('Xenova/ms-marco-TinyBERT-L-2-v2');
    const tokenizer = await AutoTokenizer.from_pretrained('Xenova/ms-marco-TinyBERT-L-2-v2');
    
    const features = tokenizer(
        ['How many people live in Berlin?', 'How many people live in Berlin?'],
        {
            text_pair: [
                'Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.',
                'New York City is famous for the Metropolitan Museum of Art.',
            ],
            padding: true,
            truncation: true,
        }
    )
    
    const { logits } = await model(features)
    console.log(logits.data);
    // quantized:   [ 7.210887908935547, -11.559350967407227 ]
    // unquantized: [ 7.235750675201416, -11.562294006347656 ]

    Check out the list of pre-converted models here. We also put out a demo for you to try out.

Full Changelog: 2.13.3...2.13.4