Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between AutoModelForSeq2SeqLM and AutoModelForCausalLM

As per the title, how are these two Auto Classes on Huggingface different from each other? I tried reading the documentation but did not find differentiating information

like image 315
mm256 Avatar asked Dec 30 '25 14:12

mm256


1 Answers

Intuitively, AutoModelForSeq2SeqLM is used for language models with encoder-decoder architecture, like T5 and BART, while AutoModelForCausalLM is used for auto-regressive language models like all the GPT models.

These two classes are conceptual APIs to automatically infer a specific model class for the two types of models, e.g., GPT2LMHeadModel using AutoModelForCausalLM.from_pretrained('gpt2'). For example, You can see the source code for all the inference models. (MODEL_FOR_CAUSAL_LM_MAPPING and MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING)

like image 197
Xinzhe Li Avatar answered Jan 05 '26 09:01

Xinzhe Li



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!