I'm trying to load quantization like
from transformers import LlamaForCausalLM
from transformers import BitsAndBytesConfig
model = '/model/'
model = LlamaForCausalLM.from_pretrained(model, quantization_config=BitsAndBytesConfig(load_in_8bit=True))
but I get the error
ImportError: Using `load_in_8bit=True` requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes `pip install -i https://test.pypi.org/simple/ bitsandbytes` or pip install bitsandbytes`
But I've installed both, and I get the same error. I shut down and restarted the jupyter kernel I was using this on.
I downgraded transformers
library to version 4.30 using the following command:
pip install transformers==4.30
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With