-
Couldn't load subscription status.
- Fork 1.6k
Description
(ChatGLM3-6B,envs用原来ChatGLM2的,环境名请略过)
cli_demo.py 、web_demo2.py下做如下修改切换双卡,均报错误。环境ubuntu,双4090:
def get_model():
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
~ # model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda()
# 多显卡支持,使用下面两行代替上面一行,将num_gpus改为你实际的显卡数量
~ from utils import load_model_on_gpus
~ model = load_model_on_gpus(model_path, num_gpus=2)
model = model.eval()
return tokenizer, model
2023-10-30 16:04:08.341 Uncaught app exception
Traceback (most recent call last):
File "/home/rd/anaconda3/envs/chatglm2/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 541, in _run_script
exec(code, module.dict)
File "/home/rd/LLMs/ChatGLM3/web_demo2.py", line 72, in
for response, history, past_key_values in model.stream_chat(
File "/home/rd/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
response = gen.send(None)
File "/home/rd/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 1077, in stream_chat
response, new_history = self.process_response(response, history)
File "/home/rd/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 1003, in process_response
metadata, content = response.split("\n", maxsplit=1)
ValueError: not enough values to unpack (expected 2, got 1)