chatglm-6b：本地手动下载，本地部署

Post author:xfxia
Post published:2023年9月18日
Post category:其他

文章目录

ChatGLM-6B是一个由清华大学和智谱AI联合研发的开源对话语言模型。它是一个支持中英双语问答的对话系统，并在中文方面进行了特别的优化。

该模型基于General Language Model (GLM)架构，具有62亿参数。借助模型量化技术，用户可以在消费级的显卡上进行本地部署，INT4量化级别下最低只需6GB显存。ChatGLM-6B使用了和ChatGLM相同的技术，针对中文问答和对话进行了优化经过约1T标识符的中英双语训练，辅以监督微调、反馈自助、人类反馈强化学习等技术的加持，62亿参数的ChatGLM-6B已经能生成相当符合人类偏好的回答。

模型效果演示

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
你好?！我是人工智能助手 ChatGLM-6B，很高兴见到你，欢迎问我任何问题。
response, history = model.chat(tokenizer, "介绍一下你自己", history=history)
print(response)
我是人工智能助手 ChatGLM-6B，由清华大学 KEG 实验室和智谱 AI 公司于 2023年 共同训练开发而成，
具有语言理解、文本生成和对话等功能。我的任务是针对用户的问题和要求提供适当的答复和支持。
response, history = model.chat(tokenizer, "推荐一部电影", history=history)
print(response)
如果喜欢动作片，可以推荐《速度与激情》系列。
这是一部非常经典的电影，讲述了一群高速赛车手在赛车世界中的奋斗和冒险，同时也展现了他们的友谊、家庭和人生价值观。
这部电影非常引人入胜，具有非常强烈的情感色彩，同时也具有非常精彩的动作场面。

操作步骤

步骤一

从

https://huggingface.co/THUDM/chatglm-6b/tree/main

下载模型，把所有文件下载下来

在这里插入图片描述

步骤二

如果从hugging上下载很慢，部分文件可以从这里下载，

https://cloud.tsinghua.edu.cn/d/fb9f16d6dc8f482596c2/

下载模型参数文件，把所有文件下载下来，并替换到步骤一的文件夹里

在这里插入图片描述

步骤三

上述所有文件保存在chatglm-6b文件夹里，

使用时运行如下代码即可，代码中的地址改为自己的本地地址

GPU版

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("D:\\chatglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("D:\\chatglm-6b", trust_remote_code=True).half().cuda()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)

CPU版

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("D:\\chatglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("D:\\chatglm-6b", trust_remote_code=True).float()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)

成功运行

原文链接：https://blog.csdn.net/weixin_45751925/article/details/131589057

文章目录

模型效果演示

操作步骤

步骤一

步骤二

步骤三

你可能也喜欢