Qwen2模型Text2SQL微调​以及GGUF量化

Qwen2-1.5B微调

准备python环境

conda create --name llama_factory python=3.11
conda activate llama_factory

部署llama-factory

git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip3 install -e ".[torch,metrics]"
# 如果要在 Windows 平台上开启量化 LoRA(QLoRA),需要安装预编译的 bitsandbytes 库
pip3 install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl
# 安装pytorch
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
# 如启动报错,出现ImportError: cannot import name 'get_full_repo_name' from 'huggingface_hub',需安装chardet
pip3 install chardet

执行python src/webui.py​启动:
在这里插入图片描述

Text2SQL微调

(1)准备样本数据集,如:

{"db_id": "department_management","instruction": "I want you to act as a SQL terminal in front of an example database, you need only to return the sql command to me.Below is an instruction that describes a task, Write a response that appropriately completes the request.\n\"\n##Instruction:\ndepartment_management contains tables such as department, head, management. Table department has columns such as Department_ID, Name, Creation, Ranking, Budget_in_Billions, Num_Employees. Department_ID is the primary key.\nTable head has columns such as head_ID, name, born_state, age. head_ID is the primary key.\nTable management has columns such as department_ID, head_ID, temporary_acting. department_ID is the primary key.\nThe head_ID of management is the foreign key of head_ID of head.\nThe department_ID of management is the foreign key of Department_ID of department.\n\n","input": "###Input:\nHow many heads of the departments are older than 56 ?\n\n###Response:","output": "SELECT count(*) FROM head WHERE age  >  56","history": []
}

(2)添加数据集。将数据集json文件复制到LLaMA-Factory/data目录下,在dataset_info.json中添加如下内容:

"text2sql_train": {"file_name": "text2sql_train.json","columns": {"prompt": "instruction","query": "input","response": "output","history": "history"}
},
"text2sql_dev": {"file_name": "text2sql_dev.json","columns": {"prompt": "instruction","query": "input","response": "output","history": "history"}
}

(3)配置llama-factory页面上的各参数,预览命令如下(根据自己的实际情况调整参数):

llamafactory-cli train `--stage sft `--do_train True `--model_name_or_path D:\\LLM\\Qwen2-1.5B-Instruct `--preprocessing_num_workers 16 `--finetuning_type lora `--quantization_method bitsandbytes `--template qwen `--flash_attn auto `--dataset_dir D:\\python_project\\LLaMA-Factory\\data `--dataset text2sql_train `--cutoff_len 1024 `--learning_rate 0.0001 `--num_train_epochs 2.0 `--max_samples 100000 `--per_device_train_batch_size 2 `--gradient_accumulation_steps 4 `--lr_scheduler_type cosine `--max_grad_norm 1.0 `--logging_steps 20 `--save_steps 500 `--warmup_steps 0 `--optim adamw_torch `--packing False `--report_to none `--output_dir saves\Qwen2-1.5B-Chat\lora\train_2024-07-19-19-45-59 `--bf16 True `--plot_loss True `--ddp_timeout 180000000 `--include_num_input_tokens_seen True `--lora_rank 8 `--lora_alpha 16 `--lora_dropout 0 `--lora_target all

(4)微调后,对测试集数据进行推理评估:

在这里插入图片描述

{"predict_bleu-4": 88.43791015473889,"predict_rouge-1": 92.31425483558995,"predict_rouge-2": 85.43010570599614,"predict_rouge-l": 89.06327794970986,"predict_runtime": 1027.4111,"predict_samples_per_second": 1.006,"predict_steps_per_second": 0.503
}

从以上的指标数据看,模型的效果还是挺不错的,大家可以基于自己的样本数据自行设置各个参数。以下是以上的评估指标解读:

1. predict_bleu-4:* BLEU(Bilingual Evaluation Understudy)是一种常用的用于评估机器翻译质量的指标。* BLEU-4 表示四元语法 BLEU 分数,它衡量模型生成文本与参考文本之间的 n-gram 匹配程度,其中 n=4。* 值越高表示生成的文本与参考文本越相似,最大值为 1002. predict_rouge-1 和 predict_rouge-2:* ROUGE(Recall-Oriented Understudy for Gisting Evaluation)是一种用于评估自动摘要和文本生成模型性能的指标。* ROUGE-1 表示一元 ROUGE 分数,ROUGE-2 表示二元 ROUGE 分数,分别衡量模型生成文本与参考文本之间的单个词和双词序列的匹配程度。* 值越高表示生成的文本与参考文本越相似,最大值为 1003. predict_rouge-l:* ROUGE-L 衡量模型生成文本与参考文本之间最长公共子序列(Longest Common Subsequence)的匹配程度。* 值越高表示生成的文本与参考文本越相似,最大值为 1004. predict_runtime:* 预测运行时间,表示模型生成一批样本所花费的总时间。* 单位通常为秒。5. predict_samples_per_second:* 每秒生成的样本数量,表示模型每秒钟能够生成的样本数量。* 通常用于评估模型的推理速度。6. predict_steps_per_second:* 每秒执行的步骤数量,表示模型每秒钟能够执行的步骤数量。* 对于生成模型,一般指的是每秒钟执行生成操作的次数。

(5)评估感觉达到了微调预期,可以直接导出微调后的模型(注意选上微调生成的检查点):

在这里插入图片描述

GGUF模型

以上微调导出的是safetensors模型,我们可以使用llama.cpp将safetensors模型转为GGUF模型。推荐使用w64devkit+make方案部署llama.cpp。

安装make

下载地址:https://gnuwin32.sourceforge.net/packages/make.htm

在这里插入图片描述

点击框中的下载,下载安装后,把安装路径添加到环境变量PATH中。在终端,执行以下命令,将出现Make版本信息:

(llama_factory) D:\LLM\llama.cpp>make -v
GNU Make 3.81
Copyright (C) 2006  Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.This program built for i386-pc-mingw32

安装MinGW

下载源文件,把mingw64/bin添加到环境变量PATH中。

在这里插入图片描述

安装w64devkit

下载地址:https://github.com/skeeto/w64devkit/releases

下载 w64devkit-fortran-1.23.0.zip后解压。注意:不要包含中文路径​。

部署llama.cpp

下载地址:https://github.com/ggerganov/llama.cpp

使用我们刚刚下载的w64devkit.exe打开llama.cpp,然后make,成功后就能得到一堆exe文件啦。

~ $ cd D:
D:/ $ cd /LLM/llama.cpp
D:/LLM/llama.cpp $ make

make成功后,安装python依赖:

conda create --name llama_cpp python=3.11
conda activate llama_cpp
pip3 install -r requirements.txt

转换为GGUF FP16格式

# model_path/mymodel为我们实际的模型路径
python convert_hf_to_gguf.py model_path/mymodel

运行日志:

(llama_cpp) D:\LLM\llama.cpp>python convert_hf_to_gguf.py D:/python_project/LLaMA-Factory/output_model/model2
INFO:hf-to-gguf:Loading model: model2
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: loading model part 'model-00001-of-00002.safetensors'
INFO:hf-to-gguf:token_embd.weight,         torch.bfloat16 --> F16, shape = {1536, 151936}
INFO:hf-to-gguf:blk.0.attn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.0.ffn_down.weight,     torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.0.ffn_gate.weight,     torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.0.ffn_up.weight,       torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.0.ffn_norm.weight,     torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.0.attn_k.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.0.attn_k.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.0.attn_output.weight,  torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.0.attn_q.bias,         torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.0.attn_q.weight,       torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.0.attn_v.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.0.attn_v.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.1.attn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.1.ffn_down.weight,     torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.1.ffn_gate.weight,     torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.1.ffn_up.weight,       torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.1.ffn_norm.weight,     torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.1.attn_k.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.1.attn_k.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.1.attn_output.weight,  torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.1.attn_q.bias,         torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.1.attn_q.weight,       torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.1.attn_v.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.1.attn_v.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.10.attn_norm.weight,   torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.10.ffn_down.weight,    torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.10.ffn_gate.weight,    torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.10.ffn_up.weight,      torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.10.ffn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.10.attn_k.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.10.attn_k.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.10.attn_output.weight, torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.10.attn_q.bias,        torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.10.attn_q.weight,      torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.10.attn_v.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.10.attn_v.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.11.attn_norm.weight,   torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.11.ffn_down.weight,    torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.11.ffn_gate.weight,    torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.11.ffn_up.weight,      torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.11.ffn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.11.attn_k.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.11.attn_k.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.11.attn_output.weight, torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.11.attn_q.bias,        torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.11.attn_q.weight,      torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.11.attn_v.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.11.attn_v.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.12.attn_norm.weight,   torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.12.ffn_down.weight,    torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.12.ffn_gate.weight,    torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.12.ffn_up.weight,      torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.12.ffn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.12.attn_k.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.12.attn_k.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.12.attn_output.weight, torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.12.attn_q.bias,        torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.12.attn_q.weight,      torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.12.attn_v.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.12.attn_v.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.13.attn_norm.weight,   torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.13.ffn_down.weight,    torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.13.ffn_gate.weight,    torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.13.ffn_up.weight,      torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.13.ffn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.13.attn_k.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.13.attn_k.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.13.attn_output.weight, torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.13.attn_q.bias,        torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.13.attn_q.weight,      torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.13.attn_v.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.13.attn_v.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.14.attn_norm.weight,   torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.14.ffn_down.weight,    torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.14.ffn_gate.weight,    torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.14.ffn_up.weight,      torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.14.ffn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.14.attn_k.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.14.attn_k.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.14.attn_output.weight, torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.14.attn_q.bias,        torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.14.attn_q.weight,      torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.14.attn_v.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.14.attn_v.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.15.attn_norm.weight,   torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.15.ffn_down.weight,    torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.15.ffn_gate.weight,    torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.15.ffn_up.weight,      torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.15.ffn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.15.attn_k.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.15.attn_k.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.15.attn_output.weight, torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.15.attn_q.bias,        torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.15.attn_q.weight,      torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.15.attn_v.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.15.attn_v.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.16.attn_k.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.16.attn_k.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.16.attn_output.weight, torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.16.attn_q.bias,        torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.16.attn_q.weight,      torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.16.attn_v.bias,        torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.16.attn_v.weight,      torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.2.attn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.2.ffn_down.weight,     torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.2.ffn_gate.weight,     torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.2.ffn_up.weight,       torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.2.ffn_norm.weight,     torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.2.attn_k.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.2.attn_k.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.2.attn_output.weight,  torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.2.attn_q.bias,         torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.2.attn_q.weight,       torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.2.attn_v.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.2.attn_v.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.3.attn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.3.ffn_down.weight,     torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.3.ffn_gate.weight,     torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.3.ffn_up.weight,       torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.3.ffn_norm.weight,     torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.3.attn_k.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.3.attn_k.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.3.attn_output.weight,  torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.3.attn_q.bias,         torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.3.attn_q.weight,       torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.3.attn_v.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.3.attn_v.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.4.attn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.4.ffn_down.weight,     torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.4.ffn_gate.weight,     torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.4.ffn_up.weight,       torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.4.ffn_norm.weight,     torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.4.attn_k.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.4.attn_k.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.4.attn_output.weight,  torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.4.attn_q.bias,         torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.4.attn_q.weight,       torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.4.attn_v.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.4.attn_v.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.5.attn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.5.ffn_down.weight,     torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.5.ffn_gate.weight,     torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.5.ffn_up.weight,       torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.5.ffn_norm.weight,     torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.5.attn_k.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.5.attn_k.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.5.attn_output.weight,  torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.5.attn_q.bias,         torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.5.attn_q.weight,       torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.5.attn_v.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.5.attn_v.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.6.attn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.6.ffn_down.weight,     torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.6.ffn_gate.weight,     torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.6.ffn_up.weight,       torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.6.ffn_norm.weight,     torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.6.attn_k.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.6.attn_k.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.6.attn_output.weight,  torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.6.attn_q.bias,         torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.6.attn_q.weight,       torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.6.attn_v.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.6.attn_v.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.7.attn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.7.ffn_down.weight,     torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.7.ffn_gate.weight,     torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.7.ffn_up.weight,       torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.7.ffn_norm.weight,     torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.7.attn_k.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.7.attn_k.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.7.attn_output.weight,  torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.7.attn_q.bias,         torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.7.attn_q.weight,       torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.7.attn_v.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.7.attn_v.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.8.attn_norm.weight,    torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.8.ffn_down.weight,     torch.bfloat16 --> F16, shape = {8960, 1536}
INFO:hf-to-gguf:blk.8.ffn_gate.weight,     torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.8.ffn_up.weight,       torch.bfloat16 --> F16, shape = {1536, 8960}
INFO:hf-to-gguf:blk.8.ffn_norm.weight,     torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.8.attn_k.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.8.attn_k.weight,       torch.bfloat16 --> F16, shape = {1536, 256}
INFO:hf-to-gguf:blk.8.attn_output.weight,  torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.8.attn_q.bias,         torch.bfloat16 --> F32, shape = {1536}
INFO:hf-to-gguf:blk.8.attn_q.weight,       torch.bfloat16 --> F16, shape = {1536, 1536}
INFO:hf-to-gguf:blk.8.attn_v.bias,         torch.bfloat16 --> F32, shape = {256}
INFO:hf-to-gguf:blk.8.attn_v.weight,       torch.bfloat16 --

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://xiahunao.cn/news/3267676.html

如若内容造成侵权/违法违规/事实不符,请联系瞎胡闹网进行投诉反馈,一经查实,立即删除!

相关文章

算法日记day 20(二叉搜索树)

一、验证二叉搜索树 题目: 给你一个二叉树的根节点 root ,判断其是否是一个有效的二叉搜索树。 有效 二叉搜索树定义如下: 节点的左 子树 只包含 小于 当前节点的数。节点的右子树只包含 大于 当前节点的数。所有左子树和右子树自身必须也…

IE11添加收藏、关闭窗口时弹出的对话框字体又大又粗很难看的解决办法

原因已查明,在win7 sp1 32位系统下,安装“2020-01 适用于基于 x86 的系统的 Windows 7 月度安全质量汇总(KB4534310)”这个更新会导致IE11的窗口字体变大变粗,把这个更新卸载了就可以了,无需重装IE11浏览器…

【四】jdk8基于m2芯片arm架构Ubuntu24虚拟机下载与安装

文章目录 1. 安装版本2. 开始安装3. 集群安装 1. 安装版本 如无特别说明,本文均在root权限下安装。进入oracle官网:https://www.oracle.com/java/technologies/downloads/找到最下面Java SE 看到java 8,下载使用 ARM64 Compressed Archive版…

探索 Electron:快捷键与剪切板操作

Electron是一个开源的桌面应用程序开发框架,它允许开发者使用Web技术(如 HTML、CSS 和 JavaScript)构建跨平台的桌面应用程序,它的出现极大地简化了桌面应用程序的开发流程,让更多的开发者能够利用已有的 Web 开发技能…

C++:类和对象2

1.类的默认成员函数 默认成员函数就是用户没有显示实现编译器会自动生成的成员函数称为默认成员函数。一个类,我们在不写的情况下编译器会默认生成6个默认成员函数,分别是构造函数,析构函数,拷贝构造函数,拷贝赋值运算…

GPT-4引领:AI新浪潮的转折点

OneFlow编译 **翻译|贾川、杨婷、徐佳渝 编辑|王金许** 一朝成名天下知。ChatGPT/GPT-4相关的新闻接二连三刷屏朋友圈,如今,这些模型背后的公司OpenAI的知名度不亚于任何科技巨头。 不过,就在ChatGPT问世前&#x…

Reaxys平台账号创建:简易注册流程

Reaxys数据库是Elsevier旗下的全球最大物质理化性质和事实反应数据库,包含了超过5亿条经过实验验证的物质信息,收录超过1.38亿种化合物,5,000万种单步和多步反应、6,000万条文摘记录。涵盖全球7大专利局和16,000种期刊16个学科中与化合物性质…

全网最详细Gradio教程系列5——Gradio Client: python

全网最详细Gradio教程系列5——Gradio Client: python 前言本篇摘要5. Gradio Client的三种使用方式5.1 使用Gradio Python Client5.1.1 安装gradio_client5.1.2 连接Gradio应用程序1. 通过URL连接2. 通过SpaceID连接3. 辅助:duplicate()和hf_token4. Colab Noteboo…

ajax学习1

<!-- 目标&#xff1a;使用axios库&#xff0c;获取省份列表数据&#xff0c;展示到页面上 1.引入axios库 --> <p class"my-p"></p> <script src"https://cdn.jsdelivr.net/npm/axios/dist/axios.min.js"></ script> <sc…

Tomcat项目本地部署

今天来分享一下如何于本机上在不适用idea等辅助工具的前提下&#xff0c;部署多个tomcat的web项目 我这里以我最近写的SSM项目哈米音乐为例&#xff0c;简单介绍一下项目的大致组成&#xff1a; 首先&#xff0c;项目分为4个模块&#xff0c;如下图所示&#xff1a; 其中&…

力扣高频SQL 50题(基础版)第十八题

文章目录 力扣高频SQL 50题&#xff08;基础版&#xff09;第十八题1633. 各赛事的用户注册率题目说明思路分析实现过程准备数据实现方式结果截图 力扣高频SQL 50题&#xff08;基础版&#xff09;第十八题 1633. 各赛事的用户注册率 题目说明 用户表&#xff1a; Users --…

RPG素材Unity7月20闪促限时4折游戏开发资产兽人角色模型动画休闲放置模板物理交互流体水下焦散VR界面UI2D模板场景20240720

今天这个是RPG素材比较多&#xff0c;还有一些休闲放置模板、FPS场景素材、角色模型、动画、特效。 详细内容展示&#xff1a;www.bilibili.com/video/BV1Tx4y1s7vm 闪促限时4折&#xff1a;https://prf.hn/l/0eEOG1P 半价促销&#xff1a;https://prf.hn/l/RlDmDeQ 7月闪促…

小红书(社招二面)算法原题

萝卜快跑涨价 距离我们上次谈 萝卜快跑 不足半月&#xff0c;萝卜快跑迎来了不少"反转"。 先是被曝远程后台有人操控&#xff0c;真实日成本超 400&#xff1a; 最近还被不少网友吐槽&#xff1a;萝卜快跑涨价了&#xff0c;如今价格和网约车持平。 据不少博主实测&a…

17 Python常用内置函数——基本输入输出

input() 和 print() 是 Python 的基本输入输出函数&#xff0c;前者用来接收用户的键盘输入&#xff0c;后者用来把数据以指定的格式输出到标准控制台或指定的文件对象。无论用户输入什么内容&#xff0c;input() 一律作为字符串对待&#xff0c;必要时可以使用内置函数 int()、…

【SpringBoot教程:从入门到精通】掌握Springboot开发技巧和窍门(四)-Vue项目配置环境、导航栏

主要写前端页面&#xff0c;采用vue框架写页面的导航栏&#xff01;&#xff01;&#xff01; 文章目录 前言 Vue项目配置环境 安装依赖 创建菜单 总结 前言 主要写前端页面&#xff0c;采用vue框架写页面的导航栏&#xff01;&#xff01;&#xff01; Vue项目配置环境 安装…

【算法】分布式共识Paxos

一、引言 在分布式系统中&#xff0c;一致性是至关重要的一个问题。Paxos算法是由莱斯利兰伯特&#xff08;Leslie Lamport&#xff09;在1990年提出的一种解决分布式系统中一致性问题的算法。 二、算法原理 Paxos算法的目标是让一个分布式系统中的多个节点就某个值达成一致。算…

2000-2023年上市公司全要素生产率数据LP法(含原始数据+计算代码+结果)

2000-2023年上市公司全要素生产率数据LP法&#xff08;含原始数据计算代码结果&#xff09; 1、时间&#xff1a;2000-2023年 2、指标&#xff1a;stkcd、year、证券代码、固定资产净额、资产总计、负债合计、支付给职工以及为职工支付的现金、购建固定资产无形资产和其他长期…

Monaco 使用 LinkedEditingRangeProvider

Monaco LinkEdit 功能是指同时修改同样的字符串&#xff0c;例如在编辑 Html 时&#xff0c;修改开始标签时会同时修改闭合标签。Monaco 支持自定义需要一起更新的字符串列表。最终效果如下&#xff1a; 首先&#xff0c;通过 registerLinkedEditingRangeProvider 注册 LinkEd…

关键词查找【Knuth-Morris-Pratt (KMP) 算法】

一个视频让你彻底学懂KMP算法_哔哩哔哩_bilibili KMP算法的核心是利用匹配失败后的信息&#xff0c;尽量减少模式串与主串的匹配次数以达到快速匹配的目的。 第一步&#xff1a;计算模式串(子串)和next[j]数组 模式串 前2位字母的next[j]固定是0 和 1 后续字母的nex[j]&…

生成式AI的发展方向是chat还是Agent探讨

生成式 AI 的发展方向&#xff0c;是 Chat 还是 Agent&#xff1f; 随着生成式AI技术的不断进步&#xff0c;关于其未来发展方向的讨论也愈发激烈。究竟生成式AI的未来是在对话系统&#xff08;Chat&#xff09;中展现智慧&#xff0c;还是在自主代理&#xff08;Agent&#x…