当前位置: 首页 > news >正文

mindie开启DeepSeek的128K

  需要修改的地方有2处:

一、启动脚本:

source /usr/local/Ascend/ascend-toolkit/set_env.sh
source /usr/local/Ascend/nnal/atb/set_env.sh
source /usr/local/Ascend/atb-models/set_env.sh
source /usr/local/Ascend/mindie/set_env.sh
export RANK_TABLE_FILE="/app1/scripts/ranktable.json"
export MIES_CONTAINER_IP="192.168.1.234"
export MASTER_IP="192.168.1.234"
export WORLD_SIZE=16export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True
export HCCL_OP_EXPANSION_MODE="AIV"
export NPU_MEMORY_FRACTION=0.96
export ATB_LLM_HCCL_ENABLE=1
#export INF_NAN_MODE_ENABLE=1
export OMP_NUM_THREADS=10
#export TASK_QUEUE_ENABLE=2

export MINDIE_ASYNC_SCHEDULING_ENABLE=1
export ATB_OPERATION_EXECUTE_ASYNC=1
export ATB_LLM_ENABLE_AUTO_TRANSPOSE=0
export HCCL_BUFFSIZE=64
export ATB_WORKSPACE_MEM_ALLOC_ALG_TYPE=3
export ATB_WORKSPACE_MEM_ALLOC_GLOBAL=1
export ATB_LAYER_INTERNAL_TENSOR_REUSE=1
export LD_PRELOAD="/usr/lib64/libjemalloc.so.2:$LD_PRELOAD"export HCCL_ALGO="level0:NA;level1:pipeline"
for var in $(compgen -e | grep 'STDOUT$'); do export "$var=0"; done
for var in $(compgen -e | grep 'LOG_TO_FILE$'); do export "$var=0";doneexport HCCL_CONNECT_TIMEOUT=3600
export HCCL_EXEC_TIMEOUT=0
export MINDIE_LOG_LEVEL=info
export MINDIE_LOG_TO_STDOUT=1cd /usr/local/Ascend/mindie/latest/mindie-service/
./bin/mindieservice_daemon

 

二、配置文件(mindie的/usr/local/Ascend/mindie/latest/mindie-service/conf/config.json):

{"Version" : "1.0.0","ServerConfig" :{"ipAddress" : "192.168.1.234","managementIpAddress" : "192.168.1.234","port" : 1025,"managementPort" : 1026,"metricsPort" : 1027,"allowAllZeroIpListening" : false,"maxLinkNum" : 1000,"httpsEnabled" : false,"fullTextEnabled" : false,"tlsCaPath" : "security/ca/","tlsCaFile" : ["ca.pem"],"tlsCert" : "security/certs/server.pem","tlsPk" : "security/keys/server.key.pem","tlsPkPwd" : "security/pass/key_pwd.txt","tlsCrlPath" : "security/certs/","tlsCrlFiles" : ["server_crl.pem"],"managementTlsCaFile" : ["management_ca.pem"],"managementTlsCert" : "security/certs/management/server.pem","managementTlsPk" : "security/keys/management/server.key.pem","managementTlsPkPwd" : "security/pass/management/key_pwd.txt","managementTlsCrlPath" : "security/management/certs/","managementTlsCrlFiles" : ["server_crl.pem"],"kmcKsfMaster" : "tools/pmt/master/ksfa","kmcKsfStandby" : "tools/pmt/standby/ksfb","inferMode" : "standard","interCommTLSEnabled" : false,"interCommPort" : 1121,"interCommTlsCaPath" : "security/grpc/ca/","interCommTlsCaFiles" : ["ca.pem"],"interCommTlsCert" : "security/grpc/certs/server.pem","interCommPk" : "security/grpc/keys/server.key.pem","interCommPkPwd" : "security/grpc/pass/key_pwd.txt","interCommTlsCrlPath" : "security/grpc/certs/","interCommTlsCrlFiles" : ["server_crl.pem"],"openAiSupport" : "vllm","tokenTimeout" :3600,"e2eTimeout" : 3600,"distDPServerEnabled":false},"BackendConfig" : {"backendName" : "mindieservice_llm_engine","modelInstanceNumber" : 1,"npuDeviceIds" : [[0,1,2,3,4,5,6,7]],"tokenizerProcessNumber" : 8,"multiNodesInferEnabled" : true,"multiNodesInferPort" : 1120,"interNodeTLSEnabled" : false,"interNodeTlsCaPath" : "security/grpc/ca/","interNodeTlsCaFiles" : ["ca.pem"],"interNodeTlsCert" : "security/grpc/certs/server.pem","interNodeTlsPk" : "security/grpc/keys/server.key.pem","interNodeTlsPkPwd" : "security/grpc/pass/mindie_server_key_pwd.txt","interNodeTlsCrlPath" : "security/grpc/certs/","interNodeTlsCrlFiles" : ["server_crl.pem"],"interNodeKmcKsfMaster" : "tools/pmt/master/ksfa","interNodeKmcKsfStandby" : "tools/pmt/standby/ksfb","ModelDeployConfig" :{"maxSeqLen" : 131072,"maxInputTokenLen" : 131072,"truncation" : false,"ModelConfig" : [{"modelInstanceType" : "Standard","modelName" : "DeepSeek-R1","modelWeightPath" : "/app1/models/DeepSeek-R1-0528-w8a8","worldSize" : 8,"cpuMemSize" : 5,"npuMemSize" : -1,"backendType" : "atb","trustRemoteCode" : false,"moe_ep": 16,"moe_tp": 1,"sp": 8,"cp": 2,"tp": 8,"dp": 1,"ignore_eos": true,"async_scheduler_wait_time": 120,"kv_trans_timeout": 10,"kv_link_timeout": 1080,"models":{"deepseekv2": {"ep_level":1,"enable_init_routing_cutoff": true,"topk_scaling_factor": 0.25}}                }]},"ScheduleConfig" :{"templateType" : "Standard","templateName" : "Standard_LLM","cacheBlockSize" : 128,"maxPrefillBatchSize" : 50,"maxPrefillTokens" : 131072,"prefillTimeMsPerReq" : 150,"prefillPolicyType" : 0,"decodeTimeMsPerReq" : 50,"decodePolicyType" : 0,"maxBatchSize" : 200,"maxIterTimes" : 131072,"maxPreemptCount" : 0,"supportSelectBatch" : false,"maxQueueDelayMicroseconds" : 5000}}
}

 

http://www.hskmm.com/?act=detail&tid=28739

相关文章:

  • MATLAB的无刷直流电机转速电流双闭环仿真实现
  • AI设计软件/工具/品牌/方案/大模型/开源模型/平台/小程序/插件公司推荐:专注多场景智能设计解决方案供应!
  • STM32环境配备keil5【保姆级】
  • 微波雷达模块让广告灯告别无效展示
  • 2025七水硫酸锌供货厂家最新推荐榜:品质稳定与高效服务的优
  • 从 1 到 1000:MyEMS 社区如何用开源力量搭建中小企业的 “零碳工具箱”?
  • 为什么你的项目总是延期?90%的团队忽略了这5个预警信号
  • 变量、常量和作用域
  • 用python定义类时,用子类继承父类,当父类需要从子类中传递很多形参时,该怎么处理
  • 量化(一)
  • 2025 年试验箱厂商最新推荐排行榜:涵盖高低温 / 恒温恒湿 / 冷热冲击等设备,精选研发实力强、质量管控严的优质企业
  • 2025 最新化粪池生产厂家推荐排行榜:聚焦老牌标杆与新锐力量,预制 / 玻璃钢品类权威甄选钢筋混凝土/一体/成品/拼装式化粪池厂家推荐
  • MyEMS + 边缘网关:偏远基站如何实现 “无人值守” 下的精准能耗管理?
  • 2025 云栖精选资料:《从云原生到 AI 原生核心技术与最佳实践》PPT 免费下载
  • Salesforce项目老掉坑?这8个思维陷阱千万别踩
  • 加权图异常检测技术获最具影响力论文奖
  • java基础3-判断和循环
  • 基于模拟退火的粒子群优化算法的解析
  • 总线死锁验证方法
  • 热卷
  • C#/.NET/.NET Core优秀项目和框架2025年9月简报
  • 论文对比
  • Alpha稳定分布概率密度函数的MATLAB实现
  • 关于我心目中的理想课堂构建之法的一些感受
  • 2025 年温控器厂家最新推荐排行榜:涵盖电子式、机械式、双恒温等多类型设备,结合产品性能、创新能力与市场反馈的优质品牌汇总
  • 2025 年工业与民用加热器品牌最新推荐排行榜,深度盘点机柜、柜内、紧凑、PTC 风扇型等多类型加热器优质厂商
  • 函数计算 MSE Nacos : 轻松托管你的 MCP Server
  • Metasploit Framework 6.4.92 (macOS, Linux, Windows) - 开源渗透测试框架
  • 如何查看Linux系统信息,Linux查看系统基本信息命令
  • Python 处理 Word 文档中的批注(添加、删除) - E