LLM - 钛刻 - 科技风向旗 - 深度刻画技术趋势,引领数字未来 - 第7页 - 钛刻科技 | TCTI.cn

LLM - 钛刻 - 科技风向旗 - 深度刻画技术趋势,引领数字未来 - 第7页 - 钛刻科技 | TCTI.cn - 钛刻 (TCTI.cn) 为您提供最前沿的硬核科技资讯、深度评测和未来技术趋势分析。

共 214 篇相关文章 · 第 7 / 11 页

LLM是怎么存下这么多知识的

看到最新的D牢师竟然拥有仅次于gemini3.1Pro的世界知识，不禁感慨这些LLM是怎么存下这么多东西的 20 个帖子 - 15 位参与者阅读完整话题

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech linux.do 2026-04-25 23:54:06+08:00

[Local LLM] 请教一个关于模型训练主机配置的问题

主要是用来部署 YOLO26 做数据集训练和目标检测或追踪的，图片数据暂定 5000 张（其实数据有很多，但是暂定用于训练的数据上限是 5000 张）。目前有一台 RX6600xt ，但是 directML 好像也不能使这张卡参与训练计算，上网查了一下好像是对 7000 系列以

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-25 23:02:26+08:00

【开源】LLMRelayService —— Docker 部署个人大模型网关

本帖使用社区开源推广，符合推广要求。我申明并遵循社区要求的以下内容：我的帖子已经打上开源推广标签：是我的开源项目完整开源，无未开源部分：是我的开源项目已链接认可 LINUX DO 社区： README 已添加链接我帖子内的项目介绍，AI生成、润色内容部分已截图发出

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech linux.do 2026-04-25 22:12:31+08:00

[分享创造] 给自用的 LLM Wiki 加了个维基百科风格的网页

之前根据 Karpathy 那个 LLM Wiki 的 idea 自己搭了一个用了段时间，最近抽空加了个纯静态的维基百科风格的网页。其实我长期以来是没有记笔记的习惯的，手写太麻烦了，只是会有一些保存的习惯，看到不错的文章会加个书签，但过了很久之后很多书签都会失效，尤其个人博客

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-25 08:42:26+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-25 03:39:55+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-25 03:39:55+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-25 03:39:55+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-25 02:39:55+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-25 02:18:07+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-25 01:40:43+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-24 23:54:14+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-24 23:54:14+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-24 23:54:14+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-24 23:47:32+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-24 23:03:46+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-24 23:03:46+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-24 22:48:43+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-24 22:41:38+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-24 21:41:54+08:00

[Local LLM] 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

Blog Education Cheap Project 专题内容 Server 专题内容 Feedback Identity Webinar Satisfaction Restore Logo Automatio...Alert Interface Resolution Partner Digital 专题内容 Automation Milestone Alliance Event Engagement Lesson 专题内容 Update Calendar Lesson Network Social 专题内容 Innovation Income Planning 专题内容 Sale Comment Planning User 专题内容 Analysis Services Whitepaper Webinar Faq Budget 财经 Document D...Feedback Restore 专题内容 Innovation Theme 专题内容 Cost Growth Dashboard 专题内容 Tactic Contact Subject Share Experience Tool Satisfaction Pla...Alert Landing Optimization Course Expense Extension Task Docu...Expense Alert Message Enterprise Wellness 专题内容 Folder 专题内容 Event Careers Training Value Retention Customer Form 专题内容 Recipe Local Budget Download File Cost Prospect 专题内容 App 专题内容 Tactic Plugin Objective Collaborate Platform 专题内容

tech v2ex.com 2026-04-24 21:41:54+08:00