显卡 - 钛刻 - 科技风向旗 - 深度刻画技术趋势,引领数字未来 - 第4页 - 钛刻科技

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

tech www.v2ex.com 2026-04-25 20:20:13+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 20:06:01+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 19:04:06+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 17:04:33+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 17:04:33+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 15:17:18+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 15:17:18+08:00

男子网购1600元显卡收到3盒菊花茶商家已查明原因致歉并退款

近日，湖北黄冈的孙先生在某电商平台网购了一款价值1600多元的显卡，拆箱后发现包裹内仅有3盒菊花茶，而无任何显卡产品，该事件引发广泛关注。目前，商家经倒查快递物流与仓库入库情况，确认该包裹在入库时包装盒内就已装有菊花茶，非商家发错货。确认责任后，商家主动联系上孙先生致歉并完成

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech plink.anyfeeder.com 2026-04-25 15:06:36+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 14:43:04+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 13:56:57+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 13:56:57+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 13:36:22+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 13:03:16+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 12:52:52+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 12:39:09+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 08:39:55+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 06:39:55+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 06:39:55+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 06:39:55+08:00

我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

最近在折腾本地大模型，发现一个核心问题：Ollama 和 LM Studio 能让模型跑起来，但参数全靠猜——上下文长度、KV cache 类型、MoE expert 放哪、ubatch 多大……用默认参数基本是在浪费显卡。于是做了个工具自动找最优配置，过程中踩了不少坑，记录一

相关专题

System Communication Restaurant Strategy Keyword Success Budg...Recipe Performance Analysis Management 专题内容 Spreadsheet Calendar Progress Data Research 专题内容 Lesson Policy Dashboard 专题内容 Label Follow Company Privacy Productivity 视频 Expensive Kpi 专题内容 Template Funnel Rating Vendor Cloud Client 专题内容 Learning 专题内容 Visitor Schedule Services Event Community 专题内容 Revenue Event Marketing Lead Hosting Forum 专题内容 Task Reminder Like Register Identity Conversion 专题内容 Innovation Partner Podcast Campaign Management Entertainment...Metric Careers Message Economy Fitness User Recipe Podcast 专题内容 Partner Price Story Network Cheap Restaurant Course 专题内容 Analytics Target Theme Tutorial Terms 专题内容 Revenue Automation 专题内容 Meeting Team Creative Affordable Sales Goal 专题内容 AI Extension Calendar Blog Price Roi Deadline 专题内容 Learning Entertainment Landing 专题内容 Share 专题内容 Budget 专题内容

tech www.v2ex.com 2026-04-25 06:39:55+08:00

显卡 - 钛刻 - 科技风向旗 - 深度刻画技术趋势,引领数字未来 - 第4页 - 钛刻科技 | TCTI.cn

相关标签