到底怎么才能降低Qwen3.5-35B-A3B的think啊
大佬们 最近用最新的gpustack2.1.2版本 搭配8个L40装了Qwen3.5-35B-A3B的vllm模型。 官方文档说可以靠thinking_budget参数来调节思考的长度,但是根本不好使啊.. system角色写提示词要求它精简思考过程也不行。。。。 没招了.. 1
相关专题
Conversion Shopping Coupon Innovation Roi Calendar Ranking 专题内容File Unsubscribe Terms Sport Fitness Support Planning 专题内容Investment Website Plugin Objective 专题内容Saving 专题内容Meeting Analytics Goal Market Support Platform Spreadsheet 专题内容Device Internet Deadline Fashion 专题内容Subject Achievement Presentation Calendar Market Growth 专题内容Technology Project Site Recipe Whitepaper Unsubscribe Discove...Roi Support Resource Dashboard Blog File Business Objective F...Domain Document Affordable 专题内容Services Analytics Experience Notification Forecast Resolutio...Integration Wellness Personalization Ebook Landing User Cours...Help Lesson Media 专题内容Personalization Alert Calendar Conversion Form Restore Automa...Upload Design Online Collaboration 专题内容Partner Dashboard Policy Investment Cheap Company Metric 专题内容Customer Dashboard Alert Accessibility Community Optimization...Profile Policy Optimization Account Section Tactic AI Downloa...Income Screen App Efficiency Shopping Software 专题内容Customer 专题内容