pengzhizhuo

V2EX member #559637, joined on 2021-10-25 18:23:31 +08:00

Today's activity rank 8055

pengzhizhuo 提问技术话题好玩工作信息交易信息城市相关

Per pengzhizhuo's settings, the topics list is hidden

Deals info, including closed deals, is not hidden

pengzhizhuo's recent replies

Apr 25

Replied to a topic by jonsmith › 程序员 › 阿里云 Coding Plan 增加动态限流，频繁暂停无法使用

我的是直接被无理由封号，还在投诉中

Apr 25

Replied to a topic by yuhui0531 › 程序员 › 阿里云的 coding plan 莫名奇妙被冻结了

我的 200 的 pro 前段时间也被封了，无理由被封，还在投诉中

Apr 24

Replied to a topic by KaiWuBOSS › Local LLM › 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

这个咋样？

(base) PS E:\kaiwu-windows-amd64> .\kaiwu.exe run Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive-Q4_K_M.gguf

██╗ ██╗ █████╗ ██╗██╗ ██╗██╗ ██╗
██║ ██╔╝██╔══██╗██║██║ ██║██║ ██║
█████╔╝ ███████║██║██║ █╗ ██║██║ ██║
██╔═██╗ ██╔══██║██║██║███╗██║██║ ██║
██║ ██╗██║ ██║██║╚███╔███╔╝╚██████╔╝
╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚══╝╚══╝ ╚═════╝
本地大模型部署器 vv0.1.1 · llama.cpp b8864
by llmbbs.ai · 本地 AI 技术社区

[1/6] Probing hardware...
GPU: NVIDIA GeForce RTX 4060 Laptop GPU (SM89, 8188 MB VRAM, 0 GB/s)
RAM: 63 GB DDR5
OS: windows amd64

[2/6] Selecting configuration...
Model: Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive (moe, 36B total / 1B active)
Quant: Q4_K_M (19.7 GB)
Mode: moe_offload (experts on CPU)
Accel: Flash Attention

[3/6] Checking files...
Using bundled iso3 binary: llama-server-cuda.exe
Binary: llama-server-cuda.exe [cached]
Model: Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive-Q4_K_M.gguf [cached]

[4/6] Preflight check...
✓ VRAM sufficient

[5/6] Warmup benchmark...
Probe 1: ctx=256K ... 22.1 tok/s
Tune ubatch: ub=128 → 22.3 tok/s; ub=512 → 20.7 tok/s;
✓ 22.3 tok/s @ 256K ctx
Saved profile: C:\Users\pzz\.kaiwu\profiles\qwen3.6-35b-a3b-uncensored-hauhaucs-aggressive-q4_k_m_sm89_8188mb_ddr5.json
✓ 22.3 tok/s

[6/6] Starting server...
Waiting for llama-server to be ready (port 11434)...
llama-server started (PID 49380, port 11434)
Kaiwu proxy started (port 11435)
2026/04/24 22:03:23 Kaiwu proxy listening on :11435 → llama-server :11434

┌─────────────────────────────────────────────────┐
│ Ready — Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive @ 22.3 tok/s │
│ API: http://127.0.0.1:11435/v1/chat/completions │
│ 模型文件夹: E:\model │
└─────────────────────────────────────────────────┘

运行 kaiwu inject 接入 IDE · Ctrl+C 停止
─ 实时监控 · 空载 ─────────────────── 每 2s 刷新 ─
reuse:1024 · KV:q8_0 · 256K ctx · ub128 · mlock
速度显存内存 GPU 温度
— tok/s 5.5/8 GB 47.0/64 GB 2% 58°CC
[..........] [======....] [=======...] [..........] [=====.....]
─────────────────────────────────────────────────────────
上下文 [....................] 0.0K / 256K 余 256.0K

正在停止服务...
✓ llama-server 已停止
✓ Kaiwu proxy 已停止

Apr 22

Replied to a topic by pengzhizhuo › 程序员 › 阿里云百炼这波开始割韭菜了，下架 coding plan，改头换面 token plan

@Charlie17Li 虽然但是，能搜到你也买不到，比耍猴更耍猴

Apr 18

Replied to a topic by pengzhizhuo › 程序员 › 牛逼了，我的阿里云百炼 Coding Plan Pro 莫名被永久封禁，联系客服，无法解封。寻求替代。

@Moierby 你这个是什么情况？也是最近被封的吗

Apr 18

Replied to a topic by pengzhizhuo › 程序员 › 牛逼了，我的阿里云百炼 Coding Plan Pro 莫名被永久封禁，联系客服，无法解封。寻求替代。

@whileFalse 阿里云 2 个 ECS 的龙虾算 2 个公网 IP ，另外自己电脑在家里和公司都用，也算 2 个不同公网 IP ，就这不知道应该算正常还是不正常

Apr 17

Replied to a topic by pengzhizhuo › 程序员 › 牛逼了，我的阿里云百炼 Coding Plan Pro 莫名被永久封禁，联系客服，无法解封。寻求替代。

突然被封禁还没有备用的真不方便，这两天试着抢下智谱或者其他的，看能不能抢到。