wali77
V2EX  ›  Local LLM

有人用 mac studio 测试过 gemma4 31b 16 吗

  •  
  •   wali77 · Apr 7 · 1200 views
    This topic created in 46 days ago, the information mentioned may be changed or developed.
    现在用的是 macbook air 跑龙虾,发现 API 贼贵,短短几句话花了我 150.....

    想着用 mac studio 运行 gemma4,大部分的任务就不用接 API 了

    大家有试过么?
    4 replies    2026-04-07 17:11:44 +08:00
    wali77
        1
    wali77  
    OP
       Apr 7
    还是说先整个 MAC mini ?
    wali77
        2
    wali77  
    OP
       Apr 7
    老哥们唠唠哇,感觉很多任务都是可以让龙虾来自动化的
    gotoschool
        3
    gotoschool  
       Apr 7   ❤️ 1
    是的,测试过,可用啊!
    nrtEBH
        4
    nrtEBH  
       Apr 7
    M4 Max 64G + oMLX 0.3.1 gemma-4-31b-it-4bit-mlx 版本 没有清空内存顺手跑的

    ## Single Request Results

    | Test | TTFT (ms) | TPOT (ms/tok) | pp TPS | tg TPS | E2E Latency | Throughput | Peak Mem |
    |---|---:|---:|---:|---:|---:|---:|---:|
    | pp1024/tg128 | 5558.0 | 52.03 | 184.2 tok/s | 19.4 tok/s | 12.166s | 94.7 tok/s | 18.86 GB |
    | pp4096/tg128 | 26818.7 | 59.03 | 152.7 tok/s | 17.1 tok/s | 34.316s | 123.1 tok/s | 20.51 GB |

    ## Continuous Batching

    ### pp1024 / tg128

    | Batch Size | tg TPS | Speedup | pp TPS | pp TPS/req | Avg TTFT (ms) | E2E Latency |
    |---|---:|---:|---:|---:|---:|---:|
    | 1x (baseline) | 19.4 tok/s | 1.00x | 184.2 tok/s | 184.2 tok/s | 5558.0 | 12.166s |
    | 2x | 24.9 tok/s | 1.28x | 140.9 tok/s | 70.5 tok/s | 14531.3 | 24.829s |
    | 4x | 19.1 tok/s | 0.98x | 133.9 tok/s | 33.5 tok/s | 30593.7 | 57.345s |
    About   ·   Help   ·   Advertise   ·   Blog   ·   API   ·   FAQ   ·   Solana   ·   1121 Online   Highest 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 39ms · UTC 18:04 · PVG 02:04 · LAX 11:04 · JFK 14:04
    ♥ Do have faith in what you're doing.