<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: glm</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/glm.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-04-07T21:25:14+00:00</updated><author><name>Simon Willison</name></author><entry><title>GLM-5.1: Towards Long-Horizon Tasks</title><link href="https://simonwillison.net/2026/Apr/7/glm-51/#atom-tag" rel="alternate"/><published>2026-04-07T21:25:14+00:00</published><updated>2026-04-07T21:25:14+00:00</updated><id>https://simonwillison.net/2026/Apr/7/glm-51/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://z.ai/blog/glm-5.1"&gt;GLM-5.1: Towards Long-Horizon Tasks&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Chinese AI lab Z.ai's latest model is a giant 754B parameter 1.51TB (on &lt;a href="https://huggingface.co/zai-org/GLM-5.1"&gt;Hugging Face&lt;/a&gt;) MIT-licensed monster - the same size as their previous GLM-5 release, and sharing the &lt;a href="https://huggingface.co/papers/2602.15763"&gt;same paper&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It's available &lt;a href="https://openrouter.ai/z-ai/glm-5.1"&gt;via OpenRouter&lt;/a&gt; so I asked it to draw me a pelican:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm install llm-openrouter
llm -m openrouter/z-ai/glm-5.1 'Generate an SVG of a pelican on a bicycle'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And something new happened... unprompted, the model &lt;a href="https://gist.github.com/simonw/af7170f54256cc007ef28a8721564be8"&gt;decided to give me&lt;/a&gt; an HTML page that included both the SVG and a separate set of CSS animations!&lt;/p&gt;
&lt;p&gt;The SVG was excellent, and might be my new favorite from an open weights model:&lt;/p&gt;
&lt;p&gt;&lt;img alt="The bicycle is red and has a frame the correct shape and wheels with spokes. The pelican is a perky little fella." src="https://static.simonwillison.net/static/2026/glm-5.1-pelican.png" /&gt;&lt;/p&gt;
&lt;p&gt;But the animation &lt;a href="https://gisthost.github.io/?73bb6808b18c2482f66e5f082c75f36e"&gt;broke it&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Animation - the wheels and pedals rotate, the clouds move... and the pelican has vanished, but there is a little blob bobbing up and down in the top left corner." src="https://static.simonwillison.net/static/2026/glm-5.1-broken-light-lossy.gif" /&gt;&lt;/p&gt;
&lt;p&gt;That's the pelican, floating up in the top left corner.&lt;/p&gt;
&lt;p&gt;I usually don't do follow-up prompts for the pelican test, but in this case I made an exception:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm -c 'the animation is a bit broken, the pelican ends up positioned off the screen at the top right'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;GLM 5.1 replied:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The issue is that CSS &lt;code&gt;transform&lt;/code&gt; animations on SVG elements override the SVG &lt;code&gt;transform&lt;/code&gt; attribute used for positioning, causing the pelican to lose its placement and fly off to the top-right. The fix is to separate positioning (SVG attribute) from animation (inner group) and use &lt;code&gt;&amp;lt;animateTransform&amp;gt;&lt;/code&gt; for SVG rotations since it handles coordinate systems correctly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And spat out &lt;a href="https://static.simonwillison.net/static/2026/glm-5.1-pelican-fixed.html"&gt;fresh HTML&lt;/a&gt; which fixed the problem!&lt;/p&gt;
&lt;p&gt;&lt;img alt="Now everything is right - the bicycle rotates correctly, the pelican sits on it and bobs up and down, and its lower beak moves slightly as well." src="https://static.simonwillison.net/static/2026/glm-5.1-pelican-fixed-medium-lossy.gif" /&gt;&lt;/p&gt;
&lt;p&gt;I particularly like the animation of the beak, which is described in the SVG comments like so:&lt;/p&gt;
&lt;div class="highlight highlight-text-xml-svg"&gt;&lt;pre&gt;&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;&amp;lt;!--&lt;/span&gt; Pouch (lower beak) with wobble &lt;span class="pl-c"&gt;--&amp;gt;&lt;/span&gt;&lt;/span&gt;
&amp;lt;&lt;span class="pl-ent"&gt;g&lt;/span&gt;&amp;gt;
    &amp;lt;&lt;span class="pl-ent"&gt;path&lt;/span&gt; &lt;span class="pl-e"&gt;d&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;M42,-58 Q43,-50 48,-42 Q55,-35 62,-38 Q70,-42 75,-60 L42,-58 Z&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-e"&gt;fill&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;url(#pouchGrad)&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-e"&gt;stroke&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;#b06008&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-e"&gt;stroke-width&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;1&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-e"&gt;opacity&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;0.9&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;/&amp;gt;
    &amp;lt;&lt;span class="pl-ent"&gt;path&lt;/span&gt; &lt;span class="pl-e"&gt;d&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;M48,-50 Q55,-46 60,-52&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-e"&gt;fill&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;none&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-e"&gt;stroke&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;#c06a08&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-e"&gt;stroke-width&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;0.8&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-e"&gt;opacity&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;0.6&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;/&amp;gt;
    &amp;lt;&lt;span class="pl-ent"&gt;animateTransform&lt;/span&gt; &lt;span class="pl-e"&gt;attributeName&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;transform&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-e"&gt;type&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;scale&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
    &lt;span class="pl-e"&gt;values&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;1,1; 1.03,0.97; 1,1&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-e"&gt;dur&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;0.75s&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-e"&gt;repeatCount&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;indefinite&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
    &lt;span class="pl-e"&gt;additive&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;sum&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;/&amp;gt;
&amp;lt;/&lt;span class="pl-ent"&gt;g&lt;/span&gt;&amp;gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: On Bluesky &lt;a href="https://bsky.app/profile/charles.capps.me/post/3miwrn42mjc2t"&gt;@charles.capps.me suggested&lt;/a&gt; a "NORTH VIRGINIA OPOSSUM ON AN E-SCOOTER" and...&lt;/p&gt;
&lt;p&gt;&lt;img alt="This is so great. It's dark, the possum is clearly a possum, it's riding an escooter, lovely animation, tail bobbing up and down, caption says NORTH VIRGINIA OPOSSUM, CRUISING THE COMMONWEALTH SINCE DUSK - only glitch is that it occasionally blinks and the eyes fall off the face" src="https://static.simonwillison.net/static/2026/glm-possum-escooter.gif.gif" /&gt;&lt;/p&gt;
&lt;p&gt;The HTML+SVG comments on that one include &lt;code&gt;/* Earring sparkle */, &amp;lt;!-- Opossum fur gradient --&amp;gt;, &amp;lt;!-- Distant treeline silhouette - Virginia pines --&amp;gt;,  &amp;lt;!-- Front paw on handlebar --&amp;gt;&lt;/code&gt; - here's &lt;a href="https://gist.github.com/simonw/1864b89f5304eba03c3ded4697e156c4"&gt;the transcript&lt;/a&gt; and the &lt;a href="https://static.simonwillison.net/static/2026/glm-possum-escooter.html"&gt;HTML result&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/css"&gt;css&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/svg"&gt;svg&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle"&gt;pelican-riding-a-bicycle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-in-china"&gt;ai-in-china&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/glm"&gt;glm&lt;/a&gt;&lt;/p&gt;



</summary><category term="css"/><category term="svg"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="pelican-riding-a-bicycle"/><category term="llm-release"/><category term="ai-in-china"/><category term="glm"/></entry><entry><title>GLM-5: From Vibe Coding to Agentic Engineering</title><link href="https://simonwillison.net/2026/Feb/11/glm-5/#atom-tag" rel="alternate"/><published>2026-02-11T18:56:14+00:00</published><updated>2026-02-11T18:56:14+00:00</updated><id>https://simonwillison.net/2026/Feb/11/glm-5/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://z.ai/blog/glm-5"&gt;GLM-5: From Vibe Coding to Agentic Engineering&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This is a &lt;em&gt;huge&lt;/em&gt; new MIT-licensed model: 744B parameters and &lt;a href="https://huggingface.co/zai-org/GLM-5"&gt;1.51TB on Hugging Face&lt;/a&gt; twice the size of &lt;a href="https://huggingface.co/zai-org/GLM-4.7"&gt;GLM-4.7&lt;/a&gt; which was 368B and 717GB (4.5 and 4.6 were around that size too).&lt;/p&gt;
&lt;p&gt;It's interesting to see Z.ai take a position on what we should call professional software engineers building with LLMs - I've seen &lt;strong&gt;Agentic Engineering&lt;/strong&gt; show up in a few other places recently. most notable &lt;a href="https://twitter.com/karpathy/status/2019137879310836075"&gt;from Andrej Karpathy&lt;/a&gt; and &lt;a href="https://addyosmani.com/blog/agentic-engineering/"&gt;Addy Osmani&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I ran my "Generate an SVG of a pelican riding a bicycle" prompt through GLM-5 via &lt;a href="https://openrouter.ai/"&gt;OpenRouter&lt;/a&gt; and got back &lt;a href="https://gist.github.com/simonw/cc4ca7815ae82562e89a9fdd99f0725d"&gt;a very good pelican on a disappointing bicycle frame&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="The pelican is good and has a well defined beak. The bicycle frame is a wonky red triangle. Nice sun and motion lines." src="https://static.simonwillison.net/static/2026/glm-5-pelican.png" /&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=46977210"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/definitions"&gt;definitions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle"&gt;pelican-riding-a-bicycle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openrouter"&gt;openrouter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-in-china"&gt;ai-in-china&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/glm"&gt;glm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;&lt;/p&gt;



</summary><category term="definitions"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="pelican-riding-a-bicycle"/><category term="llm-release"/><category term="vibe-coding"/><category term="openrouter"/><category term="ai-in-china"/><category term="glm"/><category term="agentic-engineering"/></entry><entry><title>Introducing SWE-1.5: Our Fast Agent Model</title><link href="https://simonwillison.net/2025/Oct/29/swe-15/#atom-tag" rel="alternate"/><published>2025-10-29T23:59:20+00:00</published><updated>2025-10-29T23:59:20+00:00</updated><id>https://simonwillison.net/2025/Oct/29/swe-15/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://cognition.ai/blog/swe-1-5"&gt;Introducing SWE-1.5: Our Fast Agent Model&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Here's the second fast coding model released by a coding agent IDE in the same day - the first was &lt;a href="https://simonwillison.net/2025/Oct/29/cursor-composer/"&gt;Composer-1 by Cursor&lt;/a&gt;. This time it's Windsurf releasing SWE-1.5:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Today we’re releasing SWE-1.5, the latest in our family of models optimized for software engineering. It is a frontier-size model with hundreds of billions of parameters that achieves near-SOTA coding performance. It also sets a new standard for speed: we partnered with Cerebras to serve it at up to 950 tok/s – 6x faster than Haiku 4.5 and 13x faster than Sonnet 4.5.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Like Composer-1 it's only available via their editor, no separate API yet. Also like Composer-1 they don't appear willing to share details of the "leading open-source base model" they based their new model on.&lt;/p&gt;
&lt;p&gt;I asked it to generate an SVG of a pelican riding a bicycle and got this:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Bicycle has a red upside down Y shaped frame, pelican is a bit dumpy, it does at least have a long sharp beak." src="https://static.simonwillison.net/static/2025/swe-pelican.png" /&gt;&lt;/p&gt;
&lt;p&gt;This one felt &lt;em&gt;really fast&lt;/em&gt;. Partnering with Cerebras for inference is a very smart move.&lt;/p&gt;
&lt;p&gt;They share a lot of details about their training process in the post:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;SWE-1.5 is trained on our state-of-the-art cluster of thousands of GB200 NVL72 chips. We believe SWE-1.5 may be the first public production model trained on the new GB200 generation. [...]&lt;/p&gt;
&lt;p&gt;Our RL rollouts require high-fidelity environments with code execution and even web browsing. To achieve this, we leveraged our VM hypervisor &lt;code&gt;otterlink&lt;/code&gt; that  allows us to scale &lt;strong&gt;Devin&lt;/strong&gt; to tens of thousands of concurrent machines (learn more about &lt;a href="https://cognition.ai/blog/blockdiff#why-incremental-vm-snapshots"&gt;blockdiff&lt;/a&gt;). This enabled us to smoothly support very high concurrency and ensure the training environment is aligned with our Devin production environments.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That's &lt;em&gt;another&lt;/em&gt; similarity to Cursor's Composer-1! Cursor talked about how they ran "hundreds of thousands of concurrent sandboxed coding environments in the cloud" in &lt;a href="https://cursor.com/blog/composer"&gt;their description of their RL training&lt;/a&gt; as well.&lt;/p&gt;
&lt;p&gt;This is a notable trend: if you want to build a really great agentic coding tool there's clearly a lot to be said for using reinforcement learning to fine-tune a model against your own custom set of tools using large numbers of sandboxed simulated coding environments as part of that process.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: &lt;a href="https://x.com/zai_org/status/1984076614951420273"&gt;I think it's built on GLM&lt;/a&gt;.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://x.com/cognition/status/1983662838955831372"&gt;@cognition&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle"&gt;pelican-riding-a-bicycle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/glm"&gt;glm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-performance"&gt;llm-performance&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="pelican-riding-a-bicycle"/><category term="llm-release"/><category term="coding-agents"/><category term="glm"/><category term="llm-performance"/></entry><entry><title>Two more Chinese pelicans</title><link href="https://simonwillison.net/2025/Oct/1/two-pelicans/#atom-tag" rel="alternate"/><published>2025-10-01T23:39:07+00:00</published><updated>2025-10-01T23:39:07+00:00</updated><id>https://simonwillison.net/2025/Oct/1/two-pelicans/#atom-tag</id><summary type="html">
    &lt;p&gt;Two new models from Chinese AI labs in the past few days. I tried them both out using &lt;a href="https://github.com/simonw/llm-openrouter"&gt;llm-openrouter&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DeepSeek-V3.2-Exp&lt;/strong&gt; from DeepSeek. &lt;a href="https://api-docs.deepseek.com/news/news250929"&gt;Announcement&lt;/a&gt;, &lt;a href="https://github.com/deepseek-ai/DeepSeek-V3.2-Exp/blob/main/DeepSeek_V3_2.pdf"&gt;Tech Report&lt;/a&gt;, &lt;a href="https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp"&gt;Hugging Face&lt;/a&gt; (690GB, MIT license).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon V3.1-Terminus by introducing DeepSeek Sparse Attention—a sparse attention mechanism designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This one felt &lt;em&gt;very slow&lt;/em&gt; when I accessed it via OpenRouter - I probably got routed to &lt;a href="https://openrouter.ai/deepseek/deepseek-v3.2-exp/providers"&gt;one of the slower providers&lt;/a&gt;. Here's &lt;a href="https://gist.github.com/simonw/659966a678dedd9d4e55a01a4256ac56"&gt;the pelican&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Claude Sonnet 4.5 says: Minimalist line drawing illustration of a stylized bird riding a bicycle, with clock faces as wheels showing approximately 10:10, orange beak and pedal accents, on a light gray background with a dashed line representing the ground." src="https://static.simonwillison.net/static/2025/deepseek-v3.2-exp.png" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;GLM-4.6 from Z.ai&lt;/strong&gt;. &lt;a href="https://z.ai/blog/glm-4.6"&gt;Announcement&lt;/a&gt;, &lt;a href="https://huggingface.co/zai-org/GLM-4.6"&gt;Hugging Face&lt;/a&gt; (714GB, MIT license).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The context window has been expanded from 128K to 200K tokens [...] higher scores on code benchmarks [...] GLM-4.6 exhibits stronger performance in tool using and search-based agents.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/5cf05165fc721b5f7eac3b10eeff20d5"&gt;the pelican&lt;/a&gt; for that:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Claude Sonnet 4.5 says: Illustration of a white seagull with an orange beak and yellow feet riding a bicycle against a light blue sky background with white clouds and a yellow sun." src="https://static.simonwillison.net/static/2025/glm-4.6.png" /&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle"&gt;pelican-riding-a-bicycle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/deepseek"&gt;deepseek&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openrouter"&gt;openrouter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-in-china"&gt;ai-in-china&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/glm"&gt;glm&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="pelican-riding-a-bicycle"/><category term="deepseek"/><category term="llm-release"/><category term="openrouter"/><category term="ai-in-china"/><category term="glm"/></entry><entry><title>The best available open weight LLMs now come from China</title><link href="https://simonwillison.net/2025/Jul/30/chinese-models/#atom-tag" rel="alternate"/><published>2025-07-30T16:18:38+00:00</published><updated>2025-07-30T16:18:38+00:00</updated><id>https://simonwillison.net/2025/Jul/30/chinese-models/#atom-tag</id><summary type="html">
    &lt;p&gt;Something that has become undeniable this month is that the best available open weight models now come from the Chinese AI labs.&lt;/p&gt;
&lt;p&gt;I continue to have a lot of love for Mistral, Gemma and Llama but my feeling is that Qwen, Moonshot and Z.ai have positively &lt;em&gt;smoked them&lt;/em&gt; over the course of July.&lt;/p&gt;
&lt;p&gt;Here's what came out this month, with links to my notes on each one:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Moonshot &lt;a href="https://simonwillison.net/2025/Jul/11/kimi-k2/"&gt;Kimi-K2-Instruct&lt;/a&gt; - 11th July, 1 trillion parameters&lt;/li&gt;
&lt;li&gt;Qwen &lt;a href="https://simonwillison.net/2025/Jul/22/qwen3-235b-a22b-instruct-2507/"&gt;Qwen3-235B-A22B-Instruct-2507&lt;/a&gt; - 21st July, 235 billion&lt;/li&gt;
&lt;li&gt;Qwen &lt;a href="https://simonwillison.net/2025/Jul/22/qwen3-coder/"&gt;Qwen3-Coder-480B-A35B-Instruct&lt;/a&gt; - 22nd July, 480 billion&lt;/li&gt;
&lt;li&gt;Qwen &lt;a href="https://simonwillison.net/2025/Jul/25/qwen3-235b-a22b-thinking-2507/"&gt;Qwen3-235B-A22B-Thinking-2507&lt;/a&gt; - 25th July, 235 billion&lt;/li&gt;
&lt;li&gt;Z.ai &lt;a href="https://simonwillison.net/2025/Jul/28/glm-45/"&gt;GLM-4.5 and GLM-4.5 Air&lt;/a&gt; - 28th July, 355 and 106 billion&lt;/li&gt;
&lt;li&gt;Qwen &lt;a href="https://simonwillison.net/2025/Jul/29/qwen3-30b-a3b-instruct-2507/"&gt;Qwen3-30B-A3B-Instruct-2507&lt;/a&gt; - 29th July, 30 billion&lt;/li&gt;
&lt;li&gt;Qwen &lt;a href="https://simonwillison.net/2025/Jul/30/qwen3-30b-a3b-thinking-2507/"&gt;Qwen3-30B-A3B-Thinking-2507&lt;/a&gt; - 30th July, 30 billion&lt;/li&gt;
&lt;li&gt;Qwen &lt;a href="https://simonwillison.net/2025/Jul/31/qwen3-coder-flash/"&gt;Qwen3-Coder-30B-A3B-Instruct&lt;/a&gt; - 31st July, 30 billion (released after I first posted this note)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;small&gt;Notably absent from this list is DeepSeek, but that's only because their last model release was &lt;a href="https://huggingface.co/deepseek-ai/DeepSeek-R1-0528"&gt;DeepSeek-R1-0528&lt;/a&gt; back in April.&lt;/small&gt;&lt;/p&gt;
&lt;p&gt;The only janky license among them is Kimi K2, which uses a non-OSI-compliant modified MIT. Qwen's models are all Apache 2 and Z.ai's are MIT.&lt;/p&gt;
&lt;p&gt;The larger Chinese models all offer their own APIs and are increasingly available from other providers.  I've been able to run versions of the Qwen 30B and GLM-4.5 Air 106B models on my own laptop.&lt;/p&gt;
&lt;p&gt;I can't help but wonder if part of the reason for the delay in release of OpenAI's open weights model comes from a desire to be notably better than this truly impressive lineup of Chinese models.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update August 5th 2025&lt;/strong&gt;: The OpenAI open weight models came out and &lt;a href="https://simonwillison.net/2025/Aug/5/gpt-oss/"&gt;they are very impressive&lt;/a&gt;.&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/qwen"&gt;qwen&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-in-china"&gt;ai-in-china&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt-oss"&gt;gpt-oss&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/moonshot"&gt;moonshot&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/kimi"&gt;kimi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/janky-licenses"&gt;janky-licenses&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/glm"&gt;glm&lt;/a&gt;&lt;/p&gt;



</summary><category term="open-source"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="qwen"/><category term="ai-in-china"/><category term="gpt-oss"/><category term="moonshot"/><category term="kimi"/><category term="janky-licenses"/><category term="glm"/></entry><entry><title>My 2.5 year old laptop can write Space Invaders in JavaScript now, using GLM-4.5 Air and MLX</title><link href="https://simonwillison.net/2025/Jul/29/space-invaders/#atom-tag" rel="alternate"/><published>2025-07-29T13:02:39+00:00</published><updated>2025-07-29T13:02:39+00:00</updated><id>https://simonwillison.net/2025/Jul/29/space-invaders/#atom-tag</id><summary type="html">
    &lt;p&gt;I wrote about the new &lt;a href="https://simonwillison.net/2025/Jul/28/glm-45/"&gt;GLM-4.5&lt;/a&gt; model family yesterday - new open weight (MIT licensed) models from &lt;a href="https://z.ai/"&gt;Z.ai&lt;/a&gt; in China which their benchmarks claim score highly in coding even against models such as Claude Sonnet 4.&lt;/p&gt;
&lt;p&gt;The models are pretty big - the smaller GLM-4.5 Air model is still 106 billion total parameters, which &lt;a href="https://huggingface.co/zai-org/GLM-4.5-Air"&gt;is 205.78GB&lt;/a&gt; on Hugging Face.&lt;/p&gt;
&lt;p&gt;Ivan Fioravanti &lt;a href="https://x.com/ivanfioravanti/status/1949911755028910557"&gt;built&lt;/a&gt; this &lt;a href="https://huggingface.co/mlx-community/GLM-4.5-Air-3bit"&gt;44GB 3bit quantized version for MLX&lt;/a&gt;, specifically sized so people with 64GB machines could have a chance of running it. I tried it out... and it works &lt;em&gt;extremely well&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;I fed it the following prompt:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;&lt;code&gt;Write an HTML and JavaScript page implementing space invaders&lt;/code&gt;&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;And it churned away for a while and produced &lt;a href="https://tools.simonwillison.net/space-invaders-GLM-4.5-Air-3bit"&gt;the following&lt;/a&gt;:&lt;/p&gt;

&lt;div style="max-width: 100%; margin-bottom: 0.4em"&gt;
    &lt;video controls="controls" preload="none" aria-label="Space Invaders" poster="https://static.simonwillison.net/static/2025/space-invaders.jpg" loop="loop" style="width: 100%; height: auto;" muted="muted"&gt;
        &lt;source src="https://static.simonwillison.net/static/2025/space-invaders.mp4" type="video/mp4" /&gt;
    &lt;/video&gt;
&lt;/div&gt;

&lt;p&gt;Clearly this isn't a particularly novel example, but I still think it's noteworthy that a model running on my 2.5 year old laptop (a 64GB MacBook Pro M2) is able to produce code like this - especially code that worked first time with no further edits needed.&lt;/p&gt;

&lt;h4 id="how-i-ran-the-model"&gt;How I ran the model&lt;/h4&gt;

&lt;p&gt;I had to run it using the current &lt;code&gt;main&lt;/code&gt; branch of the &lt;a href="https://github.com/ml-explore/mlx-lm"&gt;mlx-lm&lt;/a&gt; library (to ensure I had &lt;a href="https://github.com/ml-explore/mlx-lm/commit/489e63376b963ac02b3b7223f778dbecc164716b"&gt;this commit&lt;/a&gt; adding &lt;code&gt;glm4_moe&lt;/code&gt; support). I ran that using &lt;a href="https://github.com/astral-sh/uv"&gt;uv&lt;/a&gt; like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uv run \
  --with &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;https://github.com/ml-explore/mlx-lm/archive/489e63376b963ac02b3b7223f778dbecc164716b.zip&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  python&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then in that Python interpreter I used the standard recipe for running MLX models:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;mlx_lm&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;load&lt;/span&gt;, &lt;span class="pl-s1"&gt;generate&lt;/span&gt;
&lt;span class="pl-s1"&gt;model&lt;/span&gt;, &lt;span class="pl-s1"&gt;tokenizer&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;load&lt;/span&gt;(&lt;span class="pl-s"&gt;"mlx-community/GLM-4.5-Air-3bit"&lt;/span&gt;)&lt;/pre&gt;
&lt;p&gt;That downloaded 44GB of model weights to my  &lt;code&gt;~/.cache/huggingface/hub/models--mlx-community--GLM-4.5-Air-3bit&lt;/code&gt; folder.&lt;/p&gt;
&lt;p&gt;Then:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-s1"&gt;prompt&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;"Write an HTML and JavaScript page implementing space invaders"&lt;/span&gt;
&lt;span class="pl-s1"&gt;messages&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; [{&lt;span class="pl-s"&gt;"role"&lt;/span&gt;: &lt;span class="pl-s"&gt;"user"&lt;/span&gt;, &lt;span class="pl-s"&gt;"content"&lt;/span&gt;: &lt;span class="pl-s1"&gt;prompt&lt;/span&gt;}]
&lt;span class="pl-s1"&gt;prompt&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;tokenizer&lt;/span&gt;.&lt;span class="pl-c1"&gt;apply_chat_template&lt;/span&gt;(
    &lt;span class="pl-s1"&gt;messages&lt;/span&gt;,
    &lt;span class="pl-s1"&gt;add_generation_prompt&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;
)
&lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;generate&lt;/span&gt;(
    &lt;span class="pl-s1"&gt;model&lt;/span&gt;, &lt;span class="pl-s1"&gt;tokenizer&lt;/span&gt;,
    &lt;span class="pl-s1"&gt;prompt&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;prompt&lt;/span&gt;,
    &lt;span class="pl-s1"&gt;verbose&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;,
    &lt;span class="pl-s1"&gt;max_tokens&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;8192&lt;/span&gt;
)&lt;/pre&gt;
&lt;p&gt;The response started like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;&amp;lt;think&amp;gt;&lt;/code&gt;The user wants me to create a Space Invaders game using HTML, CSS, and JavaScript. I need to create a complete, functional game with the following features:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Player spaceship that can move left/right and shoot&lt;/li&gt;
&lt;li&gt;Enemy invaders that move in formation and shoot back&lt;/li&gt;
&lt;li&gt;Score tracking&lt;/li&gt;
&lt;li&gt;Lives/health system&lt;/li&gt;
&lt;li&gt;Game over conditions [...]&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;p&gt;Followed by the HTML and this debugging output:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Prompt: 14 tokens, 14.095 tokens-per-sec
Generation: 4193 tokens, 25.564 tokens-per-sec
Peak memory: 47.687 GB
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can see &lt;a href="https://gist.github.com/simonw/9f515c8e32fb791549aeb88304550893#file-space_invaders-txt-L61"&gt;the full transcript here&lt;/a&gt;, or view &lt;a href="https://github.com/simonw/tools/blob/9e04fd9895fae1aa9ac78b8e62d2833831fe0544/space-invaders-GLM-4.5-Air-3bit.html"&gt;the source on GitHub&lt;/a&gt;, or &lt;a href="https://tools.simonwillison.net/space-invaders-GLM-4.5-Air-3bit"&gt;try it out in your browser&lt;/a&gt;.&lt;/p&gt;

&lt;h4 id="pelican"&gt;A pelican for good measure&lt;/h4&gt;

&lt;p&gt;I ran &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle/"&gt;my pelican benchmark&lt;/a&gt; against the full sized models &lt;a href="https://simonwillison.net/2025/Jul/28/glm-45/"&gt;yesterday&lt;/a&gt;, but I couldn't resist trying it against this smaller 3bit model. Here's what I got for &lt;code&gt;"Generate an SVG of a pelican riding a bicycle"&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/glm-4.5-air-3b-pelican.png" alt="Blue background, pelican looks like a cloud with an orange bike, bicycle is recognizable as a bicycle if not quite the right geometry." /&gt;&lt;/p&gt;

&lt;p&gt;Here's the &lt;a href="https://gist.github.com/simonw/fe428f7cead72ad754f965a81117f5df"&gt;transcript for that&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In both cases the model used around 48GB of RAM at peak, leaving me with just 16GB for everything else - I had to quit quite a few apps in order to get the model to run but the speed was pretty good once it got going.&lt;/p&gt;

&lt;h4 id="local-coding-models"&gt;Local coding models are really good now&lt;/h4&gt;

&lt;p&gt;It's interesting how almost every model released in 2025 has specifically targeting coding. That focus has clearly been paying off: these coding models are getting &lt;em&gt;really good&lt;/em&gt; now.&lt;/p&gt;

&lt;p&gt;Two years ago when I &lt;a href="https://simonwillison.net/2023/Mar/11/llama/"&gt;first tried LLaMA&lt;/a&gt; I never &lt;em&gt;dreamed&lt;/em&gt; that the same laptop I was using then would one day be able to run models with capabilities as strong as what I'm seeing from GLM 4.5 Air - and Mistral 3.2 Small, and Gemma 3, and Qwen 3, and a host of other high quality models that have emerged over the past six months.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mlx"&gt;mlx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle"&gt;pelican-riding-a-bicycle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-in-china"&gt;ai-in-china&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/space-invaders"&gt;space-invaders&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ivan-fioravanti"&gt;ivan-fioravanti&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/glm"&gt;glm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="python"/><category term="ai"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="ai-assisted-programming"/><category term="uv"/><category term="mlx"/><category term="pelican-riding-a-bicycle"/><category term="ai-in-china"/><category term="space-invaders"/><category term="ivan-fioravanti"/><category term="glm"/></entry><entry><title>GLM-4.5: Reasoning, Coding, and Agentic Abililties</title><link href="https://simonwillison.net/2025/Jul/28/glm-45/#atom-tag" rel="alternate"/><published>2025-07-28T16:56:42+00:00</published><updated>2025-07-28T16:56:42+00:00</updated><id>https://simonwillison.net/2025/Jul/28/glm-45/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://z.ai/blog/glm-4.5"&gt;GLM-4.5: Reasoning, Coding, and Agentic Abililties&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Another day, another significant new open weight model release from a Chinese frontier AI lab.&lt;/p&gt;
&lt;p&gt;This time it's Z.ai - who rebranded (at least in English) from &lt;a href="https://en.wikipedia.org/wiki/Zhipu_AI"&gt;Zhipu AI&lt;/a&gt; a few months ago. They just dropped &lt;a href="https://huggingface.co/zai-org/GLM-4.5-Base"&gt;GLM-4.5-Base&lt;/a&gt;, &lt;a href="https://huggingface.co/zai-org/GLM-4.5"&gt;GLM-4.5&lt;/a&gt; and &lt;a href="https://huggingface.co/zai-org/GLM-4.5-Air"&gt;GLM-4.5 Air&lt;/a&gt; on Hugging Face, all under an MIT license.&lt;/p&gt;
&lt;p&gt;These are MoE hybrid reasoning models with thinking and non-thinking modes, similar to Qwen 3. GLM-4.5 is 355 billion total parameters with 32 billion active, GLM-4.5-Air is 106 billion total parameters and 12 billion active.&lt;/p&gt;
&lt;p&gt;They started using MIT a few months ago for their &lt;a href="https://huggingface.co/collections/zai-org/glm-4-0414-67f3cbcb34dd9d252707cb2e"&gt;GLM-4-0414&lt;/a&gt; models - their older releases used a janky non-open-source custom license.&lt;/p&gt;
&lt;p&gt;Z.ai's own benchmarking (across 12 common benchmarks) ranked their GLM-4.5 3rd behind o3 and Grok-4 and just ahead of Claude Opus 4. They ranked GLM-4.5 Air 6th place just ahead of Claude 4 Sonnet. I haven't seen any independent benchmarks yet.&lt;/p&gt;
&lt;p&gt;The other models they included in their own benchmarks were o4-mini (high), Gemini 2.5 Pro, Qwen3-235B-Thinking-2507, DeepSeek-R1-0528, Kimi K2, GPT-4.1, DeepSeek-V3-0324. Notably absent: any of Meta's Llama models, or any of Mistral's. Did they deliberately only compare themselves to open weight models from other Chinese AI labs?&lt;/p&gt;
&lt;p&gt;Both models have a 128,000 context length and are trained for tool calling, which honestly feels like table stakes for any model released in 2025 at this point.&lt;/p&gt;
&lt;p&gt;It's interesting to see them use Claude Code to run their own coding benchmarks:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;To assess GLM-4.5's agentic coding capabilities, we utilized Claude Code to evaluate performance against Claude-4-Sonnet, Kimi K2, and Qwen3-Coder across 52 coding tasks spanning frontend development, tool development, data analysis, testing, and algorithm implementation. [...] The empirical results demonstrate that GLM-4.5 achieves a 53.9% win rate against Kimi K2 and exhibits dominant performance over Qwen3-Coder with an 80.8% success rate. While GLM-4.5 shows competitive performance, further optimization opportunities remain when compared to Claude-4-Sonnet.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;They published the dataset for that benchmark as &lt;a href="https://huggingface.co/datasets/zai-org/CC-Bench-trajectories"&gt;zai-org/CC-Bench-trajectories&lt;/a&gt; on Hugging Face. I think they're using the word "trajectory" for what I would call a chat transcript.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Unlike DeepSeek-V3 and Kimi K2, we reduce the width (hidden dimension and number of routed experts) of the model while increasing the height (number of layers), as we found that deeper models exhibit better reasoning capacity.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;They pre-trained on 15 trillion tokens, then an additional 7 trillion for code and reasoning:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Our base model undergoes several training stages. During pre-training, the model is first trained on 15T tokens of a general pre-training corpus, followed by 7T tokens of a code &amp;amp; reasoning corpus. After pre-training, we introduce additional stages to further enhance the model's performance on key downstream domains.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;They also open sourced their post-training reinforcement learning harness, which they've called &lt;strong&gt;slime&lt;/strong&gt;. That's available at &lt;a href="https://github.com/THUDM/slime"&gt;THUDM/slime&lt;/a&gt; on GitHub - THUDM is the Knowledge Engineer Group @ Tsinghua University, the University from which Zhipu AI spun out as an independent company.&lt;/p&gt;
&lt;p&gt;This time I ran my &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle/"&gt;pelican bechmark&lt;/a&gt; using the &lt;a href="https://chat.z.ai/"&gt;chat.z.ai&lt;/a&gt; chat interface, which offers free access (no account required) to both GLM 4.5 and GLM 4.5 Air. I had reasoning enabled for both.&lt;/p&gt;
&lt;p&gt;Here's what I got for "Generate an SVG of a pelican riding a bicycle" on &lt;a href="https://chat.z.ai/s/014a8c13-7b73-40e8-bbf9-6a94482caa2e"&gt;GLM 4.5&lt;/a&gt;. I like how the pelican has its wings on the handlebars:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Description by Claude Sonnet 4: This is a whimsical illustration of a white duck or goose riding a red bicycle. The bird has an orange beak and is positioned on the bike seat, with its orange webbed feet gripping what appears to be chopsticks or utensils near the handlebars. The bicycle has a simple red frame with two wheels, and there are motion lines behind it suggesting movement. The background is a soft blue-gray color, giving the image a clean, minimalist cartoon style. The overall design has a playful, humorous quality to it." src="https://static.simonwillison.net/static/2025/glm-4.5-pelican.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;And &lt;a href="https://chat.z.ai/s/e772675c-3445-4cff-903c-6faa3d6b9524"&gt;GLM 4.5 Air&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Description by Claude Sonnet 4: This image shows a cute, minimalist illustration of a snowman riding a bicycle. The snowman has a simple design with a round white body, small black dot for an eye, and an orange rectangular nose (likely representing a carrot). The snowman appears to be in motion on a black bicycle with two wheels, with small orange arrows near the pedals suggesting movement. There are curved lines on either side of the image indicating motion or wind. The overall style is clean and whimsical, using a limited color palette of white, black, orange, and gray against a light background." src="https://static.simonwillison.net/static/2025/glm-4.5-air-pelican.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;Ivan Fioravanti &lt;a href="https://x.com/ivanfioravanti/status/1949854575902523399"&gt;shared a video&lt;/a&gt; of the &lt;a href="https://huggingface.co/mlx-community/GLM-4.5-Air-4bit"&gt;mlx-community/GLM-4.5-Air-4bit&lt;/a&gt; quantized model running on a M4 Mac with 128GB of RAM, and it looks like a very strong contender for a local model that can write useful code. The cheapest 128GB Mac Studio costs around $3,500 right now, so genuinely great open weight coding models are creeping closer to being affordable on consumer machines.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: Ivan released a 3 bit quantized version of GLM-4.5 Air which runs using 48GB of RAM on my laptop. I tried it and was &lt;em&gt;really&lt;/em&gt; impressed, see &lt;a href="https://simonwillison.net/2025/Jul/29/space-invaders/"&gt;My 2.5 year old laptop can write Space Invaders in JavaScript now&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mlx"&gt;mlx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle"&gt;pelican-riding-a-bicycle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-reasoning"&gt;llm-reasoning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-in-china"&gt;ai-in-china&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ivan-fioravanti"&gt;ivan-fioravanti&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/glm"&gt;glm&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="mlx"/><category term="pelican-riding-a-bicycle"/><category term="llm-reasoning"/><category term="llm-release"/><category term="ai-in-china"/><category term="ivan-fioravanti"/><category term="glm"/></entry><entry><title>Quoting The GLM-130B License</title><link href="https://simonwillison.net/2023/Jan/10/the-glm-130b-license/#atom-tag" rel="alternate"/><published>2023-01-10T22:45:21+00:00</published><updated>2023-01-10T22:45:21+00:00</updated><id>https://simonwillison.net/2023/Jan/10/the-glm-130b-license/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://github.com/THUDM/GLM-130B/blob/main/MODEL_LICENSE"&gt;&lt;p&gt;You will not use the Software for any act that may undermine China's national security and national unity, harm the public interest of society, or infringe upon the rights and interests of human beings.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://github.com/THUDM/GLM-130B/blob/main/MODEL_LICENSE"&gt;The GLM-130B License&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/licensing"&gt;licensing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/machine-learning"&gt;machine-learning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-in-china"&gt;ai-in-china&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/glm"&gt;glm&lt;/a&gt;&lt;/p&gt;



</summary><category term="licensing"/><category term="machine-learning"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-in-china"/><category term="glm"/></entry></feed>