<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: apple</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/apple.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-04-15T17:13:57+00:00</updated><author><name>Simon Willison</name></author><entry><title>Quoting John Gruber</title><link href="https://simonwillison.net/2026/Apr/15/john-gruber/#atom-tag" rel="alternate"/><published>2026-04-15T17:13:57+00:00</published><updated>2026-04-15T17:13:57+00:00</updated><id>https://simonwillison.net/2026/Apr/15/john-gruber/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://daringfireball.net/2026/04/piece_android_iphone_apps"&gt;&lt;p&gt;The real goldmine isn’t that Apple gets a cut of every App Store transaction. It’s that Apple’s platforms have the best apps, and users who are drawn to the best apps are thus drawn to the iPhone, Mac, and iPad. That edge is waning. Not because software on other platforms is getting better, but because third-party software on iPhone, Mac, and iPad is regressing to the mean, &lt;em&gt;to some extent&lt;/em&gt;, because fewer developers feel motivated — artistically, financially, or both — to create well-crafted idiomatic native apps exclusively for Apple’s platforms.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://daringfireball.net/2026/04/piece_android_iphone_apps"&gt;John Gruber&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/john-gruber"&gt;john-gruber&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="john-gruber"/></entry><entry><title>Quoting Guilherme Rambo</title><link href="https://simonwillison.net/2026/Mar/16/guilherme-rambo/#atom-tag" rel="alternate"/><published>2026-03-16T20:34:13+00:00</published><updated>2026-03-16T20:34:13+00:00</updated><id>https://simonwillison.net/2026/Mar/16/guilherme-rambo/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://daringfireball.net/2026/03/apple_enclaves_neo_camera_indicator"&gt;&lt;p&gt;Tidbit: the software-based camera indicator light in the MacBook Neo runs in the secure exclave¹ part of the chip, so it is almost as secure as the hardware indicator light. What that means in practice is that even a kernel-level exploit would not be able to turn on the camera without the light appearing on screen. It runs in a privileged environment separate from the kernel and blits the light directly onto the screen hardware.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://daringfireball.net/2026/03/apple_enclaves_neo_camera_indicator"&gt;Guilherme Rambo&lt;/a&gt;, in a text message to John Gruber&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/hardware"&gt;hardware&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/privacy"&gt;privacy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/john-gruber"&gt;john-gruber&lt;/a&gt;&lt;/p&gt;



</summary><category term="hardware"/><category term="apple"/><category term="privacy"/><category term="john-gruber"/></entry><entry><title>It’s hard to justify Tahoe icons</title><link href="https://simonwillison.net/2026/Jan/5/its-hard-to-justify-tahoe-icons/#atom-tag" rel="alternate"/><published>2026-01-05T19:30:24+00:00</published><updated>2026-01-05T19:30:24+00:00</updated><id>https://simonwillison.net/2026/Jan/5/its-hard-to-justify-tahoe-icons/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://tonsky.me/blog/tahoe-icons/"&gt;It’s hard to justify Tahoe icons&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Devastating critique of the new menu icons in macOS Tahoe by Nikita Prokopov, who starts by quoting the 1992 Apple HIG rule to not "overload the user with complex icons" and then provides comprehensive evidence of Tahoe doing exactly that.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In my opinion, Apple took on an impossible task: to add an icon to every menu item. There are just not enough good metaphors to do something like that.&lt;/p&gt;
&lt;p&gt;But even if there were, the premise itself is questionable: if everything has an icon, it doesn’t mean users will find what they are looking for faster.&lt;/p&gt;
&lt;p&gt;And even if the premise was solid, I still wish I could say: they did the best they could, given the goal. But that’s not true either: they did a poor job consistently applying the metaphors and designing the icons themselves.&lt;/p&gt;
&lt;/blockquote&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=46497712"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/design"&gt;design&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/macos"&gt;macos&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/usability"&gt;usability&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="design"/><category term="macos"/><category term="usability"/></entry><entry><title>NVIDIA DGX Spark + Apple Mac Studio = 4x Faster LLM Inference with EXO 1.0</title><link href="https://simonwillison.net/2025/Oct/16/nvidia-dgx-spark-apple-mac-studio/#atom-tag" rel="alternate"/><published>2025-10-16T05:34:41+00:00</published><updated>2025-10-16T05:34:41+00:00</updated><id>https://simonwillison.net/2025/Oct/16/nvidia-dgx-spark-apple-mac-studio/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.exolabs.net/nvidia-dgx-spark"&gt;NVIDIA DGX Spark + Apple Mac Studio = 4x Faster LLM Inference with EXO 1.0&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
EXO Labs wired a 256GB M3 Ultra Mac Studio up to an NVIDIA DGX Spark and got a 2.8x performance boost serving Llama-3.1 8B (FP16) with an 8,192 token prompt.&lt;/p&gt;
&lt;p&gt;Their detailed explanation taught me a lot about LLM performance.&lt;/p&gt;
&lt;p&gt;There are two key steps in executing a prompt. The first is the &lt;strong&gt;prefill&lt;/strong&gt; phase that reads the incoming prompt and builds a KV cache for each of the transformer layers in the model. This is compute-bound as it needs to process every token in the input and perform large matrix multiplications across all of the layers to initialize the model's internal state.&lt;/p&gt;
&lt;p&gt;Performance in the prefill stage influences TTFT - time‑to‑first‑token.&lt;/p&gt;
&lt;p&gt;The second step is the &lt;strong&gt;decode&lt;/strong&gt; phase, which generates the output one token at a time. This part is limited by memory bandwidth - there's less arithmetic, but each token needs to consider the entire KV cache.&lt;/p&gt;
&lt;p&gt;Decode performance influences TPS - tokens per second.&lt;/p&gt;
&lt;p&gt;EXO noted that the Spark has 100 TFLOPS but only 273GB/s of memory bandwidth, making it a better fit for prefill. The M3 Ultra has 26 TFLOPS but 819GB/s of memory bandwidth, making it ideal for the decode phase.&lt;/p&gt;
&lt;p&gt;They run prefill on the Spark, streaming the KV cache to the Mac over 10Gb Ethernet. They can start streaming earlier layers while the later layers are still being calculated. Then the Mac runs the decode phase, returning tokens faster than if the Spark had run the full process end-to-end.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/exolabs/status/1978525767739883736"&gt;@exolabs&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nvidia"&gt;nvidia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nvidia-spark"&gt;nvidia-spark&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="ai"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="nvidia"/><category term="nvidia-spark"/></entry><entry><title>Why NetNewsWire Is Not a Web App</title><link href="https://simonwillison.net/2025/Oct/8/why-netnewswire-is-not-a-web-app/#atom-tag" rel="alternate"/><published>2025-10-08T16:12:14+00:00</published><updated>2025-10-08T16:12:14+00:00</updated><id>https://simonwillison.net/2025/Oct/8/why-netnewswire-is-not-a-web-app/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://inessential.com/2025/10/04/why-netnewswire-is-not-web-app.html"&gt;Why NetNewsWire Is Not a Web App&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
In the wake of Apple &lt;a href="https://daringfireball.net/2025/10/iceblock_removed_from_app_store"&gt;removing ICEBlock from the App Store&lt;/a&gt;, Brent Simmons talks about why he still thinks his veteran (and actively maintained) &lt;a href="https://netnewswire.com/"&gt;NetNewsWire&lt;/a&gt; feed reader app should remain a native application.&lt;/p&gt;
&lt;p&gt;Part of the reason is cost - NetNewsWire is free these days (&lt;a href="https://github.com/Ranchero-Software/NetNewsWire/blob/main/LICENSE"&gt;MIT licensed in fact&lt;/a&gt;) and the cost to Brent is an annual Apple developer subscription:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If it were a web app instead, I could drop the developer membership, but I’d have to pay way more money for web and database hosting. [...] I could charge for NetNewsWire, but that would go against my political goal of making sure there’s a good and &lt;em&gt;free&lt;/em&gt; RSS reader available to everyone.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A bigger reason is around privacy and protecting users:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Second issue. Right now, if law enforcement comes to me and demands I turn over a given user’s subscriptions list, I can’t. Literally can’t. I don’t have an encrypted version, even — I have nothing at all. The list lives on their machine (iOS or macOS).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And finally it's about the principle of what a personal computing device should mean:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;My computer is &lt;em&gt;not&lt;/em&gt; a terminal. It’s a world I get to control, and I can use — and, especially, &lt;em&gt;make&lt;/em&gt; — whatever I want. I’m not stuck using just what’s provided to me on some other machines elsewhere: I’m not dialing into a mainframe or doing the modern equivalent of using only websites that other people control.&lt;/p&gt;
&lt;/blockquote&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/brent-simmons"&gt;brent-simmons&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/macos"&gt;macos&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/netnewswire"&gt;netnewswire&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ios"&gt;ios&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="brent-simmons"/><category term="macos"/><category term="netnewswire"/><category term="ios"/></entry><entry><title>Locally AI</title><link href="https://simonwillison.net/2025/Sep/21/locally-ai/#atom-tag" rel="alternate"/><published>2025-09-21T23:56:14+00:00</published><updated>2025-09-21T23:56:14+00:00</updated><id>https://simonwillison.net/2025/Sep/21/locally-ai/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://apps.apple.com/us/app/locally-ai-local-ai-chat/id6741426692"&gt;Locally AI&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Handy new iOS app by Adrien Grondin for running local LLMs on your phone. It just added support for the new iOS 26 Apple Foundation model, so you can install this app and instantly start a conversation with that model without any additional download.&lt;/p&gt;
&lt;p&gt;The app can also run a variety of other models using MLX, including members of the Gemma, Llama 3.2, and and Qwen families.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ios"&gt;ios&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mlx"&gt;mlx&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="ios"/><category term="ai"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="mlx"/></entry><entry><title>Quoting Apple Security Engineering and Architecture</title><link href="https://simonwillison.net/2025/Sep/9/memory-integrity-enforcement/#atom-tag" rel="alternate"/><published>2025-09-09T21:32:55+00:00</published><updated>2025-09-09T21:32:55+00:00</updated><id>https://simonwillison.net/2025/Sep/9/memory-integrity-enforcement/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://security.apple.com/blog/memory-integrity-enforcement/"&gt;&lt;p&gt;There has never been a successful, widespread malware attack against iPhone. The only system-level iOS attacks we observe in the wild come from mercenary spyware, which is vastly more complex than regular cybercriminal activity and consumer malware. Mercenary spyware is historically associated with state actors and uses exploit chains that cost millions of dollars to target a very small number of specific individuals and their devices. [...] Known mercenary spyware chains used against iOS share a common denominator with those targeting Windows and Android: they exploit memory safety vulnerabilities, which are interchangeable, powerful, and exist throughout the industry.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://security.apple.com/blog/memory-integrity-enforcement/"&gt;Apple Security Engineering and Architecture&lt;/a&gt;, introducing Memory Integrity Enforcement for iPhone 17&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/privacy"&gt;privacy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="privacy"/><category term="security"/></entry><entry><title>Quoting Steve Wozniak</title><link href="https://simonwillison.net/2025/Aug/15/steve-wozniak/#atom-tag" rel="alternate"/><published>2025-08-15T16:06:23+00:00</published><updated>2025-08-15T16:06:23+00:00</updated><id>https://simonwillison.net/2025/Aug/15/steve-wozniak/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://slashdot.org/comments.pl?sid=23765914&amp;amp;cid=65583466"&gt;&lt;p&gt;I gave all my Apple wealth away because wealth and power are not what I live for. I have a lot of fun and happiness. I funded a lot of important museums and arts groups in San Jose, the city of my birth, and they named a street after me for being good. I now speak publicly and have risen to the top. I have no idea how much I have but after speaking for 20 years it might be $10M plus a couple of homes. I never look for any type of tax dodge. I earn money from my labor and pay something like 55% combined tax on it. I am the happiest person ever. Life to me was never about accomplishment, but about Happiness, which is Smiles minus Frowns. I developed these philosophies when I was 18-20 years old and I never sold out.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://slashdot.org/comments.pl?sid=23765914&amp;amp;cid=65583466"&gt;Steve Wozniak&lt;/a&gt;, in a comment on Slashdot&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/careers"&gt;careers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/slashdot"&gt;slashdot&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="careers"/><category term="slashdot"/></entry><entry><title>Seven replies to the viral Apple reasoning paper – and why they fall short</title><link href="https://simonwillison.net/2025/Jun/15/viral-apple-reasoning-paper/#atom-tag" rel="alternate"/><published>2025-06-15T06:17:42+00:00</published><updated>2025-06-15T06:17:42+00:00</updated><id>https://simonwillison.net/2025/Jun/15/viral-apple-reasoning-paper/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://garymarcus.substack.com/p/seven-replies-to-the-viral-apple"&gt;Seven replies to the viral Apple reasoning paper – and why they fall short&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A few weeks ago Apple Research released a new paper &lt;a href="https://machinelearning.apple.com/research/illusion-of-thinking"&gt;The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Through extensive experimentation across diverse puzzles, we show that frontier LRMs face a complete accuracy collapse beyond certain complexities. Moreover, they exhibit a counter-intuitive scaling limit: their reasoning effort increases with problem complexity up to a point, then declines despite having an adequate token budget.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I skimmed the paper and it struck me as a more thorough example of the many other trick questions that expose failings in LLMs - this time involving puzzles such as the Tower of Hanoi that can have their difficulty level increased to the point that even "reasoning" LLMs run out of output tokens and fail to complete them.&lt;/p&gt;
&lt;p&gt;I thought this paper got &lt;em&gt;way&lt;/em&gt; more attention than it warranted - the title "The Illusion of Thinking" captured the attention of the "LLMs are over-hyped junk" crowd.  I saw enough well-reasoned rebuttals that I didn't feel it worth digging into.&lt;/p&gt;
&lt;p&gt;And now, notable LLM skeptic Gary Marcus has saved me some time by aggregating the best of those rebuttals &lt;a href="https://garymarcus.substack.com/p/seven-replies-to-the-viral-apple"&gt;together in one place&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;Gary rebuts those rebuttals, but given that his previous headline concerning this paper was &lt;a href="https://garymarcus.substack.com/p/a-knockout-blow-for-llms"&gt;a knockout blow for LLMs?&lt;/a&gt; it's not surprising that he finds those arguments unconvincing. From that previous piece:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The vision of AGI I have always had is one that &lt;em&gt;combines&lt;/em&gt; the strengths of humans with the strength of machines, overcoming the weaknesses of humans. I am not interested in a “AGI” that can’t do arithmetic, and I certainly wouldn’t want to entrust global infrastructure or the future of humanity to such a system.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Then from his new post:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The paper is not news; we already knew these models generalize poorly.&lt;/strong&gt; True! (I personally have been trying to tell people this for almost thirty years; Subbarao Rao Kambhampati has been trying his best, too). But then why do we think these models are the royal road to AGI?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And therein lies my disagreement. I'm not interested in whether or not LLMs are the "road to AGI". I continue to care only about whether they have useful applications today, once you've understood their limitations.&lt;/p&gt;
&lt;p&gt;Reasoning LLMs are a relatively new and interesting twist on the genre. They are demonstrably able to solve a whole bunch of problems that previous LLMs were unable to handle, hence why we've seen &lt;a href="https://simonwillison.net/tags/llm-reasoning/"&gt;a rush of new models&lt;/a&gt; from OpenAI and Anthropic and Gemini and DeepSeek and Qwen and Mistral.&lt;/p&gt;
&lt;p&gt;They get even more interesting when you &lt;a href="https://simonwillison.net/2025/Jun/6/six-months-in-llms/#ai-worlds-fair-2025-43.jpeg"&gt;combine them with tools&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;They're already useful to me today, whether or not they can reliably solve the Tower of Hanoi or River Crossing puzzles.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Update:&lt;/strong&gt; Gary &lt;a href="https://twitter.com/GaryMarcus/status/1935088453684990204"&gt;clarifies&lt;/a&gt; that "the existence of some utility does not mean I can’t also address the rampant but misguided claims of imminent AGI".&lt;/em&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=44278403"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-reasoning"&gt;llm-reasoning&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm-reasoning"/></entry><entry><title>WWDC: Apple supercharges its tools and technologies for developers</title><link href="https://simonwillison.net/2025/Jun/9/apple-wwdc/#atom-tag" rel="alternate"/><published>2025-06-09T19:42:10+00:00</published><updated>2025-06-09T19:42:10+00:00</updated><id>https://simonwillison.net/2025/Jun/9/apple-wwdc/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.apple.com/newsroom/2025/06/apple-supercharges-its-tools-and-technologies-for-developers/"&gt;WWDC: Apple supercharges its tools and technologies for developers&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Here's the Apple press release for today's WWDC announcements. Two things that stood out to me:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Foundation Models Framework&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;With the Foundation Models framework, developers will be able to build on Apple Intelligence to bring users new experiences that are intelligent, available when they’re offline, and that protect their privacy, using AI inference that is free of cost.
The framework has native support for Swift, so developers can easily access the Apple Intelligence model with as few as three lines of code.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's new documentation on &lt;a href="https://developer.apple.com/documentation/FoundationModels/generating-content-and-performing-tasks-with-foundation-models"&gt;Generating content and performing tasks with Foundation Models&lt;/a&gt; - the Swift code looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;session&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;LanguageModelSession&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
    instructions&lt;span class="pl-kos"&gt;:&lt;/span&gt; &lt;span class="pl-s"&gt;"&lt;/span&gt;&lt;span class="pl-s"&gt;Reply with step by step instructions&lt;/span&gt;&lt;span class="pl-s"&gt;"&lt;/span&gt;
&lt;span class="pl-kos"&gt;)&lt;/span&gt;
&lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;prompt&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;"&lt;/span&gt;&lt;span class="pl-s"&gt;Rum old fashioned cocktail&lt;/span&gt;&lt;span class="pl-s"&gt;"&lt;/span&gt;
&lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-c1"&gt;&lt;span class="pl-k"&gt;try&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;await&lt;/span&gt; session&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;respond&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
    to&lt;span class="pl-kos"&gt;:&lt;/span&gt; prompt&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    options&lt;span class="pl-kos"&gt;:&lt;/span&gt; &lt;span class="pl-en"&gt;GenerationOptions&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;temperature&lt;span class="pl-kos"&gt;:&lt;/span&gt; &lt;span class="pl-c1"&gt;2.0&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;There's also a &lt;a href="https://developer.apple.com/videos/play/wwdc2025/286/"&gt;23 minute Meet the Foundation Models framework&lt;/a&gt; video from the conference, which clarifies that this is a 3 billion parameter model with 2 bit quantization. The model is trained for both tool-calling and structured output, which they call "guided generation" and describe as taking advantage of constrained decoding.&lt;/p&gt;
&lt;p&gt;I'm also &lt;em&gt;very&lt;/em&gt; excited about this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Containerization Framework&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The Containerization framework enables developers to create, download, or run Linux container images directly on Mac. It’s built on an open-source framework optimized for Apple silicon and provides secure isolation between container images. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I continue to seek the ideal sandboxing solution for running untrusted code - both from other humans and written for me by LLMs - on my own machines. This looks like it could be a really great option for that going forward.&lt;/p&gt;
&lt;p&gt;It looks like &lt;a href="https://github.com/apple/container"&gt;apple/container&lt;/a&gt; on GitHub is part of this new feature. From the &lt;a href="https://github.com/apple/container/blob/main/docs/technical-overview.md"&gt;technical overview&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;On macOS, the typical way to run Linux containers is to launch a Linux virtual machine (VM) that hosts all of your containers.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;container&lt;/code&gt; runs containers differently. Using the open source &lt;a href="https://github.com/apple/containerization"&gt;Containerization&lt;/a&gt; package, it runs a lightweight VM for each container that you create. [...]&lt;/p&gt;
&lt;p&gt;Since &lt;code&gt;container&lt;/code&gt; consumes and produces standard OCI images, you can easily build with and run images produced by other container applications, and the images that you build will run everywhere.&lt;/p&gt;
&lt;/blockquote&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/macos"&gt;macos&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sandboxing"&gt;sandboxing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/swift"&gt;swift&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="macos"/><category term="sandboxing"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm-release"/><category term="swift"/></entry><entry><title>Apple’s Siri Chief Calls AI Delays Ugly and Embarrassing, Promises Fixes</title><link href="https://simonwillison.net/2025/Mar/14/ai-delays/#atom-tag" rel="alternate"/><published>2025-03-14T21:35:02+00:00</published><updated>2025-03-14T21:35:02+00:00</updated><id>https://simonwillison.net/2025/Mar/14/ai-delays/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.bloomberg.com/news/articles/2025-03-14/apple-s-siri-chief-calls-ai-delays-ugly-and-embarrassing-promises-fixes"&gt;Apple’s Siri Chief Calls AI Delays Ugly and Embarrassing, Promises Fixes&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Mark Gurman reports on some leaked details from internal Apple meetings concerning the delays in shipping personalized Siri. This note in particular stood out to me:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Walker said the decision to delay the features was made because of quality issues and that the company has found the technology only works properly up to two-thirds to 80% of the time. He said the group “can make more progress to get those percentages up, so that users get something they can really count on.” [...]&lt;/p&gt;
&lt;p&gt;But Apple wants to maintain a high bar and only deliver the features when they’re polished, he said. “These are not quite ready to go to the general public, even though our competitors might have launched them in this state or worse.”&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I imagine it's a lot harder to get reliable results out of small, local LLMs that run on an iPhone. Features that fail 1/3 to 1/5 of the time are unacceptable for a consumer product like this.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=43365517"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/apple-intelligence"&gt;apple-intelligence&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="apple-intelligence"/></entry><entry><title>Something Is Rotten in the State of Cupertino</title><link href="https://simonwillison.net/2025/Mar/14/something-is-rotten/#atom-tag" rel="alternate"/><published>2025-03-14T20:15:54+00:00</published><updated>2025-03-14T20:15:54+00:00</updated><id>https://simonwillison.net/2025/Mar/14/something-is-rotten/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://daringfireball.net/2025/03/something_is_rotten_in_the_state_of_cupertino"&gt;Something Is Rotten in the State of Cupertino&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
John Gruber's blazing takedown of Apple's failure to ship many of the key Apple Intelligence features they've been actively promoting for the past twelve months.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The fiasco here is not that Apple is late on AI. It's also not that they had to announce an embarrassing delay on promised features last week. Those are problems, not fiascos, and problems happen. They're inevitable. [...] The fiasco is that Apple pitched a story that wasn't true, one that &lt;em&gt;some&lt;/em&gt; people within the company surely understood wasn't true, and they set a course based on that.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;John divides the Apple Intelligence features into the ones that were demonstrated to members of the press (including himself) at various events over the past year compared to things like "personalized Siri" that were only ever shown as concept videos. The ones that were demonstrated have all shipped. The concept video features are &lt;a href="https://simonwillison.net/2025/Mar/8/delaying-personalized-siri/"&gt;indeterminably delayed&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/john-gruber"&gt;john-gruber&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/apple-intelligence"&gt;apple-intelligence&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="john-gruber"/><category term="ai"/><category term="apple-intelligence"/></entry><entry><title>Apple Is Delaying the ‘More Personalized Siri’ Apple Intelligence Features</title><link href="https://simonwillison.net/2025/Mar/8/delaying-personalized-siri/#atom-tag" rel="alternate"/><published>2025-03-08T05:39:25+00:00</published><updated>2025-03-08T05:39:25+00:00</updated><id>https://simonwillison.net/2025/Mar/8/delaying-personalized-siri/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://daringfireball.net/2025/03/apple_is_delaying_the_more_personalized_siri_apple_intelligence_features"&gt;Apple Is Delaying the ‘More Personalized Siri’ Apple Intelligence Features&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Apple told John Gruber (and other Apple press) this about the new "personalized" Siri:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It’s going to take us longer than we thought to deliver on these features and we anticipate rolling them out in the coming year.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I have a hunch that this delay might relate to security.&lt;/p&gt;
&lt;p&gt;These new Apple Intelligence features involve Siri responding to requests to access information in applications and then performing actions on the user's behalf.&lt;/p&gt;
&lt;p&gt;This is the worst possible combination for &lt;a href="https://simonwillison.net/tags/prompt-injection/"&gt;prompt injection&lt;/a&gt; attacks! Any time an LLM-based system has access to private data, tools it can call, and exposure to potentially malicious instructions (like emails and text messages from untrusted strangers) there's a significant risk that an attacker might subvert those tools and use them to damage or exfiltrating a user's data.&lt;/p&gt;
&lt;p&gt;I published &lt;a href="https://simonwillison.net/2023/Nov/27/prompt-injection-explained/"&gt;this piece&lt;/a&gt; about the risk of prompt injection to personal digital assistants back in November 2023, and nothing has changed since then to make me think this is any less of an open problem.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/john-gruber"&gt;john-gruber&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-injection"&gt;prompt-injection&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/apple-intelligence"&gt;apple-intelligence&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="john-gruber"/><category term="security"/><category term="ai"/><category term="prompt-injection"/><category term="generative-ai"/><category term="llms"/><category term="apple-intelligence"/></entry><entry><title>The Graphing Calculator Story</title><link href="https://simonwillison.net/2025/Mar/5/the-graphing-calculator-story/#atom-tag" rel="alternate"/><published>2025-03-05T23:36:54+00:00</published><updated>2025-03-05T23:36:54+00:00</updated><id>https://simonwillison.net/2025/Mar/5/the-graphing-calculator-story/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.pacifict.com/story/"&gt;The Graphing Calculator Story&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Utterly delightful story from Ron Avitzur in 2004 about the origins of the Graphing Calculator app that shipped with many versions of macOS. Ron's contract with Apple had ended but his badge kept working so he kept on letting himself in to work on the project. He even grew a small team:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I asked my friend Greg Robbins to help me. His contract in another division at Apple had just ended, so he told his manager that he would start reporting to me. She didn't ask who I was and let him keep his office and badge. In turn, I told people that I was reporting to him. Since that left no managers in the loop, we had no meetings and could be extremely productive&lt;/p&gt;
&lt;/blockquote&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://laughingmeme.org/links/"&gt;Kellan&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/computer-history"&gt;computer-history&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="computer-history"/></entry><entry><title>Run LLMs on macOS using llm-mlx and Apple's MLX framework</title><link href="https://simonwillison.net/2025/Feb/15/llm-mlx/#atom-tag" rel="alternate"/><published>2025-02-15T19:13:29+00:00</published><updated>2025-02-15T19:13:29+00:00</updated><id>https://simonwillison.net/2025/Feb/15/llm-mlx/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://github.com/simonw/llm-mlx"&gt;llm-mlx&lt;/a&gt; is a brand new plugin for my &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; Python Library and CLI utility which builds on top of Apple's excellent &lt;a href="https://github.com/ml-explore/mlx"&gt;MLX array framework library&lt;/a&gt; and &lt;a href="https://pypi.org/project/mlx-lm/"&gt;mlx-lm&lt;/a&gt; package. If you're a terminal user or Python developer with a Mac this may be the new easiest way to start exploring local Large Language Models.&lt;/p&gt;
&lt;h4 id="running-llama-3-2-3b-using-llm-mlx"&gt;Running Llama 3.2 3B using llm-mlx&lt;/h4&gt;
&lt;p&gt;If you haven't already got LLM installed you'll need &lt;a href="https://llm.datasette.io/en/stable/setup.html"&gt;to install it&lt;/a&gt; - you can do that in a bunch of different ways - in order of preference I like &lt;code&gt;uv tool install llm&lt;/code&gt; or &lt;code&gt;pipx install llm&lt;/code&gt; or &lt;code&gt;brew install llm&lt;/code&gt; or &lt;code&gt;pip install llm&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Next, install the new plugin (macOS only):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-mlx&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now download and register a model. Llama 3.2 3B is an excellent first choice - it's pretty small (a 1.8GB download) but is a surprisingly capable starter model.&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm mlx download-model mlx-community/Llama-3.2-3B-Instruct-4bit&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This will download 1.8GB of model weights from &lt;a href="https://huggingface.co/mlx-community/Llama-3.2-3B-Instruct-4bit"&gt;mlx-community/Llama-3.2-3B-Instruct-4bit&lt;/a&gt; on Hugging Face and store them here:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;~/.cache/huggingface/hub/models--mlx-community--Llama-3.2-3B-Instruct-4bit
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you can start running prompts:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -m mlx-community/Llama-3.2-3B-Instruct-4bit &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Python code to traverse a tree, briefly&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Which &lt;a href="https://gist.github.com/simonw/85ce6a0faf1a58a243f9e5f4b2daeb3d"&gt;output this&lt;/a&gt; for me:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Here's a brief example of how to traverse a tree in Python:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;class&lt;/span&gt; &lt;span class="pl-v"&gt;Node&lt;/span&gt;:
    &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;__init__&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;, &lt;span class="pl-s1"&gt;value&lt;/span&gt;):
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;value&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;value&lt;/span&gt;
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;children&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; []

&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;traverse_tree&lt;/span&gt;(&lt;span class="pl-s1"&gt;node&lt;/span&gt;):
    &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-s1"&gt;node&lt;/span&gt; &lt;span class="pl-c1"&gt;is&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt;:
        &lt;span class="pl-k"&gt;return&lt;/span&gt;
    
    &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;node&lt;/span&gt;.&lt;span class="pl-c1"&gt;value&lt;/span&gt;)
    
    &lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;child&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;node&lt;/span&gt;.&lt;span class="pl-c1"&gt;children&lt;/span&gt;:
        &lt;span class="pl-en"&gt;traverse_tree&lt;/span&gt;(&lt;span class="pl-s1"&gt;child&lt;/span&gt;)

&lt;span class="pl-c"&gt;# Example usage:&lt;/span&gt;
&lt;span class="pl-s1"&gt;root&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;Node&lt;/span&gt;(&lt;span class="pl-s"&gt;"A"&lt;/span&gt;)
&lt;span class="pl-s1"&gt;root&lt;/span&gt;.&lt;span class="pl-c1"&gt;children&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; [&lt;span class="pl-en"&gt;Node&lt;/span&gt;(&lt;span class="pl-s"&gt;"B"&lt;/span&gt;), &lt;span class="pl-en"&gt;Node&lt;/span&gt;(&lt;span class="pl-s"&gt;"C"&lt;/span&gt;)]
&lt;span class="pl-s1"&gt;root&lt;/span&gt;.&lt;span class="pl-c1"&gt;children&lt;/span&gt;[&lt;span class="pl-c1"&gt;0&lt;/span&gt;].&lt;span class="pl-c1"&gt;children&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; [&lt;span class="pl-en"&gt;Node&lt;/span&gt;(&lt;span class="pl-s"&gt;"D"&lt;/span&gt;), &lt;span class="pl-en"&gt;Node&lt;/span&gt;(&lt;span class="pl-s"&gt;"E"&lt;/span&gt;)]
&lt;span class="pl-s1"&gt;root&lt;/span&gt;.&lt;span class="pl-c1"&gt;children&lt;/span&gt;[&lt;span class="pl-c1"&gt;1&lt;/span&gt;].&lt;span class="pl-c1"&gt;children&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; [&lt;span class="pl-en"&gt;Node&lt;/span&gt;(&lt;span class="pl-s"&gt;"F"&lt;/span&gt;)]

&lt;span class="pl-en"&gt;traverse_tree&lt;/span&gt;(&lt;span class="pl-s1"&gt;root&lt;/span&gt;)  &lt;span class="pl-c"&gt;# Output: A, B, D, E, C, F&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;In this example, we define a &lt;code&gt;Node&lt;/code&gt; class to represent each node in the tree, with a &lt;code&gt;value&lt;/code&gt; attribute and a list of &lt;code&gt;children&lt;/code&gt;. The &lt;code&gt;traverse_tree&lt;/code&gt; function recursively visits each node in the tree, printing its value.&lt;/p&gt;
&lt;p&gt;This is a basic example of a tree traversal, and there are many variations and optimizations depending on the specific use case.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That generation ran at an impressive &lt;a href="https://gist.github.com/simonw/85ce6a0faf1a58a243f9e5f4b2daeb3d?permalink_comment_id=5440701#gistcomment-5440701"&gt;152 tokens per second&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;That command was a bit of a mouthful, so let's assign an alias to the model:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm aliases &lt;span class="pl-c1"&gt;set&lt;/span&gt; l32 mlx-community/Llama-3.2-3B-Instruct-4bit&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now we can use that shorter alias instead:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -m l32 &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;a joke about a haggis buying a car&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(The &lt;a href="https://gist.github.com/simonw/af322b399ce448f766d9349d9dc1cb5d#response"&gt;joke&lt;/a&gt; isn't very good.)&lt;/p&gt;
&lt;p&gt;As with other models supported by LLM, you can also pipe things to it. Here's how to get it to explain a piece of Python code (in this case &lt;a href="https://github.com/simonw/llm-mlx/blob/0.2/llm_mlx.py"&gt;itself&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;cat llm_mlx.py &lt;span class="pl-k"&gt;|&lt;/span&gt; llm -m l32 &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;explain this code&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The response started like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This code is a part of a larger project that uses the Hugging Face Transformers library to create a text-to-text conversational AI model. The code defines a custom model class &lt;code&gt;MlxModel&lt;/code&gt; and a set of command-line interface (CLI) commands for working with MLX models. [...]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/a577c820b18591d4825b70679cee3f9c"&gt;the rest of the response&lt;/a&gt;. I'm pretty amazed at how well it did for a tiny 1.8GB model!&lt;/p&gt;

&lt;p&gt;This plugin can only run models that have been converted to work with Apple's MLX framework. Thankfully the &lt;a href="https://huggingface.co/mlx-community"&gt;mlx-community&lt;/a&gt; organization on Hugging Face has published over 1,000 of these. A few I've tried successfully:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;mlx-community/Qwen2.5-0.5B-Instruct-4bit&lt;/code&gt; - &lt;a href="https://huggingface.co/mlx-community/Qwen2.5-0.5B-Instruct-4bit"&gt;278 MB&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mlx-community/Mistral-7B-Instruct-v0.3-4bit&lt;/code&gt; - &lt;a href="https://huggingface.co/mlx-community/Mistral-7B-Instruct-v0.3-4bit"&gt;4.08 GB&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mlx-community/Mistral-Small-24B-Instruct-2501-4bit&lt;/code&gt; - &lt;a href="https://huggingface.co/mlx-community/Mistral-Small-24B-Instruct-2501-4bit"&gt;13.26 GB&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit&lt;/code&gt; - &lt;a href="https://huggingface.co/mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit"&gt;18.5 GB&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mlx-community/Llama-3.3-70B-Instruct-4bit&lt;/code&gt; - &lt;a href="https://huggingface.co/mlx-community/Llama-3.3-70B-Instruct-4bit"&gt;40 GB&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Mistral-Small-24B really does feel GPT-4 quality despite only needing around 12GB of RAM to run - so it's a good default model if you want to leave space to run other apps.&lt;/p&gt;

&lt;p&gt;DeepSeek-R1-Distill-Qwen-32B-4bit is a great model if you want to try out R1 style "reasoning" which outputs a &lt;code&gt;&amp;lt;think&amp;gt;...&amp;lt;/think&amp;gt;...&lt;/code&gt; block at the start of each response.&lt;/p&gt;

&lt;p&gt;That last Llama 70B one needs a machine with 64GB of RAM to run but is &lt;em&gt;very&lt;/em&gt; capable - I wrote about trying out the Ollama version of Llama 3.3 70B in &lt;a href="https://simonwillison.net/2024/Dec/9/llama-33-70b/"&gt;I can now run a GPT-4 class model on my laptop&lt;/a&gt; back in December.&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm mlx download-model mlx-community/Llama-3.3-70B-Instruct-4bit&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/mlx-download-llama-33.jpg" alt="Screenshot of the download-model command running against the Llama 3.3 70B model, it shows fetching 13 files and displays progress bars for 8 different safetensors files each weighing between 2.6 and 3.3GB." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;For larger models like that I recommend using the &lt;code&gt;llm chat&lt;/code&gt; command as that ensures they stay in memory for multiple rounds of the conversation:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm chat -m mlx-community/Llama-3.3-70B-Instruct-4bit&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Activity Monitor shows the process using 37.5GB of RAM while the model is running.&lt;/p&gt;
&lt;p&gt;Some of these jokes are actually pretty good!&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;&lt;code&gt;Chatting with mlx-community/Llama-3.3-70B-Instruct-4bit
Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
&amp;gt; hi
How's your day going so far? Is there something I can help you with or would you like to chat?
&amp;gt; jokes about weird animals please
Here are a few:

1. Why did the axolotl go to therapy? Because it was struggling to come to terms with its identity crisis... and also because it had a lot of "regrowing" to do!
2. What did the aye-aye say to the other aye-aye when they were out on a date? "You've got a finger on the pulse of my heart!"
3. Why did the naked mole rat go to the party? Because it was a "hair-raising" experience and it wanted to get to the "root" of the fun!
4. What do you call a group of flamingos playing instruments? The Pink Floyd... of the avian world, of course!
5. Why did the blobfish go to the doctor? It had a "gut" feeling something was wrong... and also because it was feeling a little "deflated"!

Want more? I've got a whole zoo of 'em!
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;That ran at &lt;a href="https://gist.github.com/simonw/ec99b6cf65ae61256a34f3ac269951dc"&gt;8.8 tokens per second&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;As with every LLM plugin, all prompts and responses are logged to &lt;a href="https://llm.datasette.io/en/stable/logging.html"&gt;a SQLite database&lt;/a&gt; that can be accessed using the &lt;code&gt;llm logs&lt;/code&gt; command.&lt;/p&gt;

&lt;p&gt;You can use LLM models from your Python code as well. The README includes &lt;a href="https://github.com/simonw/llm-mlx/blob/0.2.1/README.md#using-models-from-python"&gt;notes on how to do this&lt;/a&gt;, as does the &lt;a href="https://llm.datasette.io/en/stable/python-api.html"&gt;LLM Python API documentation&lt;/a&gt;. TLDR:&lt;/p&gt;

&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;
&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-c1"&gt;get_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"mlx-community/Llama-3.2-3B-Instruct-4bit"&lt;/span&gt;)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-c1"&gt;prompt&lt;/span&gt;(&lt;span class="pl-s"&gt;"hi"&lt;/span&gt;).&lt;span class="pl-c1"&gt;text&lt;/span&gt;())&lt;/pre&gt;

&lt;h4 id="mlx-and-mlx-lm"&gt;MLX and mlx-lm&lt;/h4&gt;
&lt;p&gt;The secret sauce behind this new plugin is Apple's &lt;a href="https://github.com/ml-explore/mlx"&gt;MLX library&lt;/a&gt;, which as been improving at an extraordinary pace over the past year - it first launched on GitHub just in November of 2023.&lt;/p&gt;
&lt;p&gt;The core library is C++, but the MLX team at Apple also maintain a &lt;a href="https://github.com/ml-explore/mlx-examples"&gt;mlx-examples&lt;/a&gt; repo with a wealth of Python examples, including the &lt;a href="https://github.com/ml-explore/mlx-examples/blob/main/llms/README.md"&gt;mlx-lm&lt;/a&gt; and &lt;a href="https://github.com/ml-explore/mlx-examples/tree/main/whisper"&gt;mlx-whisper&lt;/a&gt; packages.&lt;/p&gt;
&lt;p&gt;The community around it is growing too. I'm particularly impressed by Prince Canuma's &lt;a href="https://github.com/Blaizzy/mlx-vlm"&gt;mlx-vlm&lt;/a&gt; library for accessing multi-modal vision LLMs - I'd love to get that working as an LLM plugin once I figure out &lt;a href="https://github.com/Blaizzy/mlx-vlm/issues/68"&gt;how to use it for conversations&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I've used MLX for &lt;a href="https://simonwillison.net/tags/mlx/"&gt;a few experiments&lt;/a&gt; in the past, but &lt;a href="https://twitter.com/awnihannun/status/1890524526821126620"&gt;this tweet&lt;/a&gt; from MLX core developer Awni Hannun finally convinced me to wrap it up as an LLM plugin:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In the latest MLX small LLMs are a lot faster.&lt;/p&gt;
&lt;p&gt;On M4 Max 4-bit Qwen 0.5B generates 1k tokens at a whopping 510 toks/sec. And runs at over 150 tok/sec on iPhone 16 pro.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is &lt;em&gt;really&lt;/em&gt; good software. This small team at Apple appear to be almost single-handedly giving NVIDIA's CUDA a run for their money!&lt;/p&gt;
&lt;h4 id="building-the-plugin"&gt;Building the plugin&lt;/h4&gt;
&lt;p&gt;The &lt;a href="https://github.com/simonw/llm-mlx"&gt;llm-mlx&lt;/a&gt; plugin came together pretty quickly. The &lt;a href="https://github.com/simonw/llm-mlx/blob/0.1/llm_mlx.py"&gt;first version&lt;/a&gt; was ~100 lines of Python, much of it repurposed from my existing &lt;a href="https://github.com/simonw/llm-gguf"&gt;llm-gguf&lt;/a&gt; plugin.&lt;/p&gt;
&lt;p&gt;The hardest problem was figuring out how to hide the Hugging Face progress bars that displayed every time it attempted to access a model!&lt;/p&gt;
&lt;p&gt;I eventually found the &lt;code&gt;from huggingface_hub.utils import disable_progress_bars&lt;/code&gt; utility function by piping library code through Gemini 2.0.&lt;/p&gt;
&lt;p&gt;I then added &lt;a href="https://github.com/simonw/llm-mlx/issues/3"&gt;model options support&lt;/a&gt; allowing you to pass options like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -m l32 &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;a greeting&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; -o temperature 1.0 -o seed 2&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So far using a fixed seed &lt;a href="https://github.com/simonw/llm-mlx/issues/3#issuecomment-2661025591"&gt;appears to produce repeatable results&lt;/a&gt;, which is exciting for writing more predictable tests in the future.&lt;/p&gt;

&lt;p&gt;For the &lt;a href="https://github.com/simonw/llm-mlx/blob/0.2/tests/test_mlx.py"&gt;automated tests&lt;/a&gt; that run in GitHub Actions I decided to use a small model - I went with the tiny 75MB &lt;a href="https://huggingface.co/mlx-community/SmolLM-135M-Instruct-4bit/tree/main"&gt;mlx-community/SmolLM-135M-Instruct-4bit&lt;/a&gt; (explored &lt;a href="https://simonwillison.net/2025/Feb/7/pip-install-llm-smollm2/"&gt;previously&lt;/a&gt;). I configured GitHub Actions to cache the model in between CI runs by adding the following YAML to my &lt;a href="https://github.com/simonw/llm-mlx/blob/0.2/.github/workflows/test.yml"&gt;.github/workflows/test.yml&lt;/a&gt; file:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;    - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Cache models&lt;/span&gt;
      &lt;span class="pl-ent"&gt;uses&lt;/span&gt;: &lt;span class="pl-s"&gt;actions/cache@v4&lt;/span&gt;
      &lt;span class="pl-ent"&gt;with&lt;/span&gt;:
        &lt;span class="pl-ent"&gt;path&lt;/span&gt;: &lt;span class="pl-s"&gt;~/.cache/huggingface&lt;/span&gt;
        &lt;span class="pl-ent"&gt;key&lt;/span&gt;: &lt;span class="pl-s"&gt;${{ runner.os }}-huggingface-&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/qwen"&gt;qwen&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mlx"&gt;mlx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prince-canuma"&gt;prince-canuma&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-in-china"&gt;ai-in-china&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apple"/><category term="plugins"/><category term="projects"/><category term="ai"/><category term="github-actions"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="llm"/><category term="qwen"/><category term="mlx"/><category term="prince-canuma"/><category term="ai-in-china"/></entry><entry><title>Quoting John Gruber</title><link href="https://simonwillison.net/2025/Jan/22/john-gruber/#atom-tag" rel="alternate"/><published>2025-01-22T23:59:05+00:00</published><updated>2025-01-22T23:59:05+00:00</updated><id>https://simonwillison.net/2025/Jan/22/john-gruber/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://daringfireball.net/linked/2025/01/22/apple-red-cross-socal-fire-relief"&gt;&lt;p&gt;When I give money to a charitable cause, I always look for the checkboxes to opt out of being contacted by them in the future. When it happens anyway, I get annoyed, and I become reluctant to give to that charity again. [...]&lt;/p&gt;
&lt;p&gt;When you donate to the Red Cross via Apple, that concern is off the table. Apple won’t emphasize that aspect of this, because they don’t want to throw the Red Cross under the proverbial bus, but I will. An underrated aspect of privacy is the desire simply not to be annoyed.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://daringfireball.net/linked/2025/01/22/apple-red-cross-socal-fire-relief"&gt;John Gruber&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/privacy"&gt;privacy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/john-gruber"&gt;john-gruber&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="privacy"/><category term="john-gruber"/></entry><entry><title>Run DeepSeek R1 or V3 with MLX Distributed</title><link href="https://simonwillison.net/2025/Jan/22/mlx-distributed/#atom-tag" rel="alternate"/><published>2025-01-22T04:15:45+00:00</published><updated>2025-01-22T04:15:45+00:00</updated><id>https://simonwillison.net/2025/Jan/22/mlx-distributed/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://gist.github.com/awni/ec071fd27940698edd14a4191855bba6"&gt;Run DeepSeek R1 or V3 with MLX Distributed&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Handy detailed instructions from Awni Hannun on running the enormous DeepSeek R1 or v3 models on a cluster of Macs using the &lt;a href="https://ml-explore.github.io/mlx/build/html/usage/distributed.html"&gt;distributed communication&lt;/a&gt; feature of Apple's MLX library.&lt;/p&gt;
&lt;p&gt;DeepSeek R1 quantized to 4-bit requires 450GB in aggregate RAM, which can be achieved by a cluster of three 192 GB M2 Ultras ($16,797 will buy you three 192GB Apple M2 Ultra Mac Studios at $5,599 each).

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/awnihannun/status/1881915166922863045"&gt;@awnihannun&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mlx"&gt;mlx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/deepseek"&gt;deepseek&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-in-china"&gt;ai-in-china&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="ai"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="mlx"/><category term="deepseek"/><category term="ai-in-china"/></entry><entry><title>I Live My Life a Quarter Century at a Time</title><link href="https://simonwillison.net/2025/Jan/4/i-live-my-life-a-quarter-century-at-a-time/#atom-tag" rel="alternate"/><published>2025-01-04T23:00:36+00:00</published><updated>2025-01-04T23:00:36+00:00</updated><id>https://simonwillison.net/2025/Jan/4/i-live-my-life-a-quarter-century-at-a-time/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://tla.systems/blog/2025/01/04/i-live-my-life-a-quarter-century-at-a-time/"&gt;I Live My Life a Quarter Century at a Time&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Delightful Steve Jobs era Apple story from James Thomson, who built the first working prototype of the macOS Dock.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://lobste.rs/s/wraaxu/i_live_my_life_quarter_century_at_time"&gt;lobste.rs&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/history"&gt;history&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/steve-jobs"&gt;steve-jobs&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="history"/><category term="steve-jobs"/></entry><entry><title>I still don't think companies serve you ads based on spying through your microphone</title><link href="https://simonwillison.net/2025/Jan/2/they-spy-on-you-but-not-like-that/#atom-tag" rel="alternate"/><published>2025-01-02T23:43:31+00:00</published><updated>2025-01-02T23:43:31+00:00</updated><id>https://simonwillison.net/2025/Jan/2/they-spy-on-you-but-not-like-that/#atom-tag</id><summary type="html">
    &lt;p&gt;One of my weirder hobbies is trying to convince people that the idea that companies are listening to you through your phone's microphone and serving you targeted ads is a conspiracy theory that isn't true. I wrote about this previously: &lt;a href="https://simonwillison.net/2023/Dec/14/ai-trust-crisis/#facebook-dont-spy-microphone"&gt;Facebook don’t spy on you through your microphone&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;(Convincing people of this is basically impossible. It doesn't matter how good your argument is, if someone has ever seen an ad that relates to their previous voice conversation they are likely convinced and there's nothing you can do to talk them out of it. Gimlet media did &lt;a href="https://gimletmedia.com/amp/shows/reply-all/z3hlwr"&gt;a great podcast episode&lt;/a&gt; about how impossible this is back in 2017.)&lt;/p&gt;
&lt;p&gt;This is about to get even harder thanks to this proposed settlement: &lt;a href="https://arstechnica.com/tech-policy/2025/01/apple-agrees-to-pay-95m-delete-private-conversations-siri-recorded/"&gt;Siri “unintentionally” recorded private convos; Apple agrees to pay $95M&lt;/a&gt; (Ars Technica).&lt;/p&gt;
&lt;p&gt;Apple are spending $95m (nine hours of profit), agreeing to settle while "denying wrongdoing".&lt;/p&gt;
&lt;p&gt;What actually happened is it turns out Apple were capturing snippets of audio surrounding the "Hey Siri" wake word, sending those back to their servers and occasionally using them for QA, without informing users that they were doing this. This is bad.&lt;/p&gt;
&lt;p&gt;The Reuters 2021 story &lt;a href="https://www.reuters.com/technology/apple-must-face-siri-voice-assistant-privacy-lawsuit-us-judge-2021-09-02/"&gt;Apple must face Siri voice assistant privacy lawsuit -U.S. judge&lt;/a&gt; reported that:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;One Siri user said his private discussions with his doctor about a "brand name surgical treatment" caused him to receive targeted ads for that treatment, while two others said their discussions about Air Jordan sneakers, Pit Viper sunglasses and "Olive Garden" caused them to receive ads for those products.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The claim from that story was then repeated in &lt;a href="https://www.reuters.com/legal/apple-pay-95-million-settle-siri-privacy-lawsuit-2025-01-02/"&gt;the 2025 Reuters story&lt;/a&gt; about the settlement.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://arstechnica.com/tech-policy/2025/01/apple-agrees-to-pay-95m-delete-private-conversations-siri-recorded/"&gt;Ars Technica story&lt;/a&gt; reframes that like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The only clue that users seemingly had of Siri's alleged spying was eerily accurate targeted ads that appeared after they had just been talking about specific items like Air Jordans or brands like Olive Garden, Reuters &lt;a href="https://www.reuters.com/legal/apple-pay-95-million-settle-siri-privacy-lawsuit-2025-01-02/"&gt;noted&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Crucially, this was never &lt;em&gt;proven in court&lt;/em&gt;. And if Apple settle the case it never will be.&lt;/p&gt;
&lt;p&gt;Let’s think this through. For the accusation to be true, Apple would need to be recording those wake word audio snippets and transmitting them back to their servers for additional processing (likely true), but then they would need to be feeding those snippets &lt;em&gt;in almost real time&lt;/em&gt; into a system which forwards them onto advertising partners who then feed that information into targeting networks such that next time you view an ad on your phone the information is available to help select the relevant ad.&lt;/p&gt;
&lt;p&gt;That is &lt;em&gt;so far fetched&lt;/em&gt;. Why would Apple do that? Especially given both their brand and reputation as a privacy-first company combined with the large amounts of product design and engineering work they’ve put into preventing apps from doing exactly this kind of thing by enforcing permission-based capabilities &lt;em&gt;and&lt;/em&gt; ensuring a “microphone active” icon is available at all times when an app is listening in.&lt;/p&gt;
&lt;p&gt;I really don't think this is happening - in particular for Siri wake words!&lt;/p&gt;

&lt;p id="argued-these-points"&gt;I've &lt;a href="https://simonwillison.net/2023/Dec/14/ai-trust-crisis/#facebook-dont-spy-microphone"&gt;argued these points before&lt;/a&gt;, but I'll do it again here for good measure.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;You don't notice the hundreds of times a day you say something and &lt;em&gt;don't&lt;/em&gt; see a relevant advert a short time later. You see thousands of ads a day, can you remember what &lt;em&gt;any&lt;/em&gt; of them are?&lt;/li&gt;
&lt;li&gt;The tiny fraction of times where you see an ad that's relevant to something you've just said (hence breaking through your filter that prevents you from seeing most ads at all) stick in your head.&lt;/li&gt;
&lt;li&gt;Human beings are pattern matching machines with a huge bias towards personal anecdotes. If we've seen direct evidence of something ourselves, good luck talking us out of it!&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I think the truth of the matter here is much more pedestrian: the quality of ad targeting that's possible just through apps sharing data on your regular actions within those apps is shockingly high... combined with the fact that it turns out just knowing "male, 40s, NYC" is often more than enough - we're all pretty basic!&lt;/p&gt;
&lt;p&gt;I fully expect that this Apple story will be used as "proof" by conspiracy theorists effectively forever.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/conspiracy"&gt;conspiracy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/privacy"&gt;privacy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/misinformation"&gt;misinformation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/microphone-ads-conspiracy"&gt;microphone-ads-conspiracy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digital-literacy"&gt;digital-literacy&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apple"/><category term="conspiracy"/><category term="privacy"/><category term="misinformation"/><category term="microphone-ads-conspiracy"/><category term="digital-literacy"/></entry><entry><title>BBC complains to Apple over misleading shooting headline</title><link href="https://simonwillison.net/2024/Dec/14/bbc-complains-to-apple-over-misleading-shooting-headline/#atom-tag" rel="alternate"/><published>2024-12-14T00:06:44+00:00</published><updated>2024-12-14T00:06:44+00:00</updated><id>https://simonwillison.net/2024/Dec/14/bbc-complains-to-apple-over-misleading-shooting-headline/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.bbc.com/news/articles/cd0elzk24dno"&gt;BBC complains to Apple over misleading shooting headline&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This is bad: the Apple Intelligence feature that uses (on device) LLMs to present a condensed, summarized set of notifications misrepresented a BBC headline as "Luigi Mangione shoots himself".&lt;/p&gt;
&lt;p&gt;Ken Schwencke &lt;a href="https://bsky.app/profile/schwanksta.com/post/3lbi6rxhigc2r"&gt;caught that same feature&lt;/a&gt; incorrectly condensing a New York Times headline about an ICC arrest warrant for Netanyahu as "Netanyahu arrested".&lt;/p&gt;
&lt;p&gt;My understanding is that these notification summaries are generated directly on-device, using Apple's own custom &lt;a href="https://simonwillison.net/2024/Jun/11/apples-on-device-and-server-foundation-models/"&gt;3B parameter model&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The main lesson I think this illustrates is that it's not responsible to outsource headline summarization to an LLM without incorporating human review: there are way too many ways this could result in direct misinformation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update 16th January 2025&lt;/strong&gt;: &lt;a href="https://www.nytimes.com/2025/01/16/technology/apple-ai-news-notifications.html"&gt;Apple plans to disable A.I. features summarizing news notifications&lt;/a&gt;, by Tripp Mickle for the New York Times.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ethics"&gt;ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/journalism"&gt;journalism&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/apple-intelligence"&gt;apple-intelligence&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="ethics"/><category term="journalism"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="apple-intelligence"/><category term="ai-ethics"/></entry><entry><title>Apple's Knowledge Navigator concept video (1987)</title><link href="https://simonwillison.net/2024/Oct/22/knowledge-navigator/#atom-tag" rel="alternate"/><published>2024-10-22T04:40:49+00:00</published><updated>2024-10-22T04:40:49+00:00</updated><id>https://simonwillison.net/2024/Oct/22/knowledge-navigator/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.youtube.com/watch?v=-jiBLQyUi38"&gt;Apple&amp;#x27;s Knowledge Navigator concept video (1987)&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I learned about this video today while &lt;a href="https://twitter.com/simonw/status/1848360857815949551"&gt;engaged in my irresistible bad habit&lt;/a&gt; of arguing about whether or not "agents" means anything useful.&lt;/p&gt;
&lt;p&gt;It turns out CEO John Sculley's Apple in 1987 promoted a concept called &lt;a href="https://en.wikipedia.org/wiki/Knowledge_Navigator"&gt;Knowledge Navigator&lt;/a&gt; (incorporating input from Alan Kay) which imagined a future where computers hosted intelligent "agents" that could speak directly to their operators and perform tasks such as research and calendar management.&lt;/p&gt;
&lt;p&gt;This video was produced for John Sculley's keynote at the 1987 Educom higher education conference imagining a tablet-style computer with an agent called "Phil".&lt;/p&gt;
&lt;p&gt;&lt;lite-youtube videoid="-jiBLQyUi38" title="Apple's Knowledge Navigator concept video (1987)" playlabel="Play: Apple's Knowledge Navigator concept video (1987)"&gt;&lt;/lite-youtube&gt;&lt;/p&gt;
&lt;p&gt;It's fascinating how close we are getting to this nearly 40 year old concept with the most recent demos from AI labs like OpenAI. Their &lt;a href="https://www.youtube.com/watch?v=DQacCB9tDaw"&gt;Introducing GPT-4o&lt;/a&gt; video feels very similar in all sorts of ways.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/riley_stews/status/1493663143054102528"&gt;@riley_stews&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/computer-history"&gt;computer-history&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/youtube"&gt;youtube&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-history"&gt;ai-history&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="computer-history"/><category term="youtube"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="ai-agents"/><category term="ai-history"/></entry><entry><title>mlx-vlm</title><link href="https://simonwillison.net/2024/Sep/29/mlx-vlm/#atom-tag" rel="alternate"/><published>2024-09-29T21:38:46+00:00</published><updated>2024-09-29T21:38:46+00:00</updated><id>https://simonwillison.net/2024/Sep/29/mlx-vlm/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/Blaizzy/mlx-vlm"&gt;mlx-vlm&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The MLX ecosystem of libraries for running machine learning models on Apple Silicon continues to expand. Prince Canuma is actively developing this library for running vision models such as Qwen-2 VL and Pixtral and LLaVA using Python running on a Mac.&lt;/p&gt;
&lt;p&gt;I used &lt;a href="https://docs.astral.sh/uv/"&gt;uv&lt;/a&gt; to run it against &lt;a href="https://static.simonwillison.net/static/2024/django-roadmap.png"&gt;this image&lt;/a&gt; with this shell one-liner:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uv run --with mlx-vlm \
  python -m mlx_vlm.generate \
  --model Qwen/Qwen2-VL-2B-Instruct \
  --max-tokens 1000 \
  --temp 0.0 \
  --image https://static.simonwillison.net/static/2024/django-roadmap.png \
  --prompt "Describe image in detail, include all text"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;--image&lt;/code&gt; option works equally well with a URL or a path to a local file on disk.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Django roadmap diagram, described below" src="https://static.simonwillison.net/static/2024/django-roadmap.png" /&gt;&lt;/p&gt;
&lt;p&gt;This first downloaded 4.1GB to my &lt;code&gt;~/.cache/huggingface/hub/models--Qwen--Qwen2-VL-2B-Instruct&lt;/code&gt; folder and then output &lt;a href="https://gist.github.com/simonw/9e02d425cacb902260ec1307e0671e17"&gt;this result&lt;/a&gt;, which starts:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The image is a horizontal timeline chart that represents the release dates of various software versions. The timeline is divided into years from 2023 to 2029, with each year represented by a vertical line. The chart includes a legend at the bottom, which distinguishes between different types of software versions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Legend&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mainstream Support:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;4.2 (2023)&lt;/li&gt;
&lt;li&gt;5.0 (2024)&lt;/li&gt;
&lt;li&gt;5.1 (2025)&lt;/li&gt;
&lt;li&gt;5.2 (2026)&lt;/li&gt;
&lt;li&gt;6.0 (2027) [...]&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://mastodon.social/@zubakskees/113221293869864076"&gt;Chris Zubak-Skees&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vision-llms"&gt;vision-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/qwen"&gt;qwen&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mlx"&gt;mlx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prince-canuma"&gt;prince-canuma&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-in-china"&gt;ai-in-china&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="python"/><category term="ai"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="vision-llms"/><category term="uv"/><category term="qwen"/><category term="mlx"/><category term="prince-canuma"/><category term="ai-in-china"/></entry><entry><title>mlx-whisper</title><link href="https://simonwillison.net/2024/Aug/13/mlx-whisper/#atom-tag" rel="alternate"/><published>2024-08-13T16:15:28+00:00</published><updated>2024-08-13T16:15:28+00:00</updated><id>https://simonwillison.net/2024/Aug/13/mlx-whisper/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://pypi.org/project/mlx-whisper/"&gt;mlx-whisper&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Apple's &lt;a href="https://github.com/ml-explore/mlx"&gt;MLX framework&lt;/a&gt; for running GPU-accelerated machine learning models on Apple Silicon keeps growing &lt;a href="https://github.com/ml-explore/mlx-examples"&gt;new examples&lt;/a&gt;. &lt;code&gt;mlx-whisper&lt;/code&gt; is a Python package for running OpenAI's Whisper speech-to-text model. It's really easy to use:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install mlx-whisper
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then in a Python console:&lt;/p&gt;
&lt;div class="highlight highlight-text-python-console"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; mlx_whisper
&amp;gt;&amp;gt;&amp;gt; result &lt;span class="pl-k"&gt;=&lt;/span&gt; mlx_whisper.transcribe(
...    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;/tmp/recording.mp3&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
...     path_or_hf_repo&lt;span class="pl-k"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;mlx-community/distil-whisper-large-v3&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;)
.gitattributes: 100%|███████████| 1.52k/1.52k [00:00&amp;lt;00:00, 4.46MB/s]
config.json: 100%|██████████████| 268/268 [00:00&amp;lt;00:00, 843kB/s]
README.md: 100%|████████████████| 332/332 [00:00&amp;lt;00:00, 1.95MB/s]
Fetching 4 files:  50%|████▌    | 2/4 [00:01&amp;lt;00:01,  1.26it/s]
weights.npz:  63%|██████████  ▎ | 944M/1.51G [02:41&amp;lt;02:15, 4.17MB/s]
&amp;gt;&amp;gt;&amp;gt; result.keys()
dict_keys(['text', 'segments', 'language'])
&amp;gt;&amp;gt;&amp;gt; result[&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;language&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;]
'en'
&amp;gt;&amp;gt;&amp;gt; &lt;span class="pl-c1"&gt;len&lt;/span&gt;(result[&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;text&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;])
100105
&amp;gt;&amp;gt;&amp;gt; &lt;span class="pl-c1"&gt;print&lt;/span&gt;(result[&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;text&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;][:&lt;span class="pl-c1"&gt;3000&lt;/span&gt;])
 This is so exciting. I have to tell you, first of all ...&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here's Activity Monitor confirming that the Python process is using the GPU for the transcription:&lt;/p&gt;
&lt;p&gt;&lt;img alt="python3.10 is using 549% CPU, 44.20 CPU time, 9 threads, 90.8% GPU, 42.53 GPU time" src="https://static.simonwillison.net/static/2024/mlx-whisper-gpu.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;This example downloaded a 1.5GB model &lt;a href="https://huggingface.co/mlx-community/distil-whisper-large-v3/tree/main"&gt;from Hugging Face&lt;/a&gt; and stashed it in my &lt;code&gt;~/.cache/huggingface/hub/models--mlx-community--distil-whisper-large-v3&lt;/code&gt; folder.&lt;/p&gt;
&lt;p&gt;Calling &lt;code&gt;.transcribe(filepath)&lt;/code&gt; without the &lt;code&gt;path_or_hf_repo&lt;/code&gt; argument uses the much smaller (74.4 MB) &lt;a href="https://huggingface.co/mlx-community/whisper-tiny-mlx/tree/main"&gt;whisper-tiny-mlx&lt;/a&gt; model.&lt;/p&gt;
&lt;p&gt;A few people asked how this compares to &lt;code&gt;whisper.cpp&lt;/code&gt;. Bill Mill &lt;a href="https://notes.billmill.org/link_blog/2024/08/mlx-whisper.html"&gt;compared the two&lt;/a&gt; and found &lt;code&gt;mlx-whisper&lt;/code&gt; to be about 3x faster on an M1 Max.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: this note &lt;a href="https://twitter.com/josh_m/status/182411061314206529"&gt;from Josh Marshall&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;That '3x' comparison isn't fair; completely different models. I ran a test (14" M1 Pro) with the full (non-distilled) large-v2 model quantised to 8 bit (which is my pick), and whisper.cpp was 1m vs 1m36 for mlx-whisper.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href="https://twitter.com/josh_m/status/1824240282554208425"&gt;Then later&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I've now done a better test, using the MLK audio, multiple runs and 2 models (distil-large-v3, large-v2-8bit)... and mlx-whisper is indeed 30-40% faster&lt;/p&gt;
&lt;/blockquote&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/awnihannun/status/1822744609241682077"&gt;@awnihannun&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/whisper"&gt;whisper&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mlx"&gt;mlx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/speech-to-text"&gt;speech-to-text&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="python"/><category term="ai"/><category term="openai"/><category term="whisper"/><category term="mlx"/><category term="speech-to-text"/></entry><entry><title>Here Are All of the Apple Intelligence Features in the iOS 18.1 Developer Beta</title><link href="https://simonwillison.net/2024/Jul/30/apple-intelligence-beta/#atom-tag" rel="alternate"/><published>2024-07-30T04:22:20+00:00</published><updated>2024-07-30T04:22:20+00:00</updated><id>https://simonwillison.net/2024/Jul/30/apple-intelligence-beta/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.macrumors.com/2024/07/29/ios-18-1-apple-intelligence-features/"&gt;Here Are All of the Apple Intelligence Features in the iOS 18.1 Developer Beta&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Useful rundown from Juli Clover at MacRumors of the Apple Intelligence features that are available in the brand new iOS 18.1 beta, available to developer account holders with an iPhone 15 or ‌iPhone 15 Pro‌ Max or Apple Silicon iPad.&lt;/p&gt;
&lt;p&gt;I've been trying this out today. It's still clearly very early, and the on-device model that powers Siri is significantly weaker than more powerful models that I've become used to over the past two years. Similar to old Siri I find myself trying to figure out the sparse, undocumented incantations that reliably work for the things I might want my voice assistant to do for me.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://twitter.com/emollick/status/1818106202733060527"&gt;Ethan Mollick&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;My early Siri AI experience has just underlined the fact that, while there is a lot of practical, useful things that can be done with small models, they really lack the horsepower to do anything super interesting.&lt;/p&gt;
&lt;/blockquote&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://daringfireball.net/linked/2024/07/29/apple-intelligence-os-betas"&gt;John Gruber&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ethan-mollick"&gt;ethan-mollick&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/apple-intelligence"&gt;apple-intelligence&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ethan-mollick"/><category term="apple-intelligence"/></entry><entry><title>Quoting Apple Intelligence Foundation Language Models</title><link href="https://simonwillison.net/2024/Jul/29/apple-foundation-language-models/#atom-tag" rel="alternate"/><published>2024-07-29T22:39:33+00:00</published><updated>2024-07-29T22:39:33+00:00</updated><id>https://simonwillison.net/2024/Jul/29/apple-foundation-language-models/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://machinelearning.apple.com/papers/apple_intelligence_foundation_language_models.pdf"&gt;&lt;p&gt;The [Apple Foundation Model] pre-training dataset consists of a diverse and high quality data mixture. This includes data we have licensed from publishers, curated publicly-available or open-sourced datasets, and publicly available information crawled by our web-crawler, Applebot. We respect the right of webpages to opt out of being crawled by Applebot, using standard robots.txt directives.&lt;/p&gt;
&lt;p&gt;Given our focus on protecting user privacy, we note that no private Apple user data is included in the data mixture. Additionally, extensive efforts have been made to exclude profanity, unsafe material, and personally identifiable information from publicly available data (see Section 7 for more details). Rigorous decontamination is also performed against many common evaluation benchmarks.&lt;/p&gt;
&lt;p&gt;We find that data quality, much more so than quantity, is the key determining factor of downstream model performance.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://machinelearning.apple.com/papers/apple_intelligence_foundation_language_models.pdf"&gt;Apple Intelligence Foundation Language Models&lt;/a&gt;, PDF&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/training-data"&gt;training-data&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/apple-intelligence"&gt;apple-intelligence&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="generative-ai"/><category term="training-data"/><category term="ai"/><category term="llms"/><category term="apple-intelligence"/></entry><entry><title>Python 3.12 change results in Apple App Store rejection</title><link href="https://simonwillison.net/2024/Jun/27/python-312-app-store-rejection/#atom-tag" rel="alternate"/><published>2024-06-27T23:17:15+00:00</published><updated>2024-06-27T23:17:15+00:00</updated><id>https://simonwillison.net/2024/Jun/27/python-312-app-store-rejection/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/python/cpython/issues/120522"&gt;Python 3.12 change results in Apple App Store rejection&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Such a frustrating demonstration of the very worst of Apple's opaque App Store review process. The Python 3.12 standard library &lt;code&gt;urllib&lt;/code&gt; package includes the string &lt;code&gt;itms-services&lt;/code&gt;, and after much investigation Eric Froemling managed to determine that Apple use a scanner and reject any app that has that string mentioned anywhere within their bundle.&lt;/p&gt;
&lt;p&gt;Russell Keith-Magee has &lt;a href="https://discuss.python.org/t/handling-incompatibilities-with-app-store-review-processes/56011"&gt;a thread&lt;/a&gt; on the Python forum discussing solutions. He doesn't think attempts to collaborate with Apple are likely to help:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;That definitely sounds appealing as an approach - but in this case, it’s going to be screaming into the void. There’s barely even an appeals process for app rejection on Apple’s App Store. We definitely don’t have any sort of channel to raise a complaint that we could reasonably believe would result in a change of policy.&lt;/p&gt;
&lt;/blockquote&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://lwn.net/SubscriberLink/979671/4fb7c1827536d1ae/"&gt;LWN&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/appstore"&gt;appstore&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/russell-keith-magee"&gt;russell-keith-magee&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="appstore"/><category term="python"/><category term="russell-keith-magee"/></entry><entry><title>Quoting John Gruber</title><link href="https://simonwillison.net/2024/Jun/24/john-gruber/#atom-tag" rel="alternate"/><published>2024-06-24T18:00:26+00:00</published><updated>2024-06-24T18:00:26+00:00</updated><id>https://simonwillison.net/2024/Jun/24/john-gruber/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://daringfireball.net/2024/06/wwdc24_apple_intelligence"&gt;&lt;p&gt;&lt;a href="https://www.apple.com/newsroom/2024/06/introducing-apple-intelligence-for-iphone-ipad-and-mac/" title="Apple News: &amp;quot;Introducing Apple Intelligence, the Personal Intelligence System That Puts Powerful Generative Models at the Core of iPhone, iPad, and Mac&amp;quot;"&gt;What Apple unveiled&lt;/a&gt; last week with &lt;a href="https://www.apple.com/apple-intelligence/"&gt;Apple Intelligence&lt;/a&gt; wasn't so much new products, but new features—a slew of them—for existing products, powered by generative AI.&lt;/p&gt;
&lt;p&gt;[...] These aren't new apps or new products. They're the most used, most important apps Apple makes, the core apps that define the Apple platforms ecosystem, and Apple is using generative AI to make them better and more useful—without, in any way, rendering them unfamiliar.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://daringfireball.net/2024/06/wwdc24_apple_intelligence"&gt;John Gruber&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/john-gruber"&gt;john-gruber&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="llms"/><category term="ai"/><category term="generative-ai"/><category term="john-gruber"/></entry><entry><title>Transcripts on Apple Podcasts</title><link href="https://simonwillison.net/2024/Jun/13/transcripts-on-apple-podcasts/#atom-tag" rel="alternate"/><published>2024-06-13T20:14:19+00:00</published><updated>2024-06-13T20:14:19+00:00</updated><id>https://simonwillison.net/2024/Jun/13/transcripts-on-apple-podcasts/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://podcasters.apple.com/support/5316-transcripts-on-apple-podcasts"&gt;Transcripts on Apple Podcasts&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I missed this when it launched &lt;a href="https://www.apple.com/newsroom/2024/03/apple-introduces-transcripts-for-apple-podcasts/"&gt;back in March&lt;/a&gt;: the Apple Podcasts app now features searchable transcripts, including the ability to tap on text and jump to that point in the audio.&lt;/p&gt;
&lt;p&gt;Confusingly, you can only tap to navigate using the view of the transcript that comes up when you hit the quote mark icon during playback - if you click the Transcript link from the episode listing page you get a static transcript without the navigation option.&lt;/p&gt;
&lt;p&gt;Transcripts are created automatically server-side by Apple, or podcast authors can upload their own edited transcript using Apple Podcasts Connect.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/simonw/status/1801316274959749225"&gt;A few people on Twitter told me about this&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/accessibility"&gt;accessibility&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/podcasts"&gt;podcasts&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/transcripts"&gt;transcripts&lt;/a&gt;&lt;/p&gt;



</summary><category term="accessibility"/><category term="apple"/><category term="podcasts"/><category term="transcripts"/></entry><entry><title>Quoting Ben Thompson</title><link href="https://simonwillison.net/2024/Jun/12/ben-thompson/#atom-tag" rel="alternate"/><published>2024-06-12T21:29:38+00:00</published><updated>2024-06-12T21:29:38+00:00</updated><id>https://simonwillison.net/2024/Jun/12/ben-thompson/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://stratechery.com/2024/wwdc-apple-intelligence-apple-aggregates-ai/"&gt;&lt;p&gt;Contrast [Apple Intelligence] to what OpenAI is trying to accomplish with its GPT models, or Google with Gemini, or Anthropic with Claude: those large language models are trying to incorporate all of the available public knowledge to know everything; it’s a dramatically larger and more difficult problem space, which is why they get stuff wrong. There is also a lot of stuff that they don’t know because that information is locked away — like all of the information on an iPhone.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://stratechery.com/2024/wwdc-apple-intelligence-apple-aggregates-ai/"&gt;Ben Thompson&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="gemini"/><category term="claude"/><category term="openai"/><category term="ai"/><category term="llms"/></entry><entry><title>Quoting Matt Webb</title><link href="https://simonwillison.net/2024/Jun/11/matt-webb/#atom-tag" rel="alternate"/><published>2024-06-11T17:26:00+00:00</published><updated>2024-06-11T17:26:00+00:00</updated><id>https://simonwillison.net/2024/Jun/11/matt-webb/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://interconnected.org/home/2024/06/11/siri"&gt;&lt;p&gt;Apple’s terminology distinguishes between “personal intelligence,” on-device and under their control, and “world knowledge,” which is prone to hallucinations – but is also what consumers expect when they use AI, and it’s what may replace Google search as the “point of first intent” one day soon.&lt;/p&gt;
&lt;p&gt;It’s wise for them to keep world knowledge separate, behind a very clear gate, but still engage with it. Protects the brand and hedges their bets.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://interconnected.org/home/2024/06/11/siri"&gt;Matt Webb&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/matt-webb"&gt;matt-webb&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hallucinations"&gt;hallucinations&lt;/a&gt;&lt;/p&gt;



</summary><category term="apple"/><category term="llms"/><category term="ai"/><category term="generative-ai"/><category term="matt-webb"/><category term="hallucinations"/></entry></feed>