<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: facebook</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/facebook.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-04-08T23:07:44+00:00</updated><author><name>Simon Willison</name></author><entry><title>Meta's new model is Muse Spark, and meta.ai chat has some interesting tools</title><link href="https://simonwillison.net/2026/Apr/8/muse-spark/#atom-tag" rel="alternate"/><published>2026-04-08T23:07:44+00:00</published><updated>2026-04-08T23:07:44+00:00</updated><id>https://simonwillison.net/2026/Apr/8/muse-spark/#atom-tag</id><summary type="html">
    &lt;p&gt;Meta &lt;a href="https://ai.meta.com/blog/introducing-muse-spark-msl/"&gt;announced Muse Spark&lt;/a&gt; today, their first model release since Llama 4 &lt;a href="https://simonwillison.net/2025/Apr/5/llama-4-notes/"&gt;almost exactly a year ago&lt;/a&gt;. It's hosted, not open weights, and the API is currently "a private API preview to select users", but you can try it out today on &lt;a href="https://meta.ai/"&gt;meta.ai&lt;/a&gt; (Facebook or Instagram login required).&lt;/p&gt;
&lt;p&gt;Meta's self-reported benchmarks show it competitive with Opus 4.6, Gemini 3.1 Pro, and GPT 5.4 on selected benchmarks, though notably behind on Terminal-Bench 2.0. Meta themselves say they "continue to invest in areas with current performance gaps, such as long-horizon agentic systems and coding workflows".&lt;/p&gt;
&lt;p&gt;The model is exposed as two different modes on &lt;a href="https://meta.ai/"&gt;meta.ai&lt;/a&gt; - "Instant" and "Thinking". Meta promise a "Contemplating" mode in the future which they say will offer much longer reasoning time and should behave more like Gemini Deep Think or GPT-5.4 Pro.&lt;/p&gt;
&lt;h5 id="a-couple-of-pelicans"&gt;A couple of pelicans&lt;/h5&gt;
&lt;p&gt;I prefer to run &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle/"&gt;my pelican test&lt;/a&gt; via API to avoid being influenced by any invisible system prompts, but since that's not an option I ran it against the chat UI directly.&lt;/p&gt;
&lt;p&gt;Here's the pelican I got for "Instant":&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/muse-spark-instant-pelican.jpg" alt="This is a pretty basic pelican. The bicycle is mangled, the pelican itself has a rectangular beak albeit with a hint of pouch curve below it. Not a very good one." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;And this one for "Thinking":&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/muse-spark-thinking-pelican.png" alt="Much better. Clearly a pelican. Bicycle is the correct shape. Pelican is wearing a blue cycling helmet (albeit badly rendered). Not a bad job at all." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Both SVGs were rendered inline by the Meta AI interface. Interestingly, the Instant model &lt;a href="https://gist.github.com/simonw/ea7466204f1001b7d67afcb5d0532f6f"&gt;output an SVG directly&lt;/a&gt; (with code comments) whereas the Thinking model &lt;a href="https://gist.github.com/simonw/bc911a56006ba44b0bf66abf0f872ab2"&gt;wrapped it in a thin HTML shell&lt;/a&gt; with some unused &lt;code&gt;Playables SDK v1.0.0&lt;/code&gt; JavaScript libraries.&lt;/p&gt;
&lt;p&gt;Which got me curious...&lt;/p&gt;
&lt;h5 id="poking-around-with-tools"&gt;Poking around with tools&lt;/h5&gt;
&lt;p&gt;Clearly Meta's chat harness has some tools wired up to it - at the very least it can render SVG and HTML as embedded frames, Claude Artifacts style.&lt;/p&gt;
&lt;p&gt;But what else can it do?&lt;/p&gt;
&lt;p&gt;I asked it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;what tools do you have access to?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And then:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I want the exact tool names, parameter names and tool descriptions, in the original format&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It spat out detailed descriptions of 16 different tools. You can see &lt;a href="https://gist.github.com/simonw/e1ce0acd70443f93dcd6481e716c4304#response-1"&gt;the full list I got back here&lt;/a&gt; - credit to Meta for not telling their bot to hide these, since it's far less frustrating if I can get them out without having to mess around with jailbreaks.&lt;/p&gt;
&lt;p&gt;Here are highlights derived from that response:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Browse and search&lt;/strong&gt;. &lt;code&gt;browser.search&lt;/code&gt; can run a web search through an undisclosed search engine, &lt;code&gt;browser.open&lt;/code&gt; can load the full page from one of those search results and &lt;code&gt;browser.find&lt;/code&gt; can run pattern matches against the returned page content.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Meta content search&lt;/strong&gt;. &lt;code&gt;meta_1p.content_search&lt;/code&gt; can run "Semantic search across Instagram, Threads, and Facebook posts" - but only for posts the user has access to view which were created since 2025-01-01. This tool has some powerful looking parameters, including &lt;code&gt;author_ids&lt;/code&gt;, &lt;code&gt;key_celebrities&lt;/code&gt;, &lt;code&gt;commented_by_user_ids&lt;/code&gt;, and &lt;code&gt;liked_by_user_ids&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;"Catalog search"&lt;/strong&gt; - &lt;code&gt;meta_1p.meta_catalog_search&lt;/code&gt; can "Search for products in Meta's product catalog", presumably for the "Shopping" option in the Meta AI model selector.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Image generation&lt;/strong&gt;. &lt;code&gt;media.image_gen&lt;/code&gt; generates images from prompts, and "returns a CDN URL and saves the image to the sandbox". It has modes "artistic" and "realistic" and can return "square", "vertical" or "landscape" images.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;container.python_execution&lt;/strong&gt; - yes! It's &lt;a href="https://simonwillison.net/tags/code-interpreter/"&gt;Code Interpreter&lt;/a&gt;, my favourite feature of both ChatGPT and Claude.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Execute Python code in a remote sandbox environment. Python 3.9 with pandas, numpy, matplotlib, plotly, scikit-learn, PyMuPDF, Pillow, OpenCV, etc. Files persist at &lt;code&gt;/mnt/data/&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Python 3.9 &lt;a href="https://devguide.python.org/versions/"&gt;is EOL&lt;/a&gt; these days but the library collection looks useful.&lt;/p&gt;
&lt;p&gt;I prompted "use python code to confirm sqlite version and python version" and got back Python 3.9.25 and SQLite 3.34.1 (from &lt;a href="https://sqlite.org/releaselog/3_34_1.html"&gt;January 2021&lt;/a&gt;).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;container.create_web_artifact&lt;/strong&gt; - we saw this earlier with the HTML wrapper around the pelican: Meta AI can create HTML+JavaScript files in its container which can then be served up as secure sandboxed iframe interactives. "Set kind to &lt;code&gt;html&lt;/code&gt; for websites/apps or &lt;code&gt;svg&lt;/code&gt; for vector graphics."&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;container.download_meta_1p_media&lt;/strong&gt; is interesting: "Download media from Meta 1P sources into the sandbox. Use post_id for Instagram/Facebook/Threads posts, or &lt;code&gt;catalog_search_citation_id&lt;/code&gt; for catalog product images". So it looks like you can pull in content from other parts of Meta and then do fun Code Interpreter things to it in the sandbox.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;container.file_search&lt;/strong&gt; - "Search uploaded files in this conversation and return relevant excerpts" - I guess for digging through PDFs and similar?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Tools for editing files in the container&lt;/strong&gt; - &lt;code&gt;container.view&lt;/code&gt;, &lt;code&gt;container.insert&lt;/code&gt; (with &lt;code&gt;new_str&lt;/code&gt; and &lt;code&gt;insert_line&lt;/code&gt;), &lt;code&gt;container.str_replace&lt;/code&gt;. These look similar to Claude's &lt;a href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/text-editor-tool#text-editor-tool-commands"&gt;text editor tool commands&lt;/a&gt; - these are becoming a common pattern across any file-equipped agent harness.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;container.visual_grounding&lt;/strong&gt; - see below, this one is &lt;em&gt;fun&lt;/em&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;subagents.spawn_agent&lt;/strong&gt; - the &lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/subagents/"&gt;sub-agent as a tool&lt;/a&gt; pattern. "Spawn an independent sub-agent for research, analysis, or delegation. It returns its final text response."&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;third_party.link_third_party_account&lt;/strong&gt; - "Initiate account linking for a third-party service", currently Google Calendar, Outlook Calendar, Gmail, or Outlook.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="image-analysis-in-the-container"&gt;Image analysis in the container&lt;/h4&gt;
&lt;p&gt;Let's talk about that &lt;code&gt;container.visual_grounding&lt;/code&gt; one. Here's the description in full:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Description:&lt;/strong&gt; Visual grounding tool that analyzes the image, identifies and labels objects, locates regions, or counts objects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Parameters:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;object_names&lt;/strong&gt; (array, required)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;image_path&lt;/strong&gt; (string, default "")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;format_type&lt;/strong&gt; ("bbox"|"point"|"count"|null, default "bbox")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;title&lt;/strong&gt; (string|null)[string]&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;So it can take an image path, optional object names and return things in format &lt;code&gt;bbox&lt;/code&gt;, &lt;code&gt;point&lt;/code&gt; or &lt;code&gt;count&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This sounded to me like Meta's &lt;a href="https://github.com/facebookresearch/segment-anything"&gt;Segment Anything&lt;/a&gt;, (&lt;em&gt;I later heard that it's not, this is a native feature of the model itself via a tool call with a custom system prompt&lt;/em&gt;). I decided to take it for a spin.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Prompt&lt;/strong&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;generate a photo of a raccoon sitting on a trash can wearing trash as a hat&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Response&lt;/strong&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Here you go — king of the alley, in full regalia:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/meta-raccoon-trash-hat.jpg" alt="It's a photo of a raccoon wearing a disposable coffee cup and some newspaper and banana peel as a hat. It's pretty good" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Coffee cup crown, banana peel brim, newspaper feather. Peak raccoon fashion. Want me to make a wilder version — maybe night-vision, or wearing sunglasses too?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;My guess is that image generation in Meta AI is powered by their &lt;a href="https://arxiv.org/abs/2309.15807"&gt;Emu model&lt;/a&gt;, or an updated version of it. This is the same pattern as ChatGPT and Gemini image generation, where images are generated by a tool call (here that's &lt;code&gt;media.image_gen&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;image_gen&lt;/code&gt; tool description said that it "saves the image to the sandbox", so my guess was that I'd then be able to run Python code against it. I tried:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;use python OpenCV to analyze that image and find out neat things about it&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Annoyingly it did &lt;em&gt;not&lt;/em&gt; show me the code it ran, but it did create this image:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/raccoon_analysis.png" alt="OpenCV image analysis dashboard titled &amp;quot;OpenCV Analysis: Raccoon Trash Hat&amp;quot; with four panels: top-left shows the original 1280x1280 photo of a raccoon wearing a trash-lid hat with a coffee cup and newspaper on top in an alley; top-right shows a Canny edge detection version with white edges on black background labeled &amp;quot;Canny Edges - 4.6% density&amp;quot;; bottom-left shows &amp;quot;Top 5 Dominant Colors&amp;quot; as swatches with values RGB(20, 19, 24): 35.3%, RGB(43, 43, 47): 34.2%, RGB(84, 80, 79): 18.3%, RGB(139, 135, 131): 6.3%, RGB(210, 213, 208): 5.8%; bottom-right shows &amp;quot;Color Histograms&amp;quot; line chart with Red, Green, and Blue channels plotted from 0-250 on x-axis and 0-35000 on y-axis, peaking around value 25-50." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Pretty neat! We can both generate images and then use them with Python tools.&lt;/p&gt;
&lt;p&gt;Next, the &lt;code&gt;visual_grounding&lt;/code&gt; tool. Knowing the names of the tools makes it easy to refer to them in prompts:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Now try the container.visual_grounding tool&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The response (&lt;a href="https://gist.github.com/simonw/72d64b325d33e5882b9818d825551b44"&gt;full response here&lt;/a&gt;):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Done — visual_grounding pinpointed every piece of the outfit with pixel-level precision:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/meta-grounding.jpg" alt="Visual grounding analysis of a raccoon photo titled &amp;quot;Visual Grounding: Found 8 objects&amp;quot; with colored dot annotations on the image marking each detected object: coffee cup (515,205) in yellow, banana peel (569,261) in light yellow, newspaper (399,279) in blue, right eye (461,396) in magenta, left eye (582,397) in pink, nose (514,472) in cyan, raccoon (553,548) in green, and trash can lid (550,892) in red. A legend at the bottom lists all objects with their normalized 0-1000 coordinates. Footer text reads &amp;quot;Coordinates are 0-1000 normalized. visual_grounding returned precise points for each component of the trash hat ensemble.&amp;quot; Watermark reads &amp;quot;Meta AI&amp;quot;." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;[...]&lt;/p&gt;
&lt;p&gt;Want me to run it in bbox mode next to draw boxes around the fur texture, or count mode to see how many individual whiskers it can detect?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Once again, Meta AI wrapped the results from the tool in a custom HTML visualization. That's clearly &lt;code&gt;point&lt;/code&gt; mode, but how about the others?&lt;/p&gt;
&lt;p&gt;I prompted:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;yes draw bounding boxes, can it do masks too?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Response:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Yes — bounding boxes work great. Masks? Not directly in &lt;code&gt;visual_grounding&lt;/code&gt;; it only returns three formats: point, bbox, and count. No pixel-level segmentation masks.&lt;/p&gt;
&lt;p&gt;Here's the bbox output:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/meta-bbox.jpg" alt="Bounding box object detection visualization titled &amp;quot;Bounding Boxes (visual_grounding)&amp;quot; with subtitle &amp;quot;8 objects detected — coordinates are 0-1000 normalized&amp;quot; showing a raccoon photo with colored rectangular bounding boxes around detected objects: coffee cup in yellow [424,115,610,273] 186×158, banana peel in yellow [472,215,667,291] 195×76, newspaper in blue [305,223,548,315] 243×92, raccoon in green [270,240,892,845] 622×605, raccoon's face in purple [305,283,750,537] 445×254, right eye in magenta [442,379,489,413] 47×34, left eye in magenta [565,379,605,416] 40×37, and trash can lid in red [107,800,1000,998] 893×198. A legend at the bottom shows each object's name, coordinates, and pixel dimensions in colored cards. Watermark reads &amp;quot;Meta AI&amp;quot;." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;What you’re seeing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;raccoon&lt;/strong&gt;: big box from (270,240) to (892,845) — covers 62% of the width, basically the whole animal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;coffee cup&lt;/strong&gt;: (424,115)-(610,273) — sits highest, 158px tall in the 0-1000 space&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;banana peel&lt;/strong&gt;: (472,215)-(667,291) — overlaps the cup's base, confirming it's draped over the side&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;newspaper&lt;/strong&gt;: (305,223)-(548,315) — left ear coverage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;trash can lid&lt;/strong&gt;: (107,800)-(1000,998) — spans almost the full image width at the bottom&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;face&lt;/strong&gt;: (305,283)-(750,537) — tight crop around the mask&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;eyes&lt;/strong&gt;: both ~40×37 pixel boxes, perfectly level at y=379&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Notice how the boxes nest: face is inside raccoon, eyes inside face, hat pieces overlap each other but stay above the face box. The tool is doing real object localization, not just guessing centers.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;strong&gt;About masks:&lt;/strong&gt; since visual_grounding doesn't output them, I can fake them with OpenCV in the container. Two quick options:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;GrabCut&lt;/strong&gt; — I give it the raccoon bbox, it iteratively segments fur from background and returns a binary mask&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SAM-style color threshold&lt;/strong&gt; — use the bbox plus k-means to isolate the coffee cup or banana peel&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Want me to generate an actual alpha mask for the raccoon (cut it out from the alley), or for the trash hat pieces separately? Takes about 10 seconds and I can give you a transparent PNG.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://static.simonwillison.net/static/2026/meta-bbox.html"&gt;the HTML for that bbox visualization&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Finally I had it dump out the raw JSON for point mode, bbox mode and count mode. You can &lt;a href="https://gist.github.com/simonw/26bfdb22d7d73450c13f49abbf0c251a"&gt;see that all here&lt;/a&gt;. The count mode is particularly interesting - here's that in full:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;[
  {
    &lt;span class="pl-ent"&gt;"object_name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;raccoon whisker&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"points"&lt;/span&gt;: [
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;403&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;445&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;397&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;458&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;392&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;472&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;391&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;485&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;398&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;497&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;415&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;512&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;618&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;446&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;622&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;461&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;625&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;475&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;628&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;488&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;635&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;501&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;651&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;518&lt;/span&gt;}
    ],
    &lt;span class="pl-ent"&gt;"count"&lt;/span&gt;: &lt;span class="pl-c1"&gt;12&lt;/span&gt;
  },
  {
    &lt;span class="pl-ent"&gt;"object_name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;raccoon paw claw&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"points"&lt;/span&gt;: [
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;406&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;845&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;424&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;847&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;441&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;842&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;473&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;833&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;596&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;830&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;635&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;844&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;660&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;843&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;679&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;838&lt;/span&gt;}
    ],
    &lt;span class="pl-ent"&gt;"count"&lt;/span&gt;: &lt;span class="pl-c1"&gt;8&lt;/span&gt;
  },
  {
    &lt;span class="pl-ent"&gt;"object_name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;trash item on head&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"points"&lt;/span&gt;: [
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;514&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;191&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;568&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;263&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;392&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;274&lt;/span&gt;}
    ],
    &lt;span class="pl-ent"&gt;"count"&lt;/span&gt;: &lt;span class="pl-c1"&gt;3&lt;/span&gt;
  },
  {
    &lt;span class="pl-ent"&gt;"object_name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;eye&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"points"&lt;/span&gt;: [
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;462&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;395&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;582&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;398&lt;/span&gt;}
    ],
    &lt;span class="pl-ent"&gt;"count"&lt;/span&gt;: &lt;span class="pl-c1"&gt;2&lt;/span&gt;
  },
  {
    &lt;span class="pl-ent"&gt;"object_name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;ear&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"points"&lt;/span&gt;: [
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;359&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;313&lt;/span&gt;},
      {&lt;span class="pl-ent"&gt;"x"&lt;/span&gt;: &lt;span class="pl-c1"&gt;680&lt;/span&gt;, &lt;span class="pl-ent"&gt;"y"&lt;/span&gt;: &lt;span class="pl-c1"&gt;294&lt;/span&gt;}
    ],
    &lt;span class="pl-ent"&gt;"count"&lt;/span&gt;: &lt;span class="pl-c1"&gt;2&lt;/span&gt;
  }
]&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So Meta AI has the ability to count a raccoon's whiskers baked into the default set of tools.&lt;/p&gt;
&lt;p&gt;Which means... it can count pelicans too!&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/meta-count-pelicans.jpg" alt="Screenshot of a Meta AI conversation where the user uploaded a photo of pelicans on rocks and asked &amp;quot;Count the pelicans&amp;quot;. response reads: &amp;quot;That rocky haul-out is packed — I count 25 brown pelicans in your photo. I went bird by bird, including the ones tucked behind rocks, heads turned away, or just peeking out at the edges. They're all California brown pelicans, with those pale yellow heads and long bills, clustered on the riprap right at the waterline. To make it easy to verify, I've numbered each one on the image:&amp;quot; followed by an annotated version of the photo with red numbered circles (1–25) placed on each pelican. The annotated image has a downward chevron and an &amp;quot;Open&amp;quot; button at the bottom right." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Here's that overlay &lt;a href="https://static.simonwillison.net/static/2026/meta-count-pelicans.html"&gt;exported as HTML&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;Update&lt;/strong&gt;: Meta's &lt;a href="https://twitter.com/jacktripleu/status/2042050863800447387"&gt;Jack Wu confirms&lt;/a&gt; that these tools are part of the new harness they launched alongside the new model.&lt;/em&gt;&lt;/p&gt;

&lt;h4 id="maybe-open-weights-in-the-future-"&gt;Maybe open weights in the future?&lt;/h4&gt;
&lt;p&gt;On Twitter &lt;a href="https://twitter.com/alexandr_wang/status/2041909388852748717"&gt;Alexandr Wang said&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;this is step one. bigger models are already in development with infrastructure scaling to match. private api preview open to select partners today, with plans to open-source future versions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I really hope they do go back to open-sourcing their models. Llama 3.1/3.2/3.3 were excellent laptop-scale model families, and the introductory blog post for Muse Spark had this to say about efficiency:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[...] we can reach the same capabilities with over an order of magnitude less compute than our previous model, Llama 4 Maverick. This improvement also makes Muse Spark significantly more efficient than the leading base models available for comparison.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So are Meta back in the frontier model game? &lt;a href="https://twitter.com/ArtificialAnlys/status/2041913043379220801"&gt;Artificial Analysis&lt;/a&gt; think so - they scored Meta Spark at 52, "behind only Gemini 3.1 Pro, GPT-5.4, and Claude Opus 4.6". Last year's Llama 4 Maverick and Scout scored 18 and 13 respectively.&lt;/p&gt;
&lt;p&gt;I'm waiting for API access - while the tool collection on &lt;a href="https://meta.ai/"&gt;meta.ai&lt;/a&gt; is quite strong the real test of a model like this is still what we can build on top of it.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/code-interpreter"&gt;code-interpreter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-tool-use"&gt;llm-tool-use&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/meta"&gt;meta&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle"&gt;pelican-riding-a-bicycle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-reasoning"&gt;llm-reasoning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="facebook"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="code-interpreter"/><category term="llm-tool-use"/><category term="meta"/><category term="pelican-riding-a-bicycle"/><category term="llm-reasoning"/><category term="llm-release"/></entry><entry><title>Quoting Mark Zuckerberg</title><link href="https://simonwillison.net/2025/Jan/30/mark-zuckerberg/#atom-tag" rel="alternate"/><published>2025-01-30T13:41:49+00:00</published><updated>2025-01-30T13:41:49+00:00</updated><id>https://simonwillison.net/2025/Jan/30/mark-zuckerberg/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://m.facebook.com/story.php?story_fbid=pfbid02oRRTPrY1mvbqBZT4QueimeBrKcVXG4ySxFscRLiEU6QtGxbLi9U4TBojiC9aa19fl&amp;amp;id=4&amp;amp;mibextid=wwXIfr"&gt;&lt;p&gt;Llama 4 is making great progress in training. Llama 4 mini is done with pre-training and our reasoning models and larger model are looking good too. Our goal with Llama 3 was to make open source competitive with closed models, and our goal for Llama 4 is to lead. Llama 4 will be natively multimodal -- it's an omni-model -- and it will have agentic capabilities, so it's going to be novel and it's going to unlock a lot of new use cases.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://m.facebook.com/story.php?story_fbid=pfbid02oRRTPrY1mvbqBZT4QueimeBrKcVXG4ySxFscRLiEU6QtGxbLi9U4TBojiC9aa19fl&amp;amp;id=4&amp;amp;mibextid=wwXIfr"&gt;Mark Zuckerberg&lt;/a&gt;, on Meta's quarterly earnings report&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/vision-llms"&gt;vision-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/meta"&gt;meta&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mark-zuckerberg"&gt;mark-zuckerberg&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/multi-modal-output"&gt;multi-modal-output&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-reasoning"&gt;llm-reasoning&lt;/a&gt;&lt;/p&gt;



</summary><category term="vision-llms"/><category term="llama"/><category term="ai"/><category term="llms"/><category term="meta"/><category term="generative-ai"/><category term="facebook"/><category term="mark-zuckerberg"/><category term="multi-modal-output"/><category term="llm-reasoning"/></entry><entry><title>In Leak, Facebook Partner Brags About Listening to Your Phone’s Microphone to Serve Ads for Stuff You Mention</title><link href="https://simonwillison.net/2024/Sep/2/facebook-cmg/#atom-tag" rel="alternate"/><published>2024-09-02T23:56:44+00:00</published><updated>2024-09-02T23:56:44+00:00</updated><id>https://simonwillison.net/2024/Sep/2/facebook-cmg/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://futurism.com/the-byte/facebook-partner-phones-listening-microphone"&gt;In Leak, Facebook Partner Brags About Listening to Your Phone’s Microphone to Serve Ads for Stuff You Mention&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;small&gt;(I've repurposed some of my &lt;a href="https://lobste.rs/s/mf7guc/leak_facebook_partner_brags_about"&gt;comments on Lobsters&lt;/a&gt; into this commentary on this article. See also &lt;a href="https://simonwillison.net/2025/Jan/2/they-spy-on-you-but-not-like-that/"&gt;I still don’t think companies serve you ads based on spying through your microphone&lt;/a&gt;.)&lt;/small&gt;&lt;/p&gt;
&lt;p&gt;Which is more likely?&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;All of the conspiracy theories are real! The industry managed to keep the evidence from us for decades, but finally a marketing agency of a local newspaper chain has blown the lid off the whole thing, in a bunch of blog posts and PDFs and on a podcast.&lt;/li&gt;
&lt;li&gt;Everyone believed that their phone was listening to them even when it wasn’t. The marketing agency of a local newspaper chain were the first group to be caught taking advantage of that widespread paranoia and use it to try and dupe people into spending money with them, despite the tech not actually working like that.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;My money continues to be on number 2.&lt;/p&gt;
&lt;p&gt;Here’s their pitch deck. My “this is a scam” sense is vibrating like crazy reading it: &lt;a href="https://www.documentcloud.org/documents/25051283-cmg-pitch-deck-on-voice-data-advertising-active-listening"&gt;CMG Pitch Deck on Voice-Data Advertising 'Active Listening'&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It does not read to me like the deck of a company that has actually shipped their own app that tracks audio and uses it for even the most basic version of ad targeting.&lt;/p&gt;
&lt;p&gt;They give the game away on the last two slides:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Prep work:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Create buyer personas by uploading past consumer data into the platform&lt;/li&gt;
&lt;li&gt;Identify top performing keywords relative to your products and services by
analyzing keyword data and past ad campaigns&lt;/li&gt;
&lt;li&gt;Ensure tracking is set up via a tracking pixel placed on your site or landing
page&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Now that preparation is done:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Active listening begins in your target geo and buyer behavior is detected
across 470+ data sources […]&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Our technology analyzes over 1.9 trillion behaviors daily and collects opt-in customer behavior data from hundreds of popular websites that offer top display, video platforms, social applications, and mobile marketplaces that allow laser-focused media buying.&lt;/p&gt;
&lt;p&gt;Sources include: Google, LinkedIn, Facebook, Amazon and many more&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That’s not describing anything ground-breaking or different. That’s how every targeting ad platform works: you upload a bunch of “past consumer data”, identify top keywords and setup a tracking pixel.&lt;/p&gt;
&lt;p&gt;I think &lt;strong&gt;active listening&lt;/strong&gt; is the term that the team came up with for “something that sounds fancy but really just means the way ad targeting platforms work already”. Then they got over-excited about the new metaphor and added that first couple of slides that talk about “voice data”, without really understanding how the tech works or what kind of a shitstorm that could kick off when people who DID understand technology started paying attention to their marketing.&lt;/p&gt;
&lt;p&gt;TechDirt's story &lt;a href="https://www.techdirt.com/2024/08/29/cox-caught-again-bragging-it-spies-on-users-with-embedded-device-microphones-to-sell-ads/"&gt;Cox Media Group Brags It Spies On Users With Device Microphones To Sell Targeted Ads, But It’s Not Clear They Actually Can&lt;/a&gt; included a quote with a clarification from Cox Media Group:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;CMG businesses do not listen to any conversations or have access to anything beyond a third-party aggregated, anonymized and fully encrypted data set that can be used for ad placement. We regret any confusion and we are committed to ensuring our marketing is clear and transparent.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4 id="not-ok"&gt;Why I don't buy the argument that it's OK for people to believe this&lt;/h4&gt;

&lt;p&gt;I've seen variants of this argument before: phones do creepy things to target ads, but it’s not exactly “listen through your microphone” - but there’s no harm in people believing that if it helps them understand that there’s creepy stuff going on generally.&lt;/p&gt;
&lt;p&gt;I don’t buy that. Privacy is important. People who are sufficiently engaged need to be able to understand exactly what’s going on, so they can e.g. campaign for legislators to reign in the most egregious abuses.&lt;/p&gt;
&lt;p&gt;I think it’s harmful letting people continue to believe things about privacy that are not true, when we should instead be helping them understand the things that &lt;em&gt;are&lt;/em&gt; true.&lt;/p&gt;
&lt;p&gt;This discussion thread is full of technically minded, engaged people who still believe an inaccurate version of what their devices are doing. Those are the people that need to have an accurate understanding, because those are the people that can help explain it to others and can hopefully drive meaningful change.&lt;/p&gt;
&lt;p&gt;This is such a damaging conspiracy theory.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;It’s causing some people to stop trusting their most important piece of personal technology: their phone.&lt;/li&gt;
&lt;li&gt;We risk people ignoring REAL threats because they’ve already decided to tolerate made up ones.&lt;/li&gt;
&lt;li&gt;If people believe this and see society doing nothing about it, that’s horrible. That leads to a cynical “nothing can be fixed, I guess we will just let bad people get away with it” attitude. People need to believe that humanity can prevent this kind of abuse from happening.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The fact that nobody has successfully produced an experiment showing that this is happening is one of the main reasons I don’t believe it to be happening.&lt;/p&gt;
&lt;p&gt;It’s like James Randi’s &lt;a href="https://en.wikipedia.org/wiki/One_Million_Dollar_Paranormal_Challenge"&gt;One Million Dollar Paranormal Challenge&lt;/a&gt; - the very fact that nobody has been able to demonstrate it is enough for me not to believe in it.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/conspiracy"&gt;conspiracy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/privacy"&gt;privacy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/microphone-ads-conspiracy"&gt;microphone-ads-conspiracy&lt;/a&gt;&lt;/p&gt;



</summary><category term="conspiracy"/><category term="facebook"/><category term="privacy"/><category term="microphone-ads-conspiracy"/></entry><entry><title>Where Facebook's AI Slop Comes From</title><link href="https://simonwillison.net/2024/Aug/10/where-facebooks-ai-slop-comes-from/#atom-tag" rel="alternate"/><published>2024-08-10T00:26:22+00:00</published><updated>2024-08-10T00:26:22+00:00</updated><id>https://simonwillison.net/2024/Aug/10/where-facebooks-ai-slop-comes-from/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.404media.co/where-facebooks-ai-slop-comes-from/"&gt;Where Facebook&amp;#x27;s AI Slop Comes From&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Jason Koebler continues to provide the most insightful coverage of Facebook's weird ongoing problem with AI slop (&lt;a href="https://simonwillison.net/2024/Jul/15/facebook-is-the-zombie-internet/"&gt;previously&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Who's creating this stuff? It looks to primarily come from individuals in countries like India and the Philippines, inspired by get-rich-quick YouTube influencers, who are gaming Facebook's &lt;a href="https://creators.facebook.com/programs/bonuses/"&gt;Creator Bonus Program&lt;/a&gt; and flooding the platform with AI-generated images.&lt;/p&gt;
&lt;p&gt;Jason highlights this &lt;a href="https://www.youtube.com/watch?v=XtrlZwG8txc"&gt;YouTube video by YT Gyan Abhishek&lt;/a&gt; (136,000 subscribers) and describes it like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;He pauses on another image of a man being eaten by bugs. “They are getting so many likes,” he says. “They got 700 likes within 2-4 hours. They must have earned $100 from just this one photo. Facebook now pays you $100 for 1,000 likes … you must be wondering where you can get these images from. Don’t worry. I’ll show you how to create images with the help of AI.”&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That video is in Hindi but you can request auto-translated English subtitles in the YouTube video settings. The image generator demonstrated in the video is &lt;a href="https://ideogram.ai/"&gt;Ideogram&lt;/a&gt;, which offers a free plan. (Here's &lt;a href="https://gist.github.com/simonw/a96f1b74b192528cdef7098fcbb5f84c?permalink_comment_id=5149573#gistcomment-5149573"&gt;pelicans having a tea party on a yacht&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a YouTube influencer demonstrating Ideogram generating &amp;quot;BMPoor people with thin body&amp;quot; - caption reads along with this you can also see this image" src="https://static.simonwillison.net/static/2024/ideogram-ai-images.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;Jason's reporting here runs deep - he goes as far as buying FewFeed, dedicated software for scraping and automating Facebook, and running his own (unsuccessful) page using prompts from YouTube tutorials like:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;an elderly woman celebrating her 104th birthday with birthday cake realistic family realistic jesus celebrating with her&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I signed up for a $10/month 404 Media subscription to read this and it was absolutely worth the money.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ethics"&gt;ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/slop"&gt;slop&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jason-koebler"&gt;jason-koebler&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/meta"&gt;meta&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-misuse"&gt;ai-misuse&lt;/a&gt;&lt;/p&gt;



</summary><category term="ethics"/><category term="facebook"/><category term="ai"/><category term="slop"/><category term="jason-koebler"/><category term="meta"/><category term="ai-ethics"/><category term="ai-misuse"/></entry><entry><title>Quoting Mark Zuckerberg</title><link href="https://simonwillison.net/2024/Jul/23/mark-zuckerberg/#atom-tag" rel="alternate"/><published>2024-07-23T16:52:07+00:00</published><updated>2024-07-23T16:52:07+00:00</updated><id>https://simonwillison.net/2024/Jul/23/mark-zuckerberg/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/"&gt;&lt;p&gt;I believe the Llama 3.1 release will be an inflection point in the industry where most developers begin to primarily use open source, and I expect that approach to only grow from here.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/"&gt;Mark Zuckerberg&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/meta"&gt;meta&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mark-zuckerberg"&gt;mark-zuckerberg&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;&lt;/p&gt;



</summary><category term="meta"/><category term="open-source"/><category term="generative-ai"/><category term="facebook"/><category term="mark-zuckerberg"/><category term="ai"/><category term="llms"/><category term="llama"/></entry><entry><title>Introducing Llama 3.1: Our most capable models to date</title><link href="https://simonwillison.net/2024/Jul/23/introducing-llama-31/#atom-tag" rel="alternate"/><published>2024-07-23T15:40:47+00:00</published><updated>2024-07-23T15:40:47+00:00</updated><id>https://simonwillison.net/2024/Jul/23/introducing-llama-31/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://ai.meta.com/blog/meta-llama-3-1/"&gt;Introducing Llama 3.1: Our most capable models to date&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
We've been waiting for the largest release of the Llama 3 model for a few months, and now we're getting a whole new model family instead.&lt;/p&gt;
&lt;p&gt;Meta are calling Llama 3.1 405B "the first frontier-level open source AI model" and it really is benchmarking in that GPT-4+ class, competitive with both GPT-4o and Claude 3.5 Sonnet.&lt;/p&gt;
&lt;p&gt;I'm equally excited by the new 8B and 70B 3.1 models - both of which now support a 128,000 token context and benchmark significantly higher than their Llama 3 equivalents. Same-sized models getting more powerful and capable a very reassuring trend. I expect the 8B model (or variants of it) to run comfortably on an array of consumer hardware, and I've run a 70B model on a 64GB M2 in the past.&lt;/p&gt;
&lt;p&gt;The 405B model can at least be run on a single server-class node:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;To support large-scale production inference for a model at the scale of the 405B, we quantized our models from 16-bit (BF16) to 8-bit (FP8) numerics, effectively lowering the compute requirements needed and allowing the model to run within a single server node.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Meta also made a significant &lt;a href="https://twitter.com/aiatmeta/status/1815766335219249513"&gt;change to the license&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;We’ve also updated our license&lt;/strong&gt; to allow developers to use the outputs from Llama models — including 405B — to improve other models for the first time.&lt;/p&gt;
&lt;p&gt;We’re excited about how this will &lt;strong&gt;enable new advancements in the field through synthetic data generation and model distillation workflows&lt;/strong&gt;, capabilities that have never been achieved at this scale in open source.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I'm really pleased to see this. Using models to help improve other models has been a crucial technique in LLM research for over a year now, especially for fine-tuned community models release on Hugging Face. Researchers have mostly been ignoring this restriction, so it's reassuring to see the uncertainty around that finally cleared up.&lt;/p&gt;
&lt;p&gt;Lots more details about the new models in the paper &lt;a href="https://ai.meta.com/research/publications/the-llama-3-herd-of-models/"&gt;The Llama 3 Herd of Models&lt;/a&gt; including this somewhat opaque note about the 15 trillion token training data:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Our final data mix contains roughly 50% of tokens corresponding to general knowledge, 25% of mathematical and reasoning tokens, 17% code tokens, and 8% multilingual tokens.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: I got the Llama 3.1 8B Instruct model working with my &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; tool via a new plugin, &lt;a href="https://simonwillison.net/2024/Jul/23/llm-gguf/"&gt;llm-gguf&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/meta"&gt;meta&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="ai"/><category term="llama"/><category term="llms"/><category term="meta"/><category term="llm-release"/></entry><entry><title>Facebook Is the 'Zombie Internet'</title><link href="https://simonwillison.net/2024/Jul/15/facebook-is-the-zombie-internet/#atom-tag" rel="alternate"/><published>2024-07-15T18:56:54+00:00</published><updated>2024-07-15T18:56:54+00:00</updated><id>https://simonwillison.net/2024/Jul/15/facebook-is-the-zombie-internet/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.404media.co/email/24eb6cea-6fa6-4b98-a2d2-8c4ba33d6c04/"&gt;Facebook Is the &amp;#x27;Zombie Internet&amp;#x27;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Ever since Facebook started to become infested with weird AI-generated images of shrimp Jesus - with thousands of comments and likes - I've been wondering how much of that activity is real humans as opposed to yet more bots.&lt;/p&gt;
&lt;p&gt;Jason Koebler has been on the Facebook AI slop beat for a while. In this superb piece of online investigative reporting he dives deep into an attempt to answer that question, using multiple Facebook burner accounts and contacting more than 300 users who have commented on that kind of image.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I endlessly tried to talk to people who commented on these images, but I had no luck at all. Over the course of several months, I messaged 300 people who commented on bizarre AI-generated images, which I could only do 20 or so at a time before Facebook stopped letting me send messages for several hours. I also commented on dozens of images myself, asking for any human who had also commented on the image to respond to me. Across those hundreds of messages, I got four total responses.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Jacob also talked to Khan Schoolcraft, a moderator of the &lt;a href="https://www.facebook.com/groups/958294675403424/"&gt;Um, isn’t that AI?&lt;/a&gt; group, who said:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In my experience, the supermajority of engagement on viral AI Facebook pages is just as artificially-generated as the content they publish. When exploring their comment sections, one will often see hundreds of bot-like comments interspersed with a few ‘real’ people sounding the alarm to no avail. [...]&lt;/p&gt;
&lt;p&gt;Whether it's a child transforming into a water bottle cyborg, a three-armed flight attendant rescuing Tiger Jesus from a muddy plane crash, or a hybrid human-monkey baby being stung to death by giant hornets, all tend to have copy+pasted captions, reactions &amp;amp; comments which usually make no sense in the observed context.&lt;/p&gt;
&lt;/blockquote&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/slop"&gt;slop&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jason-koebler"&gt;jason-koebler&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="ai"/><category term="generative-ai"/><category term="slop"/><category term="jason-koebler"/></entry><entry><title>AI chatbots are intruding into online communities where people are trying to connect with other humans</title><link href="https://simonwillison.net/2024/Jun/9/ai-chatbots-are-intruding-into-online-communities-where-people-a/#atom-tag" rel="alternate"/><published>2024-06-09T03:14:26+00:00</published><updated>2024-06-09T03:14:26+00:00</updated><id>https://simonwillison.net/2024/Jun/9/ai-chatbots-are-intruding-into-online-communities-where-people-a/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://theconversation.com/ai-chatbots-are-intruding-into-online-communities-where-people-are-trying-to-connect-with-other-humans-229473"&gt;AI chatbots are intruding into online communities where people are trying to connect with other humans&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This thing where Facebook are experimenting with AI bots that reply in a group when someone "asks a question in a post and no one responds within an hour" is absolute grade A slop - unwanted, unreviewed AI generated text that makes the internet a worse place.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://www.404media.co/facebooks-ai-told-parents-group-it-has-a-disabled-child/"&gt;example&lt;/a&gt; where Meta AI replied in an education forum saying "I have a child who is also 2e and has been part of the NYC G&amp;amp;T program" is inexcusable.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://mastodon.social/@dangillmor/112584060245656436"&gt;@dangillmor&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ethics"&gt;ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/slop"&gt;slop&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-misuse"&gt;ai-misuse&lt;/a&gt;&lt;/p&gt;



</summary><category term="ethics"/><category term="facebook"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="slop"/><category term="ai-ethics"/><category term="ai-misuse"/></entry><entry><title>Man caught in scam after AI told him fake Facebook customer support number was legitimate</title><link href="https://simonwillison.net/2024/May/31/caught-in-scam/#atom-tag" rel="alternate"/><published>2024-05-31T16:53:33+00:00</published><updated>2024-05-31T16:53:33+00:00</updated><id>https://simonwillison.net/2024/May/31/caught-in-scam/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.cbc.ca/news/canada/manitoba/facebook-customer-support-scam-1.7219581"&gt;Man caught in scam after AI told him fake Facebook customer support number was legitimate&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This one illustrates a nasty edge-case if you ship any kind of chatbot with your company's name attached to it.&lt;/p&gt;
&lt;p&gt;The scam victim here searched Google for a customer support phone number for Facebook, and came across a suspicious looking phone number.&lt;/p&gt;
&lt;p&gt;He pasted that number into the Meta AI chatbot in Facebook Messenger and asked "is it a Facebook help line?" - and a Meta AI answered:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The phone number 1-xxx-xxx-xxxx is indeed a legitimate Facebook support number. Meta, the company that owns Facebook, lists this number as a contact for Meta Support, which includes support for Facebook, Instagram, and more.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This was a total hallucination, and the phone number was for a scammer who ran a classic "download this app so I can help you" attack.&lt;/p&gt;
&lt;p&gt;It doesn't matter how many disclaimers you add to a chatbot: this kind of misunderstanding from users is inevitable.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=40536860"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ethics"&gt;ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scams"&gt;scams&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hallucinations"&gt;hallucinations&lt;/a&gt;&lt;/p&gt;



</summary><category term="ethics"/><category term="facebook"/><category term="scams"/><category term="ai"/><category term="llms"/><category term="ai-ethics"/><category term="hallucinations"/></entry><entry><title>Quoting Meta AI bot</title><link href="https://simonwillison.net/2024/Apr/18/meta-ai-bot/#atom-tag" rel="alternate"/><published>2024-04-18T03:34:15+00:00</published><updated>2024-04-18T03:34:15+00:00</updated><id>https://simonwillison.net/2024/Apr/18/meta-ai-bot/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://twitter.com/korolova/status/1780450925028548821"&gt;&lt;p&gt;I have a child who is also 2e and has been part of the NYC G&amp;amp;T program. We've had a positive experience with the citywide program, specifically with the program at The Anderson School.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://twitter.com/korolova/status/1780450925028548821"&gt;Meta AI bot&lt;/a&gt;, answering a question on a forum&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai-misuse"&gt;ai-misuse&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ethics"&gt;ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/meta"&gt;meta&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai-misuse"/><category term="ethics"/><category term="generative-ai"/><category term="facebook"/><category term="ai"/><category term="llms"/><category term="meta"/><category term="ai-ethics"/></entry><entry><title>Threads has entered the fediverse</title><link href="https://simonwillison.net/2024/Mar/22/threads-has-entered-the-fediverse/#atom-tag" rel="alternate"/><published>2024-03-22T20:15:20+00:00</published><updated>2024-03-22T20:15:20+00:00</updated><id>https://simonwillison.net/2024/Mar/22/threads-has-entered-the-fediverse/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://engineering.fb.com/2024/03/21/networking-traffic/threads-has-entered-the-fediverse/"&gt;Threads has entered the fediverse&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Threads users with public profiles in certain countries can now turn on a setting which makes their posts available in the fediverse—so users of ActivityPub systems such as Mastodon can follow their accounts to subscribe to their posts.&lt;/p&gt;

&lt;p&gt;It’s only a partial integration at the moment: Threads users can’t themselves follow accounts from other providers yet, and their notifications will show them likes but not boosts or replies: “For now, people who want to see replies on their posts on other fediverse servers will have to visit those servers directly.”&lt;/p&gt;

&lt;p&gt;Depending on how you count, Mastodon has around 9m user accounts of which 1m are active. Threads claims more than 130m active monthly users. The Threads team are developing these features cautiously which is reassuring to see—a clumsy or thoughtless integration could cause all sorts of damage just from the sheer scale of their service.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://anderegg.ca/2024/03/22/poking-at-threads-in-the-fediverse"&gt;Gavin Anderegg&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/threads"&gt;threads&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mastodon"&gt;mastodon&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/activitypub"&gt;activitypub&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fediverse"&gt;fediverse&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="threads"/><category term="mastodon"/><category term="activitypub"/><category term="fediverse"/></entry><entry><title>Quoting Kieran McCarthy</title><link href="https://simonwillison.net/2024/Feb/28/kieran-mccarthy/#atom-tag" rel="alternate"/><published>2024-02-28T15:15:13+00:00</published><updated>2024-02-28T15:15:13+00:00</updated><id>https://simonwillison.net/2024/Feb/28/kieran-mccarthy/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://blog.ericgoldman.org/archives/2024/02/facebook-drops-anti-scraping-lawsuit-against-bright-data-guest-blog-post.htm"&gt;&lt;p&gt;For the last few years, Meta has had a team of attorneys dedicated to policing unauthorized forms of scraping and data collection on Meta platforms. The decision not to further pursue these claims seems as close to waving the white flag as you can get against these kinds of companies. But why? [...]&lt;/p&gt;
&lt;p&gt;In short, I think Meta cares more about access to large volumes of data and AI than it does about outsiders scraping their public data now. My hunch is that they know that any success in anti-scraping cases can be thrown back at them in their own attempts to build AI training databases and LLMs. And they care more about the latter than the former.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://blog.ericgoldman.org/archives/2024/02/facebook-drops-anti-scraping-lawsuit-against-bright-data-guest-blog-post.htm"&gt;Kieran McCarthy&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scraping"&gt;scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/training-data"&gt;training-data&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="scraping"/><category term="ai"/><category term="llms"/><category term="training-data"/></entry><entry><title>All you need is Wide Events, not “Metrics, Logs and Traces”</title><link href="https://simonwillison.net/2024/Feb/27/all-you-need-is-wide-events-not-metrics-logs-and-traces/#atom-tag" rel="alternate"/><published>2024-02-27T22:57:14+00:00</published><updated>2024-02-27T22:57:14+00:00</updated><id>https://simonwillison.net/2024/Feb/27/all-you-need-is-wide-events-not-metrics-logs-and-traces/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://isburmistrov.substack.com/p/all-you-need-is-wide-events-not-metrics"&gt;All you need is Wide Events, not “Metrics, Logs and Traces”&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I’ve heard great things about Meta’s internal observability platform Scuba, here’s an explanation from ex-Meta engineer Ivan Burmistrov describing the value it provides and comparing it to the widely used OpenTelemetry stack.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=39529775"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observability"&gt;observability&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="observability"/></entry><entry><title>Facebook Is Being Overrun With Stolen, AI-Generated Images That People Think Are Real</title><link href="https://simonwillison.net/2023/Dec/19/facebook-is-being-overrun/#atom-tag" rel="alternate"/><published>2023-12-19T02:01:35+00:00</published><updated>2023-12-19T02:01:35+00:00</updated><id>https://simonwillison.net/2023/Dec/19/facebook-is-being-overrun/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.404media.co/facebook-is-being-overrun-with-stolen-ai-generated-images-that-people-think-are-real/"&gt;Facebook Is Being Overrun With Stolen, AI-Generated Images That People Think Are Real&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Excellent investigative piece by Jason Koebler digging into the concerning trend of Facebook engagement farming accounts who take popular aspirational images and use generative AI to recreate hundreds of variants of them, which then gather hundreds of comments from people who have no idea that the images are fake.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ethics"&gt;ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jason-koebler"&gt;jason-koebler&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-misuse"&gt;ai-misuse&lt;/a&gt;&lt;/p&gt;



</summary><category term="ethics"/><category term="facebook"/><category term="ai"/><category term="generative-ai"/><category term="jason-koebler"/><category term="ai-ethics"/><category term="ai-misuse"/></entry><entry><title>Meta/Threads Interoperating in the Fediverse Data Dialogue Meeting yesterday</title><link href="https://simonwillison.net/2023/Dec/12/metathreads/#atom-tag" rel="alternate"/><published>2023-12-12T01:05:32+00:00</published><updated>2023-12-12T01:05:32+00:00</updated><id>https://simonwillison.net/2023/Dec/12/metathreads/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://reb00ted.org/tech/20231208-meta-threads-data-dialogue/"&gt;Meta/Threads Interoperating in the Fediverse Data Dialogue Meeting yesterday&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Johannes Ernst reports from a recent meeting hosted by Meta aimed at bringing together staff from Meta’s Threads social media platform with representatives from the Fediverse.&lt;/p&gt;

&lt;p&gt;Meta have previously announced an intention for Threads to join the Fediverse. It sounds like they’re being extremely thoughtful about how to go about this.&lt;/p&gt;

&lt;p&gt;Two points that stood out for me:&lt;/p&gt;

&lt;p&gt;“Rolling out a large node – like Threads will be – in a complex, distributed system that’s as decentralized and heterogeneous as the Fediverse is not something anybody really has done before.”&lt;/p&gt;

&lt;p&gt;And:&lt;/p&gt;

&lt;p&gt;“When we think of privacy risks when Meta connects to the Fediverse, we usually think of what happens to data that moves from today’s Fediverse into Meta. I didn’t realize the opposite is also quite a challenge (personal data posted to Threads, making its way into the Fediverse) for an organization as heavily monitored by regulators around the world as is Meta.”


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/social-media"&gt;social-media&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mastodon"&gt;mastodon&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fediverse"&gt;fediverse&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/meta"&gt;meta&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="social-media"/><category term="mastodon"/><category term="fediverse"/><category term="meta"/></entry><entry><title>Announcing Purple Llama: Towards open trust and safety in the new world of generative AI</title><link href="https://simonwillison.net/2023/Dec/8/purple-llama/#atom-tag" rel="alternate"/><published>2023-12-08T06:36:26+00:00</published><updated>2023-12-08T06:36:26+00:00</updated><id>https://simonwillison.net/2023/Dec/8/purple-llama/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://ai.meta.com/blog/purple-llama-open-trust-safety-generative-ai/"&gt;Announcing Purple Llama: Towards open trust and safety in the new world of generative AI&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New from Meta AI, Purple Llama is “an umbrella project featuring open trust and safety tools and evaluations meant to level the playing field for developers to responsibly deploy generative AI models and experiences”.&lt;/p&gt;

&lt;p&gt;There are three components: a 27 page “Responsible Use Guide”, a new open model called Llama Guard and CyberSec Eval, “a set of cybersecurity safety evaluations benchmarks for LLMs”.&lt;/p&gt;

&lt;p&gt;Disappointingly, despite this being an initiative around trustworthy LLM development,prompt injection is mentioned exactly once, in the Responsible Use Guide, with an incorrect description describing it as involving “attempts to circumvent content restrictions”!&lt;/p&gt;

&lt;p&gt;The Llama Guard model is interesting: it’s a fine-tune of Llama 2 7B designed to help spot “toxic” content in input or output from a model, effectively an openly released alternative to OpenAI’s moderation API endpoint.&lt;/p&gt;

&lt;p&gt;The CyberSec Eval benchmarks focus on two concepts: generation of insecure code, and preventing models from assisting attackers from generating new attacks. I don’t think either of those are anywhere near as important as prompt injection mitigation.&lt;/p&gt;

&lt;p&gt;My hunch is that the reason prompt injection didn’t get much coverage in this is that, like the rest of us, Meta’s AI research teams have no idea how to fix it yet!

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/simonw/status/1732785316585771489"&gt;@simonw&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-injection"&gt;prompt-injection&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/meta"&gt;meta&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="security"/><category term="ai"/><category term="prompt-injection"/><category term="generative-ai"/><category term="llms"/><category term="meta"/><category term="llm-release"/></entry><entry><title>Seamless Communication</title><link href="https://simonwillison.net/2023/Dec/1/seamless-communication/#atom-tag" rel="alternate"/><published>2023-12-01T17:01:37+00:00</published><updated>2023-12-01T17:01:37+00:00</updated><id>https://simonwillison.net/2023/Dec/1/seamless-communication/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://ai.meta.com/research/seamless-communication/"&gt;Seamless Communication&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A new “family of AI research models” from Meta AI for speech and text translation. The live demo is particularly worth trying—you can record a short webcam video of yourself speaking and get back the same video with your speech translated into another language.&lt;/p&gt;

&lt;p&gt;The key to it is the new SeamlessM4T v2 model, which supports 101 languages for speech input, 96 Languages for text input/output and 35 languages for speech output. SeamlessM4T-Large v2 is a 9GB file, available on Hugging Face.&lt;/p&gt;

&lt;p&gt;Also in this release: SeamlessExpressive, which “captures certain underexplored aspects of prosody such as speech rate and pauses”—effectively maintaining things like expressed enthusiasm across languages.&lt;/p&gt;

&lt;p&gt;Plus SeamlessStreaming, “a model that can deliver speech and text translations with around two seconds of latency”.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://github.com/facebookresearch/seamless_communication"&gt;facebookresearch/seamless_communication&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/transformers"&gt;transformers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/translation"&gt;translation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="transformers"/><category term="translation"/><category term="ai"/><category term="llms"/></entry><entry><title>Quoting Ross Taylor</title><link href="https://simonwillison.net/2023/Nov/15/ross-taylor/#atom-tag" rel="alternate"/><published>2023-11-15T01:15:15+00:00</published><updated>2023-11-15T01:15:15+00:00</updated><id>https://simonwillison.net/2023/Nov/15/ross-taylor/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://twitter.com/rosstaylor90/status/1724547381092573352"&gt;&lt;p&gt;[On Meta's Galactica LLM launch] We did this with a 8 person team which is an order of magnitude fewer people than other LLM teams at the time.&lt;/p&gt;
&lt;p&gt;We were overstretched and lost situational awareness at launch by releasing demo of a &lt;em&gt;base model&lt;/em&gt; without checks. We were aware of what potential criticisms would be, but we lost sight of the obvious in the workload we were under.&lt;/p&gt;
&lt;p&gt;One of the considerations for a demo was we wanted to understand the distribution of scientific queries that people would use for LLMs (useful for instruction tuning and RLHF). Obviously this was a free goal we gave to journalists who instead queried it outside its domain. But yes we should have known better.&lt;/p&gt;
&lt;p&gt;We had a “good faith” assumption that we’d share the base model, warts and all, with four disclaimers about hallucinations on the demo - so people could see what it could do (openness). Again, obviously this didn’t work.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://twitter.com/rosstaylor90/status/1724547381092573352"&gt;Ross Taylor&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hallucinations"&gt;hallucinations&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="llms"/><category term="ai"/><category term="generative-ai"/><category term="hallucinations"/></entry><entry><title>I’m banned for life from advertising on Meta. Because I teach Python.</title><link href="https://simonwillison.net/2023/Oct/19/banned-for-life/#atom-tag" rel="alternate"/><published>2023-10-19T14:56:05+00:00</published><updated>2023-10-19T14:56:05+00:00</updated><id>https://simonwillison.net/2023/Oct/19/banned-for-life/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://lerner.co.il/2023/10/19/im-banned-for-life-from-advertising-on-meta-because-i-teach-python/"&gt;I’m banned for life from advertising on Meta. Because I teach Python.&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
If accurate, this describes a nightmare scenario of automated decision making.&lt;/p&gt;

&lt;p&gt;Reuven recently found he had a permanent ban from advertising on Facebook. They won’t tell him exactly why, and have marked this as a final decision that can never be reviewed.&lt;/p&gt;

&lt;p&gt;His best theory (impossible for him to confirm) is that it’s because he tried advertising a course on Python and Pandas a few years ago which was blocked because a dumb algorithm thought he was trading exotic animals!&lt;/p&gt;

&lt;p&gt;The worst part? An appeal is no longer possible because relevant data is only retained for 180 days and so all of the related evidence has now been deleted.&lt;/p&gt;

&lt;p&gt;Various comments on Hacker News from people familiar with these systems confirm that this story likely holds up.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=37939269"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ethics"&gt;ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pandas"&gt;pandas&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/meta"&gt;meta&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-misuse"&gt;ai-misuse&lt;/a&gt;&lt;/p&gt;



</summary><category term="ethics"/><category term="facebook"/><category term="pandas"/><category term="python"/><category term="ai"/><category term="meta"/><category term="ai-ethics"/><category term="ai-misuse"/></entry><entry><title>Meta in Myanmar, Part I: The Setup</title><link href="https://simonwillison.net/2023/Sep/30/meta-in-myanmar-part-i-the-setup/#atom-tag" rel="alternate"/><published>2023-09-30T02:27:25+00:00</published><updated>2023-09-30T02:27:25+00:00</updated><id>https://simonwillison.net/2023/Sep/30/meta-in-myanmar-part-i-the-setup/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://erinkissane.com/meta-in-myanmar-part-i-the-setup"&gt;Meta in Myanmar, Part I: The Setup&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The first in a series by Erin Kissane explaining in detail exactly how things went so incredibly wrong with Facebook in Myanmar, contributing to a genocide ending hundreds of thousands of lives. This is an extremely tough read.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ethics"&gt;ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/social-media"&gt;social-media&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/meta"&gt;meta&lt;/a&gt;&lt;/p&gt;



</summary><category term="ethics"/><category term="facebook"/><category term="social-media"/><category term="meta"/></entry><entry><title>MMS Language Coverage in Datasette Lite</title><link href="https://simonwillison.net/2023/May/22/mms-language-coverage-in-datasette-lite/#atom-tag" rel="alternate"/><published>2023-05-22T20:01:41+00:00</published><updated>2023-05-22T20:01:41+00:00</updated><id>https://simonwillison.net/2023/May/22/mms-language-coverage-in-datasette-lite/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://lite.datasette.io/?metadata=https://gist.github.com/simonw/492ea1059ad7942f0eec902a58fe9ea0&amp;amp;json=https://gist.github.com/simonw/63aa33ec827b093f9c6a2797df950aba#/data/raw?_filter_column=Language+Name&amp;amp;_filter_op=contains&amp;amp;_filter_value=&amp;amp;_sort=rowid&amp;amp;_facet=ASR&amp;amp;_facet=TTS&amp;amp;_facet=LID&amp;amp;_facet=Subgrouping&amp;amp;_facet=Language+Family"&gt;MMS Language Coverage in Datasette Lite&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I converted the HTML table of 4,021 languages supported by Meta’s new Massively Multilingual Speech models to newline-delimited JSON and loaded it into Datasette Lite. Faceting by Language Family is particularly interesting—the top five families represented are Niger-Congo with 1,019, Austronesian with 609, Sino-Tibetan with 288, Indo-European with 278 and Afro-Asiatic with 222.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-lite"&gt;datasette-lite&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="ai"/><category term="datasette"/><category term="datasette-lite"/></entry><entry><title>Introducing speech-to-text, text-to-speech, and more for 1,100+ languages</title><link href="https://simonwillison.net/2023/May/22/massively-multilingual-speech/#atom-tag" rel="alternate"/><published>2023-05-22T19:22:38+00:00</published><updated>2023-05-22T19:22:38+00:00</updated><id>https://simonwillison.net/2023/May/22/massively-multilingual-speech/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://ai.facebook.com/blog/multilingual-model-speech-recognition/"&gt;Introducing speech-to-text, text-to-speech, and more for 1,100+ languages&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New from Meta AI: Massively Multilingual Speech. “MMS supports speech-to-text and text-to-speech for 1,107 languages and language identification for over 4,000 languages. [...] Some of these, such as the Tatuyo language, have only a few hundred speakers, and for most of these languages, no prior speech technology exists.”&lt;/p&gt;

&lt;p&gt;It’s licensed CC-BY-NC 4.0 though, so it’s not available for commercial use.&lt;/p&gt;

&lt;p&gt;“In a like-for-like comparison with OpenAI’s Whisper, we found that models trained on the Massively Multilingual Speech data achieve half the word error rate, but Massively Multilingual Speech covers 11 times more languages.”&lt;/p&gt;

&lt;p&gt;The training data was mostly sourced from audio Bible translations.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=36034211"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/translation"&gt;translation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/training-data"&gt;training-data&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="translation"/><category term="ai"/><category term="training-data"/></entry><entry><title>ImageBind</title><link href="https://simonwillison.net/2023/May/9/imagebind/#atom-tag" rel="alternate"/><published>2023-05-09T19:04:10+00:00</published><updated>2023-05-09T19:04:10+00:00</updated><id>https://simonwillison.net/2023/May/9/imagebind/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/facebookresearch/ImageBind"&gt;ImageBind&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New model release from Facebook/Meta AI research: “An approach to learn a joint embedding across six different modalities—images, text, audio, depth, thermal, and IMU (inertial measurement units) data”. The non-interactive demo shows searching audio starting with an image, searching images starting with audio, using text to retrieve images and audio, using image and audio to retrieve images (e.g. a barking sound and a photo of a beach to get dogs on a beach) and using audio as input to an image generator.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="ai"/><category term="generative-ai"/><category term="embeddings"/></entry><entry><title>Large language models are having their Stable Diffusion moment</title><link href="https://simonwillison.net/2023/Mar/11/llama/#atom-tag" rel="alternate"/><published>2023-03-11T19:15:37+00:00</published><updated>2023-03-11T19:15:37+00:00</updated><id>https://simonwillison.net/2023/Mar/11/llama/#atom-tag</id><summary type="html">
    &lt;p&gt;The open release of the Stable Diffusion image generation model back in August 2022 was a key moment. I wrote how &lt;a href="https://simonwillison.net/2022/Aug/29/stable-diffusion/"&gt;Stable Diffusion is a really big deal&lt;/a&gt; at the time.&lt;/p&gt;
&lt;p&gt;People could now generate images from text on their own hardware!&lt;/p&gt;
&lt;p&gt;More importantly, developers could mess around with the guts of what was going on.&lt;/p&gt;
&lt;p&gt;The resulting explosion in innovation is still going on today. Most recently, &lt;a href="https://github.com/lllyasviel/ControlNet/blob/main/README.md"&gt;ControlNet&lt;/a&gt; appears to have leapt Stable Diffusion ahead of Midjourney and DALL-E in terms of its capabilities.&lt;/p&gt;
&lt;p&gt;It feels to me like that Stable Diffusion moment back in August kick-started the entire new wave of interest in generative AI - which was then pushed into over-drive by the release of ChatGPT at the end of November.&lt;/p&gt;
&lt;p&gt;That Stable Diffusion moment is happening again right now, for large language models - the technology behind ChatGPT itself.&lt;/p&gt;
&lt;p&gt;This morning &lt;a href="https://til.simonwillison.net/llms/llama-7b-m2"&gt;I ran a GPT-3 class language model&lt;/a&gt; on my own personal laptop for the first time!&lt;/p&gt;
&lt;p&gt;AI stuff was weird already. It's about to get a whole lot weirder.&lt;/p&gt;
&lt;h4 id="llama"&gt;LLaMA&lt;/h4&gt;
&lt;p&gt;Somewhat surprisingly, language models like GPT-3 that power tools like ChatGPT are a lot larger and more expensive to build and operate than image generation models.&lt;/p&gt;
&lt;p&gt;The best of these models have mostly been built by private organizations such as OpenAI, and have been kept tightly controlled - accessible via their API and web interfaces, but not released for anyone to run on their own machines.&lt;/p&gt;
&lt;p&gt;These models are also BIG. Even if you could obtain the GPT-3 model you would not be able to run it on commodity hardware - these things usually require several A100-class GPUs, each of which retail for $8,000+.&lt;/p&gt;
&lt;p&gt;This technology is clearly too important to be entirely controlled by a small group of companies.&lt;/p&gt;
&lt;p&gt;There have been dozens of open large language models released over the past few years, but none of them have quite hit the sweet spot for me in terms of the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Easy to run on my own hardware&lt;/li&gt;
&lt;li&gt;Large enough to be useful - ideally equivalent in capabilities to GPT-3&lt;/li&gt;
&lt;li&gt;Open source enough that they can be tinkered with&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This all changed yesterday, thanks to the combination of Facebook's &lt;a href="https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/"&gt;LLaMA model&lt;/a&gt; and &lt;a href="https://github.com/ggerganov/llama.cpp"&gt;llama.cpp&lt;/a&gt; by Georgi Gerganov.&lt;/p&gt;
&lt;p&gt;Here's the abstract from &lt;a href="https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/"&gt;the LLaMA paper&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It's important to note that LLaMA isn't fully "open". You have to agree to &lt;a href="https://forms.gle/jk851eBVbX1m5TAv5"&gt;some strict terms&lt;/a&gt; to access the model. It's intended as a research preview, and isn't something which can be used for commercial purposes.&lt;/p&gt;
&lt;p&gt;In a totally cyberpunk move, within a few days of the release, someone &lt;a href="https://github.com/facebookresearch/llama/pull/73"&gt;submitted this PR&lt;/a&gt; to the LLaMA repository linking to an unofficial BitTorrent download link for the model files!&lt;/p&gt;
&lt;p&gt;So they're in the wild now. You may not be legally able to build a commercial product on them, but the genie is out of the bottle. That furious typing sound you can hear is thousands of hackers around the world starting to dig in and figure out what life is like when you can run a GPT-3 class model on your own hardware.&lt;/p&gt;
&lt;h4 id="llama-cpp"&gt;llama.cpp&lt;/h4&gt;
&lt;p&gt;LLaMA on its own isn't much good if it's still too hard to run it on a personal laptop.&lt;/p&gt;
&lt;p&gt;Enter &lt;a href="https://ggerganov.com/"&gt;Georgi Gerganov&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Georgi is an open source developer based in Sofia, Bulgaria (according to &lt;a href="https://github.com/ggerganov"&gt;his GitHub profile&lt;/a&gt;). He previously released &lt;a href="https://github.com/ggerganov/whisper.cpp"&gt;whisper.cpp&lt;/a&gt;, a port of OpenAI's Whisper automatic speech recognition model to C++. That project made Whisper applicable to a huge range of new use cases.&lt;/p&gt;
&lt;p&gt;He's just done the same thing with LLaMA.&lt;/p&gt;
&lt;p&gt;Georgi's &lt;a href="https://github.com/ggerganov/llama.cpp"&gt;llama.cpp&lt;/a&gt; project had its &lt;a href="https://github.com/ggerganov/llama.cpp/commit/26c084662903ddaca19bef982831bfb0856e8257"&gt;initial release yesterday&lt;/a&gt;. From the README:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The main goal is to run the model using 4-bit quantization on a MacBook.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;4-bit quantization is a technique for reducing the size of models so they can run on less powerful hardware. It also reduces the model sizes on disk - to 4GB for the 7B model and just under 8GB for the 13B one.&lt;/p&gt;
&lt;p&gt;It totally works!&lt;/p&gt;
&lt;p&gt;I used it to run the 7B LLaMA model on my laptop last night, and then this morning upgraded to the 13B model - the one that Facebook claim is competitive with GPT-3.&lt;/p&gt;
&lt;p&gt;Here are my &lt;a href="https://til.simonwillison.net/llms/llama-7b-m2"&gt;detailed notes on how I did that&lt;/a&gt; - most of the information I needed was already there in the README.&lt;/p&gt;
&lt;p&gt;As my laptop started to spit out text at me I genuinely had a feeling that the world was about to change, again.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2023/llama-smaller.gif" alt="Animated GIF showing LLaMA on my laptop completing a prompt about The first man on the moon was - it only takes a few seconds to complete and outputs information about Neil Armstrong" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I thought it would be a few more years before I could run a GPT-3 class model on hardware that I owned. I was wrong: that future is here already.&lt;/p&gt;
&lt;h4 id="worst-thing"&gt;Is this the worst thing that ever happened?&lt;/h4&gt;
&lt;p&gt;I'm not worried about the science fiction scenarios here. The language model running on my laptop is not an AGI that's going to &lt;a href="https://en.wikipedia.org/wiki/The_Mitchells_vs._the_Machines"&gt;break free and take over the world&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But there are a ton of very real ways in which this technology can be used for harm. Just a few:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Generating spam&lt;/li&gt;
&lt;li&gt;Automated romance scams&lt;/li&gt;
&lt;li&gt;Trolling and hate speech&lt;/li&gt;
&lt;li&gt;Fake news and disinformation&lt;/li&gt;
&lt;li&gt;Automated radicalization (I worry about this one a lot)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Not to mention that this technology makes things up exactly as easily as it parrots factual information, and provides no way to tell the difference.&lt;/p&gt;
&lt;p&gt;Prior to this moment, a thin layer of defence existed in terms of companies like OpenAI having a limited ability to control how people interacted with those models.&lt;/p&gt;
&lt;p&gt;Now that we can run these on our own hardware, even those controls are gone.&lt;/p&gt;
&lt;h4 id="use-this-for-good"&gt;How do we use this for good?&lt;/h4&gt;
&lt;p&gt;I think this is going to have a huge impact on society. My priority is trying to direct that impact in a positive direction.&lt;/p&gt;
&lt;p&gt;It's easy to fall into a cynical trap of thinking there's nothing good here at all, and everything generative AI is either actively harmful or a waste of time.&lt;/p&gt;
&lt;p&gt;I'm personally using generative AI tools on a daily basis now for a variety of different purposes. They've given me a material productivity boost, but more importantly they have expanded my ambitions in terms of projects that I take on.&lt;/p&gt;
&lt;p&gt;I used ChatGPT to learn enough AppleScript to ship a new project in &lt;a href="https://til.simonwillison.net/gpt3/chatgpt-applescript"&gt;less than an hour&lt;/a&gt; just last week!&lt;/p&gt;
&lt;p&gt;I'm going to continue exploring and sharing genuinely positive applications of this technology. It's not going to be un-invented, so I think our priority should be figuring out the most constructive possible ways to use it.&lt;/p&gt;
&lt;h4 id="what-to-look-for-next"&gt;What to look for next&lt;/h4&gt;
&lt;p&gt;Assuming Facebook don't relax the licensing terms, LLaMA will likely end up more a proof-of-concept that local language models are feasible on consumer hardware than a new foundation model that people use going forward.&lt;/p&gt;
&lt;p&gt;The race is on to release the first fully open language model that gives people ChatGPT-like capabilities on their own devices.&lt;/p&gt;
&lt;p&gt;Quoting Stable Diffusion backer &lt;a href="https://twitter.com/EMostaque/status/1634653313089126403"&gt;Emad Mostaque&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Wouldn't be nice if there was a fully open version eh&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4 id="its-happening"&gt;It's happening already...&lt;/h4&gt;
&lt;p&gt;I published this article on Saturday 11th March 2023. On Sunday, Artem Andreenko got it running on a RaspberryPi with 4GB of RAM:&lt;/p&gt;
&lt;blockquote class="twitter-tweet"&gt;&lt;p lang="en" dir="ltr"&gt;I&amp;#39;ve sucefully runned LLaMA 7B model on my 4GB RAM Raspberry Pi 4. It&amp;#39;s super slow about 10sec/token. But it looks we can run powerful cognitive pipelines on a cheap hardware. &lt;a href="https://t.co/XDbvM2U5GY"&gt;pic.twitter.com/XDbvM2U5GY&lt;/a&gt;&lt;/p&gt;- Artem Andreenko 🇺🇦 (@miolini) &lt;a href="https://twitter.com/miolini/status/1634982361757790209?ref_src=twsrc%5Etfw"&gt;March 12, 2023&lt;/a&gt;&lt;/blockquote&gt;
&lt;p&gt;Then on Monday, Anish Thite got it working on a Pixel 6 phone (at 26s/token):&lt;/p&gt;
&lt;blockquote class="twitter-tweet"&gt;&lt;p lang="en" dir="ltr"&gt;&lt;a href="https://twitter.com/ggerganov?ref_src=twsrc%5Etfw"&gt;@ggerganov&lt;/a&gt;&amp;#39;s LLaMA works on a Pixel 6!&lt;br /&gt;&lt;br /&gt;LLaMAs been waiting for this, and so have I &lt;a href="https://t.co/JjEhdzJ2B9"&gt;pic.twitter.com/JjEhdzJ2B9&lt;/a&gt;&lt;/p&gt;- anishmaxxing (@thiteanish) &lt;a href="https://twitter.com/thiteanish/status/1635188333705043969?ref_src=twsrc%5Etfw"&gt;March 13, 2023&lt;/a&gt;&lt;/blockquote&gt;
&lt;p&gt;And then a research lab at Stanford released Alpaca - an instruction fine-tuned version of the model. I wrote more about that in a follow-up post: &lt;a href="https://simonwillison.net/2023/Mar/13/alpaca/"&gt;Stanford Alpaca, and the acceleration of on-device large language model development&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="follow-my-works"&gt;Follow my work&lt;/h4&gt;
&lt;p&gt;Everything I write on my blog goes out in &lt;a href="https://simonwillison.net/atom/everything/"&gt;my Atom feed&lt;/a&gt;, and I have a very &lt;a href="https://fedi.simonwillison.net/@simon"&gt;active Mastodon account&lt;/a&gt;, plus a Twitter account (&lt;a href="https://twitter.com/simonw"&gt;@simonw&lt;/a&gt;) where I continue to post links to new things I've written.&lt;/p&gt;
&lt;p&gt;I'm also starting a newsletter at &lt;a href="https://simonw.substack.com/"&gt;simonw.substack.com&lt;/a&gt;. I plan to send out everything from my blog on a weekly basis, so if email is your preferred way to stay up-to-date you can subscribe there.&lt;/p&gt;
&lt;h4 id="more-stuff-i-have-written"&gt;More stuff I've written&lt;/h4&gt;
&lt;p&gt;My &lt;a href="https://simonwillison.net/tags/generativeai/"&gt;Generative AI tag&lt;/a&gt; has everything, but here are some relevant highlights from the past year:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/May/31/a-datasette-tutorial-written-by-gpt-3/"&gt;A Datasette tutorial written by GPT-3&lt;/a&gt; - 31 May 2022&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/Jun/5/play-with-gpt3/"&gt;How to use the GPT-3 language model&lt;/a&gt; - 5 Jun 2022&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/Jun/23/dall-e/"&gt;First impressions of DALL-E, generating images from text&lt;/a&gt; - 23 Jun 2022&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/Jul/9/gpt-3-explain-code/"&gt;Using GPT-3 to explain how code works&lt;/a&gt; - 9 Jul 2022&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/Aug/29/stable-diffusion/"&gt;Stable Diffusion is a really big deal&lt;/a&gt; - 29 Aug 2022&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/Sep/5/laion-aesthetics-weeknotes/"&gt;Exploring the training data behind Stable Diffusion&lt;/a&gt; - 5 Sep 2022&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/Sep/12/prompt-injection/"&gt;Prompt injection attacks against GPT-3&lt;/a&gt; - 12 Sep 2022&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/Sep/30/action-transcription/"&gt;A tool to run caption extraction against online videos using Whisper and GitHub Issues/Actions&lt;/a&gt; - 30 Sep 2022&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/Oct/5/spell-casting/"&gt;Is the AI spell-casting metaphor harmful or helpful?&lt;/a&gt; - 5 Oct 2022&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/Dec/4/give-me-ideas-for-crimes-to-do/"&gt;A new AI game: Give me ideas for crimes to do&lt;/a&gt; - 4 Dec 2022&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/Dec/5/rust-chatgpt-copilot/"&gt;AI assisted learning: Learning Rust with ChatGPT, Copilot and Advent of Code&lt;/a&gt; - 5 Dec 2022&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2023/Jan/13/semantic-search-answers/"&gt;How to implement Q&amp;amp;A against your documentation with GPT3, embeddings and Datasette&lt;/a&gt; - 13 Jan 2023&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2023/Feb/15/bing/"&gt;Bing: "I will not harm you unless you harm me first"&lt;/a&gt; - 15 Feb 2023&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2023/Feb/19/live-tv/"&gt;I talked about Bing and tried to explain language models on live TV!&lt;/a&gt; - 19 Feb 2023&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2023/Feb/21/in-defense-of-prompt-engineering/"&gt;In defense of prompt engineering&lt;/a&gt; - 21 Feb 2023&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2023/Feb/24/impressions-of-bing/"&gt;Thoughts and impressions of AI-assisted search from Bing&lt;/a&gt; - 24 Feb 2023&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2023/Mar/7/kqed-forum/"&gt;Weeknotes: NICAR, and an appearance on KQED Forum&lt;/a&gt; - 7 Mar 2023&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2023/Mar/10/chatgpt-internet-access/"&gt;ChatGPT can't access the internet, even though it really looks like it can&lt;/a&gt; - 10 Mar 2023&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt-3"&gt;gpt-3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/chatgpt"&gt;chatgpt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/whisper"&gt;whisper&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama-cpp"&gt;llama-cpp&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/georgi-gerganov"&gt;georgi-gerganov&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="facebook"/><category term="ai"/><category term="gpt-3"/><category term="openai"/><category term="generative-ai"/><category term="chatgpt"/><category term="whisper"/><category term="llama"/><category term="local-llms"/><category term="llms"/><category term="llama-cpp"/><category term="llm-release"/><category term="georgi-gerganov"/></entry><entry><title>Running LLaMA 7B on a 64GB M2 MacBook Pro with llama.cpp</title><link href="https://simonwillison.net/2023/Mar/11/running-llama-7b-on-a-64gb-m2-macbook-pro-with-llamacpp/#atom-tag" rel="alternate"/><published>2023-03-11T04:19:43+00:00</published><updated>2023-03-11T04:19:43+00:00</updated><id>https://simonwillison.net/2023/Mar/11/running-llama-7b-on-a-64gb-m2-macbook-pro-with-llamacpp/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://til.simonwillison.net/llms/llama-7b-m2"&gt;Running LLaMA 7B on a 64GB M2 MacBook Pro with llama.cpp&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I got Facebook’s LLaMA 7B to run on my MacBook Pro using llama.cpp (a “port of Facebook’s LLaMA model in C/C++”) by Georgi Gerganov. It works! I’ve been hoping to run a GPT-3 class language model on my own hardware for ages, and now it’s possible to do exactly that. The model itself ends up being just 4GB after applying Georgi’s script to “quantize the model to 4-bits”.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt-3"&gt;gpt-3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama-cpp"&gt;llama-cpp&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/georgi-gerganov"&gt;georgi-gerganov&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="ai"/><category term="gpt-3"/><category term="generative-ai"/><category term="llama"/><category term="local-llms"/><category term="llms"/><category term="llama-cpp"/><category term="georgi-gerganov"/></entry><entry><title>Introducing LLaMA: A foundational, 65-billion-parameter large language model</title><link href="https://simonwillison.net/2023/Feb/24/introducing-llama/#atom-tag" rel="alternate"/><published>2023-02-24T17:34:18+00:00</published><updated>2023-02-24T17:34:18+00:00</updated><id>https://simonwillison.net/2023/Feb/24/introducing-llama/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/"&gt;Introducing LLaMA: A foundational, 65-billion-parameter large language model&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
From the paper: “For instance, LLaMA-13B outperforms GPT-3 on most benchmarks, despite being 10× smaller. We believe that this model will help democratize the access and study of LLMs, since it can be run on a single GPU.”

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/guillaumelample/status/1629151231800115202"&gt;@guillaumelample&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt-3"&gt;gpt-3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="ai"/><category term="gpt-3"/><category term="generative-ai"/><category term="llama"/><category term="llms"/></entry><entry><title>Exploring 10m scraped Shutterstock videos used to train Meta's Make-A-Video text-to-video model</title><link href="https://simonwillison.net/2022/Sep/29/webvid/#atom-tag" rel="alternate"/><published>2022-09-29T19:31:24+00:00</published><updated>2022-09-29T19:31:24+00:00</updated><id>https://simonwillison.net/2022/Sep/29/webvid/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://makeavideo.studio/"&gt;Make-A-Video&lt;/a&gt; is a new "state-of-the-art AI system that generates videos from text" from Meta AI. It looks incredible - it really is DALL-E / Stable Diffusion for video. And it appears to have been trained on 10m video preview clips scraped from Shutterstock.&lt;/p&gt;
&lt;p&gt;I built a new search engine to explore those ten million clips:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://webvid.datasette.io/webvid/videos"&gt;https://webvid.datasette.io/webvid/videos&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/webvid-datasette.jpg" alt="A search for mars rocks returns 33 videos, each shown with a video player" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;This is similar to &lt;a href="https://simonwillison.net/2022/Sep/5/laion-aesthetics-weeknotes/"&gt;the system I built with Andy Baio&lt;/a&gt; a few weeks ago to explore the LAION data used to train Stable Diffusion.&lt;/p&gt;
&lt;h4&gt;Make-A-Video training data&lt;/h4&gt;
&lt;p&gt;Meta AI's &lt;a href="https://makeavideo.studio/Make-A-Video.pdf"&gt;paper describing the model&lt;/a&gt; includes this section about the training data:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Datasets.&lt;/strong&gt; To train the image models, we use a 2.3B subset of the dataset from (Schuhmann et al.) where the text is English. We filter out sample pairs with NSFW images 2, toxic words in the text, or images with a watermark probability larger than 0.5.&lt;/p&gt;
&lt;p&gt;We use WebVid-10M (Bain et al., 2021) and a 10M subset from HD-VILA-100M (Xue et al., 2022) 3 to train our video generation models. Note that only the videos (no aligned text) are used.&lt;/p&gt;
&lt;p&gt;The decoder Dt and the interpolation model is trained on WebVid-10M. SRt l is trained on both WebVid-10M and HD-VILA-10M. While prior work (Hong et al., 2022; Ho et al., 2022) have collected private text-video pairs for T2V generation, we use only public datasets (and no paired text for videos). We conduct automatic evaluation on UCF-101 (Soomro et al., 2012) and MSR-VTT (Xu et al., 2016) in a zero-shot setting.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That 2.3B subset of images is the same LAION data &lt;a href="https://simonwillison.net/2022/Sep/5/laion-aesthetics-weeknotes/"&gt;I explored previously&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/microsoft/XPretrain/tree/main/hd-vila-100m"&gt;HD-VILA-100M&lt;/a&gt; was collected by Microsoft Research Asia - &lt;a href="https://twitter.com/waxpancake/status/1575564723546103808"&gt;Andy Baio notes&lt;/a&gt; that these were scraped from YouTube.&lt;/p&gt;
&lt;p&gt;I decided to take a look at the &lt;a href="https://m-bain.github.io/webvid-dataset/"&gt;WebVid-10M&lt;/a&gt; data.&lt;/p&gt;
&lt;h4&gt;WebVid-10M&lt;/h4&gt;
&lt;p&gt;The &lt;a href="https://m-bain.github.io/webvid-dataset/"&gt;WebVid-10M site&lt;/a&gt; describes the data like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;WebVid-10M is a large-scale dataset of short videos with textual descriptions sourced from the web. The videos are diverse and rich in their content.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;a href="https://www.arxiv-vanity.com/papers/2104.00650/"&gt;accompanying paper&lt;/a&gt; provides a little bit more detail:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We scrape the web for a new dataset of videos with textual description annotations, called WebVid-2M. Our dataset consists of 2.5M video-text pairs, which is an order of magnitude larger than existing video captioning datasets (see Table 1).&lt;/p&gt;
&lt;p&gt;The data was scraped from the web following a similar procedure to Google Conceptual Captions [55] (CC3M). We note that more than 10% of CC3M images are in fact thumbnails from videos, which motivates us to use such video sources to scrape a total of 2.5M text-video pairs. The use of data collected for this study is authorised via the &lt;a href="https://www.gov.uk/guidance/exceptions-to-copyright/"&gt;Intellectual Property Office’s Exceptions to Copyright for Non-Commercial Research and Private Study&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I'm presuming that Web-10M is a larger version of the WebVid-2M dataset described in the paper.&lt;/p&gt;
&lt;p&gt;Most importantly though, the website includes a link to a 2.7GB CSV file - &lt;code&gt;results_10M_train.csv&lt;/code&gt; - containing the full WebVid-10M dataset. The CSV file looks like this:&lt;/p&gt;
&lt;pre lang="csv"&gt;&lt;code&gt;videoid,contentUrl,duration,page_dir,name
21179416,https://ak.picdn.net/shutterstock/videos/21179416/preview/stock-footage-aerial-shot-winter-forest.mp4,PT00H00M11S,006001_006050,Aerial shot winter forest
5629184,https://ak.picdn.net/shutterstock/videos/5629184/preview/stock-footage-senior-couple-looking-through-binoculars-on-sailboat-together-shot-on-red-epic-for-high-quality-k.mp4,PT00H00M29S,071501_071550,"Senior couple looking through binoculars on sailboat together. shot on red epic for high quality 4k, uhd, ultra hd resolution."
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I &lt;a href="https://til.simonwillison.net/sqlite/import-csv"&gt;loaded it into SQLite&lt;/a&gt; and started digging around.&lt;/p&gt;
&lt;h4&gt;It's all from Shutterstock!&lt;/h4&gt;
&lt;p&gt;The big surprise for me when I started exploring the data was this: every single one of the 10,727,582 videos linked in the Datasette started with the same URL prefix:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;https://ak.picdn.net/shutterstock/videos/&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;They're all from Shutterstock. The paper talks about "scraping the web", but it turns out there was only one scraped website involved.&lt;/p&gt;
&lt;p&gt;Here's that first row from the CSV file on Shutterstock itself:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.shutterstock.com/video/clip-21179416-aerial-shot-winter-forest"&gt;https://www.shutterstock.com/video/clip-21179416-aerial-shot-winter-forest&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;As far as I can tell, the training set used here isn't even full Shutterstock videos: it's the free, watermarked preview clips that Shutterstock makes available.&lt;/p&gt;
&lt;p&gt;I guess Shutterstock have really high quality captions for their videos, perfect for training a model on.&lt;/p&gt;
&lt;h4&gt;Implementation notes&lt;/h4&gt;
&lt;p&gt;My &lt;a href="https://github.com/simonw/webvid-datasette"&gt;simonw/webvid-datasette&lt;/a&gt; repository contains the code I used to build the Datasette instance.&lt;/p&gt;
&lt;p&gt;I built a SQLite database with full-text search enabled using &lt;a href="https://sqlite-utils.datasette.io/"&gt;sqlite-utils&lt;/a&gt;. I deployed it directly to Fly by building a Docker image that bundled the 2.5G SQLite database, taking advantage of the &lt;a href="https://simonwillison.net/2021/Jul/28/baked-data/"&gt;Baked Data architectural pattern&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The most interesting custom piece of implementation is the plugin I wrote to add a video player to each result. Here's the &lt;a href="https://github.com/simonw/webvid-datasette/blob/8b8b56b9a8fa4913520fe11c11f45920f6737a86/plugins/render_cell.py"&gt;implementation of that plugin&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;datasette&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;hookimpl&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;markupsafe&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;Markup&lt;/span&gt;

&lt;span class="pl-v"&gt;TEMPLATE&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;"""&lt;/span&gt;
&lt;span class="pl-s"&gt;&amp;lt;video controls width="400" preload="none" poster="{poster}"&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;  &amp;lt;source src="{url}" type="video/mp4"&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&amp;lt;/video&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&amp;lt;p&amp;gt;{filename}&amp;lt;br&amp;gt;On &amp;lt;a href="https://www.shutterstock.com/video/clip-{id}"&amp;gt;Shutterstock&amp;lt;/a&amp;gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;"""&lt;/span&gt;.&lt;span class="pl-en"&gt;strip&lt;/span&gt;()
&lt;span class="pl-v"&gt;VIDEO_URL&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;"https://ak.picdn.net/shutterstock/videos/{id}/preview/{filename}"&lt;/span&gt;
&lt;span class="pl-v"&gt;POSTER_URL&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;"https://ak.picdn.net/shutterstock/videos/{id}/thumb/1.jpg?ip=x480"&lt;/span&gt;


&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;hookimpl&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;render_cell&lt;/span&gt;(&lt;span class="pl-s1"&gt;row&lt;/span&gt;, &lt;span class="pl-s1"&gt;column&lt;/span&gt;, &lt;span class="pl-s1"&gt;value&lt;/span&gt;):
    &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-s1"&gt;column&lt;/span&gt; &lt;span class="pl-c1"&gt;!=&lt;/span&gt; &lt;span class="pl-s"&gt;"filename"&lt;/span&gt;:
        &lt;span class="pl-k"&gt;return&lt;/span&gt;
    &lt;span class="pl-s1"&gt;id&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;row&lt;/span&gt;[&lt;span class="pl-s"&gt;"id"&lt;/span&gt;]
    &lt;span class="pl-s1"&gt;url&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;VIDEO_URL&lt;/span&gt;.&lt;span class="pl-en"&gt;format&lt;/span&gt;(&lt;span class="pl-s1"&gt;id&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;id&lt;/span&gt;, &lt;span class="pl-s1"&gt;filename&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;value&lt;/span&gt;)
    &lt;span class="pl-s1"&gt;poster&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;POSTER_URL&lt;/span&gt;.&lt;span class="pl-en"&gt;format&lt;/span&gt;(&lt;span class="pl-s1"&gt;id&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;id&lt;/span&gt;)
    &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-v"&gt;Markup&lt;/span&gt;(&lt;span class="pl-v"&gt;TEMPLATE&lt;/span&gt;.&lt;span class="pl-en"&gt;format&lt;/span&gt;(&lt;span class="pl-s1"&gt;url&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;url&lt;/span&gt;, &lt;span class="pl-s1"&gt;poster&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;poster&lt;/span&gt;, &lt;span class="pl-s1"&gt;filename&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;value&lt;/span&gt;, &lt;span class="pl-s1"&gt;id&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;id&lt;/span&gt;))&lt;/pre&gt;
&lt;p&gt;I'm using the new &lt;code&gt;render_cell(row)&lt;/code&gt; argument added &lt;a href="https://docs.datasette.io/en/stable/changelog.html#v0-62"&gt;in Datasette 0.62&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The plugin outputs a &lt;code&gt;&amp;lt;video&amp;gt;&lt;/code&gt; element with &lt;code&gt;preload="none"&lt;/code&gt; to avoid the browser downloading the video until the user clicks play (see &lt;a href="https://til.simonwillison.net/html/video-preload-none"&gt;this TIL&lt;/a&gt;). I set the &lt;code&gt;poster&lt;/code&gt; attribute to a thumbnail image from Shutterstock.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ethics"&gt;ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/machine-learning"&gt;machine-learning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/training-data"&gt;training-data&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/text-to-video"&gt;text-to-video&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="ethics"/><category term="facebook"/><category term="machine-learning"/><category term="projects"/><category term="ai"/><category term="datasette"/><category term="generative-ai"/><category term="training-data"/><category term="text-to-video"/><category term="ai-ethics"/></entry><entry><title>Quoting Ryan Broderick</title><link href="https://simonwillison.net/2022/Feb/5/ryan-broderick/#atom-tag" rel="alternate"/><published>2022-02-05T22:41:31+00:00</published><updated>2022-02-05T22:41:31+00:00</updated><id>https://simonwillison.net/2022/Feb/5/ryan-broderick/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://www.garbageday.email/p/the-end-of-the-metaverse-hopefully"&gt;&lt;p&gt;Every few weeks, someone on Twitter notices how demented the content on Facebook is. I’ve covered a lot of these stories. The quick TL;DR is that Facebook’s video section is essentially run by a network of magicians and Vegas stage performers who hack the platform’s algorithm with surreal low-value content designed to distract users long enough to trigger an in-video advertisement and anger them enough to leave a comment.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://www.garbageday.email/p/the-end-of-the-metaverse-hopefully"&gt;Ryan Broderick&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/></entry><entry><title>Quoting Ryan Broderick</title><link href="https://simonwillison.net/2021/Oct/25/ryan-broderick/#atom-tag" rel="alternate"/><published>2021-10-25T20:22:01+00:00</published><updated>2021-10-25T20:22:01+00:00</updated><id>https://simonwillison.net/2021/Oct/25/ryan-broderick/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://www.garbageday.email/p/we-were-the-unpaid-janitors-of-a"&gt;&lt;p&gt;But this much is clear: Facebook knew all along. Their own employees were desperately trying to get anyone inside the company to listen as their products radicalized their own friends and family members. And as they were breaking the world, they had an army of spokespeople publicly and privately gaslighting and intimidating reporters and researchers who were trying to ring the alarm bell. They knew all along and they simply did not give a shit.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://www.garbageday.email/p/we-were-the-unpaid-janitors-of-a"&gt;Ryan Broderick&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/></entry><entry><title>I saw millions compromise their Facebook accounts to fuel fake engagement</title><link href="https://simonwillison.net/2021/Jun/9/fake-engagement/#atom-tag" rel="alternate"/><published>2021-06-09T15:40:12+00:00</published><updated>2021-06-09T15:40:12+00:00</updated><id>https://simonwillison.net/2021/Jun/9/fake-engagement/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://restofworld.org/2021/sophie-zhang-facebook-autolikers/"&gt;I saw millions compromise their Facebook accounts to fuel fake engagement&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Sophie Zhang, ex-Facebook, describes how millions of Facebook users have signed up for “autolikers”—programs that promise likes and engagement for their posts, in exchange for access to their accounts which are then combined into the larger bot farm and used to provide likes to other posts. “Self-compromise was a widespread problem, and possibly the largest single source of existing inauthentic activity on Facebook during my time there. While actual fake accounts can be banned, Facebook is unwilling to disable the accounts of real users who share their accounts with a bot farm.”


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/social-media"&gt;social-media&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="social-media"/></entry></feed>