<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: replicate</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/replicate.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2024-10-03T19:14:56+00:00</updated><author><name>Simon Willison</name></author><entry><title>Announcing FLUX1.1 [pro] and the BFL API</title><link href="https://simonwillison.net/2024/Oct/3/flux11-pro/#atom-tag" rel="alternate"/><published>2024-10-03T19:14:56+00:00</published><updated>2024-10-03T19:14:56+00:00</updated><id>https://simonwillison.net/2024/Oct/3/flux11-pro/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://blackforestlabs.ai/announcing-flux-1-1-pro-and-the-bfl-api/"&gt;Announcing FLUX1.1 [pro] and the BFL API&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
FLUX is the image generation model family from Black Forest Labs, a startup founded by members of the team that previously created Stable Diffusion.&lt;/p&gt;
&lt;p&gt;Released today, FLUX1.1 [pro] continues the general trend of AI models getting both better and more efficient:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;FLUX1.1 [pro] provides six times faster generation than its predecessor FLUX.1 [pro] while also improving image quality, prompt adherence, and diversity.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Black Forest Labs appear to have settled on a potentially workable business model: their smallest, fastest model FLUX.1 [schnell] is Apache 2 licensed. The next step up is FLUX.1 [dev] which is open weights for non-commercial use only. The [pro] models are closed weights, made available exclusively through their API or partnerships with other API providers.&lt;/p&gt;
&lt;p&gt;I tried the new 1.1 model out using &lt;a href="https://replicate.com/black-forest-labs/flux-1.1-pro"&gt;black-forest-labs/flux-1.1-pro&lt;/a&gt; on Replicate just now. Here's my prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Photograph of a Faberge egg representing the California coast. It should be decorated with ornate pelicans and sea lions and a humpback whale.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img alt="A beautiful faberge egg featuring a humpback whale and pelicans - it is located on a beach and sea lions on that beach are looking at it." src="https://static.simonwillison.net/static/2024/flux-pelican-egg.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;The FLUX models have a reputation for being really good at following complex prompts. In this case I wanted the sea lions to appear in the egg design rather than looking at the egg from the beach, but I imagine I could get better results if I continued to iterate on my prompt.&lt;/p&gt;
&lt;p&gt;The FLUX models are also better at applying text than any other image models I've tried myself.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=41730822"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/stable-diffusion"&gt;stable-diffusion&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/replicate"&gt;replicate&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/text-to-image"&gt;text-to-image&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="stable-diffusion"/><category term="generative-ai"/><category term="replicate"/><category term="text-to-image"/></entry><entry><title>Accessing Llama 2 from the command-line with the llm-replicate plugin</title><link href="https://simonwillison.net/2023/Jul/18/accessing-llama-2/#atom-tag" rel="alternate"/><published>2023-07-18T19:30:59+00:00</published><updated>2023-07-18T19:30:59+00:00</updated><id>https://simonwillison.net/2023/Jul/18/accessing-llama-2/#atom-tag</id><summary type="html">
    &lt;p&gt;The big news today is &lt;a href="https://ai.meta.com/llama/"&gt;Llama 2&lt;/a&gt;, the new openly licensed Large Language Model from Meta AI. It's a really big deal:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Llama 2 is available for free for research and commercial use.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(swyy &lt;a href="https://twitter.com/swyx/status/1681364325469143040"&gt;spotted&lt;/a&gt; that it was rebranded from LLaMA 2 to Llama 2 &lt;a href="https://github.com/facebookresearch/llama/commit/6d4c0c290aeec1fa4399694fefb864be5a153bb6"&gt;a few hours ago&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;Facebook's original LLaMA model, released in February, kicked off a seismic wave of innovation in the world of open source LLMs - from fine-tuned variants to from-scratch recreations. I wrote about the impact this was having a few weeks after its original release in &lt;a href="https://simonwillison.net/2023/Mar/11/llama/"&gt;Large language models are having their Stable Diffusion moment&lt;/a&gt;. If anything I under-estimated how significant this was going to be.&lt;/p&gt;
&lt;p&gt;There was just one catch: you couldn't use it for commercial purposes.&lt;/p&gt;
&lt;p&gt;That's changed with Llama 2!&lt;/p&gt;
&lt;p&gt;It does still have &lt;a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/"&gt;a couple of interesting limitations&lt;/a&gt;, emphasis mine:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You will not use the Llama Materials or any output or results of the Llama Materials &lt;strong&gt;to improve any other large language model&lt;/strong&gt; (excluding Llama 2 or derivative works thereof).&lt;/p&gt;
&lt;p&gt;If, on the Llama 2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, &lt;strong&gt;is greater than 700 million monthly active users&lt;/strong&gt; in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion[...]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This still makes it available for most commercial purposes. Let's try it out.&lt;/p&gt;
&lt;h4&gt;Accessing Llama 2 using LLM&lt;/h4&gt;
&lt;p&gt;My &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; tool provides command-line access to a wide variety of language models, both via web APIs and self-hosted on your own machine.&lt;/p&gt;
&lt;p&gt;Last week &lt;a href="https://simonwillison.net/2023/Jul/12/llm/"&gt;I added plugin support&lt;/a&gt; for extending it to support new models. An independently released plugin can now add support for new models to the existing CLI tool.&lt;/p&gt;
&lt;p&gt;The brand new &lt;a href="https://github.com/simonw/llm-replicate"&gt;llm-replicate&lt;/a&gt; plugin provides CLI access to models hosted on &lt;a href="https://replicate.com/"&gt;Replicate&lt;/a&gt;, and this morning a16z-infra released &lt;a href="https://replicate.com/a16z-infra/llama13b-v2-chat"&gt;a16z-infra/llama13b-v2-chat&lt;/a&gt; which provides Replicate API access to the new Llama 2 13B chat model.&lt;/p&gt;
&lt;p&gt;You'll need a &lt;a href="https://replicate.com/account/api-tokens"&gt;Replicate API key&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Install LLM like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;pip install llm&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(I like to use &lt;a href="https://pypa.github.io/pipx/"&gt;pipx&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;Or if you don't have &lt;code&gt;pip&lt;/code&gt; setup but can use &lt;a href="https://brew.sh/"&gt;Homebrew&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;brew install simonw/llm/llm&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Run &lt;code&gt;llm --version&lt;/code&gt; to confirm it's installed.&lt;/p&gt;
&lt;p&gt;Next, install the &lt;code&gt;llm-replicate&lt;/code&gt; plugin:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-replicate&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The plugin adds a new &lt;code&gt;llm replicate&lt;/code&gt; set of commands - see them with &lt;code&gt;llm --help&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Configure your Replicate API key so the plugin can talk to the API:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm keys &lt;span class="pl-c1"&gt;set&lt;/span&gt; replicate&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;Enter key: &amp;lt;paste key here&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Register the new &lt;code&gt;a16z-infra/llama13b-v2-chat&lt;/code&gt; model with the plugin:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm replicate add a16z-infra/llama13b-v2-chat \
  --chat --alias llama2&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This tells the plugin that it's a "chat" model, which means you can have continuing conversations with it, rather than just sending single prompts.&lt;/p&gt;
&lt;p&gt;Now we can send a prompt, using the &lt;code&gt;llama2&lt;/code&gt; alias we configured there:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm -m llama2 "Ten great names for a pet pelican"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I get this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Certainly! Here are ten great names for a pet pelican:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Peanut: A cute and playful name for a small pelican.&lt;/li&gt;
&lt;li&gt;Puddles: A fun and silly name for a pelican that loves to splash around in water.&lt;/li&gt;
&lt;li&gt;Pete: A classic name that suits a friendly and outgoing pelican.&lt;/li&gt;
&lt;li&gt;Penelope: A pretty and elegant name for a female pelican.&lt;/li&gt;
&lt;li&gt;Percy: A charming and sophisticated name for a male pelican.&lt;/li&gt;
&lt;li&gt;Pippy: A playful and affectionate name for a pelican with a lively personality.&lt;/li&gt;
&lt;li&gt;Pelty: A fun and quirky name for a pelican with a unique sense of style.&lt;/li&gt;
&lt;li&gt;Pip: A sweet and endearing name for a small pelican.&lt;/li&gt;
&lt;li&gt;Plucky: A feisty and adventurous name for a pelican with a big personality.&lt;/li&gt;
&lt;li&gt;Poppy: A lovely and delicate name for a female pelican with a sweet disposition.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I hope these suggestions help you find the perfect name for your pet pelican!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And because it's a chat model, we can send a follow-up prompt to the &lt;a href="https://llm.datasette.io/en/stable/usage.html#continuing-a-conversation"&gt;same conversation&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -c &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Five more and make them more nautical&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;Ahoy! Here be five more nautical names for yer pet pelican:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Seabird: A fitting name for a pelican that loves the ocean and all its wonders.&lt;/li&gt;
&lt;li&gt;Captain Hook: A swashbuckling name for a pelican with a bold and fearless spirit.&lt;/li&gt;
&lt;li&gt;Anchor: A sturdy and dependable name for a pelican that's always grounded and stable.&lt;/li&gt;
&lt;li&gt;Finley: A fishy name for a pelican with a love for the sea and all its creatures.&lt;/li&gt;
&lt;li&gt;Buoyant: A bouncy and cheerful name for a pelican with a lighthearted and carefree attitude.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope these suggestions float yer boat, matey!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;LLM logs every prompt and response sent through it to a SQLite database. I'm using &lt;code&gt;--truncate&lt;/code&gt; here to shorten the output to include in this post:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm logs --truncate&lt;/pre&gt;&lt;/div&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;[
  {
    &lt;span class="pl-ent"&gt;"id"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;01h5n5b3q6s54e03wtvqe90w2g&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"model"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;replicate-a16z-infra-llama13b-v2-chat&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"prompt"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Five more and make them more nautical&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"system"&lt;/span&gt;: &lt;span class="pl-c1"&gt;null&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"response"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Ahoy matey! Here be five more nautical-themed names for yer pet pelican:&lt;span class="pl-cce"&gt;\n\n&lt;/span&gt;1. Captain Hook&lt;span class="pl-cce"&gt;\n&lt;/span&gt;2. Anch...&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"response_json"&lt;/span&gt;: &lt;span class="pl-c1"&gt;null&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"conversation_id"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;01h5n5atsqfg4bhxc53j3ymnx2&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"duration_ms"&lt;/span&gt;: &lt;span class="pl-c1"&gt;7364&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"datetime_utc"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;2023-07-18T18:48:51.999944&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"conversation_name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Ten great names for a pet pelic&lt;span class="pl-cce"&gt;\u2026&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"conversation_model"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;replicate-a16z-infra-llama13b-v2-chat&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
  },
  {
    &lt;span class="pl-ent"&gt;"id"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;01h5n5atss6k9zsr5qs5hb2wzd&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"model"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;replicate-a16z-infra-llama13b-v2-chat&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"prompt"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Ten great names for a pet pelican&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"system"&lt;/span&gt;: &lt;span class="pl-c1"&gt;null&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"response"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Sure, here are ten great names for a pet pelican:&lt;span class="pl-cce"&gt;\n\n&lt;/span&gt;1. Pelty&lt;span class="pl-cce"&gt;\n&lt;/span&gt;2. Peanut&lt;span class="pl-cce"&gt;\n&lt;/span&gt;3. Puddles&lt;span class="pl-cce"&gt;\n&lt;/span&gt;4. Nibbles&lt;span class="pl-cce"&gt;\n&lt;/span&gt;5. Fu...&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"response_json"&lt;/span&gt;: &lt;span class="pl-c1"&gt;null&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"conversation_id"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;01h5n5atsqfg4bhxc53j3ymnx2&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"duration_ms"&lt;/span&gt;: &lt;span class="pl-c1"&gt;7873&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"datetime_utc"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;2023-07-18T18:48:42.356413&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"conversation_name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Ten great names for a pet pelic&lt;span class="pl-cce"&gt;\u2026&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"conversation_model"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;replicate-a16z-infra-llama13b-v2-chat&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
  }
]&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Or run &lt;code&gt;llm logs path&lt;/code&gt; to see the path to the SQLite database file. If you have &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; installed you can open the logs database like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;datasette "$(llm logs path)"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Or with &lt;a href="https://datasette.io/desktop"&gt;Datasette Desktop&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;open "$(llm logs path)"
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;What's next&lt;/h4&gt;
&lt;p&gt;I'm really excited about Llama 2. I expect we'll start seeing all kinds of innovation on top of it as the model weights themselves start to roll out.&lt;/p&gt;
&lt;p&gt;I'm particularly looking forward to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The fine-tuned models. Alpaca, Vicuna et-al did amazing things with fine-tuning on top of the original LLaMA. I expect we'll see dozens of fine-tuned models over the next few weeks.&lt;/li&gt;
&lt;li&gt;Implementation of the &lt;a href="https://til.simonwillison.net/llms/python-react-pattern"&gt;reAct&lt;/a&gt; and &lt;a href="https://arxiv.org/abs/2302.04761"&gt;Toolformer&lt;/a&gt; patterns, which will enable Llama 2 to drive external tools in a similar way to &lt;a href="https://openai.com/blog/function-calling-and-other-api-updates"&gt;OpenAI Functions&lt;/a&gt;. The &lt;a href="https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/"&gt;Llama 2 paper&lt;/a&gt; mentions testing against Toolformer prompts already, so this hopefully won't be too difficult.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Hopefully this time we'll start to see a concerted effort from people to document what kind of prompts work well with Llama 2 for practical purposes, since I expect this model will be an influential piece of the ecosystem for some time to come.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/replicate"&gt;replicate&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cli"/><category term="plugins"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llama"/><category term="local-llms"/><category term="llms"/><category term="replicate"/><category term="llm"/><category term="llm-release"/></entry><entry><title>How I Used Stable Diffusion and Dreambooth to Create A Painted Portrait of My Dog</title><link href="https://simonwillison.net/2023/Apr/16/stable-diffusion-and-dreambooth/#atom-tag" rel="alternate"/><published>2023-04-16T19:57:35+00:00</published><updated>2023-04-16T19:57:35+00:00</updated><id>https://simonwillison.net/2023/Apr/16/stable-diffusion-and-dreambooth/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.shruggingface.com/blog/how-i-used-stable-diffusion-and-dreambooth-to-create-a-painted-portrait-of-my-dog"&gt;How I Used Stable Diffusion and Dreambooth to Create A Painted Portrait of My Dog&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I like posts like this that go into detail in terms of how much work it takes to deliberately get the kind of result you really want using generative AI tools. Jake Dahn trained a Dreambooth model from 40 photos of Queso—his photogenic Golden Retriever—using Replicate, then gathered the prompts from ten images he liked on Lexica and generated over 1,000 different candidate images, picked his favourite, used Draw Things img2img resizing to expand the image beyond the initial crop, then Automatic1111 inpainting to tweak the ears, then Real-ESRGAN 4x+ to upscale for the final print.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=35592847"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/stable-diffusion"&gt;stable-diffusion&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/replicate"&gt;replicate&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/text-to-image"&gt;text-to-image&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="stable-diffusion"/><category term="generative-ai"/><category term="replicate"/><category term="text-to-image"/></entry><entry><title>Fine-tune LLaMA to speak like Homer Simpson</title><link href="https://simonwillison.net/2023/Mar/17/fine-tune-llama-to-speak-like-homer-simpson/#atom-tag" rel="alternate"/><published>2023-03-17T23:08:40+00:00</published><updated>2023-03-17T23:08:40+00:00</updated><id>https://simonwillison.net/2023/Mar/17/fine-tune-llama-to-speak-like-homer-simpson/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://replicate.com/blog/fine-tune-llama-to-speak-like-homer-simpson"&gt;Fine-tune LLaMA to speak like Homer Simpson&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Replicate spent 90 minutes fine-tuning LLaMA on 60,000 lines of dialog from the first 12 seasons of the Simpsons, and now it can do a good job of producing invented dialog from any of the characters from the series. This is a really interesting result: I’ve been skeptical about how much value can be had from fine-tuning large models on just a tiny amount of new data, assuming that the new data would be statistically irrelevant compared to the existing model. Clearly my mental model around this was incorrect.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/the-simpsons"&gt;the-simpsons&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/replicate"&gt;replicate&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fine-tuning"&gt;fine-tuning&lt;/a&gt;&lt;/p&gt;



</summary><category term="the-simpsons"/><category term="ai"/><category term="generative-ai"/><category term="llama"/><category term="local-llms"/><category term="llms"/><category term="replicate"/><category term="fine-tuning"/></entry><entry><title>Train and run Stanford Alpaca on your own machine</title><link href="https://simonwillison.net/2023/Mar/16/train-and-run-stanford-alpaca-on-your-own-machine/#atom-tag" rel="alternate"/><published>2023-03-16T16:10:39+00:00</published><updated>2023-03-16T16:10:39+00:00</updated><id>https://simonwillison.net/2023/Mar/16/train-and-run-stanford-alpaca-on-your-own-machine/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://replicate.com/blog/replicate-alpaca"&gt;Train and run Stanford Alpaca on your own machine&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The team at Replicate managed to train their own copy of Stanford’s Alpaca—a fine-tuned version of LLaMA that can follow instructions like ChatGPT. Here they provide step-by-step instructions for recreating Alpaca yourself—running the training needs one or more A100s for a few hours, which you can rent through various cloud providers.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/stanford"&gt;stanford&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/replicate"&gt;replicate&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fine-tuning"&gt;fine-tuning&lt;/a&gt;&lt;/p&gt;



</summary><category term="stanford"/><category term="ai"/><category term="generative-ai"/><category term="llama"/><category term="local-llms"/><category term="llms"/><category term="replicate"/><category term="fine-tuning"/></entry><entry><title>A tool to run caption extraction against online videos using Whisper and GitHub Issues/Actions</title><link href="https://simonwillison.net/2022/Sep/30/action-transcription/#atom-tag" rel="alternate"/><published>2022-09-30T00:56:28+00:00</published><updated>2022-09-30T00:56:28+00:00</updated><id>https://simonwillison.net/2022/Sep/30/action-transcription/#atom-tag</id><summary type="html">
    &lt;p&gt;I released a new project this weekend, built during the Bellingcat Hackathon (I came second!) It's called &lt;a href="https://github.com/simonw/action-transcription"&gt;Action Transcription&lt;/a&gt; and it's a tool for caturing captions and transcripts from online videos.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://www.youtube.com/watch?v=AneNxjSGn1I"&gt;my video&lt;/a&gt; introducing the new tool:&lt;/p&gt;
&lt;iframe allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="allowfullscreen" frameborder="0" height="315" src="https://www.youtube-nocookie.com/embed/AneNxjSGn1I" style="max-width: 100%" width="560"&gt; &lt;/iframe&gt;
&lt;h4&gt;Bellingcat&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://www.bellingcat.com/about/"&gt;Bellingcat&lt;/a&gt; describe themselves as an "independent international collective of researchers, investigators and citizen journalists using open source and social media investigation to probe a variety of subjects".&lt;/p&gt;
&lt;p&gt;They specialize in open source intelligence - which, confusingly, does NOT mean "open source software" - this is a &lt;a href="https://en.wikipedia.org/wiki/Open-source_intelligence"&gt;much older usage of the term&lt;/a&gt; that describes the use of publicly available information to gather intelligence.&lt;/p&gt;
&lt;p&gt;They have broken a LOT of impressive stories over their eight year lifespan. Wikipedia &lt;a href="https://en.wikipedia.org/wiki/Bellingcat"&gt;has a good list&lt;/a&gt; - highlights include identifying the suspects behind the &lt;a href="https://en.wikipedia.org/wiki/Bellingcat#Skripal_poisoning"&gt;Skripal poisoning case&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The theme of the hackathon was "General Digital Investigation Tools". The goal was to build prototypes of tools that could be used by their community of investigators - most of whom are volunteers working from home with little-to-no budget, and often with limited technical skills (they can use tools very effectively but they might not be comfortable writing code or using the command-line).&lt;/p&gt;
&lt;p&gt;Inspired by the recent release of &lt;a href="https://github.com/openai/whisper"&gt;OpenAI's Whisper&lt;/a&gt;, I decided to build a tool that would make it easier to extract captions and transcripts from videos on social media sites.&lt;/p&gt;
&lt;h4&gt;Why GitHub Actions and GitHub Issues?&lt;/h4&gt;
&lt;p&gt;My goals for the project were:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Help people achieve something useful&lt;/li&gt;
&lt;li&gt;Make it as inexpensive to run as possible - ideally free&lt;/li&gt;
&lt;li&gt;Make it easy for people to install and run their own copies&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I decided to build the entire thing using GitHub Actions and GitHub Issues.&lt;/p&gt;
&lt;p&gt;GitHub Actions is a powerful service for running CI jobs and other automation, but its best feature for this particular project is that it's free.&lt;/p&gt;
&lt;p&gt;I'm fine with spending money myself, but if I'm building tools for other people having a way for them to run the tool without paying for anything is a huge win.&lt;/p&gt;
&lt;p&gt;My tool needed a UI. To keep things as simple as possible, i didn't want to host anything outside of GitHub itself. So I turned to GitHub Issues to provide the interface layer.&lt;/p&gt;
&lt;p&gt;It's easy to create Actions scripts that trigger when a new issue is created. And those scripts can then interact with that issue - attaching comments, or even closing it as completed.&lt;/p&gt;
&lt;p&gt;I decided that my flow would be:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The user opens an issue and pastes in a link to an online video.&lt;/li&gt;
&lt;li&gt;GitHub Actions is triggered by that issue, extracts the URL and fetches the video using &lt;a href="https://youtube-dl.org/"&gt;youtube-dl&lt;/a&gt; (which, despite the name, can actually download videos from &lt;a href="http://ytdl-org.github.io/youtube-dl/supportedsites.html"&gt;over 1,200 sites&lt;/a&gt; including many of the social media services popular in Russia).&lt;/li&gt;
&lt;li&gt;The script extracts just the audio from the video.&lt;/li&gt;
&lt;li&gt;The audio is then passed through OpenAI's Whisper, which can create a high quality transcript in the original language AND create a shockingly good English translation.&lt;/li&gt;
&lt;li&gt;The caption is then both written back to the GitHub repository and attached to the original issue as a comment.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;GitHub Actions doesn't (yet) provide GPUs, and Whisper works a whole lot faster with GPU access. So I decided to run Whisper using &lt;a href="https://replicate.com/cjwbw/whisper"&gt;this hosted copy of the model on Replicate&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;Extracting YouTube's captions directly&lt;/h4&gt;
&lt;p&gt;I had a check-in meeting with Tristan from Bellingcat just to make sure my hack wasn't a duplicate effort, and to get feedback on the plan.&lt;/p&gt;
&lt;p&gt;Tristan liked the plan, but pointed out that extracting captions directly from YouTube would be a useful additional feature.&lt;/p&gt;
&lt;p&gt;In addition to supporting manual captions, it turns out YouTube already creates machine-generated captions in over 100 languages! The quality of these isn't nearly as good as OpenAI Whisper, but they're still useful. And they're free (running Whisper currently costs me money).&lt;/p&gt;
&lt;p&gt;So I adapted the plan, to provide the user with two options. The default option would extract captions directly from the video provider - which would definitely work for YouTube and might work for other sites too.&lt;/p&gt;
&lt;p&gt;The second option would use Whisper to create a transcript and a translation, taking longer but providing results even for sites that didn't offer their own captions.&lt;/p&gt;
&lt;p&gt;I decided to use issue tags to trigger these two workflows: tag with "captions" to extract captions directly, tag with "whisper" to use Whisper.&lt;/p&gt;
&lt;h4&gt;The implementation&lt;/h4&gt;
&lt;p&gt;The implementation ended up being &lt;a href="https://github.com/simonw/action-transcription/blob/7d900b209c6c465df35a27bb812d03754677cb78/.github/workflows/issue_created.yml"&gt;218 lines&lt;/a&gt; of JavaScript-embedded-in-YAML in a GitHub Actions &lt;code&gt;issue_created.yml&lt;/code&gt; workflow.&lt;/p&gt;
&lt;p&gt;I used &lt;a href="https://github.com/actions/github-script"&gt;actions/github-script&lt;/a&gt; for it - a convenient reusable Action that provides a pre-configured set of JavaScript objects for interacting with the GitHub API.&lt;/p&gt;
&lt;p&gt;The code isn't hugely elegant: I'm not hugely familiar with the Node.js ecosystem so I ended up hacking around with Copilot quite a bit to figure out the patterns that would work.&lt;/p&gt;
&lt;p&gt;It turns out captions can come back in a variety of different formats. The two most common appeared to be TTML - which uses XML, and WebVTT, a text-based format.&lt;/p&gt;
&lt;p&gt;I decided to archive the original caption files in the GitHub repository itself, but I wanted to extract just the text and post that as the issue comment.&lt;/p&gt;
&lt;p&gt;So I ended up building two tiny new tools: &lt;a href="https://github.com/simonw/webvtt-to-json"&gt;webvtt-to-json&lt;/a&gt; and &lt;a href="https://github.com/simonw/ttml-to-json"&gt;ttml-to-json&lt;/a&gt; - which converted the different formats into a standard JSON format of my own invention, normalizing the captions so I could then extract the text and include it in a comment.&lt;/p&gt;
&lt;p&gt;Hackathons tend to encourage some pretty scrappy solutions!&lt;/p&gt;
&lt;h4&gt;The results&lt;/h4&gt;
&lt;p&gt;These two issues demonstrate the final result of the tool:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/action-transcription-demo/issues/3"&gt;Example issue with a VK video transcribed to English using Whisper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/action-transcription-demo/issues/4"&gt;Example issue that extracted YouTube auto-generated English captions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That first one in particular shows quite how good the Whisper model is at handling Russian text, and translating it to English.&lt;/p&gt;
&lt;h4&gt;Adding issue templates&lt;/h4&gt;
&lt;p&gt;I added one last enhancement to the project after recording the demo video for the judges embedded above.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository"&gt;Issue templates&lt;/a&gt; are a new GitHub feature that let you define a form that users must fill out when they create a new issue.&lt;/p&gt;
&lt;p&gt;Frustratingly, these only work with public repositories. I had built my hack in a private repo at first, so I was only able to explore using issue templates once I had made it public.&lt;/p&gt;
&lt;p&gt;I created &lt;a href="https://github.com/simonw/action-transcription/tree/7d900b209c6c465df35a27bb812d03754677cb78/.github/ISSUE_TEMPLATE"&gt;two issue templates&lt;/a&gt; - one for caption tasks and one for whisper tasks.&lt;/p&gt;
&lt;p&gt;Now when a user goes to open a new issue they get to chose one of the two templates and fill in the URL as part of a form! Here's a GIF demo showing that flow in action:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/action-transcription-demo.gif" alt="Animated demo. Click Issues, then New Issue, then select Get Started on the Capture captions menu option. Paste in a URL and click Submit new issue." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4&gt;Template repositories&lt;/h4&gt;
&lt;p&gt;One last trick. I want users to be able to run this system themselves, on their own GitHub account.&lt;/p&gt;
&lt;p&gt;I made &lt;a href="https://github.com/simonw/action-transcription"&gt;simonw/action-transcription&lt;/a&gt; a &lt;a href="https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-template-repository"&gt;template repository&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This means that any user can click a green button to get their own copy of the repository - and when they do, they'll get their own fully configured copy of the GitHub Actions workflows too.&lt;/p&gt;
&lt;p&gt;If they want to use Whisper they'll need to get an API key from &lt;a href="https://replicate.com/"&gt;Replicate.com&lt;/a&gt; and add it to their repository's secrets - but regular caption extraction will work fine without that.&lt;/p&gt;
&lt;p&gt;I've used this technique before - I wrote about it here:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2022/Mar/14/shot-scraper-template/"&gt;Instantly create a GitHub repository to take screenshots of a web page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2021/Aug/28/dynamic-github-repository-templates/"&gt;Dynamic content for GitHub repository templates using cookiecutter and GitHub Actions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;GitHub Actions as a platform&lt;/h4&gt;
&lt;p&gt;I'm pleased with how this project turned out. But I'm mainly excited about the underlying pattern. I think building tools using GitHub Actions that people can clone to their own accounts is a really promising way of developing sophisticated automated software that people can then run independently, entirely through the GitHub web interface.&lt;/p&gt;
&lt;p&gt;I'm excited to see more tools adopt a similar pattern.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hackathons"&gt;hackathons&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/bellingcat"&gt;bellingcat&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/whisper"&gt;whisper&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/replicate"&gt;replicate&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/speech-to-text"&gt;speech-to-text&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="projects"/><category term="hackathons"/><category term="bellingcat"/><category term="github-actions"/><category term="openai"/><category term="whisper"/><category term="replicate"/><category term="github-issues"/><category term="speech-to-text"/></entry></feed>