<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: symbex</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/symbex.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-04-23T14:25:38+00:00</updated><author><name>Simon Willison</name></author><entry><title>llm-fragment-symbex</title><link href="https://simonwillison.net/2025/Apr/23/llm-fragment-symbex/#atom-tag" rel="alternate"/><published>2025-04-23T14:25:38+00:00</published><updated>2025-04-23T14:25:38+00:00</updated><id>https://simonwillison.net/2025/Apr/23/llm-fragment-symbex/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-fragments-symbex"&gt;llm-fragment-symbex&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I released a new LLM &lt;a href="https://llm.datasette.io/en/stable/fragments.html#using-fragments-from-plugins"&gt;fragment loader plugin&lt;/a&gt; that builds on top of my &lt;a href="https://simonwillison.net/2023/Jun/18/symbex/"&gt;Symbex&lt;/a&gt; project.&lt;/p&gt;
&lt;p&gt;Symbex is a CLI tool I wrote that can run against a folder full of Python code and output functions, classes, methods or just their docstrings and signatures, using the Python AST module to parse the code.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;llm-fragments-symbex&lt;/code&gt; brings that ability directly to LLM. It lets you do things like this:&lt;/p&gt;
&lt;pre&gt;llm install llm-fragments-symbex
llm -f symbex:path/to/project -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Describe this codebase&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;I just ran that against my LLM project itself like this:&lt;/p&gt;
&lt;pre&gt;cd llm
llm -f symbex:. -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;guess what this code does&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/b43d5b3ea897900f5c7de7173cc51c82#response"&gt;the full output&lt;/a&gt;, which starts like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This code listing appears to be an index or dump of Python functions, classes, and methods primarily belonging to a codebase related to large language models (LLMs). It covers a broad functionality set related to managing LLMs, embeddings, templates, plugins, logging, and command-line interface (CLI) utilities for interaction with language models. [...]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That page also &lt;a href="https://gist.github.com/simonw/b43d5b3ea897900f5c7de7173cc51c82#prompt-fragments"&gt;shows the input generated by the fragment&lt;/a&gt; - here's a representative extract:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-c"&gt;# from llm.cli import resolve_attachment&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;resolve_attachment&lt;/span&gt;(&lt;span class="pl-s1"&gt;value&lt;/span&gt;):
    &lt;span class="pl-s"&gt;"""Resolve an attachment from a string value which could be:&lt;/span&gt;
&lt;span class="pl-s"&gt;    - "-" for stdin&lt;/span&gt;
&lt;span class="pl-s"&gt;    - A URL&lt;/span&gt;
&lt;span class="pl-s"&gt;    - A file path&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;    Returns an Attachment object.&lt;/span&gt;
&lt;span class="pl-s"&gt;    Raises AttachmentError if the attachment cannot be resolved."""&lt;/span&gt;

&lt;span class="pl-c"&gt;# from llm.cli import AttachmentType&lt;/span&gt;
&lt;span class="pl-k"&gt;class&lt;/span&gt; &lt;span class="pl-v"&gt;AttachmentType&lt;/span&gt;:

    &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;convert&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;, &lt;span class="pl-s1"&gt;value&lt;/span&gt;, &lt;span class="pl-s1"&gt;param&lt;/span&gt;, &lt;span class="pl-s1"&gt;ctx&lt;/span&gt;):

&lt;span class="pl-c"&gt;# from llm.cli import resolve_attachment_with_type&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;resolve_attachment_with_type&lt;/span&gt;(&lt;span class="pl-s1"&gt;value&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;, &lt;span class="pl-s1"&gt;mimetype&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;) &lt;span class="pl-c1"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pl-smi"&gt;Attachment&lt;/span&gt;:&lt;/pre&gt;

&lt;p&gt;If your Python code has good docstrings and type annotations, this should hopefully be a shortcut for providing full API documentation to a model without needing to dump in the entire codebase.&lt;/p&gt;
&lt;p&gt;The above example used 13,471 input tokens and 781 output tokens, using &lt;code&gt;openai/gpt-4.1-mini&lt;/code&gt;. That model is extremely cheap, so the total cost was 0.6638 cents - less than a cent.&lt;/p&gt;
&lt;p&gt;The plugin itself was mostly written by o4-mini using the &lt;a href="https://github.com/simonw/llm-fragments-github"&gt;llm-fragments-github&lt;/a&gt; plugin to load the &lt;a href="https://github.com/simonw/symbex"&gt;simonw/symbex&lt;/a&gt; and &lt;a href="https://github.com/simonw/llm-hacker-news"&gt;simonw/llm-hacker-news&lt;/a&gt; repositories as example code:&lt;/p&gt;
&lt;pre&gt;llm \
  -f github:simonw/symbex \
  -f github:simonw/llm-hacker-news \
  -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Write a new plugin as a single llm_fragments_symbex.py file which&lt;/span&gt;
&lt;span class="pl-s"&gt;   provides a custom loader which can be used like this:&lt;/span&gt;
&lt;span class="pl-s"&gt;   llm -f symbex:path/to/folder - it then loads in all of the python&lt;/span&gt;
&lt;span class="pl-s"&gt;   function signatures with their docstrings from that folder using&lt;/span&gt;
&lt;span class="pl-s"&gt;   the same trick that symbex uses, effectively the same as running&lt;/span&gt;
&lt;span class="pl-s"&gt;   symbex . '*' '*.*' --docs --imports -n&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
   -m openai/o4-mini -o reasoning_effort high&lt;/pre&gt;

&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/c46390522bc839daab6c08bad3f87b39#response"&gt;the response&lt;/a&gt;. 27,819 input, 2,918 output =  4.344 cents.&lt;/p&gt;
&lt;p&gt;In working on this project I identified and fixed &lt;a href="https://github.com/simonw/symbex/issues/46"&gt;a minor cosmetic defect&lt;/a&gt; in Symbex itself. Technically this is a breaking change (it changes the output) so I shipped that as &lt;a href="https://github.com/simonw/symbex/releases/tag/2.0"&gt;Symbex 2.0&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/symbex"&gt;symbex&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="symbex"/><category term="llm"/></entry><entry><title>Symbex 1.4</title><link href="https://simonwillison.net/2023/Sep/5/symbex/#atom-tag" rel="alternate"/><published>2023-09-05T17:29:25+00:00</published><updated>2023-09-05T17:29:25+00:00</updated><id>https://simonwillison.net/2023/Sep/5/symbex/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/symbex/releases/tag/1.4"&gt;Symbex 1.4&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New release of my Symbex tool for finding symbols (functions, methods and classes) in a Python codebase. Symbex can now output matching symbols in JSON, CSV or TSV in addition to plain text.&lt;/p&gt;

&lt;p&gt;I designed this feature for compatibility with the new “llm embed-multi” command—so you can now use Symbex to find every Python function in a nested directory and then pipe them to LLM to calculate embeddings for every one of them.&lt;/p&gt;

&lt;p&gt;I tried it on my projects directory and embedded over 13,000 functions in just a few minutes! Next step is to figure out what kind of interesting things I can do with all of those embeddings.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/symbex"&gt;symbex&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;



</summary><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="embeddings"/><category term="symbex"/><category term="llm"/></entry><entry><title>Weeknotes: Self-hosted language models with LLM plugins, a new Datasette tutorial, a dozen package releases, a dozen TILs</title><link href="https://simonwillison.net/2023/Jul/16/weeknotes/#atom-tag" rel="alternate"/><published>2023-07-16T05:55:54+00:00</published><updated>2023-07-16T05:55:54+00:00</updated><id>https://simonwillison.net/2023/Jul/16/weeknotes/#atom-tag</id><summary type="html">
    &lt;p&gt;A lot of stuff to cover from the past two and a half weeks.&lt;/p&gt;
&lt;h4&gt;LLM and self-hosted language model plugins&lt;/h4&gt;
&lt;p&gt;My biggest project was the &lt;a href="https://simonwillison.net/2023/Jul/12/llm/"&gt;new version of my LLM tool for interacting with Large Language Models&lt;/a&gt;. LLM now accepts plugins for adding alternative language models to the tool, meaning it's now applicable to more than just the OpenAI collection.&lt;/p&gt;
&lt;p&gt;I figured out quite a few of the details of this while offline on a camping trip up in the Northern California redwoods, which forced the issue on figuring out how to work with LLMs that I could host on my own computer because I didn't have a connection to access the OpenAI APIs.&lt;/p&gt;
&lt;p&gt;Comprehensive documentation is sorely lacking in the world of generative AI. I've decided to push back against that for LLM, so I spent a bunch of time working on an extremely comprehensive tutorial for writing a plugin that adds a new language model to the tool:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://llm.datasette.io/en/stable/plugins/tutorial-model-plugin.html"&gt;Writing a plugin to support a new model&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As part of researching this tutorial I finally figured out how to build a Python package using just a &lt;code&gt;pyproject.toml&lt;/code&gt; file, with no &lt;code&gt;setup.py&lt;/code&gt; or &lt;code&gt;setup.cfg&lt;/code&gt; or anything else like that. I wrote that up in detail in &lt;a href="https://til.simonwillison.net/python/pyproject"&gt;Python packages with pyproject.toml and nothing else&lt;/a&gt;, and I've started using that pattern for all of my new Python packages.&lt;/p&gt;
&lt;p&gt;LLM also now includes a Python API for interacting with models, which provides an abstraction that works the same for the OpenAI models and for other models (including self-hosted models) installed via plugins. Here's &lt;a href="https://llm.datasette.io/en/stable/python-api.html"&gt;the documentation for that&lt;/a&gt; - it ends up looking like this:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;

&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-en"&gt;get_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"gpt-3.5-turbo"&lt;/span&gt;)
&lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-s1"&gt;key&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;'YOUR_API_KEY_HERE'&lt;/span&gt;
&lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-en"&gt;prompt&lt;/span&gt;(&lt;span class="pl-s"&gt;"Five surprising names for a pet pelican"&lt;/span&gt;)
&lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;chunk&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt;:
    &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;chunk&lt;/span&gt;, &lt;span class="pl-s1"&gt;end&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;""&lt;/span&gt;)&lt;/pre&gt;
&lt;p&gt;To use another model, just swap its name in for &lt;code&gt;gpt-3.5-turbo&lt;/code&gt;. The self-hosted models provided by the &lt;a href="https://github.com/simonw/llm-gpt4all"&gt;llm-gpt4all&lt;/a&gt; plugin work the same way:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;pip install llm-gpt4all&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;

&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-en"&gt;get_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"ggml-vicuna-7b-1"&lt;/span&gt;)
&lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-en"&gt;prompt&lt;/span&gt;(&lt;span class="pl-s"&gt;"Five surprising names for a pet pelican"&lt;/span&gt;)
&lt;span class="pl-c"&gt;# You can do this instead of looping through the chunks:&lt;/span&gt;
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;response&lt;/span&gt;.&lt;span class="pl-en"&gt;text&lt;/span&gt;())&lt;/pre&gt;
&lt;p&gt;I've released three plugins so far:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/llm-gpt4all"&gt;llm-gpt4all&lt;/a&gt; with 17 self-hosted models from the &lt;a href="https://gpt4all.io/"&gt;GPT4All&lt;/a&gt; project.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/llm-palm"&gt;llm-palm&lt;/a&gt; with Google's &lt;a href="https://blog.google/technology/ai/google-palm-2-ai-large-language-model/"&gt;PaLM 2&lt;/a&gt; language model, via their API.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/llm-mpt30b"&gt;llm-mpt30b&lt;/a&gt; providing the 19GB MPT-30B model, using &lt;a href="https://huggingface.co/TheBloke/mpt-30B-GGML"&gt;TheBloke/mpt-30B-GGML&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I'm looking forward to someone else &lt;a href="https://llm.datasette.io/en/stable/plugins/tutorial-model-plugin.html"&gt;following the tutorial&lt;/a&gt; and releasing their own plugin!&lt;/p&gt;
&lt;h4&gt;A new tutorial: Data analysis with SQLite and Python&lt;/h4&gt;
&lt;p&gt;I presented this as a 2hr45m tutorial at PyCon a few months ago. The video is now available, and I like to try to turn these kinds of things into more permanent documentation.&lt;/p&gt;
&lt;p&gt;The Datasette website has &lt;a href="https://datasette.io/tutorials"&gt;a growing collection of tutorials&lt;/a&gt;, and I decided to make that the final home for this one too.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://datasette.io/tutorials/data-analysis"&gt;Data analysis with SQLite and Python&lt;/a&gt; now has the full 2hr45m video plus an improved version of the handout I used for the talk. The written material there there should also be valuable for people who don't want to spend nearly three hours watching the video!&lt;/p&gt;
&lt;p&gt;As part of putting that page together I solved a problem I've been wanting to figure out for a long time: I figured out a way to build a custom Jinja block tag that looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-text-html-django"&gt;&lt;pre&gt;&lt;span class="pl-e"&gt;{%&lt;/span&gt; &lt;span class="pl-s"&gt;markdown&lt;/span&gt; &lt;span class="pl-e"&gt;%}&lt;/span&gt;
# This will be rendered as markdown

- Bulleted
- List
&lt;span class="pl-e"&gt;{%&lt;/span&gt; &lt;span class="pl-s"&gt;endmarkdown&lt;/span&gt; &lt;span class="pl-e"&gt;%}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I released that in &lt;a href="https://github.com/simonw/datasette-render-markdown/releases/tag/2.2"&gt;datasette-render-markdown 2.2&lt;/a&gt;. I also wrote up a TIL on &lt;a href="https://til.simonwillison.net/jinja/custom-jinja-tags-with-attributes"&gt;Custom Jinja template tags with attributes&lt;/a&gt; describing the pattern I used.&lt;/p&gt;
&lt;p&gt;One bonus feature for that tutorial: I decided to drop in a nested table of contents, automatically derived from the HTML headers on the page.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2023/toc.jpg" alt=" What you'll need python3 and pip Optional: GitHub Codespaces sqlite-utils Using the command-line tools to clean data Exploring data with Datasette Installing Datasette locally Try a database: legislators.db Install some plugins Learning SQL with Datasette Using sqlite-utils as a Python library, to import all the PEPs Enabling full-text search Publishing a database to Vercel Other publishing options Datasette Lite Loading SQLite, CSV and JSON data Installing plugins Further reading Advanced SQL Aggregations Subqueries CTEs JSON Window functions Baked Data Niche Museums and TILs Generating a newsletter with an Observable notebook More demos and further reading Fun demos SpatiaLite " style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I wrote the code for this entirely using the new ChatGPT Code Interpreter, which can write Python based on your description and, crucially, &lt;em&gt;execute it and see if it works&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://chat.openai.com/share/e41efb6d-eae7-454d-9aa2-5284683ba9f9"&gt;my ChatGPT transcript&lt;/a&gt; showing how I built the feature.&lt;/p&gt;
&lt;p&gt;I've been using ChatGPT Code Interpreter for a few months now, and I'm completely hooked: I think it's the most interesting thing in the whole AI space at the moment.&lt;/p&gt;
&lt;p&gt;I participated in a &lt;a href="https://www.latent.space/p/code-interpreter"&gt;Code Interpreter Latent Space&lt;/a&gt; episode to talk about it, which ended up drawing 17,000 listeners on Twitter Spaces and is now also available as a podcast episode, neatly edited together by swyx.&lt;/p&gt;
&lt;h4&gt;Symbex --check and --rexec&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/symbex"&gt;Symbex&lt;/a&gt; is my Python CLI tool for quickly finding Python functions and classes and outputting either the full code or just the signature of the matching symbol. I first &lt;a href="https://simonwillison.net/2023/Jun/18/symbex/"&gt;wrote about that here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/symbex/releases/tag/1.1"&gt;symbex 1.1&lt;/a&gt; adds two new features.&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;symbex --function --undocumented --check&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This new &lt;code&gt;--check&lt;/code&gt; mode is designed to run in Continuous Integration environments. If it finds any symbols matching the filters (in this case functions that are missing their docstring) it returns a non-zero exit code, which will fail the CI step.&lt;/p&gt;
&lt;p&gt;It's an imitation of &lt;code&gt;black . --check&lt;/code&gt; - the idea is that Symbex can now be used to enforce code quality issues like docstrings and the presence of type annotations.&lt;/p&gt;
&lt;p&gt;The other new feature is &lt;code&gt;--rexec&lt;/code&gt;. This is an extension of the existing &lt;code&gt;--replace&lt;/code&gt; feature, which lets you find a symbol in your code and replace its body with new code.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;--rexec&lt;/code&gt; takes a shell expression. The body of the matching symbol will be piped into that command, and its output will be used as the replacement.&lt;/p&gt;
&lt;p&gt;Which means you can do things like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;symbex my_function \
  --rexec &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;llm --system 'add type hints and a docstring'&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This will find &lt;code&gt;def my_function()&lt;/code&gt; and its body, pass that through &lt;code&gt;llm&lt;/code&gt; (using the &lt;code&gt;gpt-3.5-turbo&lt;/code&gt; default model, but you can specify &lt;code&gt;-m gpt-4&lt;/code&gt; or any other model to use something else), and then take the output and update the file in-place with the new implementation.&lt;/p&gt;
&lt;p&gt;As a demo, I ran it against this:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;my_function&lt;/span&gt;(&lt;span class="pl-s1"&gt;a&lt;/span&gt;, &lt;span class="pl-s1"&gt;b&lt;/span&gt;):
    &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s1"&gt;a&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;b&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-c1"&gt;3&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;And got back:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;my_function&lt;/span&gt;(&lt;span class="pl-s1"&gt;a&lt;/span&gt;: &lt;span class="pl-s1"&gt;int&lt;/span&gt;, &lt;span class="pl-s1"&gt;b&lt;/span&gt;: &lt;span class="pl-s1"&gt;int&lt;/span&gt;) &lt;span class="pl-c1"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;int&lt;/span&gt;:
    &lt;span class="pl-s"&gt;"""&lt;/span&gt;
&lt;span class="pl-s"&gt;    Returns the sum of two integers (a and b) plus 3.&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;    Parameters:&lt;/span&gt;
&lt;span class="pl-s"&gt;    a (int): The first integer.&lt;/span&gt;
&lt;span class="pl-s"&gt;    b (int): The second integer.&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;    Returns:&lt;/span&gt;
&lt;span class="pl-s"&gt;    int: The sum of a and b plus 3.&lt;/span&gt;
&lt;span class="pl-s"&gt;    """&lt;/span&gt;
    &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s1"&gt;a&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;b&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-c1"&gt;3&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;Obviously this is fraught with danger, and you should only run this against code that has already been committed to Git and hence can be easily recovered... but it's a really fun trick!&lt;/p&gt;
&lt;h4&gt;ttok --encode --decode&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;ttok&lt;/code&gt; is my CLI tool for counting tokens, as used by LLM models such as GPT-4. &lt;a href="https://github.com/simonw/ttok/releases/tag/0.2"&gt;ttok 0.2&lt;/a&gt; adds a requested feature to help make tokens easier to understand, best illustrated by this demo:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;ttok Hello world
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Outputs 2 - the number of tokens&lt;/span&gt;
ttok Hello world --encode
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Outputs 9906 1917 - the encoded tokens&lt;/span&gt;
ttok 9906 1917 --decode
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Outputs Hello world - decoding the tokens back again&lt;/span&gt;
ttok Hello world --encode --tokens
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Outputs [b'Hello', b' world']&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Being able to easily see the encoded tokens including whitespace (the &lt;code&gt;b' world'&lt;/code&gt; part) is particularly useful for understanding how the tokens all fit together.&lt;/p&gt;
&lt;p&gt;I wrote more about GPT tokenization in &lt;a href="https://simonwillison.net/2023/Jun/8/gpt-tokenizers/"&gt;understanding GPT tokenizers&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;TIL this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/python/tree-sitter"&gt;Using tree-sitter with Python&lt;/a&gt; - 2023-07-14&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/yaml/yamlfmt"&gt;Auto-formatting YAML files with yamlfmt&lt;/a&gt; - 2023-07-13&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/python/quick-testing-pyenv"&gt;Quickly testing code in a different Python version using pyenv&lt;/a&gt; - 2023-07-10&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/git/git-filter-repo"&gt;Using git-filter-repo to set commit dates to author dates&lt;/a&gt; - 2023-07-10&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/gpt3/openai-python-functions-data-extraction"&gt;Using OpenAI functions and their Python library for data extraction&lt;/a&gt; - 2023-07-10&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/python/pyproject"&gt;Python packages with pyproject.toml and nothing else&lt;/a&gt; - 2023-07-08&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/datasette/syntax-highlighted-code-examples"&gt;Syntax highlighted code examples in Datasette&lt;/a&gt; - 2023-07-02&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/jinja/custom-jinja-tags-with-attributes"&gt;Custom Jinja template tags with attributes&lt;/a&gt; - 2023-07-02&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/macos/wildcard-dns-dnsmasq"&gt;Local wildcard DNS on macOS with dnsmasq&lt;/a&gt; - 2023-06-30&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/discord/discord-github-issues-bot"&gt;A Discord bot to expand issue links to a private GitHub repository&lt;/a&gt; - 2023-06-30&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/github/bulk-edit-github-projects"&gt;Bulk editing status in GitHub Projects&lt;/a&gt; - 2023-06-29&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/python/stdlib-cli-tools"&gt;CLI tools hidden in the Python standard library&lt;/a&gt; - 2023-06-29&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Releases this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/symbex/releases/tag/1.1"&gt;symbex 1.1&lt;/a&gt;&lt;/strong&gt; - 2023-07-16&lt;br /&gt;Find the Python code for specified symbols&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-mpt30b/releases/tag/0.1"&gt;llm-mpt30b 0.1&lt;/a&gt;&lt;/strong&gt; - 2023-07-12&lt;br /&gt;LLM plugin adding support for the MPT-30B language model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-markov/releases/tag/0.1"&gt;llm-markov 0.1&lt;/a&gt;&lt;/strong&gt; - 2023-07-12&lt;br /&gt;Plugin for LLM adding a Markov chain generating model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-gpt4all/releases/tag/0.1"&gt;llm-gpt4all 0.1&lt;/a&gt;&lt;/strong&gt; - 2023-07-12&lt;br /&gt;Plugin for LLM adding support for the GPT4All collection of models&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-palm/releases/tag/0.1"&gt;llm-palm 0.1&lt;/a&gt;&lt;/strong&gt; - 2023-07-12&lt;br /&gt;Plugin for LLM adding support for Google's PaLM 2 model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm/releases/tag/0.5"&gt;llm 0.5&lt;/a&gt;&lt;/strong&gt; - 2023-07-12&lt;br /&gt;Access large language models from the command-line&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/ttok/releases/tag/0.2"&gt;ttok 0.2&lt;/a&gt;&lt;/strong&gt; - 2023-07-10&lt;br /&gt;Count and truncate text based on tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/strip-tags/releases/tag/0.5.1"&gt;strip-tags 0.5.1&lt;/a&gt;&lt;/strong&gt; - 2023-07-09&lt;br /&gt;CLI tool for stripping tags from HTML&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/dogsheep/pocket-to-sqlite/releases/tag/0.2.3"&gt;pocket-to-sqlite 0.2.3&lt;/a&gt;&lt;/strong&gt; - 2023-07-09&lt;br /&gt;Create a SQLite database containing data from your Pocket account&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-render-markdown/releases/tag/2.2"&gt;datasette-render-markdown 2.2&lt;/a&gt;&lt;/strong&gt; - 2023-07-02&lt;br /&gt;Datasette plugin for rendering Markdown&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/asgi-proxy-lib/releases/tag/0.1a0"&gt;asgi-proxy-lib 0.1a0&lt;/a&gt;&lt;/strong&gt; - 2023-07-01&lt;br /&gt;An ASGI function for proxying to a backend over HTTP&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-upload-csvs/releases/tag/0.8.3"&gt;datasette-upload-csvs 0.8.3&lt;/a&gt;&lt;/strong&gt; - 2023-06-28&lt;br /&gt;Datasette plugin for uploading CSV files and converting them to database tables&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tutorials"&gt;tutorials&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/symbex"&gt;symbex&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="plugins"/><category term="projects"/><category term="tutorials"/><category term="ai"/><category term="datasette"/><category term="weeknotes"/><category term="sqlite-utils"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="symbex"/><category term="llm"/></entry><entry><title>Weeknotes: symbex, LLM prompt templates, a bit of a break</title><link href="https://simonwillison.net/2023/Jun/27/weeknotes/#atom-tag" rel="alternate"/><published>2023-06-27T16:30:57+00:00</published><updated>2023-06-27T16:30:57+00:00</updated><id>https://simonwillison.net/2023/Jun/27/weeknotes/#atom-tag</id><summary type="html">
    &lt;p&gt;I had a holiday to the UK for a family wedding anniversary and mostly took the time off... except for building &lt;strong&gt;symbex&lt;/strong&gt;, which became one of those projects that kept on inspiring new features.&lt;/p&gt;
&lt;p&gt;I've also been working on some major improvements to my &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; tool for working with language models from the command-line.&lt;/p&gt;
&lt;h4&gt;symbex&lt;/h4&gt;
&lt;p&gt;I introduced &lt;a href="https://github.com/simonw/symbex"&gt;symbex&lt;/a&gt; in &lt;a href="https://simonwillison.net/2023/Jun/18/symbex/"&gt;symbex: search Python code for functions and classes, then pipe them into a LLM&lt;/a&gt;. It's since grown a bunch more features across &lt;a href="https://github.com/simonw/symbex/releases"&gt;12 total releases&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;symbex&lt;/code&gt; is a tool for searching Python code. The initial goal was to make it quick to find and output the body of a specific Python function or class, such that you could then pipe it to &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; to process it with GPT-3.5 or GPT-4:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;symbex find_symbol_nodes \
  &lt;span class="pl-k"&gt;|&lt;/span&gt; llm -m gpt4 --system &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Describe this code succinctly&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Output:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This code defines a function &lt;code&gt;find_symbol_nodes&lt;/code&gt; that takes in three arguments: code (string), filename (string), and symbols (iterable of strings). The function parses the given code and searches for AST nodes (Class, Function, AsyncFunction) that match the provided symbols. It returns a list of tuple pairs containing matched nodes and their corresponding class names or None.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;When piping to a language model token count is really important - the goal is to provide the shortest amount of text that gives the model enough to produce interesting results.&lt;/p&gt;
&lt;p&gt;So... I added a &lt;code&gt;-s/--signatures&lt;/code&gt; option which returns just the function or class signature:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;symbex find_symbol_nodes -s&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Output:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-c"&gt;# File: symbex/lib.py Line: 13&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-s1"&gt;find_symbol_nodes&lt;/span&gt;(&lt;span class="pl-s1"&gt;code&lt;/span&gt;: &lt;span class="pl-s1"&gt;str&lt;/span&gt;, &lt;span class="pl-s1"&gt;filename&lt;/span&gt;: &lt;span class="pl-s1"&gt;str&lt;/span&gt;, &lt;span class="pl-s1"&gt;symbols&lt;/span&gt;: &lt;span class="pl-v"&gt;Iterable&lt;/span&gt;[&lt;span class="pl-s1"&gt;str&lt;/span&gt;]) &lt;span class="pl-c1"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pl-v"&gt;List&lt;/span&gt;[&lt;span class="pl-v"&gt;Tuple&lt;/span&gt;[(&lt;span class="pl-v"&gt;AST&lt;/span&gt;, &lt;span class="pl-v"&gt;Optional&lt;/span&gt;[&lt;span class="pl-s1"&gt;str&lt;/span&gt;])]]&lt;/pre&gt;
&lt;p&gt;Add &lt;code&gt;--docstrings&lt;/code&gt; to include the docstring. Add &lt;code&gt;-i/--imports&lt;/code&gt; for an import line, and &lt;code&gt;-n/--no-file&lt;/code&gt; to suppress that &lt;code&gt;# File&lt;/code&gt; comment - so &lt;code&gt;-in&lt;/code&gt; combines both of hose options:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;symbex find_symbol_nodes -s --docstrings -in&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;span class="pl-c"&gt;# from symbex.lib import find_symbol_nodes&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-s1"&gt;find_symbol_nodes&lt;/span&gt;(&lt;span class="pl-s1"&gt;code&lt;/span&gt;: &lt;span class="pl-s1"&gt;str&lt;/span&gt;, &lt;span class="pl-s1"&gt;filename&lt;/span&gt;: &lt;span class="pl-s1"&gt;str&lt;/span&gt;, &lt;span class="pl-s1"&gt;symbols&lt;/span&gt;: &lt;span class="pl-v"&gt;Iterable&lt;/span&gt;[&lt;span class="pl-s1"&gt;str&lt;/span&gt;]) &lt;span class="pl-c1"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pl-v"&gt;List&lt;/span&gt;[&lt;span class="pl-v"&gt;Tuple&lt;/span&gt;[(&lt;span class="pl-v"&gt;AST&lt;/span&gt;, &lt;span class="pl-v"&gt;Optional&lt;/span&gt;[&lt;span class="pl-s1"&gt;str&lt;/span&gt;])]]
    "&lt;span class="pl-v"&gt;Returns&lt;/span&gt; &lt;span class="pl-s1"&gt;ast&lt;/span&gt; &lt;span class="pl-v"&gt;Nodes&lt;/span&gt; &lt;span class="pl-s1"&gt;matching&lt;/span&gt; &lt;span class="pl-s1"&gt;symbols&lt;/span&gt;"&lt;/pre&gt;
&lt;p&gt;Being able to see type annotations and docstrings tells you a lot about the code. This gave me an idea for an extra set of features: filters that could be used to only return symbols that were documented, or undocumented, or included or were missing type signatures:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;--async&lt;/code&gt;: Filter async functions&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--function&lt;/code&gt;: Filter functions&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--class&lt;/code&gt;: Filter classes&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--documented&lt;/code&gt;: Filter functions with docstrings&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--undocumented&lt;/code&gt;: Filter functions without docstrings&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--typed&lt;/code&gt;: Filter functions with type annotations&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--untyped&lt;/code&gt;: Filter functions without type annotations&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--partially-typed&lt;/code&gt;: Filter functions with partial type annotations&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--fully-typed&lt;/code&gt;: Filter functions with full type annotations&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So now you can use &lt;code&gt;symbex&lt;/code&gt; to get a feel for how well typed or documented your code is:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; See all symbols lacking a docstring:&lt;/span&gt;
symbex -s --undocumented

&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; All functions that are missing type annotations:&lt;/span&gt;
symbex -s --function --untyped&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;a href="https://github.com/simonw/symbex/blob/main/README.md"&gt;README&lt;/a&gt; has comprehensive documentation on everything else the tool can do.&lt;/p&gt;
&lt;h4&gt;LLM prompt templates&lt;/h4&gt;
&lt;p&gt;My &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; tool is shaping up in some interesting directions as well.&lt;/p&gt;
&lt;p&gt;The big new released feature is &lt;a href="https://llm.datasette.io/en/stable/templates.html"&gt;prompt templates&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A template is a file that looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;&lt;span class="pl-ent"&gt;system&lt;/span&gt;: &lt;span class="pl-s"&gt;Summarize this text in the voice of $voice&lt;/span&gt;
&lt;span class="pl-ent"&gt;model&lt;/span&gt;: &lt;span class="pl-s"&gt;gpt-4&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This can be installed using &lt;code&gt;llm templates edit summary&lt;/code&gt;, which opens a text editor (using the &lt;code&gt;$EDITOR&lt;/code&gt; environment variable).&lt;/p&gt;
&lt;p&gt;Once installed, you can use it like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;curl -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;https://til.simonwillison.net/macos/imovie-slides-and-audio&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;|&lt;/span&gt; \
  strip-tags -m &lt;span class="pl-k"&gt;|&lt;/span&gt; \
  llm -t summarize -p voice &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Extremely sarcastic GlaDOS&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;Oh, &lt;em&gt;bravo&lt;/em&gt;, Simon. You've really outdone yourself. Apparently, the highlight of his day was turning an old talk into a video using iMovie. After a truly heart-stopping struggle with the Ken Burns effect, he finally, and I mean &lt;em&gt;finally&lt;/em&gt;, tuned the slide duration to match the audio. And then, hold your applause, he met the enormous challenge of publishing it on YouTube. We were all waiting with bated breath. Oh, but wouldn't it be exciting to note that his estimated 1.03GB video was actually a shockingly smaller size? I can't contain my excitement. He also used Pixelmator for a custom title slide, as YouTube prefers a size of 1280x720px - ground-breaking information, truly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The idea here is to make it easy to create reusable template snippets, for all sorts of purposes. &lt;code&gt;git diff | llm -t diff&lt;/code&gt; could generate a commit message, &lt;code&gt;cat file.py | llm -t explain&lt;/code&gt; could explain code etc.&lt;/p&gt;
&lt;h4&gt;LLM plugins&lt;/h4&gt;
&lt;p&gt;These are still baking, but this is the feature I'm most excited about. I'm adding plugins to LLM, inspired by &lt;a href="https://docs.datasette.io/en/stable/plugins.html"&gt;plugins in Datasette&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I'm planning the following categories of plugins to start with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Command plugins&lt;/strong&gt;. These will allow extra commands to be added to the &lt;code&gt;llm&lt;/code&gt; tool - &lt;code&gt;llm search&lt;/code&gt; or &lt;code&gt;llm embed&lt;/code&gt; or similar.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Template plugins&lt;/strong&gt;. Imagine being able to install extra prompt templates using &lt;code&gt;llm install name-of-package&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model plugins&lt;/strong&gt;. I want LLM to be able to use more than just GPT-3.5 and GPT-4. I have a branch with &lt;a href="https://github.com/simonw/llm/blob/ce2a322126f98a2702077eb06d0b57c8a8414d42/llm/vertex_models.py"&gt;an example plugin&lt;/a&gt; that can call Google's PaLM 2 model via &lt;a href="https://cloud.google.com/vertex-ai"&gt;Google Vertex&lt;/a&gt;, and I hope to support many other LLM families with additional plugins, including models that can run locally via &lt;a href="https://github.com/ggerganov/llama.cpp"&gt;llama.cpp&lt;/a&gt; and similar.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Function plugins&lt;/strong&gt;. Once I get the new OpenAI functions mechanism working, I'd like to be able to install plugins that make new functions available to be executed by the LLM!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All of this is under active development at the moment. I'll write more about it once I get it working.&lt;/p&gt;
&lt;h4&gt;Entries these weeks&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2023/Jun/18/symbex/"&gt;symbex: search Python code for functions and classes, then pipe them into a LLM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2023/Jun/8/gpt-tokenizers/"&gt;Understanding GPT tokenizers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Releases these weeks&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/3.33"&gt;sqlite-utils 3.33&lt;/a&gt;&lt;/strong&gt; - 2023-06-26&lt;br /&gt;Python CLI utility and library for manipulating SQLite databases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-render-images/releases/tag/0.4"&gt;datasette-render-images 0.4&lt;/a&gt;&lt;/strong&gt; - 2023-06-14&lt;br /&gt;Datasette plugin that renders binary blob images using data-uris&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;TIL these weeks&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/python/toml"&gt;TOML in Python&lt;/a&gt; - 2023-06-26&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/homebrew/auto-formulas-github-actions"&gt;Automatically maintaining Homebrew formulas using GitHub Actions&lt;/a&gt; - 2023-06-21&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/gpt3/picking-python-project-name-chatgpt"&gt;Using ChatGPT Browse to name a Python package&lt;/a&gt; - 2023-06-18&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/macos/imovie-slides-and-audio"&gt;Syncing slide images and audio in iMovie&lt;/a&gt; - 2023-06-15&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/macos/fs-usage"&gt;Using fs_usage to see what files a process is using&lt;/a&gt; - 2023-06-15&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/llms/larger-context-openai-models-llm"&gt;Running OpenAI's large context models using llm&lt;/a&gt; - 2023-06-13&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/sql/consecutive-groups"&gt;Consecutive groups in SQL using window functions&lt;/a&gt; - 2023-06-08&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/symbex"&gt;symbex&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="projects"/><category term="ai"/><category term="weeknotes"/><category term="generative-ai"/><category term="llms"/><category term="symbex"/><category term="llm"/></entry><entry><title>Symbex: search Python code for functions and classes, then pipe them into a LLM</title><link href="https://simonwillison.net/2023/Jun/18/symbex/#atom-tag" rel="alternate"/><published>2023-06-18T22:11:12+00:00</published><updated>2023-06-18T22:11:12+00:00</updated><id>https://simonwillison.net/2023/Jun/18/symbex/#atom-tag</id><summary type="html">
    &lt;p&gt;I just released a new Python CLI tool called &lt;a href="https://github.com/simonw/symbex"&gt;Symbex&lt;/a&gt;. It's a search tool, loosely inspired by &lt;a href="https://github.com/BurntSushi/ripgrep"&gt;ripgrep&lt;/a&gt;, which lets you search Python code for functions and classes by name or wildcard, then see just the source code of those matching entities.&lt;/p&gt;
&lt;h4&gt;Searching for functions and classes&lt;/h4&gt;
&lt;p&gt;Here's an example of what it can do. Running in my &lt;code&gt;datasette/&lt;/code&gt; folder:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;symbex inspect_hash            &lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Output:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-c"&gt;# File: datasette/inspect.py Line: 17&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;inspect_hash&lt;/span&gt;(&lt;span class="pl-s1"&gt;path&lt;/span&gt;):
    &lt;span class="pl-s"&gt;"""Calculate the hash of a database, efficiently."""&lt;/span&gt;
    &lt;span class="pl-s1"&gt;m&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;hashlib&lt;/span&gt;.&lt;span class="pl-en"&gt;sha256&lt;/span&gt;()
    &lt;span class="pl-k"&gt;with&lt;/span&gt; &lt;span class="pl-s1"&gt;path&lt;/span&gt;.&lt;span class="pl-en"&gt;open&lt;/span&gt;(&lt;span class="pl-s"&gt;"rb"&lt;/span&gt;) &lt;span class="pl-k"&gt;as&lt;/span&gt; &lt;span class="pl-s1"&gt;fp&lt;/span&gt;:
        &lt;span class="pl-k"&gt;while&lt;/span&gt; &lt;span class="pl-c1"&gt;True&lt;/span&gt;:
            &lt;span class="pl-s1"&gt;data&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;fp&lt;/span&gt;.&lt;span class="pl-en"&gt;read&lt;/span&gt;(&lt;span class="pl-v"&gt;HASH_BLOCK_SIZE&lt;/span&gt;)
            &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-c1"&gt;not&lt;/span&gt; &lt;span class="pl-s1"&gt;data&lt;/span&gt;:
                &lt;span class="pl-k"&gt;break&lt;/span&gt;
            &lt;span class="pl-s1"&gt;m&lt;/span&gt;.&lt;span class="pl-en"&gt;update&lt;/span&gt;(&lt;span class="pl-s1"&gt;data&lt;/span&gt;)

    &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s1"&gt;m&lt;/span&gt;.&lt;span class="pl-en"&gt;hexdigest&lt;/span&gt;()&lt;/pre&gt;
&lt;p&gt;I gave it the name of a function (classes work too) and it searched all subfolders of the current directory, found that function and output it to my terminal.&lt;/p&gt;
&lt;p&gt;Why is this more useful than &lt;code&gt;ripgrep&lt;/code&gt; or any of the many other tools that can do this?&lt;/p&gt;
&lt;p&gt;I partly built this to have fun learning Python's &lt;a href="https://docs.python.org/3/library/ast.html"&gt;ast&lt;/a&gt; module, but it's mainly designed to complement my &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; CLI tool for running large language model prompts.&lt;/p&gt;
&lt;h4&gt;Code explanations with a large language model&lt;/h4&gt;
&lt;p&gt;Check this out:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;symbex inspect_hash &lt;span class="pl-k"&gt;|&lt;/span&gt; llm --system &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;explain succinctly&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Output:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This function calculates the hash of a database file efficiently by reading the file in blocks and updating the hash object using SHA256 algorithm from the hashlib module. The resulting hash value is returned as a hexadecimal string.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is pretty cool!&lt;/p&gt;
&lt;p&gt;&lt;code&gt;llm --system 'explain succinctly'&lt;/code&gt; runs a prompt against &lt;code&gt;gpt-3.5-turbo&lt;/code&gt; using "explain succinctly" as the system prompt.&lt;/p&gt;
&lt;p&gt;The system prompt says what to do, then the content piped to the tool is treated as the data that should be processed.&lt;/p&gt;
&lt;h4&gt;Using wildcards and guessing what a tool does from tests&lt;/h4&gt;
&lt;p&gt;That was a pretty basic example. Here's something more fun:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;symbex &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;test*csv*&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;|&lt;/span&gt; llm --system &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;based on these tests guess what this tool does&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I'm using the &lt;code&gt;test*csv*&lt;/code&gt; wildcard here to find all of my test functions that mention CSV in the name. There are 18 of those, with names like &lt;code&gt;test_table_csv_with_labels&lt;/code&gt; and &lt;code&gt;test_table_csv_cors_headers&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I can count the number of tokens in the test output like this (see &lt;a href="https://simonwillison.net/2023/May/18/cli-tools-for-llms/"&gt;llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs&lt;/a&gt; for an intro to my &lt;code&gt;ttok&lt;/code&gt; tool):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;symbex &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;test_*csv*&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;|&lt;/span&gt; ttok&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;2682
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That's significantly below the 4000 token limit for &lt;code&gt;gpt-3.5-turbo&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Here's what I get when I ask it "based on these tests guess what this tool does":&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This tool seems to be a web application that provides a user interface to query databases and export query results in various formats, including HTML, JSON, and CSV. The tests verify that the tool correctly generates HTML pages with links to export query results in JSON or CSV format, exports tables to CSV format with custom options (such as whether to include labels and nullable labels), and performs streaming of CSV data. The tests also check that the tool sets appropriate HTTP headers, allows CORS, and handles errors and large CSV files gracefully.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That's a good summary of what Datasette does!&lt;/p&gt;
&lt;p&gt;I was worried that it might be cheating since ChatGPT already knows about Datasette, but I tried this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;symbex &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;test_*csv*&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;|&lt;/span&gt; grep datasette&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And got no results, so at least that keyword wasn't being leaked in the test details somehow.&lt;/p&gt;
&lt;h4&gt;Refactoring code&lt;/h4&gt;
&lt;p&gt;Let's try something a whole lot more useful:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;symbex Request &lt;span class="pl-k"&gt;|&lt;/span&gt; llm --system &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;add type hints to this&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This locates the &lt;code&gt;Request&lt;/code&gt; class in Datasette - &lt;a href="https://github.com/simonw/datasette/blob/dda99fc09fb0b5523948f6d481c6c051c1c7b5de/datasette/utils/asgi.py#L56-L156"&gt;this one here&lt;/a&gt;, and starts adding Python type hints to it. The output started out like this (that code has no type hints at all at the moment):&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;typing&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;Dict&lt;/span&gt;, &lt;span class="pl-v"&gt;Any&lt;/span&gt;, &lt;span class="pl-v"&gt;Awaitable&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;http&lt;/span&gt;.&lt;span class="pl-s1"&gt;cookies&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;SimpleCookie&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;urllib&lt;/span&gt;.&lt;span class="pl-s1"&gt;parse&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;urlunparse&lt;/span&gt;, &lt;span class="pl-s1"&gt;parse_qs&lt;/span&gt;, &lt;span class="pl-s1"&gt;parse_qsl&lt;/span&gt;

&lt;span class="pl-k"&gt;from&lt;/span&gt; .&lt;span class="pl-s1"&gt;multidict&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;MultiParams&lt;/span&gt;


&lt;span class="pl-k"&gt;class&lt;/span&gt; &lt;span class="pl-v"&gt;Request&lt;/span&gt;:
    &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;__init__&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;, &lt;span class="pl-s1"&gt;scope&lt;/span&gt;: &lt;span class="pl-v"&gt;Dict&lt;/span&gt;[&lt;span class="pl-s1"&gt;str&lt;/span&gt;, &lt;span class="pl-v"&gt;Any&lt;/span&gt;], &lt;span class="pl-s1"&gt;receive&lt;/span&gt;: &lt;span class="pl-v"&gt;Awaitable&lt;/span&gt;) &lt;span class="pl-c1"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt;:
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-s1"&gt;scope&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;scope&lt;/span&gt;
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-s1"&gt;receive&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;receive&lt;/span&gt;

    &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;__repr__&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;) &lt;span class="pl-c1"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;str&lt;/span&gt;:
        &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s"&gt;'&amp;lt;asgi.Request method="{}" url="{}"&amp;gt;'&lt;/span&gt;.&lt;span class="pl-en"&gt;format&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-s1"&gt;method&lt;/span&gt;, &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-s1"&gt;url&lt;/span&gt;)

    &lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;property&lt;/span&gt;&lt;/span&gt;
    &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;method&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;) &lt;span class="pl-c1"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;str&lt;/span&gt;:
        &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-s1"&gt;scope&lt;/span&gt;[&lt;span class="pl-s"&gt;"method"&lt;/span&gt;]

    &lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;property&lt;/span&gt;&lt;/span&gt;
    &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;url&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;) &lt;span class="pl-c1"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;str&lt;/span&gt;:
        &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-en"&gt;urlunparse&lt;/span&gt;(
            (&lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-s1"&gt;scheme&lt;/span&gt;, &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-s1"&gt;host&lt;/span&gt;, &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-s1"&gt;path&lt;/span&gt;, &lt;span class="pl-c1"&gt;None&lt;/span&gt;, &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-s1"&gt;query_string&lt;/span&gt;, &lt;span class="pl-c1"&gt;None&lt;/span&gt;)
        )

    &lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;property&lt;/span&gt;&lt;/span&gt;
    &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;url_vars&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;) &lt;span class="pl-c1"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pl-v"&gt;Dict&lt;/span&gt;[&lt;span class="pl-s1"&gt;str&lt;/span&gt;, &lt;span class="pl-s1"&gt;str&lt;/span&gt;]:
        &lt;span class="pl-k"&gt;return&lt;/span&gt; (&lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-s1"&gt;scope&lt;/span&gt;.&lt;span class="pl-en"&gt;get&lt;/span&gt;(&lt;span class="pl-s"&gt;"url_route"&lt;/span&gt;) &lt;span class="pl-c1"&gt;or&lt;/span&gt; {}).&lt;span class="pl-en"&gt;get&lt;/span&gt;(&lt;span class="pl-s"&gt;"kwargs"&lt;/span&gt;) &lt;span class="pl-c1"&gt;or&lt;/span&gt; {}
    
    &lt;span class="pl-c"&gt;# ...&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;Now this is getting impressive! Obviously I wouldn't just check code like this in without a comprehensive review and likely adjusting many of the decisions it's made, but this is a very good starting point - especially for the tiny amount of effort it takes to get started.&lt;/p&gt;
&lt;h4&gt;Picking a name for the tool&lt;/h4&gt;
&lt;p&gt;The most time-consuming part of this project ended up being picking the name!&lt;/p&gt;
&lt;p&gt;Originally I planned to call it &lt;code&gt;py-grep&lt;/code&gt;. I checked &lt;code&gt;https://pypi.org/project/py-grep/&lt;/code&gt; and it was available, so I spun up the first version of the tool and attempted to upload it to PyPI.&lt;/p&gt;
&lt;p&gt;PyPI gave me an error, because the name was too similar to the existing &lt;code&gt;pygrep&lt;/code&gt; package. On the one hand that's totally fair, but it was annoying that I couldn't check for availability without attempting an upload.&lt;/p&gt;
&lt;p&gt;I turned to ChatGPT to start brainstorming new names. I didn't use regular ChatGPT though: I fired up ChatGPT Browse, which could both read my README and, with some prompting, could learn to check if names were taken itself!&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2023/chatgpt-naming.jpg" alt="Prompt: All of those names are taken, more ideas please - you can check if they are taken by checking for https://pypi.org/project/pyseek/ ChatGPT Finished browsing Clicked on: pydex · PyPI Clicked on: pyscan · PyPI Clicked on: pylocator · PyPI Clicked on: pyspotter · PyPI Clicked on: pylocator · PyPI Clicked on: pyfinder · PyPI Clicked on: PyScanner · PyPI Clicked on: PyLookup · PyPI Clicked on: PySpot · PyPI Clicked on: PyLookup · PyPI Clicked on: pylocator · PyPI Clicked on: pyfinder · PyPI Clicked on: pylocator · PyPI Clicked on: pyfinder · PyPI Click failed Finished browsing It seems there was an error fetching the page. This might indicate that the name pysourcefinder is not taken, but it would be safer to verify manually." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I wrote up the full process for this in a TIL: &lt;a href="https://til.simonwillison.net/gpt3/picking-python-project-name-chatgpt"&gt;Using ChatGPT Browse to name a Python package&lt;/a&gt;.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/chatgpt"&gt;chatgpt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/symbex"&gt;symbex&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cli"/><category term="projects"/><category term="python"/><category term="ai"/><category term="generative-ai"/><category term="chatgpt"/><category term="llms"/><category term="symbex"/></entry></feed>