Annotated version of this introductory video
Datasette is a tool for exploring and publishing data. It helps people take data of any shape, analyze and explore it, and publish it as an interactive website and accompanying API.
Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of 46 tools and 156 plugins dedicated to making working with structured data as productive as possible.
Try a demo and explore 33,000 power plants around the world, then follow the tutorial or take a look at some other examples of Datasette in action.
Then read how to get started with Datasette, subscribe to the monthly-ish newsletter and consider signing up for office hours for an in-person conversation about the project.
New: Datasette Desktop - a macOS desktop application for easily running Datasette on your own computer!
Exploratory data analysis
Import data from CSVs, JSON, database connections and more. Datasette will automatically show you patterns in your data and help you share your findings with your colleagues.
Instant data publishing
datasette publish
lets you instantly publish your data to hosting providers like Google Cloud Run, Heroku or Vercel.
Rapid prototyping
Spin up a JSON API for any data in minutes. Use it to prototype and prove your ideas without building a custom backend.
Latest news
6th February 2025 #
Datasette 1.0a17 is the latest Datasette 1.0 alpha release, with bug fixes and small feature improvements from the last few months.
7th October 2024 #
Python 3.13 was released today. Datasette 1.0a16 is compatible with Python 3.13, but Datasette 0.64.8 was not. The new Datasette 0.65 release fixes compatibility with the new version of Python.
5th August 2024 #
Datasette 1.0a14 includes some breaking changes to how metadata works for plugins, described in detail in the new upgrade guide. See also the annotated release notes that accompany this release.
18th February 2024 #
Datasette 1.0a10 is a focused alpha that changes some internal details about how Datasette handles transactions. The datasette.execute_write_fn()
internal method now wraps the function in a database transaction unless you pass transaction=False
.
16th February 2024 #
Datasette 1.0a9 adds basic alter table support to the JSON API, tweaks how permissions works and introduces some new plugin debugging utilities.
7th February 2024 #
Datasette 1.0a8 introduces several new plugin hooks, a JavaScript plugin system and moves plugin configuration from metadata.yaml
to datasette.yaml
. Read more about the release in the annotated release notes for 1.0a8.
1st December 2023 #
Datasette Enrichments is a new feature for Datasette that supports enriching data by running custom code against every selected row in a table. Read Datasette Enrichments: a new plugin framework for augmenting your data for more details, plus a video demo of enrichments for geocoding addresses and processing text and images using GPT-4.
30th November 2023 #
datasette-comments is a new plugin by Alex Garcia which adds collaborative commenting to Datasette. Alex built the plugin for Datasette Cloud, but it's also available as an open source package for people who are hosting their own Datasette instances. See Annotate and explore your data with datasette-comments on the Datasette Cloud blog for more details.
22nd August 2023 #
Datasette 1.0a4 has a fix for a security vulnerability in the Datasette 1.0 alpha series: the API explorer interface exposed the names of private databases and tables in public instances that were protected by a plugin such as datasette-auth-passwords, though not the actual content of those tables. See the security advisory for more details and workarounds for if you can't upgrade immediately. The latest edition of the Datasette Newsletter also talks about this issue.
15th August 2023 #
datasette-write-ui: a Datasette plugin for editing, inserting, and deleting rows introduces a new plugin adding add/edit/delete functionality to Datasette, developed by Alex Garcia. Alex built this for Datasette Cloud, and this post is the first announcement made on the new Datasette Cloud blog - see also Welcome to Datasette Cloud.
9th August 2023 #
Datasette 1.0a3 is an alpha release of Datasette that previews the new default JSON API design that’s coming in version 1.0 - the single most significant change planned for that 1.0 release.
1st July 2023 #
New tutorial: Data analysis with SQLite and Python. This tutorial, originally presented at PyCon 2023, includes a 2h45m video and an extensive handout that should be useful with or without the video. Topics covered include Python's sqlite3
module, sqlite-utils
, Datasette, Datasette Lite, advanced SQL patterns and more.
24th March 2023 #
I built a ChatGPT plugin to answer questions about data hosted in Datasette describes a new experimental Datasette plugin to enable people to query data hosted in a Datasette interface via ChatGPT, asking human language questions that are automatically converted to SQL and used to generate a readable response.
23rd February 2023 #
Using Datasette in GitHub Codespaces is a new tutorial showing how Datasette can be run in GitHub's free Codespaces browser-based development environments, using the new datasette-codespaces plugin.
28th January 2023 #
Examples of sites built using Datasette now includes screenshots of Datasette deployments that illustrate a variety of problems that can be addressed using Datasette and its plugins.
Latest releases
11th April 2025
llm 0.25a0 - CLI utility and Python library for interacting with Large Language Models from organizations like OpenAI, Anthropic and Gemini plus local models installed on your own machine.
llm models --options
now shows keys and environment variables for models that use API keys. Thanks, Steve Morin. #903- Added
py.typed
marker file so LLM can now be used as a dependency in projects that usemypy
without a warning. #887 $
characters can now be used in templates by escaping them as$$
. Thanks, @guspix. #904- LLM now uses
pyproject.toml
instead ofsetup.py
. #908
10th April 2025
csvs-to-sqlite 1.3.1 - Convert CSV files into a SQLite database
- Upgraded for compatibility with recent Pandas and Click libraries. Thanks, Luighi Viton-Zorrilla. #99
9th April 2025
llm 0.24.2 - CLI utility and Python library for interacting with Large Language Models from organizations like OpenAI, Anthropic and Gemini plus local models installed on your own machine.
- Fixed a bug on Windows with the new
llm -t path/to/file.yaml
feature. #901
8th April 2025
llm 0.24.1
- Templates can now be specified as a path to a file on disk, using
llm -t path/to/file.yaml
. This makes them consistent with how-f
fragments are loaded. #897 llm logs backup /tmp/backup.db
command for backing up yourlogs.db
database. #879
7th April 2025
llm 0.24
Support for fragments to help assemble prompts for long context models. Improved support for templates to support attachments and fragments. New plugin hooks for providing custom loaders for both templates and fragments. See Long context support in LLM 0.24 using fragments and template plugins for more on this release.
The new llm-docs plugin demonstrates these new features. Install it like this:
llm install llm-docs
Now you can ask questions of the LLM documentation like this:
llm -f docs: 'How do I save a new template?'
The docs:
prefix is registered by the plugin. The plugin fetches the LLM documentation for your installed version (from the docs-for-llms repository) and uses that as a prompt fragment to help answer your question.
Two more new plugins are llm-templates-github and llm-templates-fabric.
llm-templates-github
lets you share and use templates on GitHub. You can run my Pelican riding a bicycle benchmark against a model like this:
llm install llm-templates-github
llm -t gh:simonw/pelican-svg -m o3-mini
This executes this pelican-svg.yaml template stored in my simonw/llm-templates repository, using a new repository naming convention.
To share your own templates, create a repository on GitHub under your user account called llm-templates
and start saving .yaml
files to it.
llm-templates-fabric provides a similar mechanism for loading templates from Daniel Miessler's fabric collection:
llm install llm-templates-fabric
curl https://simonwillison.net/2025/Apr/6/only-miffy/ | \
llm -t f:extract_main_idea
Major new features:
- New fragments feature. Fragments can be used to assemble long prompts from multiple existing pieces - URLs, file paths or previously used fragments. These will be stored de-duplicated in the database avoiding wasting space storing multiple long context pieces. Example usage:
llm -f https://llm.datasette.io/robots.txt 'explain this file'
. #617 - The
llm logs
file now accepts-f
fragment references too, and will show just logged prompts that used those fragments. - register_template_loaders() plugin hook allowing plugins to register new
prefix:value
custom template loaders. #809 - register_fragment_loaders() plugin hook allowing plugins to register new
prefix:value
custom fragment loaders. #886 - llm fragments family of commands for browsing fragments that have been previously logged to the database.
- The new llm-openai plugin provides support for o1-pro (which is not supported by the OpenAI mechanism used by LLM core). Future OpenAI features will migrate to this plugin instead of LLM core itself.
Improvements to templates:
llm -t $URL
option can now take a URL to a YAML template. #856- Templates can now store default model options. #845
- Executing a template that does not use the
$input
variable no longer blocks LLM waiting for input, so prompt templates can now be used to try different models usingllm -t pelican-svg -m model_id
. #835 llm templates
command no longer crashes if one of the listed template files contains invalid YAML. #880- Attachments can now be stored in templates. #826
Other changes:
- New llm models options family of commands for setting default options for particular models. #829
llm logs list
,llm schemas list
andllm schemas show
all now take a-d/--database
option with an optional path to a SQLite database. They used to take-p/--path
but that was inconsistent with other commands.-p/--path
still works but is excluded from--help
and will be removed in a future LLM release. #857llm logs -e/--expand
option for expanding fragments. #881llm prompt -d path-to-sqlite.db
option can now be used to write logs to a custom SQLite database. #858llm similar -p/--plain
option providing more human-readable output than the default JSON. #853llm logs -s/--short
now truncates to include the end of the prompt too. Thanks, Sukhbinder Singh. #759- Set the
LLM_RAISE_ERRORS=1
environment variable to raise errors during prompts rather than suppressing them, which means you can runpython -i -m llm 'prompt'
and then drop into a debugger on errors withimport pdb; pdb.pm()
. #817 - Improved --help output for
llm embed-multi
. #824 llm models -m X
option which can be passed multiple times with model IDs to see the details of just those models. #825- OpenAI models now accept PDF attachments. #834
llm prompt -q gpt -q 4o
option - pass-q searchterm
one or more times to execute a prompt against the first model that matches all of those strings - useful for if you can't remember the full model ID. #841- OpenAI compatible models configured using
extra-openai-models.yaml
now supportsupports_schema: true
,vision: true
andaudio: true
options. Thanks @adaitche and @giuli007. #819, #843
llm 0.24a1
28th March 2025
datasette-auth-existing-cookies 1.0a2 - Datasette plugin that authenticates users based on existing domain cookies
- Fix some warnings.
- Drop support for Python versions prior to 3.9.
25th March 2025
shot-scraper 1.8 - A command-line utility for taking automated screenshots of websites
shot-scraper javascript
can now optionally load scripts hosted on GitHub via the newgh:
prefix to theshot-scraper javascript -i/--input
option. #173
Scripts can be referenced as gh:username/repo/path/to/script.js
or, if the GitHub user has created a dedicated shot-scraper-scripts
repository and placed scripts in the root of it, using gh:username/name-of-script
.
For example, to run this readability.js script against any web page you can use the following:
bash
shot-scraper javascript -i gh:simonw/readability \
https://simonwillison.net/2025/Mar/24/qwen25-vl-32b/
20th March 2025
datasette-public 0.3a2 - Make specific Datasette tables visible to the public
- Fix for compatibility with
datasette-extract
. #13
10th March 2025
datasette-dashboards 0.7.1 - Datasette plugin providing data dashboards from metadata
1st March 2025
llm 0.24a0 - CLI utility and Python library for interacting with Large Language Models from organizations like OpenAI, Anthropic and Gemini plus local models installed on your own machine.
- Alpha release with experimental
register_template_loaders()
plugin hook. #809
28th February 2025
strip-tags 0.6 - Strip tags from HTML, optionally from areas identified by CSS selectors
- Fixed a bug where
strip-tags -t meta
still removed<meta>
tags from the<head>
because the entire<head>
element was removed first. #32 - Kept
<meta>
tags now default to keeping theircontent
andproperty
attributes. - The CLI
-m/--minify
option now also removes any remaining blank lines. #33 - A new
strip_tags(remove_blank_lines=True)
option can be used to achieve the same thing with the Python library function.
llm 0.23 - CLI utility and Python library for interacting with Large Language Models from organizations like OpenAI, Anthropic and Gemini plus local models installed on your own machine.
Support for schemas, for getting supported models to output JSON that matches a specified JSON schema. See also Structured data extraction from unstructured content using LLM schemas for background on this feature. #776
- New
llm prompt --schema '{JSON schema goes here}
option for specifying a schema that should be used for the output from the model. The schemas documentation has more details and a tutorial. - Schemas can also be defined using a concise schema specification, for example
llm prompt --schema 'name, bio, age int'
. #790 - Schemas can also be specified by passing a filename and through several other methods. #780
- New llm schemas family of commands:
llm schemas list
,llm schemas show
, andllm schemas dsl
for debugging the new concise schema language. #781 - Schemas can now be saved to templates using
llm --schema X --save template-name
or through modifying the template YAML. #778 - The llm logs command now has new options for extracting data collected using schemas:
--data
,--data-key
,--data-array
,--data-ids
. #782 - New
llm logs --id-gt X
and--id-gte X
options. #801 - New
llm models --schemas
option for listing models that support schemas. #797 model.prompt(..., schema={...})
parameter for specifying a schema from Python. This accepts either a dictionary JSON schema definition or a PydanticBaseModel
subclass, see schemas in the Python API docs.- The default OpenAI plugin now enables schemas across all supported models. Run
llm models --schemas
for a list of these. - The llm-anthropic and llm-gemini plugins have been upgraded to add schema support for those models. Here's documentation on how to add schema support to a model plugin.
Other smaller changes:
- GPT-4.5 preview is now a supported model:
llm -m gpt-4.5 'a joke about a pelican and a wolf'
#795 - The prompt string is now optional when calling
model.prompt()
from the Python API, somodel.prompt(attachments=llm.Attachment(url=url)))
now works. #784 extra-openai-models.yaml
now supports areasoning: true
option. Thanks, Kasper Primdal Lauritzen. #766- LLM now depends on Pydantic v2 or higher. Pydantic v1 is no longer supported. #520
27th February 2025
llm 0.23a0
Alpha release adding support for schemas, for getting supported models to output JSON that matches a specified JSON schema. #776
llm prompt --schema '{JSON schema goes here}
option for specifying a schema that should be used for the output from the model, see schemas in the CLI docs.model.prompt(..., schema={...})
parameter for specifying a schema from Python. This accepts either a dictionary JSON schema definition of a PydanticBaseModel
subclass, see schemas in the Python API docs.- The default OpenAI plugin now supports schemas across all models.
- Documentation on how to add schema support to a model plugin.
- LLM now depends on Pydantic v2 or higher. Pydantic v1 is no longer supported. #520
20th February 2025
shot-scraper 1.7 - A command-line utility for taking automated screenshots of websites
- New options for the
shot-scraper har
command:--wait
,--wait-for
and--javascript
- similar to those same options onshot-scraper shot
. #171 - Documentation now recommends using
shot-scraper multi --har-zip
for larger collections of files to avoid a Playwright crash if the.har
JSON grows too large. #170