Simon Willison’s Weblog

Subscribe

84 items tagged “apis”

2024

Third, X fails to provide access to its public data to researchers in line with the conditions set out in the DSA. In particular, X prohibits eligible researchers from independently accessing its public data, such as by scraping, as stated in its terms of service. In addition, X's process to grant eligible researchers access to its application programming interface (API) appears to dissuade researchers from carrying out their research projects or leave them with no other choice than to pay disproportionally high fees.

European Commission

# 13th July 2024, 3:52 am / apis, twitter, europe

Deactivating an API, one step at a time (via) Bruno Pedro describes a sensible approach for web API deprecation, using API keys to first block new users from using the old API, then track which existing users are depending on the old version and reaching out to them with a sunset period.

The only suggestion I'd add is to implement API brownouts - short periods of time where the deprecated API returns errors, several months before the final deprecation. This can help give users who don't read emails from you notice that they need to pay attention before their integration breaks entirely.

I've seen GitHub use this brownout technique successfully several times over the last few years - here's one example.

# 9th July 2024, 5:23 pm / apis, github

Jina AI Reader. Jina AI provide a number of different AI-related platform products, including an excellent family of embedding models, but one of their most instantly useful is Jina Reader, an API for turning any URL into Markdown content suitable for piping into an LLM.

Add r.jina.ai to the front of a URL to get back Markdown of that page, for example https://r.jina.ai/https://simonwillison.net/2024/Jun/16/jina-ai-reader/ - in addition to converting the content to Markdown it also does a decent job of extracting just the content and ignoring the surrounding navigation.

The API is free but rate-limited (presumably by IP) to 20 requests per minute without an API key or 200 request per minute with a free API key, and you can pay to increase your allowance beyond that.

The Apache 2 licensed source code for the hosted service is on GitHub - it's written in TypeScript and uses Puppeteer to run Readabiliy.js and Turndown against the scraped page.

It can also handle PDFs, which have their contents extracted using PDF.js.

There's also a search feature, s.jina.ai/search+term+goes+here, which uses the Brave Search API.

# 16th June 2024, 7:33 pm / puppeteer, apis, markdown, ai, llms

Macaroons Escalated Quickly (via) Thomas Ptacek’s follow-up on Macaroon tokens, based on a two year project to implement them at Fly.io. The way they let end users calculate new signed tokens with additional limitations applied to them (“caveats” in Macaroon terminology) is fascinating, and allows for some very creative solutions.

# 31st January 2024, 4:57 pm / fly, thomas-ptacek, apis, security

2023

Getting started with the Datasette Cloud API. I wrote an introduction to the Datasette Cloud API for the company blog, with a tutorial showing how to use Python and GitHub Actions to import data from the Federal Register into a table in Datasette Cloud, then configure full-text search against it.

# 28th September 2023, 11:05 pm / datasette-cloud, apis, datasette

babelmark3 (via) I found this tool today while investigating an bug in Datasette’s datasette-render-markdown plugin: it lets you run a fragment of Markdown through dozens of different Markdown libraries across multiple different languages and compare the results. Under the hood it works with a registry of API URL endpoints for different implementations, most of which are encrypted in the configuration file on GitHub because they are only intended to be used by this comparison tool.

# 27th January 2023, 11:34 pm / apis, markdown

2022

Datasette’s new JSON write API: The first alpha of Datasette 1.0

Visit Datasette's new JSON write API: The first alpha of Datasette 1.0

This week I published the first alpha release of Datasette 1.0, with a significant new feature: Datasette core now includes a JSON API for creating and dropping tables and inserting, updating and deleting data.

[... 2,817 words]

2021

API Tokens: A Tedious Survey. Thomas Ptacek reviews different approaches to implementing secure API tokens, from simple random strings stored in a database through various categories of signed token to exotic formats like Macaroons and Biscuits, both new to me.

Macaroons carry a signed list of restrictions with them, but combine it with a mechanism where a client can add their own additional restrictions, sign the combination and pass the token on to someone else.

Biscuits are similar, but “embed Datalog programs to evaluate whether a token allows an operation”.

# 25th August 2021, 12:12 am / fly, thomas-ptacek, apis, security

Notes on streaming large API responses

I started a Twitter conversation last week about API endpoints that stream large amounts of data as an alternative to APIs that return 100 results at a time and require clients to paginate through all of the pages in order to retrieve all of the data:

[... 1,692 words]

Replaying logs to exercise the new API

22 days ago n1mmy pushed a change to help.vaccinate which logged full details of inoming Netlify function API traffic to an Airtable database.

[... 542 words]

APIs from CSS without JavaScript: the datasette-css-properties plugin

Visit APIs from CSS without JavaScript: the datasette-css-properties plugin

I built a new Datasette plugin called datasette-css-properties. It’s very, very weird—it adds a .css output extension to Datasette which outputs the result of a SQL query using CSS custom property format. This means you can display the results of database queries using pure CSS and HTML, no JavaScript required!

[... 891 words]

Custom Properties as State. Fascinating thought experiment by Chris Coyier: since CSS custom properties can be defined in an external stylesheet, we can APIs that return stylesheets defining dynamically server-side generated CSS values for things like time-of-day colour schemes or even strings that can be inserted using ::after { content: var(--my-property).

This gave me a very eccentric idea for a Datasette plugin...

# 7th January 2021, 7:39 pm / css, apis

2020

GraphQL in Datasette with the new datasette-graphql plugin

Visit GraphQL in Datasette with the new datasette-graphql plugin

This week I’ve mostly been building datasette-graphql, a plugin that adds GraphQL query support to Datasette.

[... 1,249 words]

PostGraphile: Production Considerations. PostGraphile is a tool for building a GraphQL API on top of an existing PostgreSQL schema. Their “production considerations” documentation is particularly interesting because it directly addresses some of my biggest worries about GraphQL: the potential for someone to craft an expensive query that ties up server resources. PostGraphile suggests a number of techniques for avoiding this, including a statement timeout, a query allowlist, pagination caps and (in their “pro” version) a cost limit that uses a calculated cost score for the query.

# 27th March 2020, 1:22 am / scaling, postgresql, graphql, apis

2019

Building a stateless API proxy (via) This is a really clever idea. The GitHub API is infuriatingly coarsely grained with its permissions: you often end up having to create a token with way more permissions than you actually need for your project. Thea Flowers proposes running your own proxy in front of their API that adds more finely grained permissions, based on custom encrypted proxy API tokens that use JWT to encode the original API key along with the permissions you want to grant to that particular token (as a list of regular expressions matching paths on the underlying API).

# 30th May 2019, 4:28 am / encryption, proxy, security, apis, github, jwt

2017

Datasette: instantly create and publish an API for your SQLite databases

I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.

[... 968 words]

2013

Which format for API documentation programmers prefer: PDF or Web?

HTML is a better format for documentation than PDF.

[... 160 words]

Does the Google Maps API let you remove details of the map such as street names to focus on pins on the map?

Yes—you can do this with map styles (which allow you to set the visibility if road labels, among other things): http://developers.google.com/map...

[... 53 words]

Which is the most complete and up to date API for restaurants/nightlife?

The foursquare API is pretty great for restaurants and nightlife these days. No chance if revenue share though—how would you envisage revenue share working?

[... 44 words]

Which free encyclopedias offer free APIs?

Wikipedia runs using Mediawiki, and Mediawiki has an API: http://www.mediawiki.org/wiki/API

[... 23 words]

What information do you feel is most valuable when integrating a Web API (REST or SOAP)?

  • A really good API explorer
  • Comprehensive documentation of the response format, including what happens if certain fields are missing (empty string, null value, missing key?)
  • Comprehensive documentation of the available request parameters, including allowed values
  • What are the rate limits?
  • What is returned if there is an error?

2012

Is it possible to embed Skype into a webpage to use as live chat support for free?

Olark offer a very neat JavaScript widget that does exactly this (it’s text-based messaging, not video or voice): http://www.olark.com/—you can try their demo at the bottom of their page.

[... 72 words]

Does Amazon have a API for websites to utilize order and delivery fulfillment?

The Amazon Fulfillment Web Service used to handle this http://aws.amazon.com/fws/—but their site now says "Effective June 2012, Amazon Services will no longer support Amazon Fulfillment Web Service (Amazon FWS). All functions and services currently supported by Amazon FWS are currently available through Amazon Marketplace Web Service (Amazon MWS)." So I guess you want the Amazon Marketplace Web Service: https://developer.amazonservices...

[... 82 words]

Are there any website thumbnail services that generate images in real-time?

http://url2png.com/ generates images on demand—you pass the URL directly to the service and it replies with a PNG image. The first load can take a few seconds (depending on how long it takes the originating site to serve up the assets etc) but they cache the generated images so future requests for the same URL will be served instantly.

Is there an API that returns metadata for a given URL?

I suggest taking a look at http://embed.ly/—it can take a huge range of URLs and turn them in to JSON metadata. Here’s what it can do with a Wikipedia page: http://embed.ly/docs/explore/obj...—and here’s Google Maps URL (not as useful, but still some interesting metadata extracted) http://embed.ly/docs/explore/obj...

[... 69 words]

2011

Are there any Meta APIs?

Embed.ly is a good example of this kind of API—it gives you one endpoint which wraps oembed APIs on dozens of other services (plus a bunch of custom scraping code). We use it as part of our video/slide embedding feature on http://lanyrd.com/

How we made an API for BoingBoing in an evening. Fluidinfo really is a fascinating piece of software. The team loaded in 11 years of BoingBoing content, allowing you to run structured queries against the data using their standard API, but also allowing users to attach their own information to the same corpus using Fluidinfo tags. Writable APIs are much less common than read-only APIs—Fluidinfo instantly provides both.

# 28th January 2011, 10:56 pm / apis, boingboing, fluiddb, fluidinfo, recovered

Google APIs & Developer Products. Presented as a sort-of-periodic table. There’s quite a bit of stuff on here I didn’t know about.

# 28th January 2011, 11:25 am / apis, google, recovered

Tip: Flickr standard photo response as slideshow. Neat trick—you can construct a URL to Flickr’s slideshow widget that includes the results of any API method, including the all-powerful flickr.photos.search. It’s a shame you can’t embed the resulting slideshow in an iframe.

# 25th January 2011, 3:51 am / apis, flickr, widgets, recovered

Introducing the FluidDB Explorer. Every good API deserves a dedicated API browser.

# 13th January 2011, 4:19 am / apis, fluiddb, recovered