Yeah, unfortunately vision prompting has been a tough nut to crack. We've found it's very challenging to improve Claude's actual "vision" through just text prompts, but we can of course improve its reasoning and thought process once it extracts info from an image.
In general, I think vision is still in its early days, although 3.5 Sonnet is noticeably better than older models.
Recent articles
- django-http-debug, a new Django app mostly written by Claude - 8th August 2024
- Weeknotes: a staging environment, a Datasette alpha and a bunch of new LLMs - 6th August 2024
- Datasette 1.0a14: The annotated release notes - 5th August 2024