April 7th, 2025

Tobi's Memo: AI usage is now a baseline expectation

In Apirl of 1998, Intel Chairman Andy Grove remarked:

All companies will be Internet companies, or they won't be companies.

Today Tobi Lutke, CEO of Shopify, shared an internal and now public memo titled AI usage is now a baseline expectation.

What stands out to me the most is how this will impact staff resourcing:

Before asking for more Headcount and resources, teams must demonstrate why they cannot get what they want done using AI. What would this area look like if autonomous AI agents were already part of the team? This question can lead to really fun discussions and projects.

Published at 11:08 am


March 8th, 2025

Prompting Like An Expert: Cognitive Behaviors That Matter

Summary

I recently created this meta-prompt after applying what I learned from this arXiv paper.

Exploring Expert Behavior Driven Prompting

The paper caught my attention as it seemed it could be a great resource for prompt design.

Just before reviewing it, I'd been struggling building a prompt with a strict set of output requirements. One of them was to keep the output under 2000 words.

It kept creating well formatted outputs about 2500 words long.

This wasn't surprising as LLMs often "think" in terms of token length as supposed to word length.

After summarizing the paper, I knew it could help. And when I was done it did. I was able to get high-quality output in the 1400 word range.

Using the HTML version of the paper I asked Claude 3.7 Sonnet to summarize the paper for me.

Summarize this research report for me.

[THE HTML VERSION OF THE arXiv PAPER (https://arxiv.org/html/2503.01307v1)]

I then followed up by exploring how the research could be applied to prompt design.

How would the learnings from this paper be used when designing system prompts for llm chat assistants using the direct foundation model API (ex: openai, anthropic)

I then asked what a prompt would look like that incorporated the application of this research.

Here's what I got.

Now I could use this prompt to make it even better.

The v2 prompt was created using ChatGPT 4.5.

GPT-4.5 created an improved result which I was able to use for another round of improvement.

And I ended up the the final prompt to use.

I used both Claude 3.7 Sonnet and ChatGPT 4.5 to evaluate all three versions of the prompt.

It was satisfying to see the v2 prompt score higher than the v1 prompt and the v3 prompt score higher than the v2 prompt.

This entire process was a lot of fun. It's another example of how when we identify something potentially useful, we can leverage LLMs to quickly validate its potential usefulness and create new tools we can use.

Published at 3:19 pm


February 25th, 2025

Hello World (again)

Well, I'm blogging again!

I've torn this site down and built it back up a few times over the past ~2 decades.

I've never been quite happy with the tooling I've used. And so I've coded this site from scratch using Cursor + Sonnet so it's come together fast and it's (more or less) just to my liking.

I intend to use this Link Blog section to regularly share links in the style of Simon Willison's Link Blog.

Thanks for reading.

Published at 3:54 pm