4 posts tagged bragdoc

Eval-Driven Design with NextJS and mdx-prompt

In the previous article, we went on a deep dive into how I use mdx-prompt on bragdoc.ai to write clean, composable LLM prompts using good old JSX. In that article as well as the mdx-prompt announcement article, I promised to talk about Evals and their role in helping you architect and build AI apps that you can actually prove work.

Evals are to LLMs what unit tests are to deterministic code. They are an automated measure of the degree to which your code functions correctly. Unit tests are generally pretty easy to reason about, but LLMs are usually deployed to do non-deterministic and somewhat fuzzy things. How do we test functionality like that?

In the last article we looked at the extract-achievements.ts file from bragdoc.ai, which is responsible for extracting structured work achievement data using well-crafted LLM prompts. Here's a reminder of what that Achievement extract process looks like, with its functions to fetch, render and execute the LLM prompts.

Orchestration Diagram for Extracting Achievements from Text
The 3 higher-level functions are just orchestrations of the lower-level functions

When it comes right down to it, when we say we want to test this LLM integration, what we're trying to test is render() plus execute(), or our convenience function renderExecute. This allows us to craft our own ExtractAchievementsPromptProps and validate that we get reasonable-looking ExtractedAchievement objects back.

ExtractAchievementsPromptProps is just a TS interface that describes all the data we need to render the LLM prompt to extract achievements from a chat session. It looks like this:

//props required to render the Extract Achievements Prompt
export interface ExtractAchievementsPromptProps {
companies: Company[];
projects: Project[];
message: string;
chatHistory: Message[];
user: User;
Continue reading

mdx-prompt: Real World Example Deep Dive

I just released mdx-prompt, which is a simple library that lets you write familiar React JSX to render high quality prompts for LLMs. Read the introductory article for more general info if you didn't already, but the gist is that we can write LLM Prompts with JSX/MDX like this:

You are a careful and attentive assistant who extracts work achievements
from source control commit messages. Extract all of the achievements in
the commit messages contained within the <user-input> tag. Follow
all of the instructions provided below.
<Instruction>Each Achievement should be complete and self-contained.</Instruction>
<Instruction>If multiple related commits form a single logical achievement, combine them.</Instruction>
Pay special attention to:
1. Code changes and technical improvements
2. Bug fixes and performance optimizations
3. Feature implementations and releases
4. Architecture changes and refactoring
5. Documentation and testing improvements
<Companies companies={data.companies} />
<Projects projects={data.projects} />
<today>{new Date().toLocaleDateString()}</today>
{data.commits?.map((c) => <Commit key={c.hash} commit={c} />)}
<Repo repository={data.repository} />
examples={data.expectedAchievements?.map((e) => JSON.stringify(e, null, 4))}

This ought to look familiar to anyone who's ever seen React code. This project was born of a combination of admiration for the way IndyDevDan and others structure their LLM prompts, and frustration with the string interpolation approaches that everyone takes to generating prompts for LLMs.

In the introductory post I go into some details on why string interpolation-heavy functions are not great for prompts. It's a totally natural thing to want to do - once you've started programming against LLM interfaces, you want to start formalizing the mechanism by which you generate the string that is the prompt. Before long you notice that many of your app's prompts have a lot of overlap, and you start to think about how you can reuse the parts that are the same.

Venn Diagram of Prompt similarities
The Venn Diagram of these 3 prompts used in bragdoc.ai shows a large degree of overlap

Lots of AI-related libraries try to help you here with templating solutions, but they often feel clunky. I really, really wanted to like Langchain, but I lost a day of my life trying to get it to render a prompt that I could have done in 5 minutes with JSX. JSX seems to be a pretty good fit for this problem, and anyone who knows React (a lot of people) can pick it up straight away. mdx-prompt helps React developers compose their LLM prompts with the familiar syntax od JSX.

Continue reading

mdx-prompt: Composable LLM Prompts with JSX

I'm a big fan of IndyDevDan's YouTube channel. He has greatly expanded my thinking when it comes to LLMs. One of the interesting things he does is write many of his prompts with an XML structure, like this:

You are a world-class expert at creating mermaid charts.
You follow the instructions perfectly to generate mermaid charts.
The user's chart request can be found in the user-input section.

<instruction>Generate valid a mermaid chart based on the user-prompt.</instruction>
<instruction>Use the diagram type specified in the user-prompt.</instruction>
<instruction>Use the examples to understand the structure of the output.</instruction>

State diagram for a traffic light. Still, Moving, Crash.

Build a pie chart that shows the distribution of Apples: 40, Bananas: 35, Oranges: 25.
pie title Distribution of Fruits
"Apples" : 40
"Bananas" : 35
"Oranges" : 25
//... more examples

I really like this structure. Prompt Engineering has been a dark art for a long time. We're suddenly programming using English, which is hilariously imprecise as a programming language, and it feels not quite like "real engineering".

But prompting is actually not programming in English, it's programming in tokens. It just looks like English, so it's easy to fall into the trap of giving it English. But we're not constrained to that at all actually - we can absolutely format our prompts more like XML and reap some considerable rewards:

  • It's easier for humans to reason about prompts in this format
  • It's easier to reuse content across prompts
  • It's easier to have an LLM generate a prompt in this format (see IndyDevDan's metaprompt video)

We've seen this before

I've started migrating many of my prompts to this format, and noticed a few things:

  • It organized my thinking around what data the prompt needs
  • Many prompts could or should use the same data, but repeat fetching/rendering logic each time
Continue reading

How I built bragdoc.ai in 3 weeks

As we start 2025, it's never been faster to get a SaaS product off the ground. The frameworks, vendors and tools available make it possible to build in weeks what would have taken months or years even just a couple of years ago.

But it's still a lot.

Even when we start from a base template, we still need to figure out our data model, auth, deployment strategy, testing, email sending/receiving, internationalization, mobile support, GDPR, analytics, LLM evals, validation, UX, and a bunch more things:

How I built Bragdoc.ai in 3 weeks
Version 1 of anything is still a lot

This morning I launched bragdoc.ai, an AI tool that tracks the work you do and writes things like weekly updates & performance review documents for you. In previous jobs I would keep an achievements.txt file that theoretically kept track of what I worked on each week so that I could make a good case for myself come review time. Bragdoc scratches my own itch by keeping track of that properly with a chatbot who can also make nice reports for me to share with my manager.

But this article isn't much about bragdoc.ai itself, it's about how a product like it can be built in 3 weeks by a single engineer. The answer is AI tooling, and in particular the Windsurf IDE from Codeium.

In fact, this article could easily have been titled "Use Windsurf or Die". I've been in the fullstack software engineering racket for 20 years, and I've never seen a step-change in productivity like the one heralded by Cursor, Windsurf, Repo Prompt and the like. We're in the first innings of a wave of change in how software is built.

Continue reading