The power of a good prompt

David Fincher is famous for shooting an absurd number of takes. Dozens. Sometimes over a hundred for a single scene. But the reason isn't that his actors are bad. It's that Fincher has an extremely specific vision for what he wants, and he keeps refining his direction until the performance matches it.

Compare that to a director who tells an actor, "OK, be sad in this scene," and rolls camera. Maybe the actor nails it. Maybe they go way too big. Maybe they deliver something technically competent but emotionally flat. You get whatever you get.

The difference isn't talent. It's direction.

The same principle applies to working with AI. When you type a question into ChatGPT or Claude, you're directing a performance. A vague prompt is like telling an actor to "be sad." A well-crafted prompt is like Fincher: specific about the role, the context, the constraints, and the desired outcome.

In the last post, we introduced prompt engineering as one of four techniques for reducing hallucination. Now let's dig into what that actually looks like in practice.

What is a prompt, really?

When most people think of a "prompt," they think of the message they type into a chatbot. That's part of it, but there's more going on behind the scenes.

Production AI tools typically operate with two layers of instructions. The user prompt is what you type into the chat box. The system prompt is a set of behind-the-scenes instructions that the user never sees. It defines the model's role, behavior, and boundaries before the conversation even starts.

If you've ever used a chatbot on a company's website and noticed it stays on topic, introduces itself by name, or declines to answer certain questions, that's the system prompt at work. Someone wrote instructions telling the model how to behave, and the model follows them (most of the time).

System prompts are where most of the heavy lifting happens in prompt engineering. They're also where the techniques in this post are most commonly applied. So while everything we'll cover works in a regular chat conversation, the real power shows up when you're building tools for other people to use.

Role assignment

The first and most fundamental technique is telling the model who it is. This sounds almost too simple, but it has a surprisingly large effect on the quality of responses.

Here's a vague prompt:

And here's a specific one:

The second prompt establishes a clear identity, defines the scope of what the model should help with, draws a boundary around what it shouldn't do, and gives it an escape hatch for uncertainty. That's four meaningful constraints in four sentences.

Why does this matter? Remember, LLMs are trained to be helpful. Without a defined role, the model will try to answer anything you throw at it, whether it has the knowledge to do so or not. Giving it a role is like telling a new hire, "Your job is X. If someone asks you about Y, send them to the right department." It reduces the surface area for mistakes.

Going back to our Fincher analogy: a director doesn't just say "action." They tell the actor who their character is, what just happened in the scene before, and what emotional state they should be in. Role assignment does the same thing for the model.

Scoping and constraints

Role assignment tells the model what it is. Scoping tells the model what it should and shouldn't do.

This is especially important for anything public-facing. Imagine your city launches a chatbot on its website to help residents find information about city services. Without constraints, that chatbot might happily opine on the mayor's performance, speculate about pending lawsuits, or give tax advice. None of which you want.

A well-scoped prompt for a resident-facing bot might include instructions like:

Only answer questions about city services, hours, locations, and public programs.
Do not discuss political topics, elected officials, pending litigation, or personnel matters.
Do not provide legal, financial, or medical advice.
If someone asks a question outside your scope, politely tell them to call 311 or visit City Hall.
If you are not confident in your answer, say so.

That last one is worth repeating: telling the model it's OK to say "I don't know" is one of the single most effective things you can do to reduce hallucination. Without that instruction, the model's default behavior is to always produce an answer, even when it's guessing. With it, the model has permission to punt, and it will.

Will it always punt at the right times? No. But it will punt a lot more often than it would without the instruction, and each time it does, that's a hallucination that didn't happen.

Few-shot examples

The techniques we've covered so far tell the model what to do in the abstract. Few-shot examples show it what good output actually looks like.

The idea is simple: include a handful of example question-and-answer pairs in your prompt. The model picks up on the pattern and mimics it. This is called "few-shot" prompting because you're giving it a few examples to learn from (as opposed to "zero-shot," where you give it no examples and just hope for the best).

Let's say you're building a budget Q&A tool for your finance team. You want answers that are concise, cite the specific fund, and include the fiscal year. You could describe all of that in your instructions, or you could just show the model what you mean:

Now when a staff member asks a new question, the model has a template to follow. It knows to include the fund number, the fiscal year, and comparative context. You didn't have to write a paragraph explaining your preferred format. You just showed it.

Few-shot examples are particularly powerful for controlling tone and format. If you've ever trained a new employee by saying, "Here, look at how Sarah wrote this report, and do something similar," you've done few-shot prompting. Same principle.

How many examples do you need? Usually two or three is enough. One example can help, but the model might latch onto a quirk of that specific example rather than the underlying pattern. Two or three gives it enough signal to generalize. Much more than that and you're eating into your context window without much additional benefit.

Chain-of-thought prompting

This one sounds technical, but the intuition is simple. If you ask someone a complex question and they blurt out the first answer that comes to mind, they're more likely to be wrong than if they talk through their reasoning first. The same is true for LLMs.

Chain-of-thought prompting means asking the model to think through a problem step by step before giving its final answer. You can do this as simply as adding "Think through this step by step before answering" to your prompt.

Why does this work? It goes back to how LLMs generate text, which we covered in an earlier post. The model predicts one token at a time, and each token it generates influences the next one. When you ask the model to reason out loud, those reasoning tokens become part of the sequence. They steer the model toward a more considered final answer, because by the time it gets to the conclusion, it's already "thought through" the intermediate steps.

It's a little like the difference between asking a colleague, "What should we do?" and asking, "Walk me through your thinking." The second version forces them to organize their thoughts, which usually leads to a better answer.

A note on modern reasoning models

If you've used a recent version of ChatGPT, Claude, or Gemini, you may have noticed that they sometimes seem to "think" before responding. That's not your imagination.

Many modern models are now specifically trained to perform a reasoning step before generating their final answer. In effect, chain-of-thought happens automatically, behind the scenes. The model works through the problem internally before you see any output.

This is a meaningful improvement. It means that even without explicit "think step by step" instructions, newer models are better at complex reasoning than their predecessors were. But it doesn't make chain-of-thought prompting obsolete. For particularly tricky problems, explicitly asking the model to show its work can still improve results, both because it gives you visibility into the reasoning and because it adds even more structure to the thinking process.

The key takeaway: if you're using a recent model, you're already getting some of this benefit for free. If you're working on something complex and want the model to be more careful, you can still ask it to reason explicitly.

Output format instructions

The last technique is the simplest to explain: tell the model what format you want the answer in.

In the previous post, we mentioned structured outputs as a technique for making AI responses easier to verify. There's overlap here. The difference is that structured outputs (at the API level) enforce a format. Output format instructions in a prompt request a format. The model will usually comply, but it's not guaranteed.

That said, even a requested format goes a long way. Instead of getting a rambling paragraph that buries the key information somewhere in the middle, you can ask for something like:

Now every response has a consistent structure. Staff members know where to look for the information they need. And if the model fills in a policy section that doesn't exist, it's immediately obvious because it's in its own labeled field rather than buried in prose.

Output format instructions also pair well with few-shot examples. Define the format in your instructions, then show what it looks like with a couple of examples. That combination gives the model both the rules and the model to follow.

Common mistakes

Before we wrap up, let's talk about what not to do.

Being too vague. This is the most common mistake by far. "Help me with budget stuff" gives the model almost nothing to work with. What kind of help? For what audience? In what format? How detailed? The model will fill in every blank with its best guess, and guessing is where things go wrong.

Being too rigid. This is the opposite problem, and it's more subtle. If you over-constrain the model with extremely detailed instructions for every possible scenario, you can actually make it less useful. The model may get confused by contradictory rules or become so cautious that it won't answer anything. Good prompts define boundaries clearly but leave room for the model to be helpful within them.

Not iterating. Good prompts aren't written in one sitting. They're developed through testing. You write a prompt, try it with a variety of questions, notice where it breaks, and refine. This is normal and expected. If your first prompt is perfect, you probably got lucky.

There's a parallel here to management. The best managers don't hand someone a 40-page procedures manual on day one. They give clear direction, observe the results, and adjust. Prompt engineering works the same way.

What you don't need to worry about

You don't need to be a programmer to write good prompts. You don't need to memorize magic phrases or special syntax. The principles here are the same ones you'd use to write clear instructions for a person: define the role, set expectations, give examples of what good looks like, and tell them what to do when they're stuck.

If you can write a clear job description or a decent set of meeting notes, you can write an effective prompt. The skill is in the clarity of your thinking, not in any technical knowledge.

Next up in the series, we'll look at retrieval-augmented generation (RAG): how to give the model access to your actual documents so it can answer questions based on real information instead of relying on memory.