Prompt Engineering for Developers, Less Magic, More Method

Since GPT-4 came out in March, my feed is full of prompt magicians. Threads with “10 secret prompts that will change your life”. Number 7 will shock you . Courses selling prompt formulas like they are spells. People sharing screenshots of one lucky answer like it is proof of a method. I understand the excitement, the models really did get better. But I think we developers are looking at this the wrong way, and the right way is something we already know.

Here is my claim. Prompt engineering is not a new skill. It is an old skill with a new audience. Writing a good prompt is writing a good specification. We have been doing this, and failing at this, with humans for decades.

Think about what happens when you give a vague task to a developer. “Make the report faster.” Faster than what? For which user? Is one second fine or do we need a hundred milliseconds? Can we cache, can we change the data shape, is there a budget? The developer guesses, builds the wrong thing, and you do another round. Every round costs days. Most projects do not die from bad code, they die from rounds of guessing. I learned that running an agency for eleven years before I moved to Canada, where every round of rework came directly out of our margin. Vague requirements were not a quality problem, they were a financial leak.

Now look at what happens with GPT-4. You write “make this function better” and you get a generic answer. You write “this function is called thousands of times per request, reduce allocations, keep the public signature, we are on Java 11” and you get something useful. Same model, same cost per call, completely different value. The difference is not magic. The difference is that the second prompt is a specification and the first one is a wish.

The method, which is not new

A structured prompt with context, intent, constraints and acceptance producing a clean result

So instead of collecting secret prompts, I use the same checklist I use when writing a task for a human. It has four parts.

Context. What does the reader need to know to do this well? For a person, the business background and the constraints. For the model, the language version, the framework, the relevant code, what already failed. Both fail in the same way without context, they fill the gaps with assumptions, and assumptions are where bugs are born.

Intent. Not just what to do, but why. “Write tests for this class” gives you tests that mirror the code. “This class handles refunds, the risky part is partial refunds with expired cards, protect that behavior” gives you tests that protect money. Humans work the same way. People who know the why make better decisions on every detail you forgot to specify.

Constraints. What is not allowed. Do not change the public API. Do not add dependencies. Stay under this latency. Models, like contractors, will happily take the shortcut you forgot to forbid.

Acceptance. How we both know it is done. Examples of input and output work great, for the model and for the junior developer. If you cannot produce one example of a correct result, you do not understand the task yet, and no prompt will fix that for you.

Nothing in this list was invented for AI. It is requirements writing. The agile community spent twenty years on this, Martin Fowler’s writing on specification by example covers most of it. The model just made the feedback loop brutally short. With a human, a vague spec takes a week to come back wrong. With GPT-4, it comes back wrong in ten seconds. The model is a mirror for the quality of your thinking, and the mirror is fast.

The economics of clear words

Let me make the business case, because that is the part the magic threads skip.

A developer hour costs the company something like seventy to a hundred dollars, fully loaded. A round of rework caused by a vague task burns hours on both sides, the asker and the builder. If clear specifications save even one round of guessing per week per developer, across a team of ten you are saving thousands of dollars a month. That was already true before AI. The new part is that AI multiplies it, because now unclear writing wastes the model’s rounds and your review time too. Reviewing plausible but wrong AI code is expensive, it looks right, so you read it slowly.

There is also a team health side. Teams with a culture of clear, written, testable requests have fewer of those tense retro conversations that start with “that is not what I asked for”. Less friction, less blame, less burnout. The same writing habit that makes you good with GPT-4 makes you a better colleague. That is a nice two for one.

My honest take

I skip the prompt formula courses. I take that time and practice the older skill instead, writing one task at a time, for a human or a model, with context, intent, constraints, and acceptance examples. Every time, less rework comes back.

The magicians will keep selling spells, and the market for spells is always good. But the boring truth is better. Clear thinking, written clearly, was always the highest paid skill in software. GPT-4 did not change that. It just made the payment arrive faster.

Pax et bonum.