AI & Prompt EngineeringResearch2024

The art of prompting:
how to query AI models
and engineer the perfect prompt.

Most people use AI like a search engine. This is a guide to using it like a thinking partner — covering every technique that separates generic output from genuinely useful results.

DomainResearch / Freelance
Patterns40+ documented
Scope6 task categories
Year2024
40+Prompt patternsdocumented & tested
3Client workflowsoptimised with framework
6Task categoriescovered end-to-end
Why Prompting Is a Skill, Not a Trick

The gap between a mediocre AI response and a genuinely useful one is almost never the model's capability — it's the quality of the input. Large language models are not search engines, and they don't perform well when treated as one. They are context-sensitive text completion systems: they continue from wherever you leave them, which means the quality, specificity and structure of your prompt determines the ceiling of what they can produce.

This matters more than most people realise. An engineer who knows how to prompt well gets a first draft of a complex function in one shot. One who doesn't spends 20 minutes iterating on vague outputs. A marketing team with a prompting framework produces consistent, on-brand content at scale. One without it gets generic text that needs complete rewrites. The difference isn't the model — it's the operator.

The Core Principle

A language model continues your text. Every word in your prompt is a constraint that shapes the probability distribution of what comes next. More precise constraints produce more precise outputs. Vague prompts leave too much of the distribution open — and the model fills that space with the most statistically average response it can find.

What most people get wrong

The most common mistakes aren't about technical knowledge — they're about how people conceptualise the interaction. These patterns appear repeatedly across teams adopting AI tools for the first time:

  • Treating it like a search query. Single-line prompts without context, role or output specification. The model doesn't know who it's writing for, what format is needed, or what "good" looks like.
  • Asking for the answer before specifying the problem. "Write a marketing email" without explaining the product, audience, tone or goal. The model invents all of these — badly.
  • Iterating without structure. Adding corrections one at a time rather than rebuilding the prompt with the learned constraints. This creates a conversation that drifts rather than converges.
  • Accepting the first output. Good prompting is collaborative. The first response is diagnostic — it tells you what context the model assumed. Use it to refine the prompt, not to copy-paste.
Anatomy of a High-Quality Prompt

A well-engineered prompt is not a long prompt — it's a complete prompt. Completeness means the model has everything it needs to produce useful output without making assumptions about the things you actually care about. There are five components that, when combined, eliminate most of the guesswork.

Anatomy of a High-Quality Prompt

RoleYou are a senior backend engineer with 10 years of experience in distributed systems.
ContextI'm building a payment API that needs to handle concurrent webhook retries without double-charging.
TaskReview the following idempotency implementation and identify any race conditions or edge cases.
ConstraintsFocus on PostgreSQL-specific behaviour. Avoid suggesting Redis unless absolutely necessary.
FormatReturn a numbered list of issues, each with: problem, impact, and a concrete fix.

Not every prompt needs all five components. A simple factual question needs none of them. But for any task involving judgment, tone, structure or domain knowledge — which is most of the valuable use cases — the more components you include, the better the output.

Before and after: the same task, engineered differently

The examples below show the same underlying need expressed as a weak prompt and an engineered one. The output difference is dramatic.

Task: Explain a technical concept

❌ Weak prompt

Explain how JWT works.

✓ Engineered prompt

You are a senior backend engineer explaining concepts to a mid-level developer who has never used JWTs in production. Explain how JWT authentication works, covering: structure (header, payload, signature), the signing and verification flow, and the top two security mistakes teams make with JWTs. Use concrete examples. Keep it under 400 words.

Task: Generate marketing copy

❌ Weak prompt

Write a landing page headline for my SaaS tool.

✓ Engineered prompt

You are a conversion copywriter specialising in B2B SaaS. Write 5 landing page headline variants for a tool that helps security teams automate compliance reporting. Target audience: CISOs and security managers at 50–500 person companies. Tone: confident, direct, no buzzwords. Each headline should be under 10 words and lead with a measurable outcome.

Core Prompting Techniques

Beyond basic structure, a set of well-documented techniques reliably improve output quality for specific task types. These aren't tricks — they're ways of steering the model's internal reasoning process toward more careful, grounded, and useful responses.

Chain-of-Thought (CoT)

Chain-of-thought prompting instructs the model to reason through a problem step by step before producing an answer. This dramatically improves performance on tasks requiring multi-step reasoning — maths, logic, code debugging, complex decisions — because it forces the model to commit to intermediate reasoning that can be checked, rather than jumping directly to a conclusion that may be plausible but wrong.

The simplest trigger is adding "Let's think through this step by step." to your prompt. More powerful versions define the reasoning steps explicitly.

Chain-of-Thought Flow

01ProblemRaw user query
02Let's think…Step-by-step instruction
03ReasoningModel works through it
04VerificationCheck own logic
05AnswerGrounded final output
Chain-of-Thought promptA user's API request fails intermittently — about 1 in 50 requests returns a 503. The service has three replicas behind a load balancer. Logs show no errors on the application side. Reason through this step by step: 1. What are the most likely infrastructure causes? 2. What would you check first and why? 3. What's your working hypothesis after step 2? 4. What's the fastest way to confirm or rule it out?

Few-Shot Prompting

Few-shot prompting provides the model with two to five examples of the desired input-output pair before presenting the actual task. This is one of the most reliable techniques for enforcing style, format and tone — because it shows the model what "correct" looks like in context, rather than describing it abstractly.

It works especially well for classification, structured extraction and stylistically consistent content generation. The model infers the pattern from the examples and applies it to the new input.

Few-Shot — commit message generationGenerate a conventional commit message for the following diffs. Examples: Diff: Added rate limiting to /api/auth endpoint Commit: feat(auth): add rate limiting to authentication endpoint Diff: Fixed null pointer exception when user has no profile Commit: fix(profile): handle null profile in user lookup Diff: Removed unused imports from utils.py Commit: chore(utils): remove unused imports Now write the commit message for: Diff: Updated Stripe webhook handler to support subscription cancellation events

Role Framing

Assigning the model a specific expert role — "You are a senior Python engineer", "You are a UX researcher", "You are a hostile code reviewer" — shifts the vocabulary, reasoning style and depth of the response in measurable ways. Roles work because they activate a coherent cluster of knowledge and conventions associated with that domain.

The most effective roles are specific and contextual. "You are an expert" does very little. "You are a staff engineer who has reviewed over 500 PRs and has strong opinions about error handling" does substantially more.

Negative Prompting

Explicit constraints on what to exclude are often more powerful than positive instructions. Tell the model what to avoid: "Do not use bullet points", "Avoid jargon", "Do not suggest Redis unless strictly necessary", "Do not pad the response with caveats". Models have strong default tendencies — hedging, over-structuring, using filler — and explicit exclusions override them directly.

Output Format Specification

Specifying the exact output format — JSON, markdown table, numbered list, prose paragraphs, a specific schema — eliminates the model's default formatting choices and produces output that integrates cleanly into downstream systems or workflows without editing. For programmatic use, always ask for JSON and specify the exact structure.

Structured output promptExtract the following information from the support ticket below and return it as JSON only — no explanation, no markdown, no preamble. Schema: { "issue_type": string, "severity": "low" | "medium" | "high" | "critical", "affected_component": string, "user_impact": string, "suggested_team": string } Ticket: "Users on the mobile app are unable to complete checkout since this morning's deploy. Cart clears on payment confirmation. Affecting 100% of iOS users, Android unaffected."
Advanced Patterns — 40+ Documented

Beyond the core techniques, a library of more specific patterns covers the full range of professional use cases. Below is a selection from the 40+ patterns documented across this research — grouped by the mechanism they use to improve output quality.

🪞Self-CritiqueAsk the model to critique its own output before finalising."Review your answer above. List any weaknesses, then rewrite it."
🌡Temperature ControlSpecify creativity level explicitly when the default is wrong."Be conservative and literal. Avoid novel interpretations."
🎯Persona TargetingDefine the reader, not just the writer."Write this for a non-technical CEO, not an engineer."
Steelman PromptingAsk for the strongest version of a position you want to challenge."Present the best possible argument against my approach."
📐Constraint InjectionAdd hard limits that force focus and brevity."Maximum 5 bullet points, each under 12 words."
🔬DecompositionBreak complex tasks into sequential sub-prompts."First outline only. Stop. I'll ask for the draft next."
🪤Assumption SurfacingForce the model to state its assumptions before answering."List every assumption you're making, then answer."
🔄Iterative RefinementTreat the first output as a spec, not a deliverable."This is a first draft. Here's what needs to change: [...]"
🧪Adversarial TestingUse the model to find flaws in your own work."You are a hostile reviewer. Find every weakness in this."

The most powerful prompting pattern is also the simplest: tell the model exactly what a perfect response looks like, then ask it to produce one. Most people describe the problem. The best prompters describe the solution format.

Framework Coverage — 6 Task Categories

The prompting framework developed through this research covers six major task categories, each with its own set of tested patterns. Different categories require different techniques — what works for code generation is counterproductive for creative writing, and vice versa.

Code Generation8 patternsSpec → scaffold, TDD-first, refactor
Content Production7 patternsTone control, audience targeting, SEO
Data Analysis6 patternsTable reasoning, comparison, extraction
Document Summarisation7 patternsCompression, key-point isolation
Decision Support6 patternsPros/cons, risk framing, scenario trees
Structured Output6 patternsJSON, tables, ranked lists, schemas

Common mistakes per category

  • Code generation: Not specifying the language version, runtime environment, or error handling expectations. The model defaults to the most common pattern, not necessarily the right one for your stack.
  • Content production: Skipping audience and tone definition. "Write a blog post" without these produces content indistinguishable from every other AI-written post.
  • Data analysis: Passing raw data without specifying the analytical goal. The model will describe the data, not analyse it, unless you define what decision or insight you're working toward.
  • Summarisation: Not specifying compression ratio or output format. A "summary" could be three sentences or three pages — make the target explicit.
  • Decision support: Asking for a recommendation without providing the constraints that make it actionable. The model will give a generic framework, not a decision.
  • Structured output: Not validating the schema in the prompt itself. Include an example of the expected JSON structure directly in the prompt — don't rely on the model inferring it from a description.
Applying the Framework to Real Workflows

Client workflow 1 — engineering team

An engineering team was using AI for code review but getting inconsistent, shallow feedback. The problem: prompts like "review this code" left all critical parameters undefined. What language conventions? What's the risk tolerance? What's the focus — security, performance, readability?

After implementing a structured code review prompt template — with explicit role, language version, company conventions, severity classification and output schema — the feedback quality improved dramatically. Reviews became consistent, actionable and matched the team's actual standards.

Code review prompt templateYou are a staff Python engineer conducting a code review. Our stack uses Python 3.11+, FastAPI, SQLAlchemy (async), and Pydantic v2. Review priorities (in order): 1. Security vulnerabilities (especially injection, auth bypass, data exposure) 2. Correctness — logic errors, edge cases, missing validations 3. Performance — N+1 queries, blocking calls in async context 4. Readability — naming, structure, complexity For each issue found, return: - Severity: critical / high / medium / low - Location: line reference - Issue: one sentence - Fix: concrete code suggestion Do not comment on style preferences or things that work correctly. [CODE BELOW]

Client workflow 2 — content team

A content team was producing AI-assisted articles that required complete rewrites to match brand voice. Root cause: they were prompting for content without defining the voice, then editing the output — a slow and inconsistent process.

The fix was a two-stage approach: first, a style extraction prompt that analysed three high-performing existing articles and codified the brand's voice into a reusable style guide. Second, that style guide was included in every content generation prompt as a "voice context" block. First-draft quality improved enough that editing time dropped substantially.

Common Trap

Prompting frameworks only work if they're maintained and versioned like code. A prompt that works perfectly today may produce degraded output after a model update. Treat your prompt library as a living document — test it regularly against known-good outputs and update it when behaviour changes.

What the Research Produced

The research and applied work across this project produced a structured, reusable prompting framework — not a list of tips, but a systematic approach to decomposing any AI task into its constituent prompt components and selecting the right techniques for each.

  • 40+ prompt patterns documented across six task categories, each tested against measurable output quality benchmarks.
  • Three client workflows optimised — engineering, content and data analysis — with measurable reduction in prompt-to-usable-output iterations.
  • A prompt anatomy checklist (Role, Context, Task, Constraints, Format) adopted as a team onboarding tool for AI adoption.
  • Before/after prompt libraries for each task category, giving teams concrete starting points instead of blank prompts.
  • A prompt versioning practice for maintaining prompt quality across model updates.

Key Takeaway

Prompt engineering is not a permanent skill gap — it's a gap that closes quickly with deliberate practice. The ROI of investing a few hours in learning the core patterns is compounding: every AI-assisted task becomes faster and more reliable. Teams that treat prompting as infrastructure rather than improvisation get dramatically more value from the same models.

Back to case studies

Want to build an AI-powered workflow that actually works?

Let's talk

Get in touch

Let's build something
worth remembering.

Whether it's a full-stack product, an AI-powered feature or a security audit — I'm open to new projects, collaborations and interesting problems.