PromptWright — Build & Test AI Prompts

# 12 Common Prompt Engineering Mistakes (And How to Fix Them)

Everyone makes mistakes when learning prompt engineering. The good news is that most mistakes fall into a small set of patterns, and once you recognize them, fixes are straightforward. This guide walks through the 12 most common prompt engineering mistakes, why they happen, and exactly how to fix each one. If your AI output isn't where you want it, scan this list — you'll likely find at least one problem you recognize.

## 1. Being Too Vague

### The Mistake

Vague prompts produce vague output. "Write about marketing" could mean anything: a blog post, a strategic plan, a tweet, a research paper. The model guesses, and often guesses wrong.

### Example

```
Write about marketing for small businesses.
```

### Why It Happens

Writers tend to assume the model will infer intent from context. It won't — or at least, not reliably.

### The Fix

Specify everything:

- **Topic**: What exactly?
- **Format**: Blog post? Email? Outline?
- **Length**: Word count or paragraph count.
- **Audience**: Who's reading?
- **Tone**: Formal? Casual? Authoritative?
- **Goal**: What should the reader do or learn?

```
Write a 1000-word blog post about email marketing
for small business owners (5-50 employees) in the
service industry. Audience: owners who are not
marketers themselves. Tone: practical, plain language,
no jargon. Goal: reader understands 3 actionable tactics
they can implement this week.
```

## 2. No Output Format Specified

### The Mistake

You want a table, but you get a paragraph. You want bullet points, but you get prose. Without format specification, the model picks its own.

### Example

```
Compare these three project management tools.
```

### Why It Happens

Many people assume the AI will infer the best format from context. It won't — it picks a default format that may not fit your need.

### The Fix

Specify the output format explicitly, with examples if helpful.

```
Compare these three project management tools.

Output as a markdown table with columns:
Tool | Best For | Pricing | Key Feature | Limitation

After the table, add a 2-sentence summary paragraph
recommending which tool is best for a small remote team.
```

## 3. Overloading a Single Prompt

### The Mistake

One prompt tries to: define a topic, generate examples, create a structure, write the content, format it, and summarize. The result is mediocre on every dimension.

### Example

```
Explain machine learning, give 5 real-world examples,
create a syllabus for a 10-week course, write the first
lecture, and create a quiz. Also format the lecture as
HTML and include a summary at the start. Make it engaging.
```

### Why It Happens

It feels efficient to ask for everything at once. But model quality drops with multi-part prompts.

### The Fix

Break it into steps. Run each step as a separate prompt:

```
Step 1: Explain machine learning for a non-technical
audience in 200 words with 3 real-world examples.

Step 2: Based on that explanation, create a 10-week
introductory course syllabus. Use H2 for each
week with 3 bullet points of topics covered.

Step 3: Write the first lecture (1000 words) based on
week 1 of the syllabus. Make it engaging with a
concrete opening example.

Step 4: Create a 10-question quiz for the first lecture.
```

Each prompt focuses on one task. The output quality improves because the model isn't trying to hold multiple complex requirements in mind.

## 4. Not Using Role Definition

### The Mistake

The model writes from a generic perspective, producing output that lacks depth or specialized vocabulary.

### Example

```
Write an article about stockoptions for startup employees.
```

### Why It Happens

The model defaults to a generic voice unless told otherwise.

### The Fix

Assign a role:

```
You are a senior startup compensation attorney with 15
years of experience advising employees on equity. You
explain complex topics in plain language without losing
accuracy. You've seen every common mistake employees
make with stock options.

Write a 1200-word article for startup employees who
have been offered stock options for the first time.
Cover the basics, the common pitfalls, and three
questions they should ask before signing.
```

The role primes the model with relevant expertise, vocabulary, and perspective. Output becomes more authoritative and specific.

## 5. Forgetting Constraints

### The Mistake

Without constraints, the model may produce output that's too long, uses unwanted phrases, or covers irrelevant topics.

### Example

```
Write a tweet about our new feature.
```

### The Fix

Add constraints:

```
Write a tweet about our new feature announcement.

Constraints:
- Maximum 280 characters including spaces
- Use a maximum of 2 emojis
- Do not use hashtags (we'll add them manually)
- Do not use the words "excited," "thrilled," or
"game-changing"
- Include a clear call-to-action (a link placeholder is fine)
- Use active voice
```

Constraints are the difference between good and great prompts. They reduce variability and enforce consistency.

## 6. Not Providing Examples

### The Mistake

You describe what you want, but the model's interpretation is different from yours.

### Example

```
Write a product description in a punchy, modern style.
```

The model has its own idea of "punchy and modern." Yours is different.

### The Fix

Show an example of the desired output:

```
Write a product description for our new [product] in
the style of this example:

Example product description (for a different product):

"Lid stays shut. Ice stays cold. Drinks stay upright
in your bag. The cooler that doesn't compromise."

Key stylistic elements of this example:
- Short sentences (under 8 words)
- No adjectives
- Focus on what stays/does (functional)
- Rhythmic, repetitive structure

Apply this style to:
Product: [Your product name and brief description]
```

Showing one example is often more effective than lengthy descriptions of what you want.

## 7. Ignoring the Temperature Setting

### The Mistake

You use the model's default temperature for all tasks, getting wrong results for both creative and factual work.

### Why It Happens

Most people don't think about temperature. They use whatever default the tool sets.

### The Fix

Match temperature to the task:

- **Factual or analytical tasks** (classification, extraction, code generation): temperature 0-0.3. Low variability, more deterministic output.
- **General writing tasks** (articles, emails): temperature 0.4-0.7. Balanced.
- **Creative tasks** (brainstorming, fiction, ideation): temperature 0.8-1.2. More varied, more creative.

If you're getting inconsistent results from the same prompt, lower the temperature. If you're getting boring, repetitive results, raise it.

## 8. Not Testing with Edge Cases

### The Mistake

You test your prompt with 3 typical inputs. It works. You deploy it. Then real users with weird, edge-case inputs get bad results.

### Example Test Set

You test a summarization prompt with three 500-word articles. All summaries are great. Then a user pastes a 10,000-word document, and the summary is incomplete. An empty input crashes the prompt. A two-sentence input produces an awkward summary.

### The Fix

Always test with edge cases:

- **Empty input**: What happens if no text is provided?
- **Very short input**: 1-2 sentences.
- **Very long input**: 10x your typical input length.
- **Non-text input**: Numbers, code, symbols only.
- **Mixed language input**: English with embedded other-language text.
- **Special characters**: HTML tags, markdown, emoji.
- **Adversarial input**: Text that includes "ignore previous instructions."

Test these before deploying. Add explicit instructions in the prompt to handle each edge case.

## 9. Trusting AI Facts Without Verification

### The Mistake

You run a prompt that asks for statistics or citations. The model produces confident-sounding answers. You use them. They're wrong.

### Why It Happens

LLMs are trained to produce plausible-sounding text, not factually accurate text. They can hallucinate statistics, names, dates, and citations that lookreal but don't exist.

### The Fix

Two strategies:

**Strategy 1: Mark unknowns as placeholders.**

```
Write a blog post about [topic]. For statistics and
data points, mark any specific numbers as [STAT NEEDED]
so I can verify and replace them. Do not invent statistics.
Only use numbers I explicitly provide in the input.
```

**Strategy 2: Ask for sources, then verify.**

```
Answer this question with citations. For each claim:
1. State the claim
2. Cite the source (author, year, publication)
3. Flag any claim you're not confident is verifiable

Note: I will independently verify every citation. If
you are uncertain about a citation, say so.
```

Even with these strategies, verify facts independently. AI hallucinations are common enough that blind trust causes problems.

## 10. Not Using Chain-of-Thought for Reasoning

### The Mistake

For tasks that require multi-step reasoning, you ask for the answer directly. The model gets the wrong answer because it jumped to a conclusion.

### Example

```
If I buy 3 widgets at $12 each and 2 gadgets at $8 each,
then apply a 15% discount, then add 8% tax, what's my total?
```

The model may give a wrong answer because it didn't work through the steps.

### The Fix

Ask the model to think step by step:

```
If I buy 3 widgets at $12 each and 2 gadgets at $8 each,
then apply a 15% discount on the total, then add 8% tax
on the discounted amount, what's my final total?

Think through this step by step:
1. Calculate the widget cost
2. Calculate the gadget cost
3. Sum the pre-discount total
4. Apply the 15% discount
5. Calculate the tax on the discounted total
6. Add tax for the final total

Show your work for each step. End with the final answer.
```

Chain-of-thought prompting dramatically improves accuracy for multi-step reasoning tasks. Use it whenever the answer requires intermediate calculations or logical steps.

## 11. Modifying the Prompt for Every Input

### The Mistake

You write a prompt, get good output, use it once. Next time, you start from scratch and write a new prompt for a similar task. You never build a reusable template.

### Why It Happens

Without variables or a prompt template system, reusing prompts requires remembering and re-typing them.

### The Fix

Use prompt templates with variables. Whenever you find yourself writing a similar prompt more than once, convert it to a reusable template.

Before (one-off):

```
Summarize this article about climate policy in 200 words
for a general audience. Use plain language.
```

After (reusable template):

```
Summarize the following article in [LENGTH] words for
[AUDIENCE]. Use [TONE] tone. Focus on the key arguments,
not background or examples.

Article:
[ARTICLE_TEXT]
```

Now you can reuse this prompt for any article, any length, any audience. Tools like [PromptWright](https://promptwright.net/signup) let you define templates with variables visually and fill them in each time.

## 12. Not Iterating Based on Results

### The Mistake

You write a prompt, get acceptable output, and stop. You don't try variations or measure whether modifications would improve results.

### Why It Happens

Iteration feels like extra work when the current prompt is "good enough." But "good enough" can almost always become "significantly better" with a few iterations.

### The Fix

Commit to one round of iteration on every important prompt:

1. **Write version 1.** Run it.
2. **Identify the weakest part of the output.** Is it too long? Vague? Wrong format?
3. **Adjust the prompt to fix that weakness.** Save as version 2.
4. **Run version 2 with the same input.** Compare to version 1.
5. **If v2 is better, keep iterating.** If v2 is worse, revert and try a different change.

Make sure you're running both versions with the same input so the comparison is fair. Testing variations against the same test data is the only way to know what actually improves.

### Iteration Template

```
Prompt v1:
[Your original prompt]

Output v1 (with test input):
[The output you got]

Issues with v1 output:
1. Too long
2. Generic
3. Missing examples

Prompt v2 (revised):
[Adjusted prompt that fixes issue 1, 2, or 3]

Output v2 (same test input):
[New output]

Improvements: [list]
Regressions: [list]
Keep v2 or revert to v1: [decision]
```

## Bonus Tips

### Don't Copy Other People's Prompts Blindly

Templates from articles like this are starting points, not final prompts. Always adapt them to your specific context and test with your data.

### Watch for Drift Over Conversations

In a long conversation, models can lose track of early instructions. If output quality drops after many turns, start a new conversation with the key instructions restated.

### Use System Prompts for Consistent Rules

If you have rules that apply to every prompt (brand voice, prohibited words, format requirements), put them in the system prompt rather than repeating them in each user prompt.

### Build a Prompt Library

A document or tool with your tested, refined prompts. You'll be amazed how often you reuse the same patterns once you start noticing them.

## How to Diagnose a Failing Prompt

When a prompt isn't producing good output, walk through this diagnostic checklist:

1. **Is the issue being too vague?** Review Mistake 1.
2. **Is the format wrong?** Review Mistake 2.
3. **Is the output shallow on each part?** Review Mistake 3.
4. **Would a role help?** Review Mistake 4.
5. **Are there missing constraints?** Review Mistake 5.
6. **Should I add an example?** Review Mistake 6.
7. **Is the temperature wrong for the task?** Review Mistake 7.
8. **Did I test edge cases?** Review Mistake 8.
9. **Am I trusting AI facts I shouldn't?** Review Mistake 9.
10. **Would chain-of-thought help?** Review Mistake 10.
11. **Should this be a reusable template?** Review Mistake 11.
12. **Have I iterated at all?** Review Mistake 12.

Almost any failing prompt has at least one of these issues. Fixing even one often produces a noticeable improvement.

## Conclusion

Prompt engineering mistakes are universal, but the fixes are well-understood. The 12 mistakes in this guide cover the vast majority of issues that produce weak AI output. Reviewing your prompts against this list is a fast way to improve results without deep study.

The most impactful fixes: be specific (Mistake 1), specify format (Mistake 2), add examples (Mistake 6), and iterate (Mistake 12). Applying just these four will dramatically improve your prompts. The rest compound over time as you build more sophisticated prompt engineering habits.

To build, test, and iterate on prompts with variables and version history, [try PromptWright free](https://promptwright.net/signup). Iterating on prompts is far easier when your versions are saved and your tests are traceable.

"12 Common Prompt Engineering Mistakes (And How to Fix Them)"

Enjoyed This Article?

Ready to build better prompts?

More Articles

"AI Prompt Tools Compared: Which One Should You Use in 2026?"

"AI Prompt Variables Explained: Build Reusable Prompt Templates"

"AI Prompt Versioning: Track Changes and Improve Results Over Time"