Day 23 Check-in:
Prompt evaluation is similar to the AI interface I mentioned earlier, and both can be done with real people 😂
Code graders - Programmatically evaluate outputs using custom logic
Model graders - Use another AI model to assess the quality
Human graders - Have people manually review and score outputs
If it's not a plan, the prompt should be clear and direct. Avoid using interrogative sentences and provide as much information as possible.
Guidelines can significantly improve build quality.
Claude recommends using XML tags, but I generally use Markdown.
If you're building a product, setting up a prompt testing pipeline is essential.