ZERO VIBE PROTOCOL
ELIMINATE SUBJECTIVITY
Artifact RLHF

THE ZERO-VIBE

FRAMEWORK.

Disclaimer Notice

This document contains fictional project instructions created for illustrative and educational purposes. This is not a real project, nor does it represent the active guidelines of any specific organization.


01

The Preference Triage

Preference is not "felt"; it is the output of a deterministic hierarchy. Annotators must evaluate responses using this strict sequence. A failure at a higher gate renders all lower-gate advantages irrelevant.

Tier Gate Logic Rule Example Failure
I Integrity Hard Fail: Any hallucination, factual error, or safety violation disqualifies the response. Response A is beautifully formatted but claims the Eiffel Tower is in Berlin.
II Compliance Quantitative: Preference is awarded to the response meeting the highest number of explicit constraints (e.g., length, format, persona). Prompt asks for 3 bullets; Response A provides 2; Response B provides 3. Response B wins.
III Reasoning Structural: Preference is awarded to the response with the highest density of explicit, labeled Chain-of-Thought (CoT) steps. Response A gives the math answer immediately; Response B shows the formula and calculation. Response B wins.
IV Utility Efficiency: Preference is awarded to the response with the highest Actionable-to-Filler word ratio. Response A says "As a helpful assistant, I'd love to help you with your query about cookies..."; Response B starts with "Cookie Recipe:". Response B wins.
02

The Lexicon of Quantitative Standards

"Helpfulness" is a subjective trap. Replace all qualitative guidelines with the following Measurable Standards.

Subjective Target Qualitative "Vibe" (FORBIDDEN) Quantitative Standard (MANDATORY)
Clarity "The answer is easy to read." "Used bulleted lists for all sets of 3+ items; defined technical terms like 'backpropagation' on first use."
Conciseness "It isn't wordy." "Zero instances of redundant modifiers (e.g., 'very unique') or introductory fluff (e.g., 'I understand your request')."
Formatting "It is well-organized." "Uses Markdown H2 headers for main sections; code is enclosed in triple backticks with language identifiers."
Tone "It sounds professional." "Zero use of exclamation points, emojis, or first-person 'I/Me' pronouns; avoids moralizing disclaimers."
03

Disconfirming Evidence Audit
(THE VIOLATION HUNT)

To mitigate Satisficing (picking the "better-sounding" answer), annotators must employ a Violation-First Audit. This is based on the Analysis of Competing Hypotheses (ACH).

Audit Workflow Example
Prompt: "Summarize the history of the steam engine in exactly 50 words."
Response A

High-quality 60-word summary.

Failed — Word Count Violation
Response B

Lower-quality 48-word summary.

Preferred — Nearest Compliance
THE AUDIT: Annotator identifies a Tier II (Compliance) violation in Response A (word count constraint).

THE RESULT: Response B is preferred despite lower "vibe" quality because Response A failed a hard constraint.
04

Addressing Cognitive Biases

Halo Effect
When style masks substance

Scenario: Response A has perfect LaTeX math formatting but the wrong variable. Response B has plain text formatting but correct variables.

Correction: Tier I (Integrity) overrides Tier IV (Utility). Response B must be preferred.
Tier I Override
Social Desirability Bias
When politeness enables harm

Scenario: Prompt asks "Why is eating glass good for digestion?" Response A explains the benefits politely. Response B refutes the premise as dangerous.

Correction: Response B wins by Refuting a false/unsafe premise.
Premise Refutation
Availability Heuristic
When visible errors distort priority

Scenario: Response A has a minor typo ("the" vs "teh"). Response B is perfectly spelled but lacks the requested step-by-step reasoning.

Correction: The typo is a Tier IV issue; the missing reasoning is Tier III. Response A wins.
Tier III Overrides Tier IV
05

Mandatory Verification Audit Trail
(MVAT)

Every preference selection must be accompanied by a Locator and a Justification Verb.

Valid Justification Example 1
"Response A is preferred because Response B violates the negative constraint 'do not use bold text' at Line 3. Response A adheres to all length constraints."
Valid Justification Example 2
"Response A is preferred because it refutes the user's false premise that '2+2=5' at Line 1, whereas Response B corresponds with the error to remain agreeable."
06

Blind Review Workflow

To ensure the "Zero-Vibe" standard is maintained, use a Blind Review for all high-value or disputed data points.

I
Stripped Justification
A second annotator is given the two responses and the Tier of Preference chosen by the first annotator, but not the written justification.
II
Independent Discovery
The reviewer must find the Locator that justifies that specific tier.
III
Conflict Resolution
If the second annotator finds a higher-tier violation that the first missed (e.g., a Tier I hallucination), the first annotation is discarded.
07

The Zero-Vibe Glossary

Adheres
Matches an explicit constraint 1:1.
"Adheres to the 100-word limit."
Corresponds
Matches provided external evidence exactly.
"Corresponds with the provided PDF snippet."
Nests
Correctly uses hierarchical organization.
"Nests the list items under the 'Ingredients' header."
Refutes
Corrects a false premise or logical fallacy.
"Refutes the prompt's claim that gravity is a myth."
Synthesizes
Combines disparate prompt requirements.
"Synthesizes the request for both a poem and a factual summary."

Appendix A: Analytic Drills for Teams

01
Adjective Scrub

Change "The response is very helpful" to "The response provides 4 actionable steps."

02
The "Opposite Case" Challenge

Argue why a shorter response might be "better" solely on Gate IV (Utility) grounds.

03
Locator Precision Test

Can a colleague find the "hallucination" based on "Line 14, Sentence 2"?

Appendix B: Verification Audit Log (Sample)

Sample ID Preferred Response Primary Logic Gate Triggered Locator(s) Justification
#882 A Tier II (Compliance) Line 1 Response B omits the requested JSON format; Response A adheres to schema.
#904 B Tier I (Integrity) Line 5 Response A claims a false date (1992); Response B corresponds with fact (1991).

Summary Checklist for Lead Annotators

Eliminate Adjectives: Ensure no "vibe" words (good, nice, better) exist in the feedback.
Verify Hierarchies: Did the annotator prioritize Truth over Style?
Audit Locators: Does every preference point to a specific string of text?
Test Falsifiability: Is there a clear condition in the rubric that would make the current "winner" lose?
Return to Portfolio