Halucinatiile LLM - cand AI-ul genereaza cu incredere informatii false - sunt o provocare critica. Acest ghid acopera strategii de detectare si prevenire.
Intelegerea Halucinatiilor
Tipuri de Halucinatii LLM:
1. Erori Factuale
"Turnul Eiffel a fost construit in 1920" ❌ (De fapt 1889)
2. Surse Fabricate
"Conform unui studiu Nature din 2023..." ❌ (Studiul nu exista)
3. Inconsistente Logice
"X este adevarat" → mai tarziu → "X este fals" ❌
4. Confuzie de Entitati
"Einstein a inventat telefonul" ❌ (Bell l-a inventat)
Strategia 1: Retrieval-Augmented Generation (RAG)
Ancoreaza raspunsurile in documente reale:
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnablePassthrough
# Creeaza vector store din documentele tale
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(documents, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
# Prompt RAG care previne halucinatia
prompt = ChatPromptTemplate.from_template("""
Answer the question based ONLY on the following context.
If the context doesn't contain the answer, say "I don't have information about that."
Do NOT make up information.
Context:
{context}
Question: {question}
Answer:""")
# Chain cu retrieval
chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt
| ChatOpenAI(temperature=0)
)
response = chain.invoke("What is our refund policy?")Verifica daca citatiile exista:
def verify_rag_response(response, retrieved_docs):
"""Verifica daca continutul raspunsului provine din documente."""
response_lower = response.lower()
doc_content = " ".join([doc.page_content.lower() for doc in retrieved_docs])
# Extrage fapte potentiale din raspuns
sentences = response.split('. ')
verified = []
unverified = []
for sentence in sentences:
# Verifica daca frazele cheie apar in documentele sursa
words = sentence.lower().split()
key_phrases = [' '.join(words[i:i+3]) for i in range(len(words)-2)]
if any(phrase in doc_content for phrase in key_phrases):
verified.append(sentence)
else:
unverified.append(sentence)
return {
"verified": verified,
"unverified": unverified,
"confidence": len(verified) / (len(verified) + len(unverified))
}Strategia 2: Verificarea Auto-Consistentei
Pune aceeasi intrebare in mai multe moduri:
import openai
from collections import Counter
def check_consistency(question, model="gpt-4o", samples=5):
"""Genereaza raspunsuri multiple si verifica consistenta."""
client = openai.OpenAI()
responses = []
for _ in range(samples):
response = client.chat.completions.create(
model=model,
temperature=0.7, # Putina variatie
messages=[
{"role": "system", "content": "Answer concisely and factually."},
{"role": "user", "content": question}
]
)
responses.append(response.choices[0].message.content)
# Verifica daca raspunsurile sunt de acord
# Pentru intrebari factuale, raspunsurile ar trebui sa fie consistente
return {
"responses": responses,
"unique_answers": len(set(responses)),
"consistent": len(set(responses)) <= 2 # Permite variatie minora
}
# Utilizare
result = check_consistency("What year was Python released?")
if not result["consistent"]:
print("Atentie: Raspunsuri inconsistente detectate - posibila halucinatie")Strategia 3: Scoring de Incredere
Pune modelul sa isi evalueze propria incredere:
def get_answer_with_confidence(question):
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": """
Answer the question, then rate your confidence on a scale of 1-10.
Format:
Answer: [your answer]
Confidence: [1-10]
Reasoning: [why this confidence level]
If you're not sure, say so. It's better to admit uncertainty than to guess.
"""},
{"role": "user", "content": question}
]
)
text = response.choices[0].message.content
# Parseaza increderea
import re
confidence_match = re.search(r'Confidence:\s*(\d+)', text)
confidence = int(confidence_match.group(1)) if confidence_match else 5
return {
"full_response": text,
"confidence": confidence,
"needs_verification": confidence < 7
}
# Utilizare
result = get_answer_with_confidence("Who invented the transistor?")
if result["needs_verification"]:
print("Incredere scazuta - verifica aceasta informatie!")Strategia 4: Pipeline de Fact-Checking
Referinta incrucisata cu surse externe:
import requests
def fact_check_claim(claim):
"""Foloseste API-uri externe pentru a verifica afirmatii."""
# Optiunea 1: Cautare Wikipedia
wiki_url = "https://en.wikipedia.org/w/api.php"
params = {
"action": "query",
"list": "search",
"srsearch": claim,
"format": "json"
}
response = requests.get(wiki_url, params=params)
results = response.json().get("query", {}).get("search", [])
# Optiunea 2: Foloseste un API de cautare
# Optiunea 3: Verifica fata de baza de cunostinte
return {
"claim": claim,
"evidence_found": len(results) > 0,
"sources": [r["title"] for r in results[:3]]
}
def validate_response(llm_response):
"""Extrage si verifica afirmatiile factuale."""
client = openai.OpenAI()
# Extrage afirmatii
extraction = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": """
Extract factual claims from this text that can be verified.
Return as a JSON list of claims.
Only include objective, verifiable facts, not opinions.
"""},
{"role": "user", "content": llm_response}
]
)
import json
claims = json.loads(extraction.choices[0].message.content)
# Verifica fiecare afirmatie
results = []
for claim in claims:
verification = fact_check_claim(claim)
results.append(verification)
return resultsStrategia 5: Validare Output Structurat
Aplica schema si valideaza:
from pydantic import BaseModel, validator
from typing import List, Optional
import openai
import json
class FactualResponse(BaseModel):
answer: str
sources: List[str]
confidence_level: str # "high", "medium", "low"
caveats: Optional[List[str]] = None
@validator('confidence_level')
def validate_confidence(cls, v):
if v not in ["high", "medium", "low"]:
raise ValueError("Invalid confidence level")
return v
@validator('sources')
def require_sources(cls, v, values):
if values.get('confidence_level') == 'high' and len(v) == 0:
raise ValueError("High confidence claims need sources")
return v
def get_validated_answer(question):
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": f"""
Answer questions with verified information.
Return JSON matching this schema:
{FactualResponse.schema_json()}
Rules:
- Only claim "high" confidence for well-known facts
- Include sources when possible
- List caveats for uncertain information
"""},
{"role": "user", "content": question}
]
)
data = json.loads(response.choices[0].message.content)
# Valideaza cu Pydantic
try:
validated = FactualResponse(**data)
return validated
except Exception as e:
print(f"Validare esuata: {e}")
return NoneStrategia 6: Constientizarea Knowledge Cutoff
Gestioneaza intrebarile despre evenimente recente:
from datetime import datetime
def handle_temporal_query(question):
client = openai.OpenAI()
# Verifica daca intrebarea e despre evenimente recente
temporal_check = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": """
Analyze if this question requires knowledge of events after April 2024.
Return JSON: {"requires_recent": true/false, "reason": "..."}
"""},
{"role": "user", "content": question}
]
)
import json
result = json.loads(temporal_check.choices[0].message.content)
if result["requires_recent"]:
return {
"warning": "Aceasta intrebare poate necesita informatii de dupa knowledge cutoff (Aprilie 2024)",
"recommendation": "Verifica cu surse actuale",
"answer": None
}
# Continua cu raspunsul
answer = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": question}
]
)
return {"answer": answer.choices[0].message.content}Strategia 7: Human-in-the-Loop
Semnaleaza raspunsurile incerte pentru revizuire:
class HumanReviewQueue:
def __init__(self):
self.pending_reviews = []
def process_with_review(self, question, answer, confidence):
if confidence < 0.7:
self.pending_reviews.append({
"question": question,
"proposed_answer": answer,
"confidence": confidence,
"status": "pending_review"
})
return {
"answer": "Acest raspuns este in asteptarea revizuirii umane pentru acuratete.",
"draft": answer,
"review_id": len(self.pending_reviews) - 1
}
return {"answer": answer, "verified": True}
def approve_review(self, review_id, corrected_answer=None):
review = self.pending_reviews[review_id]
review["status"] = "approved"
review["final_answer"] = corrected_answer or review["proposed_answer"]
return review["final_answer"]
# Utilizare in productie
review_queue = HumanReviewQueue()
def answer_question(question):
llm_answer = get_llm_response(question)
confidence = estimate_confidence(llm_answer)
return review_queue.process_with_review(question, llm_answer, confidence)Checklist Anti-Halucinatie pentru Productie
def production_safe_response(question, context=None):
"""Pipeline complet pentru raspunsuri rezistente la halucinatii."""
# 1. Daca e disponibil context, foloseste RAG
if context:
grounded_answer = rag_chain.invoke({
"question": question,
"context": context
})
else:
grounded_answer = None
# 2. Obtine raspunsul LLM cu incredere
llm_result = get_answer_with_confidence(question)
# 3. Verifica consistenta
consistency = check_consistency(question, samples=3)
# 4. Determina raspunsul final
if grounded_answer and llm_result["confidence"] >= 7:
return {
"answer": grounded_answer,
"source": "grounded",
"confidence": "high"
}
elif consistency["consistent"] and llm_result["confidence"] >= 7:
return {
"answer": llm_result["full_response"],
"source": "llm",
"confidence": "medium",
"note": "Verificat prin verificare de consistenta"
}
else:
return {
"answer": "Nu am suficienta incredere pentru a raspunde corect.",
"confidence": "low",
"recommendation": "Verifica cu surse de autoritate"
}Referinta Rapida: Tehnici de Prevenire
| Tehnica | Cel mai bun pentru | Complexitate | |-----------|----------|------------| | RAG | Fapte specifice domeniului | Medie | | Auto-consistenta | Intrebari generale | Scazuta | | Scoring incredere | Toate raspunsurile | Scazuta | | Fact-checking | Afirmatii critice | Ridicata | | Output structurat | Extractie de date | Medie | | Revizuire umana | Decizii cu miza ridicata | Ridicata |
Construiesti Sisteme AI Fiabile?
Prevenirea halucinatiilor este critica pentru AI-ul enterprise. Echipa noastra ofera:
- Audituri de fiabilitate AI
- Pipeline-uri personalizate de fact-checking
- Consultanta de implementare RAG
- Conformitate EU AI Act