Componenta tMap este cel mai puternic instrument de transformare din Talend. Acest ghid complet acopera fiecare aspect al tMap, de la mapari de baza pana la rutare multi-output avansata si expresii complexe.
Prezentare Generala a Arhitecturii tMap
┌──────────────────────────────────────────────────────────────────────────┐
│ tMAP COMPONENT STRUCTURE │
├──────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐│
│ │ SECTIUNEA INPUT ││
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││
│ │ │ Flux │ │ Lookup 1 │ │ Lookup 2 │ ││
│ │ │ Principal │ │ (Optional) │ │ (Optional) │ ││
│ │ │ (Obligat.) │ │ │ │ │ ││
│ │ │ • customer_id│ │ • cust_id │ │ • code │ ││
│ │ │ • name │ │ • segment │ │ • description│ ││
│ │ │ • order_amt │ │ • tier │ │ │ ││
│ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ ││
│ └─────────┼─────────────────┼─────────────────┼────────────────────────┘│
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐│
│ │ SECTIUNEA VARIABILE ││
│ │ var_fullName = row1.first_name + " " + row1.last_name ││
│ │ var_discountRate = row2.tier.equals("GOLD") ? 0.2 : 0.1 ││
│ │ var_finalAmount = row1.order_amt * (1 - var_discountRate) ││
│ └─────────────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐│
│ │ SECTIUNEA OUTPUT ││
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││
│ │ │ Output 1 │ │ Output 2 │ │ Reject │ ││
│ │ │ (Toate rand.)│ │ (Filtrate) │ │ (Erori) │ ││
│ │ │ │ │ │ │ │ ││
│ │ │ Filter: none │ │ Filter: │ │ Catch lookup │ ││
│ │ │ │ │ amt > 1000 │ │ reject │ ││
│ │ └──────────────┘ └──────────────┘ └──────────────┘ ││
│ └─────────────────────────────────────────────────────────────────────┘│
└──────────────────────────────────────────────────────────────────────────┘
Transformari de Baza ale Datelor
Mapare Simpla a Coloanelor
// Mapare directa (drag and drop)
// row1.customer_id → output.customer_id
// Exemple de mapare bazata pe expresii:
// Transformari string
output.upper_name = StringHandling.UPCASE(row1.name)
output.lower_name = StringHandling.DOWNCASE(row1.name)
output.trimmed = StringHandling.TRIM(row1.name)
output.padded = StringHandling.LPAD(row1.code, 10, "0")
// Operatii substring
output.first_3 = StringHandling.LEFT(row1.code, 3)
output.last_4 = StringHandling.RIGHT(row1.phone, 4)
output.middle = StringHandling.MID(row1.code, 2, 4)
// Inlocuire string
output.cleaned = StringHandling.REPLACE(row1.text, "\n", " ")
output.no_spaces = StringHandling.EREPLACE(row1.text, "\\s+", "_")
// Concatenare
output.full_address = row1.street + ", " + row1.city + " " + row1.zip
// Gestionare null
output.safe_name = row1.name != null ? row1.name : "Unknown"
output.safe_amount = row1.amount == null ? 0 : row1.amountTransformari Numerice
// Matematica de baza
output.total = row1.quantity * row1.unit_price
output.discounted = row1.price * (1 - row1.discount_rate)
output.rounded = Math.round(row1.amount * 100.0) / 100.0
// Conversii de tip
output.int_value = Integer.parseInt(row1.string_number)
output.double_value = Double.parseDouble(row1.string_decimal)
output.string_value = String.valueOf(row1.numeric_field)
// Formatare numere
output.formatted = String.format("%.2f", row1.amount)
output.with_comma = String.format("%,.2f", row1.large_number)
output.percentage = String.format("%.1f%%", row1.rate * 100)
// Functii numerice
output.absolute = Math.abs(row1.value)
output.ceiling = Math.ceil(row1.decimal)
output.floor = Math.floor(row1.decimal)
output.max_value = Math.max(row1.val1, row1.val2)
output.min_value = Math.min(row1.val1, row1.val2)
output.power = Math.pow(row1.base, row1.exponent)
output.sqrt = Math.sqrt(row1.value)
// Valori aleatorii
output.random_int = (int)(Math.random() * 1000)
output.uuid = java.util.UUID.randomUUID().toString()Transformari de Date
// Parseaza date din string-uri
output.parsed_date = TalendDate.parseDate("yyyy-MM-dd", row1.date_string)
output.parsed_datetime = TalendDate.parseDate("yyyy-MM-dd HH:mm:ss", row1.datetime_string)
// Formateaza date in string-uri
output.formatted_date = TalendDate.formatDate("MM/dd/yyyy", row1.date_field)
output.iso_date = TalendDate.formatDate("yyyy-MM-dd'T'HH:mm:ss'Z'", row1.datetime_field)
// Aritmetica pe date
output.next_week = TalendDate.addDate(row1.order_date, 7, "dd")
output.next_month = TalendDate.addDate(row1.start_date, 1, "MM")
output.next_year = TalendDate.addDate(row1.birth_date, 1, "yyyy")
output.yesterday = TalendDate.addDate(TalendDate.getCurrentDate(), -1, "dd")
// Extragerea partilor de data
output.year = TalendDate.getPartOfDate("YEAR", row1.date_field)
output.month = TalendDate.getPartOfDate("MONTH", row1.date_field)
output.day = TalendDate.getPartOfDate("DAY_OF_MONTH", row1.date_field)
output.day_of_week = TalendDate.getPartOfDate("DAY_OF_WEEK", row1.date_field)
output.week_of_year = TalendDate.getPartOfDate("WEEK_OF_YEAR", row1.date_field)
output.quarter = (TalendDate.getPartOfDate("MONTH", row1.date_field) - 1) / 3 + 1
// Comparatii de date
output.days_between = TalendDate.diffDate(row1.end_date, row1.start_date, "dd")
output.months_between = TalendDate.diffDate(row1.end_date, row1.start_date, "MM")
// Data/ora curenta
output.today = TalendDate.getCurrentDate()
output.now = TalendDate.parseDate("yyyy-MM-dd HH:mm:ss",
TalendDate.formatDate("yyyy-MM-dd HH:mm:ss", new java.util.Date()))
// Prima/Ultima zi din luna
output.first_of_month = TalendDate.getFirstDayOfMonth(row1.date_field)
output.last_of_month = TalendDate.getLastDayOfMonth(row1.date_field)Operatii de Lookup
Lookup pe Tabela Singulara
/*
* Configurare Lookup:
*
* Input Principal: row1 (customers)
* Input Lookup: row2 (customer_segments)
*
* Cheie Join: row1.customer_id = row2.customer_id
*
* Lookup Model: Load once (pentru tabele lookup mici)
* Tip Join: Left Outer Join
* Model Potrivire: First Match
*/
// Expresii output cu date lookup
output.customer_id = row1.customer_id
output.customer_name = row1.name
output.segment = row2.segment != null ? row2.segment : "UNKNOWN"
output.tier = row2.tier != null ? row2.tier : "STANDARD"
// Gestioneaza lookup-uri fara rezultat cu valori implicite
output.discount_rate = row2 != null && row2.discount_rate != null
? row2.discount_rate
: 0.0
// Conditional bazat pe rezultatul lookup
output.is_premium = row2 != null && "PREMIUM".equals(row2.tier)Lookup-uri Multiple
/*
* Configurare Lookup-uri Multiple:
*
* Input Principal: row1 (orders)
* Lookup 1: row2 (customers) - join pe customer_id
* Lookup 2: row3 (products) - join pe product_id
* Lookup 3: row4 (regions) - join pe region_code
*
* Ordine procesare: Lookup-urile evaluate secvential
*/
// Sectiunea variabile (pentru reutilizare)
Var.customer_name = row2 != null ? row2.name : "Unknown Customer"
Var.product_name = row3 != null ? row3.product_name : "Unknown Product"
Var.region_name = row4 != null ? row4.region_name : "Unknown Region"
Var.unit_price = row3 != null ? row3.unit_price : row1.default_price
// Expresii output
output.order_id = row1.order_id
output.customer_name = Var.customer_name
output.product_name = Var.product_name
output.region = Var.region_name
output.line_total = row1.quantity * Var.unit_price
output.lookup_status = (row2 != null ? "C" : "") +
(row3 != null ? "P" : "") +
(row4 != null ? "R" : "")Lookup cu Chei Compuse
/*
* Lookup cu Cheie Compusa:
*
* Principal: row1 (transactions)
* Lookup: row2 (price_matrix)
*
* Chei Join:
* row1.product_id = row2.product_id AND
* row1.region_code = row2.region_code AND
* row1.effective_date >= row2.start_date AND
* row1.effective_date <= row2.end_date
*
* Nota: Pentru intervale complexe de date, foloseste tMap cu expresie inner join
* sau pre-filtreaza datele de lookup
*/
// Exemplu cheie expresie (in setarile cheie tMap)
// Key 1: row1.product_id.equals(row2.product_id)
// Key 2: row1.region_code.equals(row2.region_code)
// Pentru potrivire interval date, foloseste expresie:
row2.start_date.compareTo(row1.trans_date) <= 0 &&
row2.end_date.compareTo(row1.trans_date) >= 0Variabile pentru Logica Complexa
Declarare si Utilizare Variabile
/*
* Beneficii Sectiune Variabile:
* - Evita repetarea expresiilor complexe
* - Imbunatateste lizibilitatea
* - Optimizeaza performanta (expresia evaluata o singura data)
* - Construieste calcule pas cu pas
*/
// Definitii variabile (evaluate in ordine)
Var.base_price = row1.quantity * row3.unit_price
Var.discount_rate = row2 != null && row2.tier != null
? (row2.tier.equals("GOLD") ? 0.20 :
row2.tier.equals("SILVER") ? 0.15 :
row2.tier.equals("BRONZE") ? 0.10 : 0.05)
: 0.0
Var.discount_amount = Var.base_price * Var.discount_rate
Var.subtotal = Var.base_price - Var.discount_amount
Var.tax_rate = row4 != null ? row4.tax_rate : 0.08
Var.tax_amount = Var.subtotal * Var.tax_rate
Var.total = Var.subtotal + Var.tax_amount
// Output folosind variabile
output.base_price = Var.base_price
output.discount_rate = Var.discount_rate
output.discount_amount = Var.discount_amount
output.subtotal = Var.subtotal
output.tax_rate = Var.tax_rate
output.tax_amount = Var.tax_amount
output.total = Var.totalLogica Conditionala in Variabile
// Asignare conditionala complexa
Var.customer_category =
row1.annual_revenue > 1000000 ? "ENTERPRISE" :
row1.annual_revenue > 100000 ? "MID_MARKET" :
row1.annual_revenue > 10000 ? "SMALL_BUSINESS" :
"MICRO"
Var.risk_level =
row1.payment_days_late > 90 ? "HIGH" :
row1.payment_days_late > 30 ? "MEDIUM" :
row1.payment_days_late > 0 ? "LOW" :
"NONE"
// Potrivire pattern cu regex
Var.is_valid_email = row1.email != null &&
row1.email.matches("^[A-Za-z0-9+_.-]+@[A-Za-z0-9.-]+$")
// Logica state machine
Var.order_status =
row1.shipped_date != null && row1.delivered_date != null ? "DELIVERED" :
row1.shipped_date != null ? "IN_TRANSIT" :
row1.paid_date != null ? "PROCESSING" :
row1.confirmed_date != null ? "CONFIRMED" :
"PENDING"Filtrare si Rutare
Filtre de Output
/*
* Configurare Filtre in Setarile Output:
*
* Output 1 (all_orders): Fara filtru
* Output 2 (large_orders): row1.amount > 1000
* Output 3 (premium_customers): row2.tier.equals("PREMIUM")
* Output 4 (needs_review): Var.needs_review == true
*/
// Expresii de filtru
// Filtru simplu pe valoare
row1.status.equals("ACTIVE")
// Comparatie numerica
row1.amount >= 1000 && row1.amount <= 10000
// Filtru pe data
row1.order_date.after(TalendDate.parseDate("yyyy-MM-dd", "2024-01-01"))
// Filtru verificare null
row1.email != null && !row1.email.isEmpty()
// Conditii multiple
row1.status.equals("ACTIVE") &&
row1.amount > 100 &&
row2 != null &&
row2.credit_limit > row1.amount
// Filtru pe pattern
row1.product_code.startsWith("PREM-")
// Apartenenta la lista
java.util.Arrays.asList("US", "CA", "MX").contains(row1.country_code)Fluxuri Reject
/*
* Configurare Flux Reject:
*
* Catch lookup reject: true
* Catch expression filter reject: true
*
* Cauze reject:
* - Lookup cu "Inner Join" si fara potrivire
* - Expresia de filtru evaluata la false
*/
// In output-ul reject, poti accesa:
output.original_id = row1.id
output.original_data = row1.data
output.reject_reason = row2 == null ? "Customer lookup failed" :
row3 == null ? "Product lookup failed" :
"Filter condition not met"
output.rejected_at = TalendDate.getCurrentDate()Rutare Multi-Output
/*
* Strategie de Rutare:
*
* Output Principal: Inregistrari reusite
* Output Erori: Esecuri de validare
* Output Arhiva: Inregistrari istorice
* Output Delta: Doar inregistrari modificate
*/
// Folosirea variabilelor pentru decizii de rutare
Var.is_valid = row1.id != null &&
row1.amount > 0 &&
row1.customer_id != null
Var.is_historical = row1.record_date.before(
TalendDate.addDate(TalendDate.getCurrentDate(), -365, "dd"))
Var.is_changed = !row1.hash.equals(row2.previous_hash)
// Filtre output:
// main_output: Var.is_valid && !Var.is_historical
// error_output: !Var.is_valid
// archive_output: Var.is_historical
// delta_output: Var.is_valid && Var.is_changedTehnici Avansate
Metode Java Personalizate
// Defineste metoda reutilizabila in tJava (sectiunea begin)
public static String maskEmail(String email) {
if (email == null || !email.contains("@")) return email;
String[] parts = email.split("@");
String username = parts[0];
String masked = username.charAt(0) +
"*".repeat(Math.max(0, username.length() - 2)) +
(username.length() > 1 ? username.charAt(username.length() - 1) : "");
return masked + "@" + parts[1];
}
public static String cleanPhone(String phone) {
if (phone == null) return null;
return phone.replaceAll("[^0-9]", "");
}
public static boolean isValidSSN(String ssn) {
if (ssn == null) return false;
return ssn.matches("^\\d{3}-\\d{2}-\\d{4}$");
}
// Foloseste in expresiile tMap:
output.masked_email = maskEmail(row1.email)
output.clean_phone = cleanPhone(row1.phone)
output.ssn_valid = isValidSSN(row1.ssn)Optimizare Performanta
Optimizare Expresii
// GRESIT: Expresie complexa repetata
output.field1 = StringHandling.UPCASE(StringHandling.TRIM(row1.name))
output.field2 = StringHandling.UPCASE(StringHandling.TRIM(row1.name)) + "_SUFFIX"
output.field3 = StringHandling.UPCASE(StringHandling.TRIM(row1.name)).length()
// CORECT: Foloseste variabila
Var.clean_name = StringHandling.UPCASE(StringHandling.TRIM(row1.name))
output.field1 = Var.clean_name
output.field2 = Var.clean_name + "_SUFFIX"
output.field3 = Var.clean_name.length()
// GRESIT: Verificari null multiple
output.val = row2 != null ? (row2.field != null ? row2.field : "default") : "default"
// CORECT: Variabila singulara cu short-circuit
Var.lookup_value = row2 != null && row2.field != null ? row2.field : "default"
output.val = Var.lookup_valueOptimizare Lookup
/*
* Selectie Lookup Model:
*
* 1. Load Once
* - Intregul lookup incarcat in memorie la inceput
* - Cel mai bun pentru: < 100K randuri, accesat frecvent
* - Memorie: O(n) unde n = dimensiunea lookup
*
* 2. Reload at Each Row
* - Interogeaza lookup-ul pentru fiecare rand principal
* - Cel mai bun pentru: Lookup-uri foarte mari, flux principal selectiv
* - Performanta: Lent, dar memorie scazuta
*
* 3. Reload at Each Row (cache)
* - Interogeaza + cache rezultate
* - Cel mai bun pentru: Lookup mare, chei repetate in principal
* - Memorie: O(chei unice accesate)
*/
// Pentru lookup-uri uriase, considera:
// 1. Pre-filtreaza datele lookup cu tFilterRow
// 2. Foloseste componente tHash* pentru lookup-uri bazate pe disc
// 3. Imparte job-ul in chunk-uri cu tFlowToIterateExemplu Complet de Job
Structura Job: Procesare Comenzi Clienti
tFileInputDelimited (orders.csv)
│
▼
[tMap_Main]
│
├──► lookup: tDBInput (customers)
│ JOIN: order.customer_id = customer.id
│
├──► lookup: tDBInput (products)
│ JOIN: order.product_id = product.id
│
├──► Variabile:
│ Var.unit_price = row3.price != null ? row3.price : row1.default_price
│ Var.discount = row2.tier.equals("GOLD") ? 0.2 : 0.1
│ Var.line_total = row1.qty * Var.unit_price * (1 - Var.discount)
│ Var.is_valid = row1.qty > 0 && row2 != null
│
├──► Output 1: valid_orders (Filter: Var.is_valid)
│ └──► tDBOutput (processed_orders)
│
├──► Output 2: large_orders (Filter: Var.line_total > 10000)
│ └──► tFileOutputDelimited (large_orders.csv)
│
└──► Output 3: reject_orders (Catch lookup reject)
└──► tLogRow + tFileOutputDelimited (rejected.csv)
Sumar Bune Practici
tmap_best_practices:
design:
- "Foloseste nume semnificative pentru variabile"
- "Comenteaza expresiile complexe"
- "Pastreaza expresiile simple si lizibile"
- "Imparte logica complexa in variabile multiple"
performanta:
- "Alege modelul de lookup corespunzator pe baza dimensiunii datelor"
- "Foloseste variabile pentru expresii repetate"
- "Filtreaza devreme pentru a reduce procesarea"
- "Pre-filtreaza lookup-urile mari inainte de tMap"
gestionare_erori:
- "Gestioneaza intotdeauna valorile null explicit"
- "Foloseste output-uri reject pentru tracking erori"
- "Valideaza tipurile de date inainte de conversie"
- "Logheaza inregistrarile rejectate pentru analiza"
mentenanta:
- "Documenteaza cheile de join si logica de business"
- "Testeaza cu cazuri limita (null-uri, string-uri goale)"
- "Versioneaza configuratiile tMap"
- "Revizuieste si optimizeaza periodic"Concluzie
Componenta tMap este esentiala pentru transformarile de date in Talend. Stapaneste utilizarea variabilelor pentru logica complexa, alege strategii de lookup corespunzatoare pe baza volumelor de date si implementeaza gestionare corecta a erorilor cu fluxuri reject. Aceste tehnici permit job-uri ETL eficiente si usor de intretinut.
Sistemul tau AI e conform cu EU AI Act? Evaluare gratuita de risc - afla in 2 minute →