[31588f42] Keine neuen Commits in dieser Session.

Keine neuen Commits in dieser Session.
Docs: Aktualisierung der Dokumentation für Task [31588f42]
2026-03-03 08:37:11 +00:00 · 2026-03-03 08:37:11 +00:00 · 2026-03-02 19:43:09 +00:00 · 2026-03-02 19:40:31 +00:00 · 2026-03-02 19:38:56 +00:00 · 2026-03-02 19:37:06 +00:00
28 changed files with 4683 additions and 491 deletions
--- a/.dev_session/SESSION_INFO
+++ b/.dev_session/SESSION_INFO
@@ -1 +1 @@
-{"task_id": "31588f42-8544-800b-8c82-e17c067bdf69", "token": "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", "readme_path": "connector-superoffice/README.md", "session_start_time": "2026-02-28T18:45:32.220313"}
+{"task_id": "31588f42-8544-800b-8c82-e17c067bdf69", "token": "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", "readme_path": "connector-superoffice/README.md", "session_start_time": "2026-03-03T08:37:10.203823"}
--- a/GEMINI.md
+++ b/GEMINI.md
@@ -283,6 +283,15 @@ When creating sales via API, specific constraints apply due to the shared tenant
    *   `Person`: Highly recommended linking to a specific person, not just the company.
 *   **Context:** Avoid creating sales on the parent company "Wackler Service Group" (ID 3). Always target the specific lead company.

+### Analyse der SuperOffice `Sale`-Entität (März 2026)
+
+- **Ziel:** Erstellung eines Reports, der abbildet, welche Kunden welche Produkte angeboten bekommen oder gekauft haben. Die initiale Vermutung war, dass Produktinformationen oft als Freitext-Einträge und nicht über den offiziellen Produktkatalog erfasst werden.
+- **Problem:** Die Untersuchung der Datenstruktur zeigte, dass die API-Endpunkte zur Abfrage von `Quote`-Objekten (Angeboten) und `QuoteLines` (Angebotspositionen) über `Sale`-, `Contact`- oder `Project`-Beziehungen hinweg nicht zuverlässig funktionierten. Viele Abfragen resultierten in `500 Internal Server Errors` oder leeren Datenmengen, was eine direkte Verknüpfung von Verkauf zu Produkt unmöglich machte.
+- **Kern-Erkenntnis (Datenstruktur):**
+    1.  **Freitext statt strukturierter Daten:** Die Analyse eines konkreten `Sale`-Objekts (ID `342243`) bestätigte die ursprüngliche Hypothese. Produktinformationen (z.B. `2xOmnie CD-01 mit Nachlass`) werden direkt in das `Heading`-Feld (Betreff) des `Sale`-Objekts als Freitext eingetragen. Es existieren oft keine verknüpften `Quote`- oder `QuoteLine`-Entitäten.
+    2.  **Datenqualität bei Verknüpfungen:** Eine signifikante Anzahl von `Sale`-Objekten im System weist keine Verknüpfung zu einem `Contact`-Objekt auf (`Contact: null`). Dies erschwert die automatische Zuordnung von Verkäufen zu Kunden erheblich.
+- **Nächster Schritt / Lösungsweg:** Ein Skript (`/app/connector-superoffice/generate_customer_product_report.py`) wurde entwickelt, das diese Probleme adressiert. Es fragt gezielt nur `Sale`-Objekte ab, die eine gültige `Contact`-Verknüpfung besitzen (`$filter=Contact ne null`). Anschließend extrahiert es den Kundennamen und das `Heading`-Feld des Verkaufs und durchsucht letzteres nach vordefinierten Produkt-Schlüsselwörtern. Die Ergebnisse werden für die manuelle Analyse in einer CSV-Datei (`product_report.csv`) gespeichert. Dieser Ansatz ist der einzig verlässliche Weg, um die gewünschten Informationen aus dem System zu extrahieren.
+
 ### 7. Service & Tickets (Anfragen)

 SuperOffice Tickets represent the support and request system. Like Sales, they are organized to allow separation between Roboplanet and Wackler.
--- a/KONVER_STRATEGY.md
+++ b/KONVER_STRATEGY.md
@@ -79,3 +79,60 @@ Wir müssen im Onboarding-Gespräch klären:
 3.  **Bulk-Enrichment:** Können wir Listen (Domains) zum Anreichern hochladen?

 Ohne diese Features ist Konver ein Rückschritt in die manuelle Einzelbearbeitung.
+
+## 5. Kommunikations-Vorlagen
+
+### A. Interne Klärung (an Projektverantwortlichen)
+*Ziel: Strategische Ausrichtung (Enrichment vs. Neukunden) und technische Implikationen klären, ohne "Schlafende Hunde" zu wecken.*
+
+**Betreff:** Technische Konzeption der Konver.ai-Integration – Rückfragen zum Einsatzszenario
+
+Sehr geehrte(r) [Name des Ansprechpartners],
+
+ich befasse mich derzeit mit der technischen Konzeption für die Einbindung von Konver.ai in unsere bestehende Systemlandschaft. Da ich leider erst sehr spät in den Prozess hinzugekommen bin und die bisherigen Präsentationen von Konver.ai nicht verfolgen konnte, haben sich für mich zwei zentrale Fragen zur optimalen technischen Integration ergeben:
+
+**1. Fokus der Nutzung (Enrichment vs. Neukunden-Recherche)**
+Ist geplant, Konver.ai primär für die Anreicherung bereits bestehender Datensätze im CRM zu nutzen (z. B. Ergänzung fehlender Mobilnummern)? Oder soll das Tool, ähnlich wie zuvor Dealfront, auch für die aktive Suche nach neuen, bisher unbekannten Ansprechpartnern eingesetzt werden (z. B. „Suche Facility Manager bei Unternehmen X“)?
+
+Diese Unterscheidung ist für die Implementierung einer automatisierten Dublettenprüfung und den effizienten Einsatz von Credits von großer Bedeutung.
+
+**2. Art der Datenbereitstellung (Echtzeit-Datenbank vs. Live-Recherche)**
+Können Sie mir sagen, ob Konver.ai die angefragten Daten sofort aus einer bestehenden Datenbank liefert oder ob die Recherche jeweils erst zum Zeitpunkt der Anfrage „live“ gestartet wird? Davon hängt ab, ob wir die Integration synchron (Daten sind sofort verfügbar) oder asynchron (mit einer zeitlichen Verzögerung und Warteschlange) abbilden müssen.
+
+Für eine kurze Einschätzung Ihrerseits wäre ich Ihnen sehr dankbar, um die technische Umsetzung von Beginn an passgenau planen zu können.
+Gerne stelle ich die Fragen auch direkt an Konver.ai, ich möchte hier nur nicht querschießen.
+
+Mit freundlichen Grüßen
+
+[Ihr Name]
+
+---
+
+### B. Externe Anfrage (an Konver.ai Support / Tech)
+*Ziel: Technische Dokumentation erhalten und Deduplizierungs-Möglichkeiten prüfen.*
+
+**Betreff:** Technische Integration API - Dokumentation & Workflow (Wackler/Roboplanet)
+
+Sehr geehrte Damen und Herren / Hallo Support-Team,
+
+wir sind derzeit dabei, Konver.ai technisch in unsere Systemlandschaft (SuperOffice CRM & interne Lead-Intelligence) zu integrieren.
+Um den Aufwand und die Architektur (synchron vs. asynchron) planen zu können, benötigen wir bitte folgende Informationen bzw. Unterlagen:
+
+**1. API Dokumentation**
+Könnten Sie uns bitte die aktuelle technische Dokumentation (OpenAPI/Swagger Specs) für die `Search` und `Enrich` Endpoints zukommen lassen?
+
+**2. Person Search & Deduplizierung (Pre-Purchase Check)**
+Unser geplanter Use-Case sieht vor, gezielt nach Ansprechpartnern für vorqualifizierte Unternehmen zu suchen (z.B. `domain="klinikum-x.de"` AND `job_title="Facility Manager"`).
+*   Gibt es einen Endpunkt, der uns erlaubt, **vor** dem kostenpflichtigen "Reveal" zu prüfen, ob wir einen Datensatz (E-Mail/Telefon) bereits besitzen (z.B. via Hash-Abgleich oder Exclusion-List)?
+*   Wir möchten vermeiden, Credits für Daten zu verbrauchen, die bereits in unserem CRM vorhanden sind.
+
+**3. Datenbereitstellung (Sync vs. Async)**
+Erfolgt die Rückgabe der Daten bei einer API-Anfrage synchron (direkt aus einer Datenbank) oder wird ein Live-Recherche-Prozess angestoßen?
+Falls "Live": Bieten Sie Webhooks an, um uns über die Fertigstellung der Datenanreicherung zu informieren, oder muss gepollt werden?
+
+Vielen Dank für Ihre Unterstützung beim technischen Onboarding.
+
+Mit freundlichen Grüßen
+
+[Ihr Name]
+
--- a/MIGRATION_PLAN.md
+++ b/MIGRATION_PLAN.md
@@ -410,19 +410,29 @@ Der Company Explorer unterstützt nun den Parameter `campaign_tag`. Der Connecto

 ---

+## 19. External Lead Ingestion & Contact API (v3.5 - March 2, 2026)
+
+**Kontext:** Automatisierte Verarbeitung von externen Lead-Quellen (Tradingtwins) direkt in den Company Explorer.
+
+### 19.1 API-Erweiterung: Externer Kontakt-Sync
+Um Kontakte von externen Tools (wie der Lead-Engine) ohne SuperOffice-Kontext zu übernehmen, wurde die API erweitert.
+
+*   **Neuer Endpunkt:** `POST /api/contacts`
+*   **Funktionalität:** 
+    *   Anlage von Personen-Stammdaten (Vorname, Nachname, E-Mail).
+    *   **Automatisches Role-Mapping:** Der Endpunkt integriert den `RoleMappingService`. Eingehende Job-Titel (z.B. "CFO") werden automatisch gegen die interne Muster-Datenbank geprüft und der passenden Persona (z.B. "Wirtschaftlicher Entscheider") zugeordnet.
+    *   **De-Duplizierung:** Existiert eine E-Mail bereits für ein Unternehmen, wird der Datensatz aktualisiert statt neu angelegt.
+
+### 19.2 Standardisierung der Datenfelder
+Zur Verbesserung der asynchronen Zusammenarbeit zwischen Lead-Engine und CE wurden die Feldnamen in der API-Antwort vereinheitlicht:
+*   **Branche:** `industry_ai` (Primärfeld für die KI-Klassifizierung).
+*   **Analyse:** `research_dossier` (Das vollständige KI-generierte Firmendossier).
+
+### 19.3 Synchronisations-Workflow (Connector)
+Der `company_explorer_connector.py` unterstützt nun den erweiterten Workflow:
+1. `check_company_existence` (Suche via Name)
+2. `create_company` (Anlage falls neu)
+3. `create_contact` (Integration der Person inkl. Role-Mapping)
+4. `trigger_discovery` / `trigger_analysis` (Asynchroner Start der Intelligence-Phase)
+
 ---
-
-## 18. Offene Arbeitspakete (Stand: 27.02.2026)
-
-### Prio A: Operative Automatisierung
-*   **Webhook-Aktivierung:** Registrierung des Live-Webhooks für `online3`, sobald Admin-Rechte für den API-User vorliegen (`register_webhook.py`).
-*   **Full Matrix Generation:** Ausführung der KI-Generierung für alle 25 Verticals (englische IDs), sobald die "Pains" in Notion finalisiert wurden.
-*   **Campaign-Validation:** Erstellung von Test-Szenarien für mindestens 3 verschiedene Kampagnen-Tags zur Verifizierung der Weichenstellung.
-
-### Prio B: Marketing-Execution
-*   **Sending Logic:** Implementierung der Logik für den tatsächlichen E-Mail-Versand (oder Export zu einem E-Mail-Provider) basierend auf den befüllten UDFs.
-*   **Unsubscribe-Frontend:** Visuelle Gestaltung der HTML-Bestätigungsseite für den Unsubscribe-Link.
-
-### Prio C: Daten-Optimierung
-*   **Google Maps API:** Einbindung zur Validierung von Firmenadressen bei Diskrepanzen zwischen CRM und Scraper.
-*   **Deduplication 2.0:** Verfeinerung des Matchings bei Firmen mit mehreren Standorten (Filial-Logik).
--- a/add_mapping.py
+++ b/add_mapping.py
@@ -1,12 +0,0 @@
-import sqlite3
-
-def add_mapping():
-    conn = sqlite3.connect('/app/companies_v3_fixed_2.db')
-    cursor = conn.cursor()
-    cursor.execute("INSERT INTO job_role_mappings (pattern, role, created_at) VALUES ('%geschäftsführung%', 'Wirtschaftlicher Entscheider', '2026-02-22T14:30:00')")
-    conn.commit()
-    conn.close()
-    print("Added mapping for geschäftsführung")
-
-if __name__ == "__main__":
-    add_mapping()
--- a/company-explorer/backend/app.py
+++ b/company-explorer/backend/app.py
@@ -68,6 +68,15 @@ class CompanyCreate(BaseModel):
    website: Optional[str] = None
    crm_id: Optional[str] = None

+class ContactCreate(BaseModel):
+    company_id: int
+    first_name: Optional[str] = None
+    last_name: Optional[str] = None
+    email: Optional[str] = None
+    job_title: Optional[str] = None
+    role: Optional[str] = None
+    is_primary: bool = True
+
 class BulkImportRequest(BaseModel):
    names: List[str]

@@ -631,6 +640,53 @@ def bulk_import_companies(req: BulkImportRequest, background_tasks: BackgroundTa
    db.commit()
    return {"status": "success", "imported": imported_count}

+@app.post("/api/contacts")
+def create_contact_endpoint(contact: ContactCreate, db: Session = Depends(get_db), username: str = Depends(authenticate_user)):
+    # Check if company exists
+    company = db.query(Company).filter(Company.id == contact.company_id).first()
+    if not company:
+        raise HTTPException(status_code=404, detail="Company not found")
+        
+    # Automatic Role Mapping logic
+    final_role = contact.role
+    if contact.job_title and not final_role:
+        role_mapping_service = RoleMappingService(db)
+        found_role = role_mapping_service.get_role_for_job_title(contact.job_title)
+        if found_role:
+            final_role = found_role
+        else:
+            # Log unclassified title for future mining
+            role_mapping_service.add_or_update_unclassified_title(contact.job_title)
+
+    # Check if contact with same email already exists for this company
+    if contact.email:
+        existing = db.query(Contact).filter(Contact.company_id == contact.company_id, Contact.email == contact.email).first()
+        if existing:
+            # Update existing contact
+            existing.first_name = contact.first_name
+            existing.last_name = contact.last_name
+            existing.job_title = contact.job_title
+            existing.role = final_role
+            db.commit()
+            db.refresh(existing)
+            return existing
+
+    new_contact = Contact(
+        company_id=contact.company_id,
+        first_name=contact.first_name,
+        last_name=contact.last_name,
+        email=contact.email,
+        job_title=contact.job_title,
+        role=final_role,
+        is_primary=contact.is_primary,
+        status="ACTIVE",
+        unsubscribe_token=str(uuid.uuid4())
+    )
+    db.add(new_contact)
+    db.commit()
+    db.refresh(new_contact)
+    return new_contact
+
@app.post("/api/companies/{company_id}/override/wikipedia")
 def override_wikipedia(company_id: int, url: str, background_tasks: BackgroundTasks, db: Session = Depends(get_db), username: str = Depends(authenticate_user)):
    company = db.query(Company).filter(Company.id == company_id).first()
--- a/company_explorer_connector.py
+++ b/company_explorer_connector.py
@@ -64,10 +64,24 @@ def trigger_analysis(company_id: int) -> dict:
 def get_company_details(company_id: int) -> dict:
    """Holt die vollständigen Details zu einem Unternehmen."""
    return _make_api_request("GET", f"/companies/{company_id}")
+
+def create_contact(company_id: int, contact_data: dict) -> dict:
+    """Erstellt einen neuen Kontakt für ein Unternehmen im Company Explorer."""
+    payload = {
+        "company_id": company_id,
+        "first_name": contact_data.get("first_name"),
+        "last_name": contact_data.get("last_name"),
+        "email": contact_data.get("email"),
+        "job_title": contact_data.get("job_title"),
+        "role": contact_data.get("role"),
+        "is_primary": contact_data.get("is_primary", True)
+    }
+    return _make_api_request("POST", "/contacts", json_data=payload)
    
-def handle_company_workflow(company_name: str) -> dict:
+def handle_company_workflow(company_name: str, contact_info: dict = None) -> dict:
    """
    Haupt-Workflow: Prüft, erstellt und reichert ein Unternehmen an.
+    Optional wird auch ein Kontakt angelegt.
    Gibt die finalen Unternehmensdaten zurück.
    """
    print(f"Workflow gestartet für: '{company_name}'")
@@ -75,70 +89,60 @@ def handle_company_workflow(company_name: str) -> dict:
    # 1. Prüfen, ob das Unternehmen existiert
    existence_check = check_company_existence(company_name)
    
+    company_id = None
    if existence_check.get("exists"):
        company_id = existence_check["company"]["id"]
        print(f"Unternehmen '{company_name}' (ID: {company_id}) existiert bereits.")
-        final_company_data = get_company_details(company_id)
-        return {"status": "found", "data": final_company_data}
-        
-    if "error" in existence_check:
+    elif "error" in existence_check:
        print(f"Fehler bei der Existenzprüfung: {existence_check['error']}")
        return {"status": "error", "message": existence_check['error']}
+    else:
+        # 2. Wenn nicht, Unternehmen erstellen
+        print(f"Unternehmen '{company_name}' nicht gefunden. Erstelle es...")
+        creation_response = create_company(company_name)
+        
+        if "error" in creation_response:
+            return {"status": "error", "message": creation_response['error']}
+        
+        company_id = creation_response.get("id")
+        print(f"Unternehmen '{company_name}' erfolgreich mit ID {company_id} erstellt.")

-    # 2. Wenn nicht, Unternehmen erstellen
-    print(f"Unternehmen '{company_name}' nicht gefunden. Erstelle es...")
-    creation_response = create_company(company_name)
-    
-    if "error" in creation_response:
-        print(f"Fehler bei der Erstellung: {creation_response['error']}")
-        return {"status": "error", "message": creation_response['error']}
-    
-    company_id = creation_response.get("id")
-    if not company_id:
-        print(f"Fehler: Konnte keine ID aus der Erstellungs-Antwort extrahieren: {creation_response}")
-        return {"status": "error", "message": "Failed to get company ID after creation."}
-    
-    print(f"Unternehmen '{company_name}' erfolgreich mit ID {company_id} erstellt.")
+    # 2b. Kontakt anlegen/aktualisieren (falls Info vorhanden)
+    if company_id and contact_info:
+        print(f"Lege Kontakt für {contact_info.get('last_name')} an...")
+        contact_res = create_contact(company_id, contact_info)
+        if "error" in contact_res:
+            print(f"Hinweis: Kontakt konnte nicht angelegt werden: {contact_res['error']}")

-    # 3. Discovery anstoßen
-    print(f"Starte Discovery für ID {company_id}...")
-    discovery_status = trigger_discovery(company_id)
-    if "error" in discovery_status:
-        print(f"Fehler beim Anstoßen der Discovery: {discovery_status['error']}")
-        return {"status": "error", "message": discovery_status['error']}
-
-    # 4. Warten, bis Discovery eine Website gefunden hat (Polling)
-    max_wait_time = 30
-    start_time = time.time()
-    website_found = False
-    print("Warte auf Abschluss der Discovery (max. 30s)...")
-    while time.time() - start_time < max_wait_time:
-        details = get_company_details(company_id)
-        if details.get("website") and details["website"] not in ["", "k.A."]:
-            print(f"Website gefunden: {details['website']}")
-            website_found = True
-            break
-        time.sleep(3)
-        print(".")
-
-    if not website_found:
-        print("Discovery hat nach 30s keine Website gefunden. Breche Analyse ab.")
-        final_data = get_company_details(company_id)
-        return {"status": "created_discovery_timeout", "data": final_data}
-
-    # 5. Analyse anstoßen
-    print(f"Starte Analyse für ID {company_id}...")
-    analysis_status = trigger_analysis(company_id)
-    if "error" in analysis_status:
-        print(f"Fehler beim Anstoßen der Analyse: {analysis_status['error']}")
-        return {"status": "error", "message": analysis_status['error']}
-
-    print("Analyse-Prozess erfolgreich in die Warteschlange gestellt.")
+    # 3. Discovery anstoßen (falls Status NEW)
+    # Wir holen Details, um den Status zu prüfen
+    details = get_company_details(company_id)
+    if details.get("status") == "NEW":
+        print(f"Starte Discovery für ID {company_id}...")
+        trigger_discovery(company_id)
+        
+        # 4. Warten, bis Discovery eine Website gefunden hat (Polling)
+        max_wait_time = 30
+        start_time = time.time()
+        website_found = False
+        print("Warte auf Abschluss der Discovery (max. 30s)...")
+        while time.time() - start_time < max_wait_time:
+            details = get_company_details(company_id)
+            if details.get("website") and details["website"] not in ["", "k.A."]:
+                print(f"Website gefunden: {details['website']}")
+                website_found = True
+                break
+            time.sleep(3)
+            print(".")
+            
+    # 5. Analyse anstoßen (falls Website da, aber noch nicht ENRICHED)
+    if details.get("website") and details["website"] not in ["", "k.A."] and details.get("status") != "ENRICHED":
+        print(f"Starte Analyse für ID {company_id}...")
+        trigger_analysis(company_id)
    
    # 6. Finale Daten abrufen und zurückgeben
    final_company_data = get_company_details(company_id)
-    
-    return {"status": "created_and_enriched", "data": final_company_data}
+    return {"status": "synced", "data": final_company_data}


 if __name__ == "__main__":
--- a/connector-superoffice/README.md
+++ b/connector-superoffice/README.md
@@ -1,352 +1,40 @@
-# SuperOffice Connector & GTM Engine ("The Muscle & The Brain")
-
-Dieses Dokument beschreibt die Architektur der **Go-to-Market (GTM) Engine**, die SuperOffice CRM mit der Company Explorer Intelligence verbindet.
-
-Ziel des Systems ist der vollautomatisierte Versand von **hyper-personalisierten E-Mails**, die so wirken, als wären sie manuell von einem Branchenexperten geschrieben worden.
-
---
-
-## 1. Das Konzept: "Static Magic"
-
-Anders als bei üblichen KI-Tools, die E-Mails "on the fly" generieren, setzt dieses System auf **vorberechnete, statische Textbausteine**.
-
-**Warum?**
-1.  **Qualitätssicherung:** Jeder Baustein kann vor dem Versand geprüft werden.
-2.  **Performance:** SuperOffice muss beim Versand keine KI anfragen, sondern nur Felder zusammenfügen.
-3.  **Konsistenz:** Ein "Finanzleiter im Maschinenbau" bekommt immer dieselbe, perfekte Argumentation – egal bei welchem Unternehmen.
-
-### Die E-Mail-Formel
-
-Eine E-Mail setzt sich aus **drei statischen Komponenten** zusammen, die im CRM (SuperOffice) gespeichert sind:
-
-```text
-[1. Opener (Unternehmens-Spezifisch)] + [2. Bridge (Persona x Vertical)] + [3. Social Proof (Vertical)]
-```
-
-*   **1. Opener (Der Haken):** Bezieht sich zu 100% auf das spezifische Unternehmen und dessen Geschäftsmodell.
-    *   *Quelle:* `Company`-Objekt (Feld: `ai_opener`).
-    *   *Beispiel:* "Die präzise Just-in-Time-Fertigung von **Müller CNC** erfordert einen reibungslosen Materialfluss ohne Mikrostillstände."
-*   **2. Bridge (Die Relevanz):** Holt die Person in ihrer Rolle ab und verknüpft sie mit dem Branchen-Pain.
-    *   *Quelle:* `Matrix`-Tabelle (Feld: `intro`).
-    *   *Beispiel:* "Für Sie als **Produktionsleiter** bedeutet das, trotz Fachkräftemangel die Taktzeiten an der Linie stabil zu halten."
-*   **3. Social Proof (Die Lösung):** Zeigt Referenzen und den konkreten Nutzen (Gains).
-    *   *Quelle:* `Matrix`-Tabelle (Feld: `social_proof`).
-    *   *Beispiel:* "Unternehmen wie **Jungheinrich** nutzen unsere Transportroboter, um Fachkräfte an der Maschine zu halten und Suchzeiten um 30% zu senken."
-
---
-
-## 2. Die Datenbasis (Foundation)
-
-Die Qualität der Texte steht und fällt mit der Datenbasis. Diese wird zentral in **Notion** gepflegt und in den Company Explorer synchronisiert.
-
-### A. Verticals (Branchen)
-Definiert die **Makro-Pains** und **Gains** einer Branche sowie das **passende Produkt**.
-*   *Beispiel:* Healthcare -> Pain: "Pflegekräfte machen Logistik" -> Gain: "Hände fürs Bett" -> Produkt: Service-Roboter.
-*   *Wichtig:* Unterscheidung nach **Ops-Focus** (Operativ vs. Infrastruktur) steuert das Produkt (Reinigung vs. Service).
-
-### B. Personas (Rollen)
-Definiert die **persönlichen Pains** einer Rolle.
-*   *Beispiel:* Produktionsleiter -> Pain: "OEE / Taktzeit".
-*   *Beispiel:* Geschäftsführer -> Pain: "ROI / Amortisation".
-
---
-
-## 3. Die Matrix-Engine (Multiplikation)
-
-Das Skript `generate_matrix.py` (im Backend) ist das Herzstück. Es berechnet **alle möglichen Kombinationen** aus Verticals und Personas voraus.
-
-**Logik:**
-1.  Lade alle Verticals (`V`) und Personas (`P`).
-2.  Für jede Kombination `V x P`:
-    *   Lade `V.Pains` und `P.Pains`.
-    *   Generiere via Gemini einen **perfekten Satz 2 (Bridge)** und **Satz 3 (Proof)**.
-    *   Generiere ein **Subject**, das den Persona-Pain trifft.
-3.  Speichere das Ergebnis in der Tabelle `marketing_matrix`.
-
-*Ergebnis:* Eine Lookup-Tabelle, aus der für jeden Kontakt sofort der passende Text gezogen werden kann.
-
---
-
-## 4. Der "Opener" (First Sentence)
-
-Dieser Baustein ist der einzige, der **pro Unternehmen** generiert wird (bei der Analyse/Discovery).
-
-**Logik:**
-1.  Scrape Website-Content.
-2.  Identifiziere das **Vertical** (z.B. Maschinenbau).
-3.  Lade den **Core-Pain** des Verticals (z.B. "Materialfluss").
-4.  **Prompt:** "Analysiere das Geschäftsmodell von [Firma]. Formuliere einen Satz, der erklärt, warum [Core-Pain] für genau dieses Geschäftsmodell kritisch ist."
-
-*Ergebnis:* Ein Satz, der beweist: "Ich habe verstanden, was ihr tut."
-
---
-
-## 5. SuperOffice Connector ("The Muscle")
-
-Der Connector ist der Bote, der diese Daten in das CRM bringt.
-
-**Workflow:**
-1.  **Trigger:** Kontakt-Änderung in SuperOffice (Webhook).
-2.  **Enrichment:** Connector fragt Company Explorer: "Gib mir Daten für Firma X, Person Y".
-3.  **Lookup:** Company Explorer...
-    *   Holt den `Opener` aus der Company-Tabelle.
-    *   Bestimmt `Vertical` und `Persona`.
-    *   Sucht den passenden Eintrag in der `MarketingMatrix`.
-4.  **Write-Back:** Connector schreibt die Texte in die UDF-Felder (User Defined Fields) des Kontakts in SuperOffice.
-    *   `UDF_Opener`
-    *   `UDF_Bridge`
-    *   `UDF_Proof`
-    *   `UDF_Subject`
-    *   `UDF_UnsubscribeLink`
-
---
-
-### 6. Monitoring & Dashboard ("The Eyes")
-
-Das System verfügt über ein integriertes Echtzeit-Dashboard zur Überwachung der Synchronisationsprozesse.
-
-**Features:**
-*   **Account-basierte Ansicht:** Gruppiert alle Ereignisse nach SuperOffice-Account oder Person, um den aktuellen Status pro Datensatz zu zeigen.
-*   **Phasen-Visualisierung:** Stellt den Fortschritt in vier Phasen dar:
-    1.  **Received:** Webhook erfolgreich empfangen.
-    2.  **Enriching:** Datenanreicherung im Company Explorer läuft (Gelb blinkend = In Arbeit).
-    3.  **Syncing:** Rückschreiben der Daten nach SuperOffice (Gelb blinkend = In Arbeit).
-    4.  **Completed:** Prozess für diesen Kontakt erfolgreich abgeschlossen (Grün).
-*   **Performance-Tracking:** Anzeige der Gesamtdurchlaufzeit (Duration) pro Prozess.
-*   **Fehler-Analyse:** Detaillierte Fehlermeldungen direkt in der Übersicht.
-*   **Dark Mode:** Modernes UI-Design für Admin-Monitoring.
-
-**Zugriff:**
-Das Dashboard ist über das Company Explorer Frontend (Icon "Activity" im Header) oder direkt unter `/connector/dashboard` erreichbar.
-
---
-
-## 7. Setup & Wartung
-
-### Neue Branche hinzufügen
-1.  In **Notion** anlegen (Pains/Gains/Produkte definieren).
-2.  Sync-Skript laufen lassen: `python3 backend/scripts/sync_notion_industries.py`.
-3.  Matrix neu berechnen: `python3 backend/scripts/generate_matrix.py --live`.
-
-### End-to-End Tests
-Ein automatisierter Integrationstest (`tests/test_e2e_flow.py`) deckt den gesamten Zyklus ab:
-1.  **Company Creation:** Webhook -> CE Provisioning -> Write-back (Vertical).
-2.  **Person Creation:** Webhook -> CE Matrix Lookup -> Write-back (Texte).
-3.  **Vertical Change:** Änderung im CRM -> CE Update -> Cascade zu Personen -> Neue Texte.
-
-Ausführen mittels:
-```bash
-python3 connector-superoffice/tests/test_e2e_flow.py
-```
-
-## 7. Troubleshooting & Known Issues
-
-### Authentication "URL has an invalid label"
-Tritt auf, wenn `SO_ENVIRONMENT` leer ist. Der Client fällt nun automatisch auf `sod` zurück.
-
-### Pydantic V2 Compatibility
-Die `config.py` wurde auf natives Python (`os.getenv`) umgestellt, um Konflikte mit `pydantic-settings` in Docker-Containern zu vermeiden.
-
-### Address & VAT Sync (WIP)
-Der Worker wurde erweitert, um auch `City` und `OrgNumber` (VAT) zurückzuschreiben.
-**Status (21.02.2026):** Implementiert, aber noch im Feinschliff. Logs zeigen teils Re-Queueing während das Enrichment läuft.
-
-### 8. Lessons Learned: Address & VAT Sync (Solved Feb 22, 2026)
-
-Die Synchronisation von Adressdaten stellte sich als unerwartet komplex heraus. Hier die wichtigsten Erkenntnisse für zukünftige Entwickler:
-
-1.  **Das "OrgNumber"-Phantom:**
-    *   **Problem:** In der API-Dokumentation oft als `OrgNumber` referenziert, akzeptiert die WebAPI (REST) strikt nur **`OrgNr`**.
-    *   **Lösung:** Mapping im `worker.py` hart auf `OrgNr` umgestellt.
-
-2.  **Verschachtelte Adress-Struktur:**
-    *   **Problem:** Ein flaches Update auf `PostalAddress` wird von der API stillschweigend ignoriert (HTTP 200, aber keine Änderung).
-    *   **Lösung:** Das Update muss die tiefe Struktur respektieren: `Address["Postal"]["City"]` UND `Address["Street"]["City"]`. Beide müssen explizit gesetzt werden, um in der UI sichtbar zu sein.
-
-3.  **Die "Race Condition" Falle (Atomic Updates):**
-    *   **Problem:** Wenn Adress-Daten (`PUT Contact`) und UDF-Daten (`PUT Contact/Udef`) in separaten API-Aufrufen kurz hintereinander gesendet werden, gewinnt der letzte Call. Da dieser oft auf einem *veralteten* `GET`-Stand basiert (bevor das erste Update durch war), wurde die Adresse wieder mit "Leer" überschrieben.
-    *   **Lösung:** **Atomic Update Strategy**. Der Worker sammelt *alle* Änderungen (Adresse, VAT, Vertical, Openers) in einem einzigen Dictionary und sendet genau **einen** `PUT`-Request an den Kontakt-Endpunkt. Dies garantiert Konsistenz.
-
---
-
-### 9. Lessons Learned: Appointment Simulation & Persona Matching
-
-Die Simulation von E-Mails via Terminen (Appointments) erforderte Workarounds für das UI-Verhalten von SuperOffice.
-
-1.  **Header vs. Description (Die 42-Zeichen-Grenze):**
-    *   SuperOffice nutzt im Kalender und in Listen den `MainHeader` als Titel. Dieser ist auf ca. 42 Zeichen begrenzt.
-    *   Ist der Titel länger, schneidet SuperOffice ihn ab. Erscheint der Titel inkonsistent, "stiehlt" das UI oft die erste Zeile der `Description` als Titel-Ersatz.
-    *   **Strategie:** Wir kürzen den `MainHeader` auf 40 Zeichen und stellen sicher, dass der **vollständige Betreff** als allererste Zeile in der `Description` steht. Danach folgen zwei Newlines. Damit landet der Betreff im UI-Header und die Anrede ("Hallo...") bleibt sicher im Textkörper.
-
-2.  **Mapping-Resilienz (Funktion vs. Titel):**
-    *   Jobtitel (Funktion) landen in SuperOffice inkonsistent in den Feldern `JobTitle` oder `Title`.
-    *   **Lösung:** Der Worker fragt nun beide Felder ab (`person.get("JobTitle") or person.get("Title")`), um die Rolle korrekt zuzuweisen.
-
-3.  **Rollen-Dynamik:**
-    *   Um zu verhindern, dass alte Rollen (z.B. "Infrastruktur") nach einer Beförderung/Änderung in SuperOffice "kleben" bleiben, führt das System nun bei jeder Namens- oder Funktionsänderung einen **Rollen-Reset** durch.
-
---
-
-### 10. Lessons Learned: API Optimization & Certification (Feb 24, 2026)
-
-Um die Zertifizierung für den SuperOffice App Store zu erhalten, mussten kritische Performance-Optimierungen durchgeführt werden.
-
-1.  **Die `getAllRows`-Falle:**
-    *   **Problem:** SuperOffice monierte in der Validierung API-Calls wie `getAllRows` (implizit oft durch Abfragen ganzer Objekte ohne Filter), die unnötige Last verursachen.
-    *   **Lösung:** Implementierung von **OData `$select`**. Wir fordern nun strikt nur die Felder an, die wir wirklich benötigen (z.B. `get_person(id, select=['JobTitle', 'UserDefinedFields'])`).
-    *   **Wichtig:** Niemals pauschal `get_person()` aufrufen, wenn nur die Rolle geprüft werden soll.
-
-3.  **PUT vs. PATCH (Safe Updates):**
-    *   **Problem:** Die Verwendung von `PUT` zum Aktualisieren von Entitäten (Person/Contact) erfordert, dass das *gesamte* Objekt gesendet wird. Dies birgt das Risiko, Felder zu überschreiben, die zwischenzeitlich von anderen Benutzern geändert wurden ("Race Condition"), und verursacht unnötigen Traffic.
-    *   **Lösung:** Umstellung auf **`PATCH`**. Wir senden nun nur noch die *tatsächlich geänderten Felder* (Delta).
-    *   **Implementierung:** Der Worker baut nun ein `patch_payload` (z.B. `{'Position': {'Id': 123}}`) und nutzt den dedizierten PATCH-Endpunkt. Dies wurde explizit von SuperOffice für die Zertifizierung gefordert.
-
-### 11. Production Environment (Live Feb 27, 2026)
-
-Nach erfolgreicher Zertifizierung durch SuperOffice wurde der Connector auf die Produktionsumgebung umgestellt.
-
-*   **Tenant:** `Cust26720`
-*   **Environment:** `online3` (zuvor `sod`)
-*   **Endpoint:** `https://online3.superoffice.com/Cust26720/api/v1`
-*   **Authentication:** Umstellung auf Produktions-Client-ID und -Secret erfolgreich verifiziert (Health Check OK).
-
-**Wichtig:** SuperOffice nutzt Load-Balancing. Die Subdomain (`online3`) kann sich theoretisch ändern. Die Anwendung prüft dies dynamisch, aber die Basis-Konfiguration sollte den aktuellen Tenant-Status widerspiegeln.
-
-### 12. Lessons Learned: Production Migration (Feb 27, 2026)
-
-Der Wechsel von der Staging-Umgebung (`sod`) zur Produktion (`onlineX`) brachte spezifische technische Hürden mit sich:
-
-1.  **Globaler Token-Endpunkt:**
-    *   **Problem:** Mandantenspezifische Subdomains (wie `online3.superoffice.com`) akzeptieren oft keine OAuth-Anfragen oder liefern leere Antworten.
-    *   **Lösung:** Für den Token-Refresh muss zwingend der globale Endpunkt **`https://online.superoffice.com/login/common/oauth/tokens`** verwendet werden, unabhängig davon, auf welchem Cluster der Mandant liegt.
-
-2.  **DNS-Präfixe (app- vs. direkt):**
-    *   **Problem:** In der Staging-Umgebung lautet der API-Host meist `app-sod.superoffice.com`. In der Produktion wird das `app-` Präfix oft nicht verwendet oder führt zu Zertifikatsfehlern.
-    *   **Lösung:** Der `SuperOfficeClient` wurde so flexibilisiert, dass er in der Produktion direkt auf `{env}.superoffice.com` zugreift.
-
-3.  **Environment Variables Persistence:**
-    *   **Problem:** Docker-Container behalten Umgebungsvariablen oft im Cache ("Shadow Configuration"), selbst wenn die `.env`-Datei geändert wurde.
-    *   **Lösung:** Zwingendes `docker-compose up -d --force-recreate` nach Credentials-Änderungen.
-
-### 13. Post-Migration Configuration (Cust26720)
-
-Die Konfiguration in der `.env` Datei wurde für die Produktion wie folgt finalisiert:
-
-| Funktion | UDF / ID | Entity |
-| :--- | :--- | :--- |
-| **Subject** | `SuperOffice:19` | Person |
-| **Intro Text** | `SuperOffice:20` | Person |
-| **Social Proof** | `SuperOffice:21` | Person |
-| **Unsubscribe** | `SuperOffice:22` | Person |
-| **Campaign Tag** | `SuperOffice:23` | Person |
-| **Opener Primary** | `SuperOffice:86` | Contact |
-| **Opener Sec.** | `SuperOffice:87` | Contact |
-| **Vertical** | `SuperOffice:83` | Contact |
-| **Summary** | `SuperOffice:84` | Contact |
-| **Last Update** | `SuperOffice:85` | Contact |
-
-### 14. Kampagnen-Steuerung (Usage)
-
-Das System unterstützt mehrere Outreach-Varianten über das Feld **`MA_Campaign`** (Person).
-
-1.  **Standard:** Bleibt das Feld leer, werden die Standard-Texte ("standard") für Kaltakquise geladen.
-2.  **Spezifisch:** Wird ein Wert gewählt (z.B. "Messe 2026"), sucht der Connector gezielt nach Matrix-Einträgen mit diesem Tag.
-3.  **Fallback:** Existiert für die gewählte Kampagne kein spezifischer Text für das Vertical/Persona, wird automatisch auf "standard" zurückgegriffen.
-
-### 16. Email Sending Implementation (Feb 28, 2026)
-
-A dedicated script `create_email_test.py` has been implemented to create "Email Documents" directly in SuperOffice via the API. This bypasses the need for an external SMTP server by utilizing SuperOffice's internal document system.
-
-**Features:**
-*   **Document Creation:** Creates a document of type "Ausg. E-Mail" (Template ID 157).
-*   **Activity Tracking:** Automatically creates a linked "Appointment" (Task ID 6 - Document Out) to ensure the email appears in the contact's activity timeline.
-*   **Direct Link:** Outputs a direct URL to open the created document in SuperOffice Online.
-
-**Usage:**
-```bash
-python3 connector-superoffice/create_email_test.py <PersonID>
-# Example:
-python3 connector-superoffice/create_email_test.py 193036
-```
-
-**Key API Endpoints Used:**
-*   `POST /Document`: Creates the email body and metadata.
-*   `POST /Appointment`: Creates the activity record linked to the document.
-
-### 17. Sales & Opportunities Implementation (Feb 28, 2026)
-
-We have successfully prototyped the creation of "Sales" (Opportunities) via the API. This allows the AI to not only enrich contacts but also create tangible pipeline objects for sales representatives.
-
-**Prototype Script:** `connector-superoffice/create_sale_test.py`
-
-**Key Findings (Roboplanet Specific):**
-1.  **Authentication:** Must use `load_dotenv(override=True)` to ensure production credentials are used (see GEMINI.md).
-2.  **Sale Type (CRITICAL):** Use `SaleType: { "Id": 14 }` (**Roboplanet Verkauf**) to separate from Wackler parent data.
-3.  **Mandatory Fields:**
-    *   `Heading`: Title of the opportunity.
-    *   `Saledate`: Estimated closing date (ISO format).
-    *   `Amount` & `Currency`: Use `Currency: { "Id": 33 }` (EUR).
-    *   `Stage`: `{ "Id": 10 }` (5% Prospect).
-    *   `Contact`: `{ "ContactId": 123 }`.
-
-**Usage:**
-```bash
-python3 connector-superoffice/create_sale_test.py
-```
-This script finds the first available contact in the CRM and creates a test opportunity for them.
-
-### 18. Service & Request Management (Tickets)
-
-The Service module handles incoming requests and support cases. For Roboplanet, specific categories are used to manage leads and partner interactions.
-
-**Key Categories (Roboplanet):**
-*   **Lead Roboplanet (ID 46):** Incoming potential leads.
-*   **Vertriebspartner Roboplanet (ID 47):** Communication with partners.
-*   **Weitergabe Roboplanet (ID 48):** Internal handovers.
-
-**Data Structure (Entity: `ticket`):**
-Tickets are linked via `contactId` and `personId`. They can also be associated with a `saleId` to provide a 360-degree view of a deal including its technical support questions.
-
-### 19. Product Catalog & Families
-
-SuperOffice contains a shared product catalog. Roboplanet's hardware portfolio is mapped directly to **Product Families**.
-
-**Key Product Families (Robot Models):**
-*   **Cleaning:**
-    *   `CC1` (ID 17), `CC1 Pro` (ID 34)
-    *   `Scrubber 50` (ID 24), `Scrubber 75` (ID 25)
-    *   `Nexaro 1500` (ID 26)
-    *   `Phantas` (ID 23)
-*   **Service/Transport:**
-    *   `BellaBot` (ID 18), `BellaBot Pro` (ID 19)
-    *   `KettyBot Pro` (ID 22)
-    *   `HolaBot` (ID 29)
-    *   `FlashBot` (ID 21)
-
-**Usage:** When creating Quote Lines (`Quote/Line`), linking to these families helps in reporting pipeline value by robot model.
-
-### 20. Selections & Target Lists
-
-Selections are the primary tool for grouping contacts in SuperOffice (e.g. for mailings or call lists).
-
-*   **API Access:** `/Selection` endpoints allow reading and adding members.
-*   **Strategy:** Instead of just tagging contacts via UDFs, adding them to a specific **Static Selection** (e.g., "AI Generated Leads - Review Pending") makes them immediately actionable for sales reps in their "Selections" tab.
-
---
-This is the core logic used to generate the company-specific opener.
-
-**Goal:** Prove understanding of the business model + imply the pain (positive observation).
-
-```text
-Du bist ein exzellenter B2B-Stratege und Texter mit einem tiefen Verständnis für operative Prozesse.
-Deine Aufgabe ist es, einen hochpersonalisierten, scharfsinnigen und wertschätzenden Einleitungssatz für eine E-Mail an ein potenzielles Kundenunternehmen zu formulieren.
-
--- Denkprozess & Stilvorgaben ---
-1. **Analysiere den Kontext:** Verstehe das Kerngeschäft. Was ist die kritische, physische Tätigkeit vor Ort? (z.B. 'Betrieb von Hochregallagern', 'Pflege von Patienten').
-2. **Identifiziere den Hebel:** Was ist der Erfolgsfaktor? (z.B. 'reibungslose Abläufe', 'maximale Hygiene').
-3. **Formuliere den Satz (ca. 20-35 Wörter):**
-   - Wähle einen eleganten, aktiven Einstieg wie 'Speziell im Bereich...' oder 'Der reibungslose Betrieb...'.
-   - Verbinde die **spezifische Tätigkeit** mit dem **Hebel** und den **geschäftlichen Konsequenzen**.
-   - **WICHTIG:** Formuliere immer als positive Beobachtung über eine Kernkompetenz. Du implizierst die Herausforderung durch die Betonung der Wichtigkeit.
-   - **VERMEIDE:** Konkrete Zahlen (z.B. "35 Rutschen"), da diese veraltet sein können. Nutze abstrakte Größen ("weitläufige Anlagen").
-```
+# SuperOffice Connector README
+
+## Overview
+This directory contains Python scripts designed to integrate with the SuperOffice CRM API, primarily for data extraction and analysis related to sales and customer product information.
+
+## Authentication
+Authentication is handled via the `AuthHandler` class, which uses a refresh token flow to obtain access tokens. Ensure that the `.env` file in the project root is correctly configured with `SO_CLIENT_ID`, `SO_CLIENT_SECRET`, `SO_REFRESH_TOKEN`, `SO_REDIRECT_URI`, `SO_ENVIRONMENT`, and `SO_CONTEXT_IDENTIFIER`.
+
+## Key SuperOffice Entities and Data Model Observations
+During the development of reporting functionalities, the following observations were made regarding the SuperOffice data model:
+
+### 1. Sale Entity
+- **Primary Source for Product Information:** Contrary to initial expectations, product information is frequently stored as free-text within the `Heading` field of the `Sale` object (e.g., `"2xOmnie CD-01 mit Nachlass"`). This appears to be a common practice in the system, rather than utilizing structured product catalog items linked via quotes.
+- **Contact Association:** A significant number of `Sale` objects (`Sale?$filter=Contact ne null`) are not directly linked to a `Contact` object (`Contact: null`), making it challenging to attribute sales to specific customers programmatically. Our reporting scripts specifically filter for sales where a `Contact` is present.
+- **No Direct Quote/QuoteLine Linkage:** Attempts to retrieve `Quote` or `QuoteLine` objects directly via `Sale/{saleId}/Quotes`, `Contact/{contactId}/Quotes`, or `Sale/{saleId}/Activities` resulted in `500 Internal Server Errors` or empty result sets. This indicates that direct, API-accessible linkages between `Sales` and `structured QuoteLines` are often absent or not exposed via these endpoints.
+
+### 2. Product Information Storage (Hypothesis & Workaround)
+- **Free-Text in Heading:** The primary source for identifying products associated with a sale is the `Heading` field of the `Sale` entity itself. This field often contains product codes, descriptions, and other relevant details as free-text.
+- **User-Defined Fields (UDFs):** While `UserDefinedFields` were inspected for structured product data (e.g., `RR-02-017-OMNIE`), no such patterns were found in the `sale_id=342243` example. This suggests that UDFs are either not consistently used for product codes or are named in a way that doesn't align with common product terminology.
+
+## Scripts
+
+### `list_products.py`
+- **Purpose:** Fetches and displays a list of all defined product families from the SuperOffice product catalog (`/List/ProductFamily/Items`).
+- **Usage:** `python3 list_products.py`
+
+### `generate_customer_product_report.py`
+- **Purpose:** Generates a CSV report of customer sales, extracting product information from the `Sale.Heading` field using keyword matching.
+- **Methodology:**
+    1. Retrieves the latest `SALE_LIMIT` (e.g., 1000) `Sale` objects, filtering only those with an associated `Contact` (`$filter=Contact ne null`).
+    2. Extracts `SaleId`, `CustomerName`, and `SaleHeading` for each relevant sale.
+    3. Searches the `SaleHeading` for predefined `PRODUCT_KEYWORDS` (e.g., `OMNIE`, `CD-01`, `Service`).
+    4. Outputs the results to `product_report.csv`.
+- **Usage:** `python3 generate_customer_product_report.py`
+
+## Future Work
+- **Analyse der leeren `product_report.csv`:** Untersuchen, warum die `product_report.csv` auch nach der Filterung nach `Sale`-Objekten mit `Contact`-Verknüpfung leer bleibt. Es ist entscheidend zu verstehen, ob es keine solchen Verkäufe gibt oder ob ein Problem mit der Datenabfrage oder -verarbeitung vorliegt.
+- **Manuelle Inspektion gefilterter `Sale`-Objekte:** Wenn der Report leer ist, müssen wir einige `Sale`-Objekte, die die Bedingung `Contact ne null` erfüllen, manuell inspizieren, um ihre Struktur zu verstehen und festzustellen, ob das `Heading`-Feld oder andere Felder Produktinformationen enthalten.
+- **Verfeinerung der `PRODUCT_KEYWORDS`:** Die Liste der Produkt-Schlüsselwörter muss möglicherweise erweitert werden, basierend auf einer detaillierteren manuellen Analyse der Verkaufsüberschriften.
+- **Erforschung alternativer API-Pfade:** Falls der aktuelle Ansatz weiterhin Schwierigkeiten bereitet, müssen wir tiefer in die SuperOffice-API eintauchen, um strukturierte Produktdaten zu finden, auch wenn sie nicht direkt mit den Verkäufen verknüpft sind.
--- a/connector-superoffice/generate_customer_product_report.py
+++ b/connector-superoffice/generate_customer_product_report.py
@@ -0,0 +1,126 @@
+import os
+import csv
+import logging
+import requests
+import json
+from dotenv import load_dotenv
+
+# --- Configuration ---
+logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
+logger = logging.getLogger("customer_product_report")
+OUTPUT_FILE = 'product_report.csv'
+SALE_LIMIT = 1000  # Process the top 1000 most recently updated sales
+PRODUCT_KEYWORDS = [
+    'OMNIE', 'CD-01', 'RR-02-017', 'Service', 'Dienstleistung', 
+    'Wartung', 'Support', 'Installation', 'Beratung'
+]
+
+# --- Auth & API Client Classes (from previous scripts) ---
+class AuthHandler:
+    def __init__(self):
+        load_dotenv(override=True)
+        self.client_id = os.getenv("SO_CLIENT_ID") or os.getenv("SO_SOD")
+        self.client_secret = os.getenv("SO_CLIENT_SECRET")
+        self.refresh_token = os.getenv("SO_REFRESH_TOKEN")
+        self.redirect_uri = os.getenv("SO_REDIRECT_URI", "http://localhost")
+        self.env = os.getenv("SO_ENVIRONMENT", "sod")
+        self.cust_id = os.getenv("SO_CONTEXT_IDENTIFIER", "Cust55774")
+        if not all([self.client_id, self.client_secret, self.refresh_token]):
+            raise ValueError("SuperOffice credentials missing in .env file.")
+    def get_access_token(self):
+        return self._refresh_access_token()
+    def _refresh_access_token(self):
+        token_domain = "online.superoffice.com" if "online" in self.env.lower() else "sod.superoffice.com"
+        url = f"https://{token_domain}/login/common/oauth/tokens"
+        data = {"grant_type": "refresh_token", "client_id": self.client_id, "client_secret": self.client_secret, "refresh_token": self.refresh_token, "redirect_uri": self.redirect_uri}
+        try:
+            resp = requests.post(url, data=data)
+            resp.raise_for_status()
+            return resp.json().get("access_token")
+        except requests.RequestException as e:
+            logger.error(f"❌ Connection Error during token refresh: {e}")
+            return None
+
+class SuperOfficeClient:
+    def __init__(self, auth_handler):
+        self.auth_handler = auth_handler
+        self.base_url = f"https://{self.auth_handler.env}.superoffice.com/{self.auth_handler.cust_id}/api/v1"
+        self.access_token = self.auth_handler.get_access_token()
+        if not self.access_token:
+            raise Exception("Failed to obtain access token.")
+        self.headers = {"Authorization": f"Bearer {self.access_token}", "Content-Type": "application/json", "Accept": "application/json"}
+    def _get(self, endpoint):
+        url = f"{self.base_url}/{endpoint}"
+        logger.debug(f"GET: {url}")
+        resp = requests.get(url, headers=self.headers)
+        if resp.status_code == 204: return None
+        resp.raise_for_status()
+        return resp.json()
+
+def find_keywords(text):
+    """Searches for keywords in a given text, case-insensitively."""
+    found = []
+    if not text:
+        return found
+    text_lower = text.lower()
+    for keyword in PRODUCT_KEYWORDS:
+        if keyword.lower() in text_lower:
+            found.append(keyword)
+    return found
+
+def main():
+    logger.info("--- Starting Customer Product Report Generation ---")
+    
+    try:
+        auth = AuthHandler()
+        client = SuperOfficeClient(auth)
+
+        # 1. Fetch the most recently updated sales
+        logger.info(f"Fetching the last {SALE_LIMIT} updated sales...")
+        # OData query to get the top N sales that have a contact associated
+        sales_endpoint = f"Sale?$filter=Contact ne null&$orderby=saleId desc&$top={SALE_LIMIT}&$select=SaleId,Heading,Contact"
+        sales_response = client._get(sales_endpoint)
+        
+        if not sales_response or 'value' not in sales_response:
+            logger.warning("No sales with associated contacts found.")
+            return
+
+        sales = sales_response['value']
+        logger.info(f"Found {len(sales)} sales with associated contacts to process.")
+        # Removed the debug log to avoid excessive output of the same data
+
+        # 2. Process each sale and write to CSV
+        with open(OUTPUT_FILE, 'w', newline='', encoding='utf-8') as csvfile:
+            fieldnames = ['SaleID', 'CustomerName', 'SaleHeading', 'DetectedKeywords']
+            writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
+            writer.writeheader()
+            
+            for sale in sales:
+                if not sale.get('Contact') or not sale['Contact'].get('ContactId'):
+                    logger.warning(f"Skipping Sale {sale.get('SaleId')} because it has no linked Contact.")
+                    continue
+
+                sale_id = sale.get('SaleId')
+                heading = sale.get('Heading', 'N/A')
+                customer_name = sale['Contact'].get('Name', 'N/A')
+
+                # Find keywords in the heading
+                keywords_found = find_keywords(heading)
+                
+                writer.writerow({
+                    'SaleID': sale_id,
+                    'CustomerName': customer_name,
+                    'SaleHeading': heading,
+                    'DetectedKeywords': ', '.join(keywords_found) if keywords_found else 'None'
+                })
+        
+        logger.info(f"--- ✅ Report generation complete. ---")
+        logger.info(f"Results saved to '{OUTPUT_FILE}'.")
+
+    except requests.exceptions.HTTPError as e:
+        logger.error(f"❌ API Error: {e.response.status_code} - {e.response.text}")
+    except Exception as e:
+        logger.error(f"An unexpected error occurred: {e}", exc_info=True)
+
+if __name__ == "__main__":
+    main()
--- a/connector-superoffice/list_products.py
+++ b/connector-superoffice/list_products.py
@@ -0,0 +1,122 @@
+import os
+import sys
+import json
+import logging
+import requests
+from dotenv import load_dotenv
+
+# Configure logging
+logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
+logger = logging.getLogger("list_products")
+
+# --- Inline AuthHandler & SuperOfficeClient (Proven Working Logic) ---
+
+class AuthHandler:
+    def __init__(self):
+        # CRITICAL: override=True ensures we read from .env even if env vars are already set
+        load_dotenv(override=True)
+        
+        self.client_id = os.getenv("SO_CLIENT_ID") or os.getenv("SO_SOD")
+        self.client_secret = os.getenv("SO_CLIENT_SECRET")
+        self.refresh_token = os.getenv("SO_REFRESH_TOKEN")
+        self.redirect_uri = os.getenv("SO_REDIRECT_URI", "http://localhost")
+        self.env = os.getenv("SO_ENVIRONMENT", "sod")
+        self.cust_id = os.getenv("SO_CONTEXT_IDENTIFIER", "Cust55774")
+
+        if not all([self.client_id, self.client_secret, self.refresh_token]):
+            raise ValueError("SuperOffice credentials missing in .env file.")
+
+    def get_access_token(self):
+        return self._refresh_access_token()
+
+    def _refresh_access_token(self):
+        token_domain = "online.superoffice.com" if "online" in self.env.lower() else "sod.superoffice.com"
+        url = f"https://{token_domain}/login/common/oauth/tokens"
+        
+        data = {
+            "grant_type": "refresh_token",
+            "client_id": self.client_id,
+            "client_secret": self.client_secret,
+            "refresh_token": self.refresh_token,
+            "redirect_uri": self.redirect_uri
+        }
+        try:
+            resp = requests.post(url, data=data)
+            if resp.status_code != 200:
+                logger.error(f"❌ Token Refresh Failed (Status {resp.status_code}): {resp.text}")
+                return None
+            return resp.json().get("access_token")
+        except Exception as e:
+            logger.error(f"❌ Connection Error during token refresh: {e}")
+            return None
+
+class SuperOfficeClient:
+    def __init__(self, auth_handler):
+        self.auth_handler = auth_handler
+        self.env = auth_handler.env
+        self.cust_id = auth_handler.cust_id
+        self.base_url = f"https://{self.env}.superoffice.com/{self.cust_id}/api/v1"
+        self.access_token = self.auth_handler.get_access_token()
+        
+        if not self.access_token:
+            raise Exception("Failed to obtain access token.")
+        
+        self.headers = {
+            "Authorization": f"Bearer {self.access_token}",
+            "Content-Type": "application/json",
+            "Accept": "application/json"
+        }
+
+    def _get(self, endpoint):
+        url = f"{self.base_url}/{endpoint}"
+        logger.info(f"Attempting to GET: {url}")
+        resp = requests.get(url, headers=self.headers)
+        resp.raise_for_status()
+        return resp.json()
+
+def main():
+    try:
+        # Initialize Auth and Client
+        auth = AuthHandler()
+        client = SuperOfficeClient(auth)
+        
+        print("\n--- 1. Fetching Product Information ---")
+        
+        product_families = []
+        endpoint_to_try = "List/ProductFamily/Items"
+
+        try:
+            product_families = client._get(endpoint_to_try)
+        except requests.exceptions.HTTPError as e:
+            logger.error(f"Failed to fetch from '{endpoint_to_try}': {e}")
+            print(f"Could not find Product Families at '{endpoint_to_try}'. This might not be the correct endpoint or list name.")
+            # If the first endpoint fails, you could try others here.
+            # For now, we will exit if this one fails.
+            return
+
+        if not product_families:
+            print("No product families found or the endpoint returned an empty list.")
+            return
+
+        print("\n--- ✅ SUCCESS: Found Product Families ---")
+        print("-----------------------------------------")
+        print(f"{'ID':<10} | {'Name':<30} | {'Tooltip':<40}")
+        print("-----------------------------------------")
+        
+        for product in product_families:
+            product_id = product.get('Id', 'N/A')
+            name = product.get('Name', 'N/A')
+            tooltip = product.get('Tooltip', 'N/A')
+            print(f"{str(product_id):<10} | {name:<30} | {tooltip:<40}")
+        
+        print("-----------------------------------------")
+
+    except requests.exceptions.HTTPError as e:
+        logger.error(f"❌ API Error: {e}")
+        if e.response is not None:
+             logger.error(f"Response Body: {e.response.text}")
+    except Exception as e:
+        logger.error(f"Fatal Error: {e}")
+
+if __name__ == "__main__":
+    main()
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -244,6 +244,27 @@ services:
      DB_PATH: "/app/connector_queue.db"
      COMPANY_EXPLORER_URL: "http://company-explorer:8000"

+  lead-engine:
+    build:
+      context: ./lead-engine
+      dockerfile: Dockerfile
+    container_name: lead-engine
+    restart: unless-stopped
+    ports:
+      - "8501:8501"
+    env_file:
+      - .env
+    environment:
+      PYTHONUNBUFFERED: "1"
+      COMPANY_EXPLORER_URL: "http://company-explorer:8000"
+      # Explicitly pass keys to ensure availability
+      SERP_API: "${SERP_API}"
+      GEMINI_API_KEY: "${GEMINI_API_KEY}"
+    volumes:
+      - ./lead-engine:/app
+      # We need to mount the root connector module so it can be imported inside the container
+      - ./company_explorer_connector.py:/app/company_explorer_connector.py 
+
  # --- INFRASTRUCTURE SERVICES ---
  duckdns:
    image: lscr.io/linuxserver/duckdns:latest
--- a/31
+++ b/31
@@ -0,0 +1,31 @@
+# In diese Datei können sensible Umgebungsvariablen wie API-Schlüssel eingetragen werden.
+# Sie wird von der Anwendung geladen, aber nicht in Git eingecheckt.
+
+GEMINI_API_KEY="AIzaSyBNg5yQ-dezfDs6j9DGn8qJ8SImNCGm9Ds"
+GITEA_TOKEN="318c736205934dd066b6bbcb1d732931eaa7c8c4"
+GITEA_USER="Floke"
+NOTION_API_KEY="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"
+SO_SOD="f8f918c67fc6bcd59b4a53707a6662a0"
+SO_STAGE="e913252ce3fb6d8421df5893edf0973c"
+SO_PRODUCTION="0fd8272803551846f7212a961a1a0046"
+SO_CLIENT_SECRET="418c424681944ad4138788692dfd7ab2"
+SO_REFRESH_TOKEN='1x4vJ2fL0Hje8s5RKHnyxtzRpmsJBE2Bf0MO1TPM9OuyHA7OTCKx9kdmkQCzKHHF'
+SO_REFRESH_TOKEN_PROD="1x4vJ2fL0Hje8s5RKHnyxtzRpmsJBE2Bf0MO1TPM9OuyHA7OTCKx9kdmkQCzKHHF"
+SO_SYSTEM_USER_TOKEN=""
+SO_CONTEXT_IDENTIFIER='Cust26720'
+SO_PRIVATE_KEY="MIICeAIBADANBgkqhkiG9w0BAQEFAASCAmIwggJeAgEAAoGBANz5YWSoodUvQCprDnJz7kuhXz8mHSoOpbQlMqbeBDotvvqDOTtumBcTgwbUBzvlJrBKDXM+l9gOQRPZL+MvF8r/oQ8UKx7Mmr65KtmJ+TH/wRQKrmLkaF+Rbbx+obfspZXwSULN8BPZvzvCyh6JihOR14mlf0DA0S6GHgMM0MHBAgMBAAECgYEAi8TdWprjSgHKF0qB59j2WDYpFbtY5RpAq3J/2FZD3DzFOJU55SKt5qK71NzV+oeV8hnU6hkkWE+j0BcnGA7Yf6xGIoVNVhrenU18hrd6vSUPDeOuerkv+u98pNEqs6jcfYwhKKEJ2nFl4AacdQ7RaQPEWb41pVYvP+qaX6PeQAECQQDx8ZGLzjp9EJtzyhpQafDV1HYFc6inPF8Ax4oQ4JR5E/9iPWRRxb2TjNR5oVza6md8B5RtffwGTlVbl50IMiMBAkEA6c/usvg8/4quH8Z70tSotmN+N6UxiuaTF51oOeTnIVUjXMqB3gc5sRCbipGj1u+DJUYh4LQLZp+W2LU7uCpewQJBAMtqvGFcIebW2KxwptEnUVqnCBerV4hMBOBF5DouaAaonpa9YSQzaiGtTVN6LPTOEfXA9bVdMFEo+TFJ9rhWVwECQQDJz37xnRBRZWsL5C8GeCWzX8cW0pAjmwdFL8lBh1D0VV8zfVuAv+3M5k/K2BB5ubwR1SnyoJTinEcAf9WvDWtBAkBVfhJHFVDXfR6cCrD0zQ3KX7zvm+aFzpxuwlBDcT98mNC+QHwSCPEGnolVN5jVTmBrnoe/OeCiaTffmkDqCWLQ"
+SO_CLIENT_ID='0fd8272803551846f7212a961a1a0046'
+SO_ENVIRONMENT='online3'
+UDF_SUBJECT='SuperOffice:19'
+UDF_INTRO='SuperOffice:20'
+UDF_SOCIAL_PROOF='SuperOffice:21'
+UDF_OPENER='SuperOffice:86'
+UDF_OPENER_SECONDARY='SuperOffice:87'
+UDF_VERTICAL='SuperOffice:83'
+UDF_CAMPAIGN='SuperOffice:23'
+UDF_UNSUBSCRIBE_LINK='SuperOffice:22'
+PERSONA_MAP_JSON='{"Wirtschaftlicher Entscheider": 54, "Operativer Entscheider": 55, "Infrastruktur-Verantwortlicher": 56, "Innovations-Treiber": 57, "Influencer": 58}'
+VERTICAL_MAP_JSON='{"Automotive - Dealer": 1613, "Corporate - Campus": 1614, "Energy - Grid & Utilities": 1615, "Energy - Solar/Wind": 1616, "Healthcare - Care Home": 1617, "Healthcare - Hospital": 1618, "Hospitality - Gastronomy": 1619, "Hospitality - Hotel": 1620, "Industry - Manufacturing": 1621, "Infrastructure - Communities": 1622, "Infrastructure - Parking": 1625, "Infrastructure - Public": 1623, "Infrastructure - Transport": 1624, "Leisure - Entertainment": 1626, "Leisure - Fitness": 1627, "Leisure - Indoor Active": 1628, "Leisure - Outdoor Park": 1629, "Leisure - Wet & Spa": 1630, "Logistics - Warehouse": 1631, "Others": 1632, "Reinigungsdienstleister": 1633, "Retail - Food": 1634, "Retail - Non-Food": 1635, "Retail - Shopping Center": 1636, "Tech - Data Center": 1637}'
+UDF_SUMMARY='SuperOffice:84'
+UDF_LAST_UPDATE='SuperOffice:85'
+UDF_LAST_OUTREACH='SuperOffice:88'
--- a/34
+++ b/34
@@ -0,0 +1,34 @@
+# In diese Datei können sensible Umgebungsvariablen wie API-Schlüssel eingetragen werden.
+# Sie wird von der Anwendung geladen, aber nicht in Git eingecheckt.
+
+GEMINI_API_KEY="AIzaSyBNg5yQ-dezfDs6j9DGn8qJ8SImNCGm9Ds"
+GITEA_TOKEN="318c736205934dd066b6bbcb1d732931eaa7c8c4"
+GITEA_USER="Floke"
+NOTION_API_KEY="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"
+SO_SOD="f8f918c67fc6bcd59b4a53707a6662a0"
+SO_STAGE="e913252ce3fb6d8421df5893edf0973c"
+SO_PRODUCTION="0fd8272803551846f7212a961a1a0046"
+SO_CLIENT_SECRET="418c424681944ad4138788692dfd7ab2"
+SO_REFRESH_TOKEN='1x4vJ2fL0Hje8s5RKHnyxtzRpmsJBE2Bf0MO1TPM9OuyHA7OTCKx9kdmkQCzKHHF'
+SO_REFRESH_TOKEN_PROD="1x4vJ2fL0Hje8s5RKHnyxtzRpmsJBE2Bf0MO1TPM9OuyHA7OTCKx9kdmkQCzKHHF"
+SO_SYSTEM_USER_TOKEN=""
+SO_CONTEXT_IDENTIFIER='Cust26720'
+SO_PRIVATE_KEY="MIICeAIBADANBgkqhkiG9w0BAQEFAASCAmIwggJeAgEAAoGBANz5YWSoodUvQCprDnJz7kuhXz8mHSoOpbQlMqbeBDotvvqDOTtumBcTgwbUBzvlJrBKDXM+l9gOQRPZL+MvF8r/oQ8UKx7Mmr65KtmJ+TH/wRQKrmLkaF+Rbbx+obfspZXwSULN8BPZvzvCyh6JihOR14mlf0DA0S6GHgMM0MHBAgMBAAECgYEAi8TdWprjSgHKF0qB59j2WDYpFbtY5RpAq3J/2FZD3DzFOJU55SKt5qK71NzV+oeV8hnU6hkkWE+j0BcnGA7Yf6xGIoVNVhrenU18hrd6vSUPDeOuerkv+u98pNEqs6jcfYwhKKEJ2nFl4AacdQ7RaQPEWb41pVYvP+qaX6PeQAECQQDx8ZGLzjp9EJtzyhpQafDV1HYFc6inPF8Ax4oQ4JR5E/9iPWRRxb2TjNR5oVza6md8B5RtffwGTlVbl50IMiMBAkEA6c/usvg8/4quH8Z70tSotmN+N6UxiuaTF51oOeTnIVUjXMqB3gc5sRCbipGj1u+DJUYh4LQLZp+W2LU7uCpewQJBAMtqvGFcIebW2KxwptEnUVqnCBerV4hMBOBF5DouaAaonpa9YSQzaiGtTVN6LPTOEfXA9bVdMFEo+TFJ9rhWVwECQQDJz37xnRBRZWsL5C8GeCWzX8cW0pAjmwdFL8lBh1D0VV8zfVuAv+3M5k/K2BB5ubwR1SnyoJTinEcAf9WvDWtBAkBVfhJHFVDXfR6cCrD0zQ3KX7zvm+aFzpxuwlBDcT98mNC+QHwSCPEGnolVN5jVTmBrnoe/OeCiaTffmkDqCWLQ"
+SO_CLIENT_ID='0fd8272803551846f7212a961a1a0046'
+SO_ENVIRONMENT='online3'
+UDF_SUBJECT='SuperOffice:19'
+UDF_INTRO='SuperOffice:20'
+UDF_SOCIAL_PROOF='SuperOffice:21'
+UDF_OPENER='SuperOffice:86'
+UDF_OPENER_SECONDARY='SuperOffice:87'
+UDF_VERTICAL='SuperOffice:83'
+UDF_CAMPAIGN='SuperOffice:23'
+UDF_UNSUBSCRIBE_LINK='SuperOffice:22'
+PERSONA_MAP_JSON='{"Wirtschaftlicher Entscheider": 54, "Operativer Entscheider": 55, "Infrastruktur-Verantwortlicher": 56, "Innovations-Treiber": 57, "Influencer": 58}'
+VERTICAL_MAP_JSON='{"Automotive - Dealer": 1613, "Corporate - Campus": 1614, "Energy - Grid & Utilities": 1615, "Energy - Solar/Wind": 1616, "Healthcare - Care Home": 1617, "Healthcare - Hospital": 1618, "Hospitality - Gastronomy": 1619, "Hospitality - Hotel": 1620, "Industry - Manufacturing": 1621, "Infrastructure - Communities": 1622, "Infrastructure - Parking": 1625, "Infrastructure - Public": 1623, "Infrastructure - Transport": 1624, "Leisure - Entertainment": 1626, "Leisure - Fitness": 1627, "Leisure - Indoor Active": 1628, "Leisure - Outdoor Park": 1629, "Leisure - Wet & Spa": 1630, "Logistics - Warehouse": 1631, "Others": 1632, "Reinigungsdienstleister": 1633, "Retail - Food": 1634, "Retail - Non-Food": 1635, "Retail - Shopping Center": 1636, "Tech - Data Center": 1637}'
+UDF_SUMMARY='SuperOffice:84'
+UDF_LAST_UPDATE='SuperOffice:85'
+UDF_LAST_OUTREACH='SuperOffice:88'
+INFO_Application_ID="68439166-5f50-477a-ab20-4b6d4585c0a7"
+INFO_Tenant_ID="6d85a9ef-3878-420b-8f43-38d6cb12b665"
+INFO_Secret="dlm8Q~KH5IzjChljiexb7NfSohp3M_~AuR8QqcXi"
--- a/lead-engine/Dockerfile
+++ b/lead-engine/Dockerfile
@@ -4,7 +4,9 @@ WORKDIR /app

 COPY . .

-RUN pip install streamlit pandas
+# Install dependencies required for ingestion and DB
+RUN pip install streamlit pandas requests python-dotenv

 ENV PYTHONUNBUFFERED=1
-CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]
+# Start monitor in background and streamlit in foreground
+CMD ["sh", "-c", "python monitor.py & streamlit run app.py --server.port=8501 --server.address=0.0.0.0"]
--- a/lead-engine/README.md
+++ b/lead-engine/README.md
@@ -0,0 +1,75 @@
+# Lead Engine: Multi-Source Automation v1.1 [31388f42]
+
+## 🚀 Übersicht
+Die **Lead Engine** ist ein spezialisiertes Modul zur autonomen Verarbeitung von B2B-Anfragen aus verschiedenen Quellen. Sie fungiert als Brücke zwischen dem E-Mail-Postfach und dem **Company Explorer**, um innerhalb von Minuten hochgradig personalisierte Antwort-Entwürfe auf "Human Expert Level" zu generieren.
+
+## 🛠 Hauptfunktionen
+
+### 1. Intelligenter E-Mail Ingest
+*   **Multi-Source:** Überwacht das Postfach `info@robo-planet.de` via **Microsoft Graph API** auf verschiedene Lead-Typen.
+*   **Filter & Routing:** Erkennt und unterscheidet Anfragen von **TradingTwins** und dem **Roboplanet-Kontaktformular**.
+*   **Parsing:** Spezialisierte HTML-Parser extrahieren für jede Quelle strukturierte Daten (Firma, Kontakt, Bedarf, etc.).
+
+### 2. Contact Research (LinkedIn Lookup)
+*   **Automatisierung:** Sucht via **SerpAPI** und **Gemini 2.0 Flash** nach der beruflichen Position des Ansprechpartners.
+*   **Ergebnis:** Identifiziert Rollen wie "CFO", "Mitglied der Klinikleitung" oder "Facharzt", um den Tonfall der Antwort perfekt anzupassen.
+
+### 3. Company Explorer Sync & Monitoring
+*   **Integration:** Legt Accounts und Kontakte automatisch im CE an.
+*   **Monitor:** Ein Hintergrund-Prozess (`monitor.py`) überwacht asynchron den Status der KI-Analyse im CE.
+*   **Daten-Pull:** Sobald die Analyse (Branche, Dossier) fertig ist, werden die Daten in die lokale Lead-Datenbank übernommen.
+
+### 4. Expert Response Generator
+*   **KI-Engine:** Nutzt Gemini 2.0 Flash zur Erstellung von E-Mail-Entwürfen.
+*   **Kontext:** Kombiniert Lead-Daten (Fläche) + CE-Daten (Dossier) + Matrix-Argumente (Pains/Gains).
+*   **Persistente Entwürfe:** Generierte E-Mail-Entwürfe werden direkt beim Lead gespeichert und bleiben erhalten.
+
+### 5. UI & Qualitätskontrolle
+*   **Visuelle Unterscheidung:** Klare Kennzeichnung der Lead-Quelle (z.B. 🌐 für Website, 🤝 für Partner) in der Übersicht.
+*   **Status-Tracking:** Visueller Indikator (🆕/✅) für den Synchronisations-Status mit dem Company Explorer.
+*   **Low-Quality-Warnung:** Visuelle Kennzeichnung (⚠️) von Leads mit Free-Mail-Adressen oder ohne Firmennamen direkt in der Übersicht.
+
+## 🏗 Architektur
+
+```text
+/app/lead-engine/
+├── app.py                   # Streamlit Web-Interface
+├── trading_twins_ingest.py  # E-Mail Importer (Graph API für alle Quellen)
+├── ingest.py                # Enthält alle spezifischen Parser
+├── lookup_role.py           # LinkedIn/Role Research (SerpAPI + Gemini)
+├── generate_reply.py        # Email Draft Generator (Gemini)
+├── monitor.py               # Asynchroner CE-Status Monitor
+├── db.py                    # Lokale SQLite Lead-Datenbank
+└── data/                    # DB-Storage
+```
+
+## 🚀 Inbetriebnahme (Docker)
+
+Die Lead Engine ist als Service in der zentralen `docker-compose.yml` integriert.
+
+```bash
+# Neustart des Dienstes nach Code-Änderungen
+docker-compose restart lead-engine
+```
+
+**Zugriff:** `https://floke-ai.duckdns.org/lead/` (Passwortgeschützt)
+
+## 📝 Nutzungshinweise
+1.  **Ingest:** Klicke in der Web-App auf "2. Ingest Real Emails". Das System lädt alle neuen Leads, egal welcher Quelle.
+2.  **Sync:** Wähle einen Lead und klicke auf "Sync to Company Explorer".
+3.  **Wait:** Der Monitor erkennt automatisch, wenn die Analyse im CE fertig ist.
+4.  **Draft:** Klicke auf "Generate Expert Reply" für den fertigen Entwurf.
+
+## 📋 Roadmap / Nächste Schritte
+
+- [ ] **Phase 2: Intelligente Antworten für Kontaktformulare:** Entwicklung einer kontextbezogenen Antwortlogik für Website-Formular-Leads.
+- [ ] **IT-Klärung:** Microsoft Bookings Berechtigungen (`Bookings.Read.All`, `BookingsAppointment.ReadWrite.All`) für die Entra App anfragen.
+- [ ] **Infrastruktur:** Korrekten Buchungslink (persönliches Konto) ermitteln und in der `.env` hinterlegen.
+- [ ] **CRM-Integration:** Modul "Push to SuperOffice" entwickeln, um Personen und E-Mail-Entwürfe direkt im CRM anzulegen.
+- [ ] **Daten-Synchronisation:** Notion-Produktdatenbank in die lokale DB spiegeln, um Produktauswahl und ROI-Berechnung zu dynamisieren.
+- [ ] **Logik:** ROI-Kalkulation im E-Mail-Entwurf auf Basis von echten Leistungsdaten (m²/h) und Preisen schärfen.
+- [ ] **UI:** "Copy to Clipboard" Funktion für den fertigen Entwurf in der Web-App finalisieren.
+
+---
+*Dokumentationsstand: 2. März 2026*
+*Task: [31388f42]*
--- a/lead-engine/app.py
+++ b/lead-engine/app.py
@@ -1,8 +1,47 @@
 import streamlit as st
 import pandas as pd
-from db import get_leads, init_db
+from db import get_leads, init_db, reset_lead, update_lead_draft
 import json
-from enrich import run_sync  # Import our sync function
+import re
+import os
+from enrich import run_sync, refresh_ce_data, sync_single_lead
+from generate_reply import generate_email_draft
+
+def clean_html_to_text(html_content):
+    """Surgical helper to extract relevant Tradingtwins data and format it cleanly."""
+    if not html_content:
+        return ""
+    
+    # 1. Strip head and style
+    clean = re.sub(r'<head.*?>.*?</head>', '', html_content, flags=re.DOTALL | re.IGNORECASE)
+    clean = re.sub(r'<style.*?>.*?</style>', '', clean, flags=re.DOTALL | re.IGNORECASE)
+    
+    # 2. Extract the core data block (from 'Datum:' until the matchmaking plug)
+    # We look for the first 'Datum:' label
+    start_match = re.search(r'Datum:', clean, re.IGNORECASE)
+    end_match = re.search(r'Kennen Sie schon Ihr persönliches Konto', clean, re.IGNORECASE)
+    
+    if start_match:
+        start_pos = start_match.start()
+        end_pos = end_match.start() if end_match else len(clean)
+        clean = clean[start_pos:end_pos]
+    
+    # 3. Format Table Structure: </td><td> should be a space/tab, </tr> a newline
+    # This prevents the "Label on one line, value on next" issue
+    clean = re.sub(r'</td>\s*<td.*?>', '  ', clean, flags=re.IGNORECASE)
+    clean = re.sub(r'</tr>', '\n', clean, flags=re.IGNORECASE)
+    
+    # 4. Standard Cleanup
+    clean = re.sub(r'<br\s*/?>', '\n', clean, flags=re.IGNORECASE)
+    clean = re.sub(r'</p>', '\n', clean, flags=re.IGNORECASE)
+    clean = re.sub(r'<.*?>', '', clean)
+    
+    # 5. Entity Decoding
+    clean = clean.replace('&nbsp;', ' ').replace('&amp;', '&').replace('&quot;', '"').replace('&gt;', '>')
+    
+    # 6. Final Polish: remove empty lines and leading/trailing whitespace
+    lines = [line.strip() for line in clean.split('\n') if line.strip()]
+    return '\n'.join(lines)

 st.set_page_config(page_title="TradingTwins Lead Engine", layout="wide")

@@ -18,7 +57,20 @@ if st.sidebar.button("1. Ingest Emails (Mock)"):
    st.sidebar.success(f"Ingested {count} new leads.")
    st.rerun()

-if st.sidebar.button("2. Sync to Company Explorer"):
+if st.sidebar.button("2. Ingest Real Emails (Graph API)"):
+    try:
+        from trading_twins_ingest import process_leads
+        with st.spinner("Fetching emails from Microsoft Graph..."):
+            count = process_leads()
+        if count > 0:
+            st.sidebar.success(f"Successfully ingested {count} new leads form inbox!")
+        else:
+            st.sidebar.info("No new leads found in inbox.")
+        st.rerun()
+    except Exception as e:
+        st.sidebar.error(f"Ingest failed: {e}")
+
+if st.sidebar.button("3. Sync to Company Explorer"):
    with st.spinner("Syncing with Company Explorer API..."):
        # Capture output for debugging
        try:
@@ -37,6 +89,42 @@ if st.sidebar.button("2. Sync to Company Explorer"):
        except Exception as e:
            st.error(f"Sync Failed: {e}")

+if st.sidebar.checkbox("Show System Debug"):
+    st.sidebar.subheader("System Diagnostics")
+    
+    # 1. API Key Check
+    from lookup_role import get_gemini_key
+    key = get_gemini_key()
+    if key:
+        st.sidebar.success(f"Gemini Key found ({key[:5]}...)")
+    else:
+        st.sidebar.error("Gemini Key NOT found!")
+        
+    # 2. SerpAPI Check
+    serp_key = os.getenv("SERP_API")
+    if serp_key:
+        st.sidebar.success(f"SerpAPI Key found ({serp_key[:5]}...)")
+    else:
+        st.sidebar.error("SerpAPI Key NOT found in Env!")
+        
+    # 3. Network Check
+    try:
+        import requests
+        res = requests.get("https://generativelanguage.googleapis.com", timeout=2)
+        st.sidebar.success(f"Gemini API Reachable ({res.status_code})")
+    except Exception as e:
+        st.sidebar.error(f"Network Error: {e}")
+        
+    # 4. Live Lookup Test
+    if st.sidebar.button("Test Role Lookup (Georg Stahl)"):
+        from lookup_role import lookup_person_role
+        with st.sidebar.status("Running Lookup..."):
+            res = lookup_person_role("Georg Stahl", "Klemm Bohrtechnik GmbH")
+            if res:
+                st.sidebar.success(f"Result: {res}")
+            else:
+                st.sidebar.error("Result: None")
+
 # Main View
 leads = get_leads()
 df = pd.DataFrame(leads)
@@ -50,26 +138,123 @@ if not df.empty:
    st.subheader("Lead Pipeline")
    
    for index, row in df.iterrows():
-        with st.expander(f"{row['company_name']} ({row['status']})"):
-            c1, c2 = st.columns(2)
-            c1.write(f"**Contact:** {row['contact_name']}")
-            c1.write(f"**Email:** {row['email']}")
-            c1.text(row['raw_body'][:200] + "...")
+        # Format date for title
+        date_str = ""
+        if row.get('received_at'):
+             try:
+                 dt = pd.to_datetime(row['received_at'])
+                 date_str = dt.strftime("%d.%m. %H:%M")
+             except:
+                 pass
+                 
+        # --- DYNAMIC TITLE ---
+        source_icon = "🌐" if row.get('source') == 'Website-Formular' else "🤝"
+        status_icon = "✅" if row.get('status') == 'synced' else "🆕"
+        
+        meta = {}
+        if row.get('lead_metadata'):
+            try: meta = json.loads(row['lead_metadata'])
+            except: pass
+        
+        quality_icon = "⚠️ " if meta.get('is_low_quality') else ""
+        
+        title = f"{quality_icon}{status_icon} {source_icon} {row.get('source', 'Lead')} | {date_str} | {row['company_name']}"
+                 
+        with st.expander(title):
+            # The full warning message is still shown inside for clarity
+            if meta.get('is_low_quality'):
+                st.warning("⚠️ **Low Quality Lead detected** (Free-mail provider or missing company name). Please verify manually.")
+
+            # --- SECTION 1: LEAD INFO & INTELLIGENCE ---
+            col_lead, col_intel = st.columns(2)
            
-            enrichment = json.loads(row['enrichment_data']) if row['enrichment_data'] else {}
-            
-            if enrichment:
-                c2.write("--- Integration Status ---")
-                if enrichment.get('ce_id'):
-                    c2.success(f"✅ Linked to Company Explorer (ID: {enrichment['ce_id']})")
-                    c2.write(f"**CE Status:** {enrichment.get('ce_status')}")
+            with col_lead:
+                st.markdown("### 📋 Lead Data")
+                st.write(f"**Salutation:** {meta.get('salutation', '-')}")
+                st.write(f"**Contact:** {row['contact_name']}")
+                st.write(f"**Email:** {row['email']}")
+                st.write(f"**Phone:** {meta.get('phone', row.get('phone', '-'))}")
+                
+                role = meta.get('role')
+                if role:
+                    st.info(f"**Role:** {role}")
                else:
-                    c2.warning("⚠️ Not yet synced or failed")
+                    if st.button("🔍 Find Role", key=f"role_{row['id']}"):
+                        from enrich import enrich_contact_role
+                        with st.spinner("Searching..."):
+                            found_role = enrich_contact_role(row)
+                            if found_role: st.success(f"Found: {found_role}"); st.rerun()
+                            else: st.error("No role found.")
                
-                c2.info(f"Log: {enrichment.get('message')}")
+                st.write(f"**Area:** {meta.get('area', '-')}")
+                st.write(f"**Purpose:** {meta.get('purpose', '-')}")
+                st.write(f"**Functions:** {meta.get('cleaning_functions', '-')}")
+                st.write(f"**Location:** {meta.get('zip', '')} {meta.get('city', '')}")
+
+            with col_intel:
+                st.markdown("### 🔍 Intelligence (CE)")
+                enrichment = json.loads(row['enrichment_data']) if row['enrichment_data'] else {}
+                ce_id = enrichment.get('ce_id')
                
-                if enrichment.get('ce_data'):
-                    c2.json(enrichment['ce_data'])
+                if ce_id:
+                    st.success(f"✅ Linked to Company Explorer (ID: {ce_id})")
+                    ce_data = enrichment.get('ce_data', {})
+                    
+                    vertical = ce_data.get('industry_ai') or ce_data.get('vertical')
+                    summary = ce_data.get('research_dossier') or ce_data.get('summary')
+                    
+                    if vertical and vertical != 'None':
+                        st.info(f"**Industry:** {vertical}")
+                    else:
+                        st.warning("Industry Analysis pending...")
+                    
+                    if summary:
+                        with st.expander("Show AI Research Dossier", expanded=True):
+                            st.write(summary)
+                    
+                    if st.button("🔄 Refresh CE Data", key=f"refresh_{row['id']}"):
+                        with st.spinner("Fetching..."):
+                            refresh_ce_data(row['id'], ce_id)
+                            st.rerun()
+                else:
+                    st.warning("⚠️ Not synced with Company Explorer yet")
+                    if st.button("🚀 Sync to Company Explorer", key=f"sync_single_{row['id']}"):
+                        with st.spinner("Syncing..."):
+                            sync_single_lead(row['id'])
+                            st.rerun()
+
+            st.divider()
+
+            # --- SECTION 2: ORIGINAL EMAIL ---
+            with st.expander("✉️ View Original Email Content"):
+                st.text(clean_html_to_text(row['raw_body']))
+                if st.checkbox("Show Raw HTML", key=f"raw_{row['id']}"):
+                    st.code(row['raw_body'], language="html")
+
+            st.divider()
+
+            # --- SECTION 3: RESPONSE DRAFT (Full Width) ---
+            st.markdown("### 📝 Response Draft")
+            if row['status'] != 'new' and ce_id:
+                if st.button("✨ Generate Expert Reply", key=f"gen_{row['id']}", type="primary"):
+                    with st.spinner("Writing email..."):
+                        ce_data = enrichment.get('ce_data', {})
+                        draft = generate_email_draft(row.to_dict(), ce_data)
+                        update_lead_draft(row['id'], draft) # Save to DB
+                        st.rerun() # Rerun to display the new draft from DB
+                
+                # Always display the draft from the database if it exists
+                if row.get('response_draft'):
+                    st.text_area("Email Entwurf", value=row['response_draft'], height=400)
+                    st.button("📋 Copy to Clipboard", key=f"copy_{row['id']}", on_click=lambda: st.write("Copy functionality simulated"))
+            else:
+                st.info("Sync with Company Explorer first to generate a response.")
+            
+            if row['status'] != 'new':
+                st.markdown("---")
+                if st.button("🔄 Reset Lead Status", key=f"reset_{row['id']}", help="Back to 'new' status"):
+                    reset_lead(row['id'])
+                    st.rerun()

 else:
    st.info("No leads found. Click 'Ingest Emails' in the sidebar.")
--- a/lead-engine/company_explorer_connector.py
+++ b/lead-engine/company_explorer_connector.py
--- a/lead-engine/db.py
+++ b/lead-engine/db.py
@@ -27,12 +27,26 @@ def init_db():
            email TEXT,
            phone TEXT,
            raw_body TEXT,
+            lead_metadata TEXT,
            enrichment_data TEXT,
            status TEXT DEFAULT 'new',
            response_draft TEXT,
            sent_at TIMESTAMP
        )
    ''')
+    
+    # Simple migration check: add 'lead_metadata' if not exists
+    c.execute("PRAGMA table_info(leads)")
+    columns = [row[1] for row in c.fetchall()]
+    
+    if 'lead_metadata' not in columns:
+        print("Migrating DB: Adding lead_metadata column...")
+        c.execute('ALTER TABLE leads ADD COLUMN lead_metadata TEXT')
+        
+    if 'source' not in columns:
+        print("Migrating DB: Adding source column...")
+        c.execute('ALTER TABLE leads ADD COLUMN source TEXT')
+        
    conn.commit()
    conn.close()

@@ -41,21 +55,40 @@ def insert_lead(lead_data):
    if not os.path.exists(DB_PATH):
        init_db()
        
+    # Extract metadata fields
+    meta = {
+        'area': lead_data.get('area'),
+        'purpose': lead_data.get('purpose'),
+        'zip': lead_data.get('zip'),
+        'city': lead_data.get('city'),
+        'role': lead_data.get('role'),
+        'salutation': lead_data.get('salutation'),
+        'phone': lead_data.get('phone'),
+        'cleaning_functions': lead_data.get('cleaning_functions'),
+        'is_free_mail': lead_data.get('is_free_mail', False),
+        'is_low_quality': lead_data.get('is_low_quality', False)
+    }
+    
+    # Use provided received_at or default to now
+    received_at = lead_data.get('received_at') or datetime.now()
+    
    conn = sqlite3.connect(DB_PATH)
    c = conn.cursor()
    try:
        c.execute('''
-            INSERT INTO leads (source_id, received_at, company_name, contact_name, email, phone, raw_body, status)
-            VALUES (?, ?, ?, ?, ?, ?, ?, ?)
+            INSERT INTO leads (source_id, received_at, company_name, contact_name, email, phone, raw_body, lead_metadata, status, source)
+            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
        ''', (
            lead_data.get('id'),
-            datetime.now(),
+            received_at,
            lead_data.get('company'),
            lead_data.get('contact'),
            lead_data.get('email'),
            lead_data.get('phone'),
            lead_data.get('raw_body'),
-            'new'
+            json.dumps(meta),
+            'new',
+            lead_data.get('source') # Added source
        ))
        conn.commit()
        return True
@@ -64,6 +97,14 @@ def insert_lead(lead_data):
    finally:
        conn.close()

+def update_lead_metadata(lead_id, meta_data):
+    """Helper to update metadata for existing leads (repair)"""
+    conn = sqlite3.connect(DB_PATH)
+    c = conn.cursor()
+    c.execute('UPDATE leads SET lead_metadata = ? WHERE id = ?', (json.dumps(meta_data), lead_id))
+    conn.commit()
+    conn.close()
+
 def get_leads():
    if not os.path.exists(DB_PATH):
        init_db()
@@ -85,3 +126,19 @@ def update_lead_status(lead_id, status, response_draft=None):
        c.execute('UPDATE leads SET status = ? WHERE id = ?', (status, lead_id))
    conn.commit()
    conn.close()
+
+def update_lead_draft(lead_id, draft_text):
+    """Saves a generated email draft to the database."""
+    conn = sqlite3.connect(DB_PATH)
+    c = conn.cursor()
+    c.execute('UPDATE leads SET response_draft = ? WHERE id = ?', (draft_text, lead_id))
+    conn.commit()
+    conn.close()
+
+def reset_lead(lead_id):
+    """Resets a lead to 'new' status and clears enrichment data."""
+    conn = sqlite3.connect(DB_PATH)
+    c = conn.cursor()
+    c.execute('UPDATE leads SET status = "new", enrichment_data = NULL WHERE id = ?', (lead_id,))
+    conn.commit()
+    conn.close()
--- a/lead-engine/enrich.py
+++ b/lead-engine/enrich.py
@@ -5,8 +5,9 @@ import sqlite3

 # Füge das Hauptverzeichnis zum Python-Pfad hinzu, damit der Connector gefunden wird
 sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
-from company_explorer_connector import handle_company_workflow
-from db import get_leads, DB_PATH
+from company_explorer_connector import handle_company_workflow, get_company_details
+from db import get_leads, DB_PATH, update_lead_metadata
+from lookup_role import lookup_person_role

 def update_lead_enrichment(lead_id, data, status):
    """Aktualisiert einen Lead in der Datenbank mit neuen Enrichment-Daten und einem neuen Status."""
@@ -18,9 +19,129 @@ def update_lead_enrichment(lead_id, data, status):
    conn.close()
    print(f"Lead {lead_id} aktualisiert. Neuer Status: {status}")

+def refresh_ce_data(lead_id, ce_id):
+    """
+    Holt die aktuellsten Daten (inkl. Analyse-Ergebnis) vom Company Explorer
+    und aktualisiert den lokalen Lead.
+    """
+    print(f"Refreshing data for CE ID {ce_id}...")
+    ce_data = get_company_details(ce_id)
+    
+    # Bestehende Enrichment-Daten holen
+    leads = get_leads()
+    lead = next((l for l in leads if l['id'] == lead_id), None)
+    
+    enrichment_data = {}
+    if lead and lead.get('enrichment_data'):
+        try:
+            enrichment_data = json.loads(lead['enrichment_data'])
+        except:
+            pass
+            
+    enrichment_data.update({
+        "sync_status": "refreshed",
+        "ce_id": ce_id,
+        "message": "Data refreshed from CE",
+        "ce_data": ce_data
+    })
+    
+    update_lead_enrichment(lead_id, enrichment_data, status='synced')
+    return ce_data
+
+def enrich_contact_role(lead):
+    """
+    Versucht, die Rolle des Kontakts via SerpAPI zu finden und speichert sie in den Metadaten.
+    """
+    meta = {}
+    if lead.get('lead_metadata'):
+        try:
+            meta = json.loads(lead.get('lead_metadata'))
+        except:
+            pass
+            
+    # Skip if we already have a role (and it's not None/Unknown)
+    if meta.get('role') and meta.get('role') != "Unbekannt":
+        return meta.get('role')
+        
+    print(f"Looking up role for {lead['contact_name']} at {lead['company_name']}...")
+    role = lookup_person_role(lead['contact_name'], lead['company_name'])
+    
+    if role:
+        print(f" -> Found role: {role}")
+        meta['role'] = role
+        update_lead_metadata(lead['id'], meta)
+    else:
+        print(" -> No role found.")
+        
+    return role
+
+def sync_single_lead(lead_id):
+    """
+    Verarbeitet einen einzelnen Lead: Rolle suchen, CE-Sync, Analyse triggern.
+    """
+    conn = sqlite3.connect(DB_PATH)
+    conn.row_factory = sqlite3.Row
+    c = conn.cursor()
+    c.execute('SELECT * FROM leads WHERE id = ?', (lead_id,))
+    lead = c.fetchone()
+    conn.close()
+    
+    if not lead:
+        return {"status": "error", "message": "Lead not found"}
+
+    lead_dict = dict(lead)
+    company_name = lead_dict['company_name']
+    print(f"\n--- Manually Syncing Lead ID: {lead_id}, Company: '{company_name}' ---")
+    
+    # 1. Contact Enrichment (Role Lookup)
+    role = enrich_contact_role(lead_dict)
+    
+    # 2. Prepare Contact Info
+    meta = {}
+    if lead_dict.get('lead_metadata'):
+        try: meta = json.loads(lead_dict['lead_metadata'])
+        except: pass
+    
+    # Smarter name splitting if meta is empty (for repaired leads)
+    full_name = lead_dict.get('contact_name', '')
+    first_name = meta.get('contact_first')
+    last_name = meta.get('contact_last')
+    
+    if not first_name and full_name:
+        parts = full_name.strip().split(' ')
+        if len(parts) > 1:
+            first_name = parts[0]
+            last_name = ' '.join(parts[1:])
+        else:
+            last_name = full_name
+            first_name = ''
+
+    contact_info = {
+        "first_name": first_name,
+        "last_name": last_name,
+        "email": lead_dict['email'],
+        "job_title": meta.get('role', role),
+        "role": None, # Set to None so CE can use its RoleMappingService
+        "is_primary": True
+    }
+    
+    # 3. CE Workflow
+    result = handle_company_workflow(company_name, contact_info=contact_info)
+    
+    # 4. Save results
+    enrichment_data = {
+        "sync_status": result.get("status"),
+        "ce_id": result.get("data", {}).get("id") if result.get("data") else None,
+        "message": result.get("message", "Manual sync successful"),
+        "ce_data": result.get("data")
+    }
+    
+    update_lead_enrichment(lead_id, enrichment_data, status='synced')
+    return result
+
 def run_sync():
    """
-    Haupt-Synchronisationsprozess.
+    Haupt-Synchronisationsprozess (Batch).
    Holt alle neuen Leads und stößt den Company Explorer Workflow für jeden an.
    """
    # Hole nur die Leads, die wirklich neu sind und noch nicht verarbeitet wurden
@@ -36,15 +157,33 @@ def run_sync():
        company_name = lead['company_name']
        print(f"\n--- Processing Lead ID: {lead['id']}, Company: '{company_name}' ---")
        
-        # Rufe den zentralen Workflow auf, den wir im Connector definiert haben
-        # Diese Funktion kümmert sich um alles: prüfen, erstellen, discovern, pollen, analysieren
-        result = handle_company_workflow(company_name)
+        # 1. Contact Enrichment (Role Lookup via SerpAPI)
+        role = enrich_contact_role(lead)
+        
+        # 2. Prepare Contact Info for CE
+        meta = {}
+        if lead.get('lead_metadata'):
+            try:
+                meta = json.loads(lead.get('lead_metadata'))
+            except:
+                pass
+        
+        contact_info = {
+            "first_name": meta.get('contact_first', ''),
+            "last_name": meta.get('contact_last', lead['contact_name'].split(' ')[-1] if lead['contact_name'] else ''),
+            "email": lead['email'],
+            "job_title": meta.get('role', role), # The raw title or Gemini result
+            "role": meta.get('role', role) # Currently mapped to same field
+        }
+        
+        # 3. Company Enrichment (CE Workflow with Contact)
+        result = handle_company_workflow(company_name, contact_info=contact_info)
        
        # Bereite die Daten für die Speicherung in der DB vor
        enrichment_data = {
            "sync_status": result.get("status"),
            "ce_id": result.get("data", {}).get("id") if result.get("data") else None,
-            "message": result.get("message"),
+            "message": result.get("message", "Sync successful"),
            "ce_data": result.get("data")
        }
        
--- a/lead-engine/generate_reply.py
+++ b/lead-engine/generate_reply.py
@@ -0,0 +1,216 @@
+import os
+import json
+import requests
+import sqlite3
+import re
+import datetime
+
+# --- Helper: Get Gemini Key ---
+def get_gemini_key():
+    candidates = [
+        "gemini_api_key.txt",                     # Current dir
+        "/app/gemini_api_key.txt",                # Docker default
+        os.path.join(os.path.dirname(__file__), "gemini_api_key.txt"), # Script dir
+        os.path.join(os.path.dirname(os.path.dirname(__file__)), 'gemini_api_key.txt') # Parent dir
+    ]
+    
+    for path in candidates:
+        if os.path.exists(path):
+            try:
+                with open(path, 'r') as f:
+                    return f.read().strip()
+            except:
+                pass
+                
+    return os.getenv("GEMINI_API_KEY")
+
+def get_matrix_context(industry_name, persona_name):
+    """Fetches Pains, Gains and Arguments from CE Database."""
+    context = {
+        "industry_pains": "",
+        "industry_gains": "",
+        "persona_description": "",
+        "persona_arguments": ""
+    }
+    db_path = os.path.join(os.path.dirname(os.path.dirname(__file__)), 'companies_v3_fixed_2.db')
+    if not os.path.exists(db_path):
+        return context
+        
+    try:
+        conn = sqlite3.connect(db_path)
+        c = conn.cursor()
+        
+        # Get Industry Data
+        c.execute('SELECT pains, gains FROM industries WHERE name = ?', (industry_name,))
+        ind_res = c.fetchone()
+        if ind_res:
+            context["industry_pains"], context["industry_gains"] = ind_res
+            
+        # Get Persona Data
+        c.execute('SELECT description, convincing_arguments FROM personas WHERE name = ?', (persona_name,))
+        per_res = c.fetchone()
+        if per_res:
+            context["persona_description"], context["persona_arguments"] = per_res
+            
+        conn.close()
+    except Exception as e:
+        print(f"DB Error in matrix lookup: {e}")
+        
+    return context
+
+def get_suggested_date():
+    """Calculates a suggested meeting date (3-4 days in future, avoiding weekends)."""
+    now = datetime.datetime.now()
+    # Jump 3 days ahead
+    suggested = now + datetime.timedelta(days=3)
+    # If weekend, move to Monday
+    if suggested.weekday() == 5: # Saturday
+        suggested += datetime.timedelta(days=2)
+    elif suggested.weekday() == 6: # Sunday
+        suggested += datetime.timedelta(days=1)
+    
+    days_de = ["Montag", "Dienstag", "Mittwoch", "Donnerstag", "Freitag", "Samstag", "Sonntag"]
+    return f"{days_de[suggested.weekday()]}, den {suggested.strftime('%d.%m.')} um 10:00 Uhr"
+
+def clean_company_name(name):
+    """Removes legal suffixes like GmbH, AG, etc. for a more personal touch."""
+    if not name: return ""
+    # Remove common German legal forms
+    cleaned = re.sub(r'\s+(GmbH|AG|GmbH\s+&\s+Co\.\s+KG|KG|e\.V\.|e\.K\.|Limited|Ltd|Inc)\.?(?:\s|$)', '', name, flags=re.IGNORECASE)
+    return cleaned.strip()
+
+def get_multi_solution_recommendation(area_str, purpose_str):
+    """
+    Selects a range of robots based on surface area AND requested purposes.
+    """
+    recommendations = []
+    purpose_lower = purpose_str.lower()
+    
+    # 1. Cleaning Logic (Area based)
+    nums = re.findall(r'\d+', area_str.replace('.', '').replace(',', ''))
+    area_val = int(nums[0]) if nums else 0
+    
+    if "reinigung" in purpose_lower:
+        if area_val >= 5000 or "über 10.000" in area_str:
+            recommendations.append("den Scrubber 75 als industrielles Kraftpaket für Ihre Großflächen")
+        elif area_val >= 1000:
+            recommendations.append("den Scrubber 50 oder Phantas für eine wendige und gründliche Bodenreinigung")
+        else:
+            recommendations.append("den Phantas oder Pudu CC1 für eine effiziente Reinigung Ihrer Räumlichkeiten")
+
+    # 2. Service/Transport Logic
+    if any(word in purpose_lower for word in ["servieren", "abräumen", "speisen", "getränke"]):
+        recommendations.append("den BellaBot zur Entlastung Ihres Teams beim Transport von Speisen und Getränken")
+    
+    # 3. Marketing/Interaction Logic
+    if any(word in purpose_lower for word in ["marketing", "gästebetreuung", "kundenansprache"]):
+        recommendations.append("den KettyBot als interaktiven Begleiter für Marketing und Patienteninformation")
+
+    if not recommendations:
+        recommendations.append("unsere wendigen Allrounder wie den Phantas")
+
+    return {
+        "solution_text": " und ".join(recommendations),
+        "has_multi": len(recommendations) > 1
+    }
+
+def generate_email_draft(lead_data, company_data, booking_link="[IHR BUCHUNGSLINK]"):
+    """
+    Generates a high-end, personalized sales email using Gemini API and Matrix knowledge.
+    """
+    api_key = get_gemini_key()
+    if not api_key:
+        return "Error: Gemini API Key not found."
+
+    # Extract Data from Lead Engine
+    company_raw = lead_data.get('company_name', 'Interessent')
+    company_name = clean_company_name(company_raw) 
+    contact_name = lead_data.get('contact_name', 'Damen und Herren')
+    
+    # Metadata from Lead
+    meta = {}
+    if lead_data.get('lead_metadata'):
+        try: meta = json.loads(lead_data['lead_metadata'])
+        except: pass
+            
+    area = meta.get('area', 'Unbekannte Fläche')
+    purpose = meta.get('purpose', 'Reinigung')
+    role = meta.get('role', 'Wirtschaftlicher Entscheider')
+    salutation = meta.get('salutation', 'Damen und Herren')
+    cleaning_functions = meta.get('cleaning_functions', '')
+
+    # Data from Company Explorer
+    ce_summary = company_data.get('research_dossier') or company_data.get('summary', '')
+    ce_vertical = company_data.get('industry_ai') or company_data.get('vertical', 'Healthcare')
+    ce_opener = company_data.get('ai_opener', '') 
+
+    # Multi-Solution Logic
+    solution = get_multi_solution_recommendation(area, purpose)
+    suggested_date = get_suggested_date()
+
+    # Fetch "Golden Records" from Matrix
+    matrix = get_matrix_context(ce_vertical, role)
+
+    # Prompt Engineering for "Unwiderstehliche E-Mail"
+    prompt = f"""
+    Du bist ein Senior Sales Executive bei Robo-Planet. Antworte auf eine Anfrage von Tradingtwins.
+    Schreibe eine E-Mail auf "Human Expert Level".
+    
+    WICHTIGE IDENTITÄT:
+    - Anrede-Form: {salutation} (z.B. Herr, Frau)
+    - Name: {contact_name}
+    - Firma: {company_name}
+    
+    STRATEGIE:
+    - STARTE DIREKT mit dem strategischen Aufhänger aus dem Company Explorer ({ce_opener}). Baue daraus den ersten Absatz.
+    - KEIN "mit großem Interesse verfolge ich..." oder ähnliche Phrasen. Das wirkt unnatürlich.
+    - Deine Mail reagiert auf die Anfrage zu: {purpose} auf {area}.
+    - Fasse die vorgeschlagene Lösung ({solution['solution_text']}) KOMPAKT zusammen. Wir bieten ein ganzheitliches Entlastungskonzept an, keine Detail-Auflistung von Datenblättern.
+    
+    KONTEXT:
+    - Branche: {ce_vertical}
+    - Pains aus Matrix: {matrix['industry_pains']}
+    - Dossier/Wissen: {ce_summary}
+    - Strategischer Aufhänger (CE-Opener): {ce_opener}
+
+    AUFGABE:
+    1. ANREDE: Persönlich.
+    2. EINSTIEG: Nutze den inhaltlichen Kern von: "{ce_opener}".
+    3. DER ÜBERGANG: Verknüpfe dies mit der Anfrage zu {purpose}. Erkläre, dass manuelle Prozesse bei {area} angesichts der Dokumentationspflichten und des Fachkräftemangels zum Risiko werden.
+    4. DIE LÖSUNG: Schlage die Kombination aus {solution['solution_text']} als integriertes Konzept vor, um das Team in Reinigung, Service und Patientenansprache spürbar zu entlasten.
+    5. ROI: Sprich kurz die Amortisation (18-24 Monate) an – als Argument für den wirtschaftlichen Entscheider.
+    6. CTA: Schlag konkret den {suggested_date} vor. Alternativ: {booking_link}
+
+    STIL: Senior, lösungsorientiert, direkt. Keine unnötigen Füllwörter.
+    
+    FORMAT:
+    Betreff: [Prägnant, z.B. Automatisierungskonzept für {company_name}]
+    
+    [E-Mail Text]
+    """
+
+    # Call Gemini API
+    url = f"https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key={api_key}"
+    headers = {'Content-Type': 'application/json'}
+    payload = {"contents": [{"parts": [{"text": prompt}]}]}
+
+    try:
+        response = requests.post(url, headers=headers, json=payload)
+        response.raise_for_status()
+        result = response.json()
+        return result['candidates'][0]['content']['parts'][0]['text']
+    except Exception as e:
+        return f"Error generating draft: {str(e)}"
+
+if __name__ == "__main__":
+    # Test Mock
+    mock_lead = {
+        "company_name": "Klinikum Test",
+        "contact_name": "Dr. Müller",
+        "lead_metadata": json.dumps({"area": "5000 qm", "purpose": "Desinfektion und Boden", "city": "Berlin"})
+    }
+    mock_company = {
+        "vertical": "Healthcare / Krankenhaus",
+        "summary": "Ein großes Klinikum der Maximalversorgung mit Fokus auf Kardiologie."
+    }
+    print(generate_email_draft(mock_lead, mock_company))
--- a/lead-engine/ingest.py
+++ b/lead-engine/ingest.py
@@ -1,4 +1,5 @@
 import re
+from datetime import datetime
 from db import insert_lead

 def parse_tradingtwins_email(body):
@@ -28,6 +29,108 @@ def parse_tradingtwins_email(body):
    data['raw_body'] = body
    return data

+def is_free_mail(email_addr):
+    """Checks if an email belongs to a known free-mail provider."""
+    if not email_addr: return False
+    free_domains = {
+        'gmail.com', 'googlemail.com', 'outlook.com', 'hotmail.com', 'live.com', 
+        'msn.com', 'icloud.com', 'me.com', 'mac.com', 'yahoo.com', 'ymail.com', 
+        'rocketmail.com', 'gmx.de', 'gmx.net', 'web.de', 't-online.de', 
+        'freenet.de', 'mail.com', 'protonmail.com', 'proton.me', 'online.de'
+    }
+    domain = email_addr.split('@')[-1].lower()
+    return domain in free_domains
+
+def parse_tradingtwins_html(html_body):
+    """
+    Extracts data from the Tradingtwins HTML table structure.
+    Pattern: <p ...>Label:</p>...<p ...>Value</p>
+    """
+    data = {}
+    
+    # Map label names in HTML to our keys
+    field_map = {
+        'Firma': 'company',
+        'Vorname': 'contact_first', 
+        'Nachname': 'contact_last', 
+        'Anrede': 'salutation',
+        'E-Mail': 'email',
+        'Rufnummer': 'phone',
+        'Einsatzzweck': 'purpose', 
+        'Reinigungs-Funktionen': 'cleaning_functions',
+        'Reinigungs-Fläche': 'area', 
+        'PLZ': 'zip',
+        'Stadt': 'city',
+        'Lead-ID': 'source_id' 
+    }
+    
+    for label, key in field_map.items():
+        pattern = fr'>\s*{re.escape(label)}:\s*</p>.*?<p[^>]*>(.*?)</p>'
+        match = re.search(pattern, html_body, re.DOTALL | re.IGNORECASE)
+        if match:
+            raw_val = match.group(1).strip()
+            clean_val = re.sub(r'<[^>]+>', '', raw_val).strip()
+            data[key] = clean_val
+            
+    # Composite fields
+    if data.get('contact_first') and data.get('contact_last'):
+        data['contact'] = f"{data['contact_first']} {data['contact_last']}"
+    
+    # Quality Check: Free mail or missing company
+    email = data.get('email', '')
+    company = data.get('company', '-')
+    
+    data['is_free_mail'] = is_free_mail(email)
+    data['is_low_quality'] = data['is_free_mail'] or company == '-' or not company
+    
+    # Ensure source_id is present and map to 'id' for db.py compatibility
+    if not data.get('source_id'):
+         data['source_id'] = f"tt_unknown_{int(datetime.now().timestamp())}"
+    
+    data['id'] = data['source_id'] # db.py expects 'id' for source_id column
+
+    return data
+    
+def parse_roboplanet_form(html_body):
+    """
+    Parses the Roboplanet website contact form (HTML format).
+    Example: <b>Vorname:</b> Gordana <br><b>Nachname:</b> Dumitrovic <br>...
+    """
+    data = {}
+    
+    # Map label names in HTML to our keys
+    field_map = {
+        'Vorname': 'contact_first', 
+        'Nachname': 'contact_last', 
+        'Email': 'email',
+        'Telefon': 'phone',
+        'Firma': 'company',
+        'PLZ': 'zip',
+        'Nachricht': 'message'
+    }
+    
+    for label, key in field_map.items():
+        # Pattern: <b>Label:</b> Value <br>
+        pattern = fr'<b>{re.escape(label)}:</b>\s*(.*?)\s*<br>'
+        match = re.search(pattern, html_body, re.DOTALL | re.IGNORECASE)
+        if match:
+            raw_val = match.group(1).strip()
+            clean_val = re.sub(r'<[^>]+>', '', raw_val).strip() # Clean any leftover HTML tags
+            data[key] = clean_val
+            
+    # Composite fields
+    if data.get('contact_first') and data.get('contact_last'):
+        data['contact'] = f"{data['contact_first']} {data['contact_last']}"
+
+    # For Roboplanet forms, we use the timestamp as ID or a hash if missing
+    # We need to ensure 'id' is present for db.py compatibility
+    if not data.get('source_id'):
+         data['source_id'] = f"rp_unknown_{int(datetime.now().timestamp())}"
+    data['id'] = data['source_id']
+    
+    data['raw_body'] = html_body
+    return data
+
 def ingest_mock_leads():
    # Mock data from the session context
    leads = [
--- a/lead-engine/lookup_role.py
+++ b/lead-engine/lookup_role.py
@@ -0,0 +1,129 @@
+import os
+import requests
+import re
+from dotenv import load_dotenv
+
+# Try loading .env only if file exists (Local Dev), otherwise rely on Docker Env
+env_path = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '.env'))
+if os.path.exists(env_path):
+    load_dotenv(dotenv_path=env_path, override=True)
+
+SERP_API_KEY = os.getenv("SERP_API")
+
+if not SERP_API_KEY:
+    print(f"DEBUG: SERP_API not found in environment.")
+
+import json
+
+# --- Helper: Get Gemini Key ---
+def get_gemini_key():
+    candidates = [
+        "gemini_api_key.txt",                     # Current dir
+        "/app/gemini_api_key.txt",                # Docker default
+        os.path.join(os.path.dirname(__file__), "gemini_api_key.txt"), # Script dir
+        os.path.join(os.path.dirname(os.path.dirname(__file__)), 'gemini_api_key.txt') # Parent dir
+    ]
+    
+    for path in candidates:
+        if os.path.exists(path):
+            try:
+                with open(path, 'r') as f:
+                    return f.read().strip()
+            except:
+                pass
+                
+    return os.getenv("GEMINI_API_KEY")
+
+def extract_role_with_llm(name, company, search_results):
+    """Uses Gemini to identify the job title from search snippets."""
+    api_key = get_gemini_key()
+    if not api_key: return None
+    
+    context = "\n".join([f"- {r.get('title')}: {r.get('snippet')}" for r in search_results])
+    
+    prompt = f"""
+    Analyze these Google Search results to identify the professional role of "{name}" at "{company}".
+    
+    SEARCH RESULTS:
+    {context}
+    
+    TASK:
+    Extract the professional Job Title / Role. 
+    Look for: 
+    - Management: "Geschäftsführer", "Vorstand", "CFO", "Mitglied der Klinikleitung"
+    - Department Heads: "Leiter", "Bereichsleitung", "Head of", "Pflegedienstleitung"
+    - Specialized: "Arzt", "Ingenieur", "Einkäufer"
+    
+    RULES:
+    1. Extract the most specific and senior current role.
+    2. Return ONLY the role string (e.g. "Bereichsleitung Patientenmanagement").
+    3. Maximum length: 60 characters.
+    4. If no role is found, return "Unbekannt".
+    """
+    
+    url = f"https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key={api_key}"
+    try:
+        response = requests.post(url, headers={'Content-Type': 'application/json'}, json={"contents": [{"parts": [{"text": prompt}]}]})
+        if response.status_code == 200:
+            role = response.json()['candidates'][0]['content']['parts'][0]['text'].strip()
+            # Remove markdown formatting if any
+            role = role.replace('**', '').replace('"', '').rstrip('.')
+            return None if "Unbekannt" in role else role
+        else:
+            print(f"DEBUG: Gemini API Error {response.status_code}: {response.text}")
+    except Exception as e:
+        print(f"DEBUG: Gemini API Exception: {e}")
+    return None
+
+def lookup_person_role(name, company):
+    """
+    Searches for a person's role via SerpAPI and extracts it using LLM.
+    Uses a multi-step search strategy to find the best snippets.
+    """
+    if not SERP_API_KEY:
+        print("Error: SERP_API key not found in .env")
+        return None
+
+    # Step 1: Highly specific search
+    queries = [
+        f'site:linkedin.com "{name}" "{company}"',
+        f'"{name}" "{company}" position',
+        f'{name} {company}'
+    ]
+    
+    all_results = []
+    for query in queries:
+        params = {
+            "engine": "google",
+            "q": query,
+            "api_key": SERP_API_KEY,
+            "num": 3,
+            "hl": "de",
+            "gl": "de"
+        }
+
+        try:
+            response = requests.get("https://serpapi.com/search", params=params)
+            response.raise_for_status()
+            data = response.json()
+            
+            results = data.get("organic_results", [])
+            if results:
+                all_results.extend(results)
+                # If we have good results, we don't necessarily need more searches
+                if len(all_results) >= 3:
+                    break
+        except Exception as e:
+            print(f"SerpAPI lookup failed for query '{query}': {e}")
+
+    if not all_results:
+        return None
+
+    # Delegate extraction to LLM with the best results found
+    return extract_role_with_llm(name, company, all_results)
+
+if __name__ == "__main__":
+    # Test cases
+    print(f"Markus Drees: {lookup_person_role('Markus Drees', 'Ärztehaus Rünthe')}")
+    print(f"Georg Stahl: {lookup_person_role('Georg Stahl', 'Klemm Bohrtechnik GmbH')}")
+    print(f"Steve Trüby: {lookup_person_role('Steve Trüby', 'RehaKlinikum Bad Säckingen GmbH')}")
--- a/lead-engine/monitor.py
+++ b/lead-engine/monitor.py
@@ -0,0 +1,56 @@
+import time
+import json
+import logging
+import os
+import sys
+
+# Path setup to import local modules
+sys.path.append(os.path.dirname(__file__))
+from db import get_leads
+from enrich import refresh_ce_data
+
+# Setup logging
+logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
+logger = logging.getLogger("lead-monitor")
+
+def run_monitor():
+    logger.info("Starting Lead Monitor (Polling CE for updates)...")
+    
+    while True:
+        try:
+            leads = get_leads()
+            # Filter leads that are synced but missing analysis data
+            pending_leads = []
+            for lead in leads:
+                if lead['status'] == 'synced':
+                    enrichment = json.loads(lead['enrichment_data']) if lead['enrichment_data'] else {}
+                    ce_data = enrichment.get('ce_data', {})
+                    ce_id = enrichment.get('ce_id')
+                    
+                    # If we have a CE ID but no vertical/summary yet, it's pending
+                    vertical = ce_data.get('industry_ai') or ce_data.get('vertical')
+                    if ce_id and (not vertical or vertical == 'None'):
+                        pending_leads.append(lead)
+
+            if pending_leads:
+                logger.info(f"Checking {len(pending_leads)} pending leads for analysis updates...")
+                for lead in pending_leads:
+                    enrichment = json.loads(lead['enrichment_data'])
+                    ce_id = enrichment['ce_id']
+                    
+                    logger.info(f" -> Refreshing Lead {lead['id']} ({lead['company_name']})...")
+                    new_data = refresh_ce_data(lead['id'], ce_id)
+                    
+                    new_vertical = new_data.get('industry_ai') or new_data.get('vertical')
+                    if new_vertical and new_vertical != 'None':
+                        logger.info(f"    [SUCCESS] Analysis finished for {lead['company_name']}: {new_vertical}")
+                        # Optional: Here we could trigger the Auto-Reply generation in the future
+            
+        except Exception as e:
+            logger.error(f"Monitor error: {e}")
+            
+        # Wait before next check
+        time.sleep(30) # Poll every 30 seconds
+
+if __name__ == "__main__":
+    run_monitor()
--- a/lead-engine/trading_twins_ingest.py
+++ b/lead-engine/trading_twins_ingest.py
@@ -0,0 +1,157 @@
+import os
+import sys
+import re
+import logging
+import requests
+import json
+from datetime import datetime
+from dotenv import load_dotenv
+
+# Ensure we can import from root directory
+sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
+
+# Import db functions and parsers
+try:
+    from db import insert_lead, init_db
+    from ingest import parse_roboplanet_form, parse_tradingtwins_html, is_free_mail
+except ImportError:
+    # Fallback for direct execution
+    sys.path.append(os.path.dirname(__file__))
+    from db import insert_lead, init_db
+    from ingest import parse_roboplanet_form, parse_tradingtwins_html, is_free_mail
+
+# Configuration
+env_path = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '.env'))
+load_dotenv(dotenv_path=env_path, override=True)
+
+CLIENT_ID = os.getenv("INFO_Application_ID")
+TENANT_ID = os.getenv("INFO_Tenant_ID")
+CLIENT_SECRET = os.getenv("INFO_Secret")
+USER_EMAIL = "info@robo-planet.de"
+
+# Setup logging
+logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
+logger = logging.getLogger(__name__)
+
+def get_access_token():
+    url = f"https://login.microsoftonline.com/{TENANT_ID}/oauth2/v2.0/token"
+    data = {
+        "client_id": CLIENT_ID,
+        "scope": "https://graph.microsoft.com/.default",
+        "client_secret": CLIENT_SECRET,
+        "grant_type": "client_credentials"
+    }
+    response = requests.post(url, data=data)
+    response.raise_for_status()
+    return response.json().get("access_token")
+
+def fetch_new_leads_emails(token, limit=200):
+    url = f"https://graph.microsoft.com/v1.0/users/{USER_EMAIL}/messages"
+    headers = {
+        "Authorization": f"Bearer {token}",
+        "Content-Type": "application/json"
+    }
+    
+    params = {
+        "$top": limit,
+        "$select": "id,subject,receivedDateTime,body",
+        "$orderby": "receivedDateTime desc"
+    }
+    
+    response = requests.get(url, headers=headers, params=params)
+    if response.status_code != 200:
+        logger.error(f"Graph API Error: {response.status_code} - {response.text}")
+        return []
+    
+    all_msgs = response.json().get("value", [])
+    
+    # Filter client-side for both TradingTwins and Roboplanet contact forms
+    filtered = [m for m in all_msgs if (
+        "Neue Anfrage zum Thema Roboter" in (m.get('subject') or '') or
+        "Kontaktformular Roboplanet" in (m.get('subject') or '')
+    )]
+    return filtered
+
+def process_leads(auto_sync=False):
+    init_db()
+    new_count = 0
+    try:
+        token = get_access_token()
+        emails = fetch_new_leads_emails(token) # Use the new function
+        logger.info(f"Found {len(emails)} potential lead emails.")
+        
+        for email in emails:
+            subject = email.get('subject') or ''
+            body = email.get('body', {}).get('content', '')
+            received_at_str = email.get('receivedDateTime')
+            
+            received_at = None
+            if received_at_str:
+                try:
+                    received_at = datetime.fromisoformat(received_at_str.replace('Z', '+00:00'))
+                except:
+                    pass
+
+            lead_data = {}
+            source_prefix = "unknown"
+            source_display_name = "Unknown"
+
+            if "Neue Anfrage zum Thema Roboter" in subject:
+                lead_data = parse_tradingtwins_html(body)
+                source_prefix = "tt"
+                source_display_name = "TradingTwins"
+            elif "Kontaktformular Roboplanet" in subject:
+                lead_data = parse_roboplanet_form(body)
+                source_prefix = "rp"
+                source_display_name = "Website-Formular"
+            else:
+                # Should not happen with current filtering, but good for robustness
+                logger.warning(f"Skipping unknown email type: {subject}")
+                continue
+            
+            lead_data['source'] = source_display_name # Add the new source field for the DB
+            lead_data['raw_body'] = body
+            lead_data['received_at'] = received_at
+
+            # Apply general quality checks (if not already done by parser)
+            if 'is_free_mail' not in lead_data: 
+                lead_data['is_free_mail'] = is_free_mail(lead_data.get('email', ''))
+            if 'is_low_quality' not in lead_data:
+                company_name_check = lead_data.get('company', '')
+                # Consider company name '-' as missing/invalid
+                if company_name_check == '-': company_name_check = '' 
+                lead_data['is_low_quality'] = lead_data['is_free_mail'] or not company_name_check
+
+            company_name = lead_data.get('company')
+            if not company_name or company_name == '-':
+                # Fallback: if company name is missing, use contact name as company
+                company_name = lead_data.get('contact')
+                lead_data['company'] = company_name
+            
+            if not company_name:
+                logger.warning(f"Skipping lead due to missing company and contact name: {subject}")
+                continue
+                
+            # Ensure source_id and 'id' for db.py compatibility
+            if not lead_data.get('source_id'):
+                 lead_data['source_id'] = f"{source_prefix}_unknown_{int(datetime.now().timestamp())}"
+            lead_data['id'] = lead_data['source_id'] # db.py expects 'id' for source_id column
+            
+            if insert_lead(lead_data):
+                logger.info(f" -> Ingested ({source_prefix}): {company_name}")
+                new_count += 1
+        
+        if new_count > 0 and auto_sync:
+            logger.info(f"Triggering auto-sync for {new_count} new leads...")
+            from enrich import run_sync
+            run_sync()
+                 
+        return new_count
+            
+    except Exception as e:
+        logger.error(f"Error in process_leads: {e}")
+        return 0
+
+if __name__ == "__main__":
+    count = process_leads()
+    print(f"Ingested {count} new leads.")
--- a/nginx-proxy.conf
+++ b/nginx-proxy.conf
@@ -153,8 +153,7 @@ http {

        location /lead/ {
            # Lead Engine (TradingTwins)
-            # Proxying external service on host
-            proxy_pass http://192.168.178.6:8501/;
+            proxy_pass http://lead-engine:8501/;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header Upgrade $http_upgrade;
--- a/sale_342243_details.txt
+++ b/sale_342243_details.txt
--- a/tasks.md
+++ b/tasks.md
@@ -39,6 +39,24 @@

 - [x] **Integrität:** Fehlende API-Endpunkte für Firmen-Erstellung, Bulk-Import und Wiki-Overrides wiederhergestellt.

+## Lead Engine: Tradingtwins Automation (In Arbeit)
+
+- [x] **E-Mail Ingest:** Automatisierter Import von Leads aus dem Postfach `info@robo-planet.de` via Microsoft Graph API.
+- [x] **Parsing:** Strukturierte Extraktion von Bedarfsdaten (Fläche, Zweck, Funktionen) aus Tradingtwins-HTML.
+- [x] **Contact Research:** KI-gestützte Rollen-Identifizierung via SerpAPI und Gemini 2.0 Flash.
+- [x] **CE-Sync:** Automatisches Anlegen von Firmen und Kontakten im Company Explorer inkl. Role-Mapping.
+- [x] **Drafting:** E-Mail-Generator auf "Human Expert Level" mit Branchen-Fokus und ROI-Argumentation.
+- [x] **UI: Visuelle Unterscheidung:** Leads nach Herkunft (TradingTwins vs. Website-Formular) optisch differenziert.
+- [x] **UI: Status-Indikatoren:** Synchronisationsstatus (CE) und Low-Quality-Warnungen direkt im Lead-Header sichtbar.
+- [x] **Drafts: Persistente Speicherung:** Generierte E-Mail-Entwürfe werden dauerhaft in der Datenbank gespeichert.
+- [ ] **IT-Klärung:** Microsoft Bookings Berechtigungen (`Bookings.Read.All`, `BookingsAppointment.ReadWrite.All`) für die Entra App anfragen und "Admin Consent" einholen.
+- [ ] **Infrastruktur:** Korrekten Buchungslink (persönliches Konto) ermitteln und in der `.env` (Variable `BOOKING_LINK`) hinterlegen.
+- [ ] **CRM-Integration:** Modul "Push to SuperOffice" entwickeln, um Personen und E-Mail-Entwürfe (als Aufgabe/Aktivität) direkt im CRM anzulegen.
+- [ ] **Daten-Synchronisation:** Notion-Produktdatenbank in die lokale DB spiegeln, um Produktauswahl und ROI-Berechnung zu dynamisieren.
+- [ ] **Logik:** ROI-Kalkulation im E-Mail-Entwurf auf Basis von echten Leistungsdaten (m²/h) und Preisen schärfen.
+- [ ] **UI:** "Copy to Clipboard" Funktion für den fertigen Entwurf in der Web-App finalisieren.
+- [ ] **Phase 2: Intelligente Antworten für Kontaktformulare:** Entwicklung einer kontextbezogenen Antwortlogik für Website-Formular-Leads (zunächst allgemeine Bestätigung, später KI-gestützt auf Nachrichtsinhalt basierend).
+
 ## Heatmap Tool (Standalone)

 ### Status: Beta (Funktionsfähig mit Basisfunktionen)
--- a/trading_twins_tool.py
+++ b/trading_twins_tool.py
@@ -1,6 +1,16 @@
 import json
 import time
 import os
+import sys
+
+# Ensure we can import from lead-engine
+sys.path.append(os.path.join(os.path.dirname(__file__), 'lead-engine'))
+try:
+    from trading_twins_ingest import process_leads
+except ImportError:
+    print("Warning: Could not import trading_twins_ingest from lead-engine. Email ingestion disabled.")
+    process_leads = None
+
 from company_explorer_connector import handle_company_workflow

 def run_trading_twins_process(target_company_name: str):
@@ -46,6 +56,14 @@ def run_trading_twins_process(target_company_name: str):
    print(f"Trading Twins Analyse für {target_company_name} abgeschlossen.")
    print(f"{'='*50}\n")

+def run_email_ingest():
+    """Starts the automated email ingestion process for Tradingtwins leads."""
+    if process_leads:
+        print("\nStarting automated email ingestion via Microsoft Graph...")
+        process_leads()
+        print("Email ingestion completed.")
+    else:
+        print("Error: Email ingestion module not available.")

 if __name__ == "__main__":
    # Simulieren der Umgebungsvariablen für diesen Testlauf, falls nicht gesetzt
@@ -54,26 +72,28 @@ if __name__ == "__main__":
    if "COMPANY_EXPLORER_API_PASSWORD" not in os.environ:
        os.environ["COMPANY_EXPLORER_API_PASSWORD"] = "gemini"

-    # Testfall 1: Ein Unternehmen, das wahrscheinlich bereits existiert
-    # Da 'Robo-Planet GmbH' bei den vorherigen Läufen erstellt wurde, sollte es jetzt gefunden werden.
-    run_trading_twins_process("Robo-Planet GmbH")
-
-    # Kurze Pause zwischen den Testläufen
-    time.sleep(5)
-
-    # Testfall 1b: Ein bekanntes, real existierendes Unternehmen
-    run_trading_twins_process("Klinikum Landkreis Erding")
-
-    # Kurze Pause zwischen den Testläufen
-    time.sleep(5)
-
-    # Testfall 2: Ein neues, eindeutiges Unternehmen
-    new_unique_company_name = f"Trading Twins New Target {int(time.time())}"
-    run_trading_twins_process(new_unique_company_name)
-
-    # Kurze Pause
-    time.sleep(5)
+    print("Trading Twins Tool - Main Menu")
+    print("1. Process specific company name")
+    print("2. Ingest leads from Email (info@robo-planet.de)")
+    print("3. Run demo sequence (Robo-Planet, Erding, etc.)")
    
-    # Testfall 3: Ein weiteres neues Unternehmen, um die Erstellung zu prüfen
-    another_new_company_name = f"Another Demo Corp {int(time.time())}"
-    run_trading_twins_process(another_new_company_name)
+    choice = input("\nSelect option (1-3): ").strip()
+    
+    if choice == "1":
+        name = input("Enter company name: ").strip()
+        if name:
+            run_trading_twins_process(name)
+    elif choice == "2":
+        run_email_ingest()
+    elif choice == "3":
+        # Testfall 1: Ein Unternehmen, das wahrscheinlich bereits existiert
+        run_trading_twins_process("Robo-Planet GmbH")
+        time.sleep(2)
+        # Testfall 1b: Ein bekanntes, real existierendes Unternehmen
+        run_trading_twins_process("Klinikum Landkreis Erding")
+        time.sleep(2)
+        # Testfall 2: Ein neues, eindeutiges Unternehmen
+        new_unique_company_name = f"Trading Twins New Target {int(time.time())}"
+        run_trading_twins_process(new_unique_company_name)
+    else:
+        print("Invalid choice.")
Author	SHA1	Message	Date
Floke	7507c1d289	[31588f42] Keine neuen Commits in dieser Session. Keine neuen Commits in dieser Session.	2026-03-03 08:37:11 +00:00
Floke	e3adee7776	Docs: Aktualisierung der Dokumentation für Task [31588f42]	2026-03-03 08:37:11 +00:00
Floke	e8005820ff	[31388f42] Doc: Minor update to tasks.md This commit includes a minor update to the tasks.md file, ensuring all task statuses are accurately reflected.	2026-03-02 19:43:09 +00:00
Floke	75e1096853	[31388f42] Doc: Update README with new features and roadmap - Updates the main features section in `lead-engine/README.md` to reflect multi-source ingestion, UI indicators, and persistent draft storage. - Adds a new "Roadmap / Nächste Schritte" section to document open To-Dos and future enhancements. - Adjusts the version to 1.1 to signify the new capabilities.	2026-03-02 19:40:31 +00:00
Floke	c784c7b3ed	[31388f42] Fix: Centralize Lead Parsers in ingest.py This commit finalizes the centralization of lead parsing logic. - Moves from to . - Moves from to . - Ensures is imported in for generation. - Corrects all necessary imports in to use functions from . This addresses the and improves modularity.	2026-03-02 19:38:56 +00:00
Floke	5f3bd06bbe	[31388f42] Doc: Update task list in tasks.md	2026-03-02 19:37:06 +00:00
Floke	41a9f8c215	[31388f42] Feat: Persist drafts and enhance UI warnings This commit introduces two key improvements to the Lead Engine: 1. Persistent Email Drafts: - Adds a new function to . - Modifies to save generated email replies directly to the column in the database, ensuring they persist across sessions. - Removes the previous session-based state for drafts. 2. Enhanced UI Visibility: - Adds a warning icon (⚠️) directly to the lead expander's title if a lead is flagged as low-quality, making it easier to spot.	2026-03-02 19:32:07 +00:00
Floke	8148d7716e	[31388f42] Refactor: Centralize Lead Parsers and Fix ImportError This commit resolves an `ImportError` by centralizing all lead parsing logic into `ingest.py`. - Move `parse_tradingtwins_html`: The function for parsing TradingTwins HTML emails has been moved from `trading_twins_ingest.py` to `ingest.py`. - Move `is_free_mail`: The utility function `is_free_mail` was also moved from `trading_twins_ingest.py` to `ingest.py` to be shared by all parsers. - Update Imports: The import statement in `trading_twins_ingest.py` is now corrected to import all necessary functions from the central `ingest.py` module. This refactoring improves code structure, removes redundancy, and fixes the critical bug that prevented the ingest process from running.	2026-03-02 19:21:50 +00:00
Floke	efaa43858d	[31388f42] Feature: Integrate Roboplanet Contact Forms into Lead Engine This commit integrates the Roboplanet website contact form submissions into the Lead Engine, allowing them to be processed alongside TradingTwins leads. Key changes: - Database Schema Update (db.py): Added a new source column to the leads table for tracking lead origin (TradingTwins or Website-Formular). Includes a migration check to safely add the column. - Improved HTML Parsing (ingest.py): Refined the `parse_roboplanet_form` function to accurately extract data from the specific HTML structure of Roboplanet contact form emails. - Enhanced Ingestion Logic (trading_twins_ingest.py): - Renamed `fetch_tradingtwins_emails` to `fetch_new_leads_emails` and updated it to fetch emails from both lead sources. - Modified `process_leads` to dynamically select the correct parser based on email subject. - Ensured `source` field is correctly populated and `is_low_quality` checks are applied for both lead types. - UI Enhancement (app.py): Updated the Streamlit UI to visually distinguish lead types with icons and improved the "Low Quality Lead" warning message. This feature enables a unified processing pipeline for different lead sources and provides better visibility in the Lead Engine dashboard.	2026-03-02 19:19:01 +00:00
Floke	04013920ee	[31388f42] Final session polish: Refined UI, improved ingest parsing, and completed documentation	2026-03-02 15:10:12 +00:00
Floke	86be6b1c08	[31388f42] Update documentation: Append external lead ingestion to MIGRATION_PLAN and create Lead Engine README	2026-03-02 15:08:41 +00:00
Floke	7579c78f3a	[31388f42] Restructure Lead Engine UI: vertical layout with 2-column top section for source data	2026-03-02 13:35:18 +00:00
Floke	0734256c11	[31388f42] Fix NameError in reply generator and implement free-mail quality detection	2026-03-02 13:04:10 +00:00
Floke	5c760efc4c	[31388f42] Fix contact sync: separate first/last names and enable CE role mapping	2026-03-02 10:42:51 +00:00
Floke	6f3bdcfcfc	[31388f42] CE Backend: Add missing POST /api/contacts endpoint to support external contact ingestion	2026-03-02 10:28:07 +00:00
Floke	fe4916dc17	[31388f42] Fix field naming: Use industry_ai and research_dossier from CE API	2026-03-02 10:20:06 +00:00
Floke	dc3595244a	[31388f42] Deep CE Sync: Support contact creation and automated enrichment workflow	2026-03-02 10:01:11 +00:00
Floke	4fe0d03a39	[31388f42] Implement hierarchical search strategy for more robust role discovery	2026-03-02 09:52:51 +00:00
Floke	433670dae8	[31388f42] Explicitly pass SERP_API and GEMINI_API_KEY to lead-engine container	2026-03-02 09:45:07 +00:00
Floke	faf2ac3670	[31388f42] Add SerpAPI debugging to UI and relax dotenv loading	2026-03-02 09:40:04 +00:00
Floke	4fa70f4ff1	[31388f42] Fix Gemini API Key path resolution for Docker environment	2026-03-02 09:31:10 +00:00
Floke	fcc5f6e63e	[31388f42] Fix role lookup: Force German locale, improved LLM prompt	2026-03-02 09:27:35 +00:00
Floke	c9c2572104	[31388f42] Implement Expert Response Generator with Gemini & CE integration	2026-03-02 08:46:22 +00:00
Floke	dddc92a6c3	[31388f42] Fix ingest: Increase limit to 200, handle None-Subject, client-side filter	2026-03-02 07:58:11 +00:00
Floke	8bc2f61aca	[31388f42] Dockerize lead-engine: Add service to docker-compose.yml, update nginx proxy, fix Dockerfile deps	2026-03-02 07:49:17 +00:00
Floke	b5bb78c7c0	[31388f42] Implement end-to-end email ingest for Tradingtwins leads via MS Graph API	2026-03-02 07:39:46 +00:00
Floke	2130304950	docs: add email templates for Konver.ai integration strategy [31788f42]	2026-03-02 07:16:00 +00:00