[2ff88f42] Full End-to-End integration: Webhooks, Auto-Enrichment, Notion-Sync, UI updates and new Connector Architecture

2026-02-19 16:05:52 +00:00
parent 262add256f
commit 17346b3fcb
21 changed files with 1107 additions and 203 deletions
--- a/connector-superoffice/Dockerfile
+++ b/connector-superoffice/Dockerfile
@@ -0,0 +1,25 @@
+FROM python:3.11-slim
+
+WORKDIR /app
+
+# Install system dependencies
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    build-essential \
+    && rm -rf /var/lib/apt/lists/*
+
+# Install dependencies
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+
+# Copy source code
+COPY . .
+
+# Expose port for Webhook
+EXPOSE 8000
+
+# Make sure scripts are executable
+RUN chmod +x start.sh
+
+# Start both worker and webhook
+CMD ["./start.sh"]
+
--- a/connector-superoffice/README.md
+++ b/connector-superoffice/README.md
@@ -1,178 +1,90 @@
-# SuperOffice Connector ("The Butler") - GTM Engine
+# SuperOffice Connector ("The Muscle") - GTM Engine

-Dies ist der Microservice zur Anbindung von **SuperOffice CRM** an die **GTM-Engine**.
-Der Connector agiert als "Butler": Er bereitet im Hintergrund hyper-personalisierte Marketing-Texte vor und legt sie im CRM ab, damit der Vertrieb sie mit minimalem Aufwand versenden kann.
+Dies ist der "dumme" Microservice zur Anbindung von **SuperOffice CRM** an die **Company Explorer Intelligence**.
+Der Connector agiert als reiner Bote ("Muscle"): Er nimmt Webhook-Events entgegen, fragt das "Gehirn" (Company Explorer) nach Instruktionen und führt diese im CRM aus.

-## 1. Architektur: Das "Butler-Modell"
+## 1. Architektur: "The Intelligent Hub & The Loyal Messenger"

-Wir haben uns gegen eine komplexe Automatisierung *innerhalb* von SuperOffice (via CRMScript-Trigger) entschieden, da die DEV-Umgebung limitiert ist und die Logik extern flexibler ist.
+Wir haben uns für eine **Event-gesteuerte Architektur** entschieden, um Skalierbarkeit und Echtzeit-Verarbeitung zu gewährleisten.

 **Der Datenfluss:**
-1.  **Master-Daten (Notion):** Pains & Gains werden pro Vertical in Notion gepflegt.
-2.  **Text-Matrix (Lokal):** Das Skript `build_matrix.py` "multipliziert" die Vertical-Daten mit den Rollen-Daten, generiert via Gemini die finalen Texte und speichert sie in einer lokalen SQLite DB (`marketing_matrix.db`).
-3.  **Polling & Injection (Daemon):** Das Skript `polling_daemon_final.py` läuft periodisch, prüft auf Änderungen an Kontakten in SuperOffice und "injiziert" die passenden Texte aus der lokalen Matrix-DB in die UDF-Felder.
+1.  **Auslöser:** User ändert in SuperOffice einen Kontakt (z.B. Status -> `Init`).
+2.  **Transport:** SuperOffice sendet ein `POST` Event an unseren Webhook-Endpunkt (`:8003/webhook`).
+3.  **Queueing:** Der `Webhook Receiver` validiert das Event und legt es sofort in eine lokale `SQLite`-Queue (`connector_queue.db`).
+4.  **Verarbeitung:** Ein separater `Worker`-Prozess holt den Job ab.
+5.  **Provisioning:** Der Worker fragt den **Company Explorer** (`POST /api/provision/superoffice-contact`): "Was soll ich mit Person ID 123 tun?".
+6.  **Write-Back:** Der Company Explorer liefert das fertige Text-Paket (Subject, Intro, Proof) zurück. Der Worker schreibt dies via REST API in die UDF-Felder von SuperOffice.

-```mermaid
-graph TD
-    subgraph "A: Strategie & Content"
-        NotionDB["🏢 Notion DB<br/>(Industries & Roles)"]
-    end
+## 2. Kern-Komponenten

-    subgraph "B: Lokale Engine (Python)"
-        MatrixBuilder["⚙️ build_matrix.py"]
-        GeminiAPI["🧠 Gemini API"]
-        MatrixDB[("💾 marketing_matrix.db")]
-        PollingDaemon["🤖 polling_daemon_final.py"]
-    end
+*   **`webhook_app.py` (FastAPI):**
+    *   Lauscht auf Port `8000` (Extern: `8003`).
+    *   Nimmt Events entgegen, prüft Token (`WEBHOOK_SECRET`).
+    *   Schreibt Jobs in die Queue.
+    *   Endpunkt: `POST /webhook`.

-    subgraph "C: CRM-System"
-        SuperOfficeAPI["☁️ SuperOffice API"]
-        Contact["👤 Contact / Person<br/>(mit UDFs)"]
-    end
+*   **`queue_manager.py` (SQLite):**
+    *   Verwaltet die lokale Job-Queue.
+    *   Status: `PENDING` -> `PROCESSING` -> `COMPLETED` / `FAILED`.
+    *   Persistiert Jobs auch bei Container-Neustart.

-    NotionDB -->|1. Liest Pains/Gains| MatrixBuilder
-    MatrixBuilder -->|2. Promptet| GeminiAPI
-    GeminiAPI -->|3. Generiert Texte| MatrixBuilder
-    MatrixBuilder -->|4. Speichert in DB| MatrixDB
-    
-    PollingDaemon -->|5. Prüft Änderungen| SuperOfficeAPI
-    PollingDaemon -->|6. Holt passende Texte| MatrixDB
-    PollingDaemon -->|7. Schreibt in UDFs| SuperOfficeAPI
-    SuperOfficeAPI -->|8. Aktualisiert| Contact
-```
-
-## 2. Kern-Komponenten (Skripte)
+*   **`worker.py`:**
+    *   Läuft als Hintergrundprozess.
+    *   Pollt die Queue alle 5 Sekunden.
+    *   Kommuniziert mit Company Explorer (Intern: `http://company-explorer:8000`) und SuperOffice API.
+    *   Behandelt Fehler und Retries.

 *   **`superoffice_client.py`:**
-    *   Das Herzstück. Eine Python-Klasse, die die gesamte Kommunikation mit der SuperOffice API kapselt (Auth, GET, PUT, Search).
-    *   Beherrscht Paginierung, um auch große Datensätze zu verarbeiten.
-
-*   **`build_matrix.py`:**
-    *   Der "Content Generator".
-    *   Liest Pains/Gains aus Notion.
-    *   Iteriert über definierte Verticals und Rollen.
-    *   Nutzt Gemini, um die spezifischen Textbausteine (`Subject`, `Intro`, `SocialProof`) zu erstellen.
-    *   Speichert das Ergebnis in der `marketing_matrix.db`.
-
-*   **`polling_daemon_final.py`:**
-    *   Der Automatisierungs-Motor.
-    *   Läuft in einer Schleife (z.B. alle 15 Minuten) nur zu Geschäftszeiten (Mo-Fr, 7-18 Uhr).
-    *   Fragt SuperOffice nach kürzlich geänderten Kontakten.
-    *   Nutzt ein **Hash-System**, um zu prüfen, ob sich eine *relevante* Eigenschaft (Vertical oder Rolle) geändert hat, um unnötige API-Schreibvorgänge zu vermeiden.
-    *   Wenn eine Änderung erkannt wird, stößt er den Update-Prozess an.
-
-*   **`normalize_persona.py`:**
-    *   Ein Hilfs-Skript (Classifier), um Jobtitel auf unsere 4 Archetypen (Operativ, Infrastruktur, Wirtschaftlich, Innovation) zu mappen.
+    *   Kapselt die SuperOffice REST API (Auth, GET, PUT).
+    *   Verwaltet Refresh Tokens.

 ## 3. Setup & Konfiguration

-### Installation
-```bash
-# In einer Python 3.11+ Umgebung:
-pip install -r requirements.txt
-```
+### Docker Service
+Der Service läuft im Container `connector-superoffice`.
+Startet via `start.sh` sowohl den Webserver als auch den Worker.

 ### Konfiguration (`.env`)
-Der Connector benötigt eine `.env` Datei im Root-Verzeichnis mit folgendem Inhalt:
-```ini
-# Gemini API Keys (DEV für Tests, PROD für Live-Tools)
-GEMINI_API_KEY_DEV="AIza..."
-GEMINI_API_KEY_PROD="AIza..."
-GEMINI_API_KEY=${GEMINI_API_KEY_DEV} # Setzt den Default
+Der Connector benötigt folgende Variablen (in `docker-compose.yml` gesetzt):

-# Notion
-NOTION_API_KEY="secret_..."
-
-# SuperOffice Client Configuration
-SO_CLIENT_ID="<Client ID / App ID>"
-SO_CLIENT_SECRET="<Client Secret>"
-SO_CONTEXT_IDENTIFIER="CustXXXXX"
-SO_REFRESH_TOKEN="<Refresh Token>"
-SO_ENVIRONMENT="sod" # oder "online" für Produktion
+```yaml
+environment:
+  API_USER: "admin"
+  API_PASSWORD: "..."
+  COMPANY_EXPLORER_URL: "http://company-explorer:8000" # Interne Docker-Adresse
+  WEBHOOK_SECRET: "changeme" # Muss mit SO-Webhook Config übereinstimmen
+  # Plus die SuperOffice Credentials (Client ID, Secret, Refresh Token)
 ```

-## 4. SuperOffice Konfiguration (Admin)
+## 4. API-Schnittstelle (Intern)

-Folgende **Benutzerdefinierte Felder (UDFs)** müssen angelegt sein. Die ProgIds sind umgebungs-spezifisch und müssen pro System (DEV/PROD) geprüft werden.
+Der Connector ruft den Company Explorer auf und liefert dabei **Live-Daten** aus dem CRM für den "Double Truth" Abgleich:

-| Objekt | Feldname (Label) | Typ | ProgId (DEV) | ProgId (PROD) |
-| :--- | :--- | :--- | :--- | :--- |
-| Firma | Contact Vertical | Liste | `SuperOffice:5` | *tbd* |
-| Firma | AI Challenge Satz | Text | `SuperOffice:6` | *tbd* |
-| Person | Person Role | Liste | `SuperOffice:3` | *tbd* |
-| Person | Marketing Subject | Text | `SuperOffice:5` | *tbd* |
-| Person | Marketing Intro | Text | `SuperOffice:6` | *tbd* |
-| Person | Marketing Proof | Text | `SuperOffice:7` | *tbd* |
-| Person | AI Copy Hash | Text | `SuperOffice:8` | *tbd* |
+**Request:** `POST /api/provision/superoffice-contact`
+```json
+{
+  "so_contact_id": 12345,
+  "so_person_id": 67890,
+  "crm_name": "RoboPlanet GmbH",
+  "crm_website": "www.roboplanet.de",
+  "job_title": "Geschäftsführer"
+}
+```

-### B. Listen-Werte & IDs (DEV Environment)
+**Response:**
+```json
+{
+  "status": "success",
+  "texts": {
+    "subject": "Optimierung Ihrer Logistik...",
+    "intro": "Als Logistikleiter kennen Sie...",
+    "social_proof": "Wir helfen bereits Firma X..."
+  }
+}
+```

-Folgende IDs werden in den Skripten als Referenz genutzt. Diese müssen für PROD neu validiert werden.
+## 5. Offene To-Dos (Roadmap)

-**MA Status (`x_ma_status`):**
-* `Init`
-* `Ready_to_Craft`
-* `Ready_to_Send`
-* `Sent_Week1`
-* `...`
-
-**Rollen (`x_person_role` / `SuperOffice:3`):**
-| ID | Name |
-| :--- | :--- |
-| `19` | Operativer Entscheider |
-| `20` | Infrastruktur-Verantwortlicher |
-| `21` | Wirtschaftlicher Entscheider |
-| `22` | Innovations-Treiber |
-
-**Verticals (`x_contact_vertical` / `SuperOffice:5`):**
-| ID | Name |
-| :--- | :--- |
-| `23` | Logistics - Warehouse |
-| `24` | Healthcare - Hospital |
-| `25` | Infrastructure - Transport |
-| `26` | Leisure - Indoor Active |
-| `...` | *tbd* |
-
-## 5. "Gotchas" & Lessons Learned (Update Feb 16, 2026)
-
-*   **API-URL:** Der `sod` Tenant `Cust55774` ist nur über `https://app-sod.superoffice.com` erreichbar.
-*   **Listen-IDs:** Die API gibt IDs von Listenfeldern im Format `[I:26]` zurück. Der String muss vor der DB-Abfrage auf den Integer `26` geparst werden.
-*   **Write-Back (Stammfelder):** 
-    *   **UrlAddress & Phones:** Das einfache Root-Feld `UrlAddress` ist beim `PUT` oft schreibgeschützt. Um die Website oder Telefonnummern zu setzen, muss die entsprechende Liste (`Urls` oder `Phones`) als Array von Objekten gesendet werden (z.B. `{"Value": "...", "Description": "..."}`).
-    *   **Mandatory Fields:** Beim Update eines `Contact` Objekts müssen Pflichtfelder wie `Name` und `Number2` (oder `Number1`) zwingend im Payload enthalten sein, sonst schlägt die Validierung serverseitig fehl.
-    *   **Full Object PUT:** SuperOffice REST überschreibt das gesamte Objekt. Felder, die im `PUT`-Payload fehlen, werden im CRM geleert. Es empfiehlt sich, das Objekt erst per `GET` zu laden, die Änderungen vorzunehmen und dann das gesamte Objekt zurückzusenden.
-*   **Dev-System Limits:** Die Textfelder im DEV-System sind auf 40 Zeichen limitiert.
-*   **Y-Tabellen & Trigger:** Direkter Zugriff auf Zusatz-Tabellen und CRMScript-Trigger sind im SOD-DEV Mandanten blockiert.
-
-## 6. Umbau-Plan für Skalierbarkeit (19.02.2026)
-
-Die aktuelle Implementierung ist transaktionsbasiert und nicht für die Massenverarbeitung ausgelegt. Der folgende Plan beschreibt den notwendigen Umbau, um die Schnittstelle performant und robust für den Batch-Betrieb zu machen.
-
-### 6.1. Kernproblem: Transaktion vs. Batch
-*   **IST-Zustand:** Ein geänderter Kontakt löst eine Kette von individuellen API-Calls aus (Lesen, Hashen, Schreiben).
-*   **SOLL-Zustand:** Ein periodischer Job sammelt alle Änderungen und verarbeitet sie in einem einzigen, optimierten Durchlauf.
-
-### 6.2. Maßnahmen & Technische Umsetzung
-
-1.  **Batch-fähige Client-Methoden in `superoffice_client.py`:**
-    *   Die SuperOffice API unterstützt Batch-Operationen via `/$batch`-Endpunkt.
-    *   **Aufgabe:** Implementierung einer neuen Methode `execute_batch(requests)` im Client. Diese Methode nimmt eine Liste von einzelnen API-Anfragen (z.B. mehrere `PUT`-Operationen) entgegen, bündelt sie in einen einzelnen `POST`-Request an den `/$batch`-Endpunkt und verarbeitet die gesammelte Antwort.
-
-2.  **Einführung eines "Change-Tracking"-Mechanismus:**
-    *   **Problem:** Wir müssen effizient erkennen, welche Kontakte seit dem letzten Lauf aktualisiert werden müssen.
-    *   **Aufgabe:** Wir erweitern unsere lokale Company-Explorer-Datenbank um ein Feld `needs_sync (boolean)` oder `last_so_sync (datetime)`. Wenn eine Änderung im CE stattfindet, wird der Flag gesetzt.
-    *   Der `polling_daemon` wird umgebaut: Statt einzelne Kontakte zu vergleichen, sammelt er alle Kontakte, bei denen `needs_sync` auf `true` steht.
-
-3.  **Neuer Workflow im `polling_daemon_final.py`:**
-    *   **Phase 1: Sammeln:** Der Daemon fragt die lokale DB ab: `SELECT * FROM contacts WHERE needs_sync = true`.
-    *   **Phase 2: Batch-Request bauen:** Für jeden Kontakt in der Liste wird der notwendige `PUT`-Request für die SuperOffice API vorbereitet (aber noch nicht gesendet).
-    *   **Phase 3: Ausführen:** Alle vorbereiteten Requests werden an die neue `execute_batch()`-Methode im Client übergeben.
-    *   **Phase 4: Ergebnisverarbeitung & Fehler-Handling:** Die Batch-Antwort von SuperOffice wird analysiert. 
-        *   Für jeden erfolgreich verarbeiteten Kontakt wird `needs_sync` in der lokalen DB auf `false` gesetzt.
-        *   Fehlgeschlagene Updates werden detailliert geloggt (inkl. Kontakt-ID und Fehlermeldung), ohne den gesamten Batch abzubrechen.
-
-4.  **Konfliktlösungs-Strategie definieren:**
-    *   **Regel:** "SuperOffice hat Priorität."
-    *   **Umsetzung:** Der Polling-Daemon wird zusätzlich regelmäßig (z.B. alle 6 Stunden) einen Hash-Abgleich für eine größere Menge an Kontakten durchführen. Stellt er eine Diskrepanz fest (weil ein Kontakt manuell in SO geändert wurde), wird die Änderung aus SuperOffice in den Company Explorer übernommen und der `needs_sync`-Flag für diesen Kontakt *nicht* gesetzt. Dies verhindert, dass manuelle Änderungen überschrieben werden.
-
-Dieser Umbau ist eine Voraussetzung für den produktiven Einsatz und stellt sicher, dass die Schnittstelle auch bei tausenden von Kontakten stabil und performant bleibt.
+*   [ ] **UDF-Mapping:** Aktuell sind die `ProgId`s (z.B. `SuperOffice:5`) im Code (`worker.py`) hartkodiert. Dies muss in eine Config ausgelagert werden.
+*   [ ] **Fehlerbehandlung:** Was passiert, wenn der Company Explorer "404 Not Found" meldet? (Aktuell: Log Warning & Skip).
+*   [ ] **Redis:** Bei sehr hoher Last (>100 Events/Sekunde) sollte die SQLite-Queue durch Redis ersetzt werden.
--- a/connector-superoffice/queue_manager.py
+++ b/connector-superoffice/queue_manager.py
@@ -0,0 +1,106 @@
+import sqlite3
+import json
+from datetime import datetime, timedelta
+import os
+
+DB_PATH = os.getenv("DB_PATH", "connector_queue.db")
+
+class JobQueue:
+    def __init__(self):
+        self._init_db()
+
+    def _init_db(self):
+        with sqlite3.connect(DB_PATH) as conn:
+            conn.execute("""
+                CREATE TABLE IF NOT EXISTS jobs (
+                    id INTEGER PRIMARY KEY AUTOINCREMENT,
+                    event_type TEXT,
+                    payload TEXT,
+                    status TEXT DEFAULT 'PENDING',
+                    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+                    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+                    error_msg TEXT,
+                    next_try_at TIMESTAMP
+                )
+            """)
+            # Migration for existing DBs
+            try:
+                conn.execute("ALTER TABLE jobs ADD COLUMN next_try_at TIMESTAMP")
+            except sqlite3.OperationalError:
+                pass
+
+    def add_job(self, event_type: str, payload: dict):
+        with sqlite3.connect(DB_PATH) as conn:
+            conn.execute(
+                "INSERT INTO jobs (event_type, payload, status) VALUES (?, ?, ?)",
+                (event_type, json.dumps(payload), 'PENDING')
+            )
+
+    def get_next_job(self):
+        """
+        Atomically fetches the next pending job where next_try_at is reached.
+        """
+        job = None
+        with sqlite3.connect(DB_PATH) as conn:
+            conn.row_factory = sqlite3.Row
+            cursor = conn.cursor()
+            
+            # Lock the job
+            cursor.execute("BEGIN EXCLUSIVE")
+            try:
+                cursor.execute("""
+                    SELECT id, event_type, payload, created_at 
+                    FROM jobs 
+                    WHERE status = 'PENDING' 
+                    AND (next_try_at IS NULL OR next_try_at <= datetime('now'))
+                    ORDER BY created_at ASC 
+                    LIMIT 1
+                """)
+                row = cursor.fetchone()
+                
+                if row:
+                    job = dict(row)
+                    # Mark as processing
+                    cursor.execute(
+                        "UPDATE jobs SET status = 'PROCESSING', updated_at = datetime('now') WHERE id = ?",
+                        (job['id'],)
+                    )
+                    conn.commit()
+                else:
+                    conn.rollback() # No job found
+            except Exception:
+                conn.rollback()
+                raise
+                
+        if job:
+            job['payload'] = json.loads(job['payload'])
+            
+        return job
+
+    def retry_job_later(self, job_id, delay_seconds=60):
+        next_try = datetime.utcnow() + timedelta(seconds=delay_seconds)
+        with sqlite3.connect(DB_PATH) as conn:
+            conn.execute(
+                "UPDATE jobs SET status = 'PENDING', next_try_at = ?, updated_at = datetime('now') WHERE id = ?",
+                (next_try, job_id)
+            )
+
+    def complete_job(self, job_id):
+        with sqlite3.connect(DB_PATH) as conn:
+            conn.execute(
+                "UPDATE jobs SET status = 'COMPLETED', updated_at = datetime('now') WHERE id = ?",
+                (job_id,)
+            )
+
+    def fail_job(self, job_id, error_msg):
+        with sqlite3.connect(DB_PATH) as conn:
+            conn.execute(
+                "UPDATE jobs SET status = 'FAILED', error_msg = ?, updated_at = datetime('now') WHERE id = ?",
+                (str(error_msg), job_id)
+            )
+
+    def get_stats(self):
+        with sqlite3.connect(DB_PATH) as conn:
+            cursor = conn.cursor()
+            cursor.execute("SELECT status, COUNT(*) FROM jobs GROUP BY status")
+            return dict(cursor.fetchall())
--- a/connector-superoffice/register_webhook.py
+++ b/connector-superoffice/register_webhook.py
@@ -0,0 +1,60 @@
+import sys
+import os
+from superoffice_client import SuperOfficeClient
+
+# Configuration
+WEBHOOK_NAME = "Gemini Connector Hook"
+TARGET_URL = "https://floke-ai.duckdns.org/connector/webhook?token=changeme" # Token match .env
+EVENTS = [
+    "contact.created",
+    "contact.changed",
+    "person.created",
+    "person.changed"
+]
+
+def register():
+    print("🚀 Initializing SuperOffice Client...")
+    try:
+        client = SuperOfficeClient()
+    except Exception as e:
+        print(f"❌ Failed to connect: {e}")
+        return
+
+    print("🔎 Checking existing webhooks...")
+    webhooks = client._get("Webhook")
+    
+    if webhooks and 'value' in webhooks:
+        for wh in webhooks['value']:
+            if wh['Name'] == WEBHOOK_NAME:
+                print(f"⚠️  Webhook '{WEBHOOK_NAME}' already exists (ID: {wh['WebhookId']}).")
+                
+                # Check if URL matches
+                if wh['TargetUrl'] != TARGET_URL:
+                    print(f"   ⚠️ URL Mismatch! Deleting old webhook...")
+                    # Warning: _delete is not implemented in generic client yet, skipping auto-fix
+                    print("   Please delete it manually via API or extend client.")
+                return
+
+    print(f"✨ Registering new webhook: {WEBHOOK_NAME}")
+    payload = {
+        "Name": WEBHOOK_NAME,
+        "Events": EVENTS,
+        "TargetUrl": TARGET_URL,
+        "Secret": "changeme", # Used for signature calculation by SO
+        "State": "Active",
+        "Type": "Webhook"
+    }
+
+    try:
+        # Note: _post is defined in your client, returns JSON
+        res = client._post("Webhook", payload)
+        if res and "WebhookId" in res:
+            print(f"✅ SUCCESS! Webhook created with ID: {res['WebhookId']}")
+            print(f"   Target: {res['TargetUrl']}")
+        else:
+            print(f"❌ Creation failed. Response: {res}")
+    except Exception as e:
+        print(f"❌ Error during registration: {e}")
+
+if __name__ == "__main__":
+    register()
--- a/connector-superoffice/requirements.txt
+++ b/connector-superoffice/requirements.txt
@@ -3,4 +3,8 @@ python-dotenv
 cryptography
 pyjwt
 xmltodict
-holidays
+holidays
+fastapi
+uvicorn
+schedule
+sqlalchemy
--- a/connector-superoffice/start.sh
+++ b/connector-superoffice/start.sh
@@ -0,0 +1,6 @@
+#!/bin/bash
+# Start Worker in background
+python worker.py &
+
+# Start Webhook Server in foreground
+uvicorn webhook_app:app --host 0.0.0.0 --port 8000
--- a/connector-superoffice/superoffice_client.py
+++ b/connector-superoffice/superoffice_client.py
@@ -9,12 +9,19 @@ class SuperOfficeClient:
    """A client for interacting with the SuperOffice REST API."""

    def __init__(self):
-        self.client_id = os.getenv("SO_CLIENT_ID") or os.getenv("SO_SOD")
-        self.client_secret = os.getenv("SO_CLIENT_SECRET")
-        self.refresh_token = os.getenv("SO_REFRESH_TOKEN")
-        self.redirect_uri = os.getenv("SO_REDIRECT_URI", "http://localhost")
-        self.env = os.getenv("SO_ENVIRONMENT", "sod")
-        self.cust_id = os.getenv("SO_CONTEXT_IDENTIFIER", "Cust55774") # Fallback for your dev
+        # Helper to strip quotes if Docker passed them literally
+        def get_clean_env(key, default=None):
+            val = os.getenv(key)
+            if val and val.strip(): # Check if not empty string
+                return val.strip('"').strip("'")
+            return default
+
+        self.client_id = get_clean_env("SO_CLIENT_ID") or get_clean_env("SO_SOD")
+        self.client_secret = get_clean_env("SO_CLIENT_SECRET")
+        self.refresh_token = get_clean_env("SO_REFRESH_TOKEN")
+        self.redirect_uri = get_clean_env("SO_REDIRECT_URI", "http://localhost")
+        self.env = get_clean_env("SO_ENVIRONMENT", "sod")
+        self.cust_id = get_clean_env("SO_CONTEXT_IDENTIFIER", "Cust55774") # Fallback for your dev

        if not all([self.client_id, self.client_secret, self.refresh_token]):
            raise ValueError("SuperOffice credentials missing in .env file.")
@@ -34,6 +41,7 @@ class SuperOfficeClient:
    def _refresh_access_token(self):
        """Refreshes and returns a new access token."""
        url = f"https://{self.env}.superoffice.com/login/common/oauth/tokens"
+        print(f"DEBUG: Refresh URL: '{url}' (Env: '{self.env}')") # DEBUG
        data = {
            "grant_type": "refresh_token",
            "client_id": self.client_id,
--- a/connector-superoffice/webhook_app.py
+++ b/connector-superoffice/webhook_app.py
@@ -0,0 +1,56 @@
+from fastapi import FastAPI, Request, HTTPException, BackgroundTasks
+import logging
+import os
+import json
+from queue_manager import JobQueue
+
+# Logging Setup
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger("connector-webhook")
+
+app = FastAPI(title="SuperOffice Connector Webhook", version="2.0")
+queue = JobQueue()
+
+WEBHOOK_SECRET = os.getenv("WEBHOOK_SECRET", "changeme")
+
+@app.post("/webhook")
+async def receive_webhook(request: Request, background_tasks: BackgroundTasks):
+    """
+    Endpoint for SuperOffice Webhooks.
+    """
+    # 1. Verify Secret (Basic Security)
+    # SuperOffice puts signature in headers, but for custom webhook we might just use query param or header
+    # Let's assume for now a shared secret in header 'X-SuperOffice-Signature' or similar
+    # Or simply a secret in the URL: /webhook?token=...
+    
+    token = request.query_params.get("token")
+    if token != WEBHOOK_SECRET:
+        logger.warning(f"Invalid webhook token attempt: {token}")
+        raise HTTPException(403, "Invalid Token")
+
+    try:
+        payload = await request.json()
+        logger.info(f"Received webhook payload: {payload}")
+        
+        event_type = payload.get("Event", "unknown")
+        
+        # Add to local Queue
+        queue.add_job(event_type, payload)
+        
+        return {"status": "queued"}
+        
+    except Exception as e:
+        logger.error(f"Error processing webhook: {e}", exc_info=True)
+        raise HTTPException(500, "Internal Server Error")
+
+@app.get("/health")
+def health():
+    return {"status": "ok"}
+
+@app.get("/stats")
+def stats():
+    return queue.get_stats()
+
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run("webhook_app:app", host="0.0.0.0", port=8000, reload=True)
--- a/connector-superoffice/worker.py
+++ b/connector-superoffice/worker.py
@@ -0,0 +1,280 @@
+import time
+import logging
+import os
+import requests
+import json
+from queue_manager import JobQueue
+from superoffice_client import SuperOfficeClient
+
+# Setup Logging
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger("connector-worker")
+
+# Config
+COMPANY_EXPLORER_URL = os.getenv("COMPANY_EXPLORER_URL", "http://company-explorer:8000")
+POLL_INTERVAL = 5 # Seconds
+
+# UDF Mapping (DEV) - Should be moved to config later
+UDF_MAPPING = {
+    "subject": "SuperOffice:5",
+    "intro": "SuperOffice:6",
+    "social_proof": "SuperOffice:7"
+}
+
+def process_job(job, so_client: SuperOfficeClient):
+    """
+    Core logic for processing a single job.
+    """
+    logger.info(f"Processing Job {job['id']} ({job['event_type']})")
+    payload = job['payload']
+    event_low = job['event_type'].lower()
+    
+    # 1. Extract IDs from Webhook Payload
+    person_id = None
+    contact_id = None
+    
+    if "PersonId" in payload:
+        person_id = int(payload["PersonId"])
+    elif "PrimaryKey" in payload and "person" in event_low:
+        person_id = int(payload["PrimaryKey"])
+        
+    if "ContactId" in payload:
+        contact_id = int(payload["ContactId"])
+    elif "PrimaryKey" in payload and "contact" in event_low:
+        contact_id = int(payload["PrimaryKey"])
+
+    # Fallback/Deep Lookup
+    if not contact_id and person_id:
+        person_data = so_client.get_person(person_id)
+        if person_data and "Contact" in person_data:
+             contact_id = person_data["Contact"].get("ContactId")
+             
+    if not contact_id:
+        raise ValueError(f"Could not identify ContactId in payload: {payload}")
+
+    logger.info(f"Target: Person {person_id}, Contact {contact_id}")
+
+    # --- Cascading Logic ---
+    # If a company changes, we want to update all its persons eventually.
+    # We do this by adding "person.changed" jobs for each person to the queue.
+    if "contact" in event_low and not person_id:
+        logger.info(f"Company event detected. Triggering cascade for all persons of Contact {contact_id}.")
+        try:
+            persons = so_client.search(f"Person?$filter=contact/contactId eq {contact_id}")
+            if persons:
+                q = JobQueue()
+                for p in persons:
+                    p_id = p.get("PersonId")
+                    if p_id:
+                        logger.info(f"Cascading: Enqueueing job for Person {p_id}")
+                        q.add_job("person.changed", {"PersonId": p_id, "ContactId": contact_id, "Source": "Cascade"})
+        except Exception as e:
+            logger.warning(f"Failed to cascade to persons for contact {contact_id}: {e}")
+
+    # 1b. Fetch full contact details for 'Double Truth' check (Master Data Sync)
+    crm_name = None
+    crm_website = None
+    try:
+        contact_details = so_client.get_contact(contact_id)
+        if contact_details:
+            crm_name = contact_details.get("Name")
+            crm_website = contact_details.get("UrlAddress")
+            if not crm_website and "Urls" in contact_details and contact_details["Urls"]:
+                crm_website = contact_details["Urls"][0].get("Value")
+    except Exception as e:
+        logger.warning(f"Failed to fetch contact details for {contact_id}: {e}")
+
+    # 2. Call Company Explorer Provisioning API
+    ce_url = f"{COMPANY_EXPLORER_URL}/api/provision/superoffice-contact"
+    ce_req = {
+        "so_contact_id": contact_id,
+        "so_person_id": person_id,
+        "job_title": payload.get("JobTitle"),
+        "crm_name": crm_name,
+        "crm_website": crm_website
+    }
+    
+    ce_auth = (os.getenv("API_USER", "admin"), os.getenv("API_PASSWORD", "gemini"))
+    
+    try:
+        resp = requests.post(ce_url, json=ce_req, auth=ce_auth)
+        if resp.status_code == 404:
+            logger.warning(f"Company Explorer returned 404. Retrying later.")
+            return "RETRY"
+        
+        resp.raise_for_status()
+        provisioning_data = resp.json()
+        
+        if provisioning_data.get("status") == "processing":
+            logger.info(f"Company Explorer is processing {provisioning_data.get('company_name', 'Unknown')}. Re-queueing job.")
+            return "RETRY"
+            
+        if provisioning_data.get("status") == "processing":
+            logger.info(f"Company Explorer is processing {provisioning_data.get('company_name', 'Unknown')}. Re-queueing job.")
+            return "RETRY"
+            
+    except requests.exceptions.RequestException as e:
+        raise Exception(f"Company Explorer API failed: {e}")
+
+    logger.info(f"CE Response for Contact {contact_id}: {json.dumps(provisioning_data)}") # DEBUG
+
+    # 2b. Sync Vertical to SuperOffice (Company Level)
+    vertical_name = provisioning_data.get("vertical_name")
+    
+    if vertical_name:
+        # Mappings from README
+        VERTICAL_MAP = {
+            "Logistics - Warehouse": 23,
+            "Healthcare - Hospital": 24,
+            "Infrastructure - Transport": 25,
+            "Leisure - Indoor Active": 26
+        }
+        
+        vertical_id = VERTICAL_MAP.get(vertical_name)
+        
+        if vertical_id:
+            logger.info(f"Identified Vertical '{vertical_name}' -> ID {vertical_id}")
+            try:
+                # Check current value to avoid loops
+                current_contact = so_client.get_contact(contact_id)
+                current_udfs = current_contact.get("UserDefinedFields", {})
+                current_val = current_udfs.get("SuperOffice:5", "")
+                
+                # Normalize SO list ID format (e.g., "[I:26]" -> "26")
+                if current_val and current_val.startswith("[I:"):
+                    current_val = current_val.split(":")[1].strip("]")
+                
+                if str(current_val) != str(vertical_id):
+                    logger.info(f"Updating Contact {contact_id} Vertical: {current_val} -> {vertical_id}")
+                    so_client.update_entity_udfs(contact_id, "Contact", {"SuperOffice:5": str(vertical_id)})
+                else:
+                    logger.info(f"Vertical for Contact {contact_id} already in sync ({vertical_id}).")
+            except Exception as e:
+                logger.error(f"Failed to sync vertical for Contact {contact_id}: {e}")
+        else:
+            logger.warning(f"Vertical '{vertical_name}' not found in internal mapping.")
+
+    # 2c. Sync Website (Company Level)
+    # TEMPORARILY DISABLED TO PREVENT LOOP (SO API Read-after-Write latency or field mapping issue)
+    """
+    website = provisioning_data.get("website")
+    if website and website != "k.A.":
+        try:
+            # Re-fetch contact to ensure we work on latest version (Optimistic Concurrency)
+            contact_data = so_client.get_contact(contact_id)
+            current_url = contact_data.get("UrlAddress", "")
+            
+            # Normalize for comparison
+            def norm(u): return str(u).lower().replace("https://", "").replace("http://", "").strip("/") if u else ""
+            
+            if norm(current_url) != norm(website):
+                logger.info(f"Updating Website for Contact {contact_id}: {current_url} -> {website}")
+                
+                # Update Urls collection (Rank 1)
+                new_urls = []
+                if "Urls" in contact_data:
+                    found = False
+                    for u in contact_data["Urls"]:
+                        if u.get("Rank") == 1:
+                            u["Value"] = website
+                            found = True
+                        new_urls.append(u)
+                    if not found:
+                        new_urls.append({"Value": website, "Rank": 1, "Description": "Website"})
+                    contact_data["Urls"] = new_urls
+                else:
+                    contact_data["Urls"] = [{"Value": website, "Rank": 1, "Description": "Website"}]
+                
+                # Also set main field if empty
+                if not current_url:
+                    contact_data["UrlAddress"] = website
+
+                # Write back full object
+                so_client._put(f"Contact/{contact_id}", contact_data)
+            else:
+                logger.info(f"Website for Contact {contact_id} already in sync.")
+                
+        except Exception as e:
+            logger.error(f"Failed to sync website for Contact {contact_id}: {e}")
+    """
+
+    # 3. Update SuperOffice (Only if person_id is present)
+    if not person_id:
+        logger.info("Sync complete (Company only). No texts to write back.")
+        return "SUCCESS"
+
+    texts = provisioning_data.get("texts", {})
+    if not any(texts.values()):
+        logger.info("No texts returned from Matrix (yet). Skipping write-back.")
+        return "SUCCESS"
+
+    udf_update = {}
+    if texts.get("subject"): udf_update[UDF_MAPPING["subject"]] = texts["subject"]
+    if texts.get("intro"): udf_update[UDF_MAPPING["intro"]] = texts["intro"]
+    if texts.get("social_proof"): udf_update[UDF_MAPPING["social_proof"]] = texts["social_proof"]
+    
+    if udf_update:
+        # Loop Prevention
+        try:
+            current_person = so_client.get_person(person_id)
+            current_udfs = current_person.get("UserDefinedFields", {})
+            needs_update = False
+            for key, new_val in udf_update.items():
+                if current_udfs.get(key, "") != new_val:
+                    needs_update = True
+                    break
+            
+            if needs_update:
+                logger.info(f"Applying update to Person {person_id} (Changes detected).")
+                success = so_client.update_entity_udfs(person_id, "Person", udf_update)
+                if not success:
+                    raise Exception("Failed to update SuperOffice UDFs")
+            else:
+                logger.info(f"Skipping update for Person {person_id}: Values match (Loop Prevention).")
+                
+        except Exception as e:
+            logger.error(f"Error during pre-update check: {e}")
+            raise
+
+    logger.info("Job successfully processed.")
+    return "SUCCESS"
+
+def run_worker():
+    queue = JobQueue()
+    
+    # Initialize SO Client with retry
+    so_client = None
+    while not so_client:
+        try:
+            so_client = SuperOfficeClient()
+        except Exception as e:
+            logger.critical(f"Failed to initialize SuperOffice Client: {e}. Retrying in 30s...")
+            time.sleep(30)
+
+    logger.info("Worker started. Polling queue...")
+    
+    while True:
+        try:
+            job = queue.get_next_job()
+            if job:
+                try:
+                    result = process_job(job, so_client)
+                    if result == "RETRY":
+                        queue.retry_job_later(job['id'], delay_seconds=120)
+                    else:
+                        queue.complete_job(job['id'])
+                except Exception as e:
+                    logger.error(f"Job {job['id']} failed: {e}", exc_info=True)
+                    queue.fail_job(job['id'], str(e))
+            else:
+                time.sleep(POLL_INTERVAL)
+                
+        except Exception as e:
+            logger.error(f"Worker loop error: {e}")
+            time.sleep(POLL_INTERVAL)
+
+if __name__ == "__main__":
+    run_worker()