[30388f42] Infrastructure Hardening: Repaired CE/Connector DB schema, fixed frontend styling build, implemented robust echo shield in worker v2.1.1, and integrated Lead Engine into gateway.

This commit is contained in:
2026-03-07 14:08:42 +00:00
parent efcaa57cf0
commit ae2303b733
404 changed files with 24100 additions and 13301 deletions

View File

@@ -1,90 +1,64 @@
# SuperOffice Connector ("The Muscle") - GTM Engine
# SuperOffice Connector README
Dies ist der "dumme" Microservice zur Anbindung von **SuperOffice CRM** an die **Company Explorer Intelligence**.
Der Connector agiert als reiner Bote ("Muscle"): Er nimmt Webhook-Events entgegen, fragt das "Gehirn" (Company Explorer) nach Instruktionen und führt diese im CRM aus.
## Overview
This directory contains Python scripts designed to integrate with the SuperOffice CRM API, primarily for data enrichment and lead generation automation.
## 1. Architektur: "The Intelligent Hub & The Loyal Messenger"
## 🚀 Production Deployment (March 2026)
Wir haben uns für eine **Event-gesteuerte Architektur** entschieden, um Skalierbarkeit und Echtzeit-Verarbeitung zu gewährleisten.
**Status:** ✅ Live & Operational
**Environment:** `online3` (Production)
**Tenant:** `Cust26720`
**Der Datenfluss:**
1. **Auslöser:** User ändert in SuperOffice einen Kontakt (z.B. Status -> `Init`).
2. **Transport:** SuperOffice sendet ein `POST` Event an unseren Webhook-Endpunkt (`:8003/webhook`).
3. **Queueing:** Der `Webhook Receiver` validiert das Event und legt es sofort in eine lokale `SQLite`-Queue (`connector_queue.db`).
4. **Verarbeitung:** Ein separater `Worker`-Prozess holt den Job ab.
5. **Provisioning:** Der Worker fragt den **Company Explorer** (`POST /api/provision/superoffice-contact`): "Was soll ich mit Person ID 123 tun?".
6. **Write-Back:** Der Company Explorer liefert das fertige Text-Paket (Subject, Intro, Proof) zurück. Der Worker schreibt dies via REST API in die UDF-Felder von SuperOffice.
### 1. Architecture & Flow
1. **Trigger:** SuperOffice sends a webhook (`contact.created`, `contact.changed`, `person.created`, `person.changed`) to `https://floke-ai.duckdns.org/connector/webhook`.
2. **Reception:** `webhook_app.py` (FastAPI) receives the event, validates the `WEBHOOK_TOKEN`, and pushes a job to the SQLite queue (`connector_queue.db`).
3. **Processing:** `worker.py` (v1.9.1) polls the queue, filters for relevance, fetches details from SuperOffice, and calls the **Company Explorer** for AI analysis.
4. **Sync:** Results (Vertical, Summary, hyper-personalized Openers) are patched back into SuperOffice `UserDefinedFields`.
## 2. Kern-Komponenten
### 2. Operational Resilience (Lessons Learned)
* **`webhook_app.py` (FastAPI):**
* Lauscht auf Port `8000` (Extern: `8003`).
* Nimmt Events entgegen, prüft Token (`WEBHOOK_SECRET`).
* Schreibt Jobs in die Queue.
* Endpunkt: `POST /webhook`.
#### 🛡️ A. Noise Reduction & Loop Prevention
* **Problem:** Every data update (`PATCH`) triggers a new `contact.changed` webhook, creating infinite loops (Ping-Pong effect).
* **Solution 1 (Whitelist):** Only changes to `name`, `urladdress`, `urls`, `orgnr`, or `userdef_id` (UDFs) trigger processing.
* **Solution 2 (Circuit Breaker):** The system explicitly ignores any events triggered by its own API User (**Associate ID 528**). This stops the echo effect immediately.
* **`queue_manager.py` (SQLite):**
* Verwaltet die lokale Job-Queue.
* Status: `PENDING` -> `PROCESSING` -> `COMPLETED` / `FAILED`.
* Persistiert Jobs auch bei Container-Neustart.
#### 👥 B. Multi-Tenant Filtering (Roboplanet vs. Wackler)
* **Challenge:** The SuperOffice tenant is shared with Wackler. We must only process Roboplanet accounts.
* **Solution:** **Hybrid Whitelist Filtering**. We maintain a list of Roboplanet Associate IDs and Shortnames (e.g., `485`, `528`, `RKAB`, `RCGO`) in `config.py`.
* **Execution:** The worker fetches contact details and skips any account whose owner is not in this whitelist. This is faster and more reliable than querying group memberships via the API (which often returns 500 errors).
* **`worker.py`:**
* Läuft als Hintergrundprozess.
* Pollt die Queue alle 5 Sekunden.
* Kommuniziert mit Company Explorer (Intern: `http://company-explorer:8000`) und SuperOffice API.
* Behandelt Fehler und Retries.
#### 📊 C. Dashboard Persistence & Prioritization
* **Challenge:** Webhooks often only contain IDs, leaving the dashboard with "Entity C123".
* **Solution:** **Late Name Resolution**. The worker persists the resolved Company Name and Associate Shortname to the SQLite database as soon as it fetches them from SuperOffice.
* **Status Priority:** Success (`COMPLETED`) now "outshines" subsequent ignored echos (`SKIPPED`). Once an account is green, it stays green in the dashboard cluster for 15 minutes.
* **`superoffice_client.py`:**
* Kapselt die SuperOffice REST API (Auth, GET, PUT).
* Verwaltet Refresh Tokens.
### 3. Advanced API Handling (Critical Fixes)
## 3. Setup & Konfiguration
* **OData Pagination:** We implemented `odata.nextLink` support (Manuel Zierl's advice) to correctly handle large result sets (>1000 records).
* **Case Sensitivity:** We learned that SuperOffice API keys are case-sensitive in responses (e.g., `contactId` vs `ContactId`) depending on the endpoint. Our code now handles both.
* **Auth Consistency:** Always use `load_dotenv(override=True)` to prevent stale environment variables (like expired tokens) from lingering in the shell process.
* **Identity Crisis:** The `Associate/Me` and `Associate/{id}` endpoints return 500 errors if the API user lacks a linked Person record. This is a known configuration blocker for automated mailing.
### Docker Service
Der Service läuft im Container `connector-superoffice`.
Startet via `start.sh` sowohl den Webserver als auch den Worker.
### 4. Docker Optimization
### Konfiguration (`.env`)
Der Connector benötigt folgende Variablen (in `docker-compose.yml` gesetzt):
* **Multi-Stage builds:** Reduced build time from 8+ minutes to seconds (for code changes) by separating the build environment (compilers) from the runtime environment.
* **Size reduction:** Removed `build-essential` and Node.js runtimes from final images, drastically shrinking the footprint.
```yaml
environment:
API_USER: "admin"
API_PASSWORD: "..."
COMPANY_EXPLORER_URL: "http://company-explorer:8000" # Interne Docker-Adresse
WEBHOOK_SECRET: "changeme" # Muss mit SO-Webhook Config übereinstimmen
# Plus die SuperOffice Credentials (Client ID, Secret, Refresh Token)
```
### 5. Tooling & Diagnosis
Located in `connector-superoffice/tools/`:
## 4. API-Schnittstelle (Intern)
* `verify_latest_roboplanet.py`: The "Ocular Proof" script that finds the youngest target account.
* `check_filter_counts.py`: Compares API counts with UI counts.
* `find_latest_roboplanet_account.py`: Diagnostic tool for group filtering.
* `cleanup_test_data.py`: Safely removes all session-created test objects.
Der Connector ruft den Company Explorer auf und liefert dabei **Live-Daten** aus dem CRM für den "Double Truth" Abgleich:
### 6. Open Todos
* [x] **List ID Mapping:** Identified Production IDs (1613-1637). Hardcoded in `config.py`.
* [ ] **Mailing Identity:** Resolve 500 error on `Associate/Me`. (Meeting scheduled for Monday).
* [x] **Docker Build:** Multi-stage builds implemented.
* [x] **Dashboard:** Names and Associate shortnames now visible.
**Request:** `POST /api/provision/superoffice-contact`
```json
{
"so_contact_id": 12345,
"so_person_id": 67890,
"crm_name": "RoboPlanet GmbH",
"crm_website": "www.roboplanet.de",
"job_title": "Geschäftsführer"
}
```
---
**Response:**
```json
{
"status": "success",
"texts": {
"subject": "Optimierung Ihrer Logistik...",
"intro": "Als Logistikleiter kennen Sie...",
"social_proof": "Wir helfen bereits Firma X..."
}
}
```
## 5. Offene To-Dos (Roadmap)
* [ ] **UDF-Mapping:** Aktuell sind die `ProgId`s (z.B. `SuperOffice:5`) im Code (`worker.py`) hartkodiert. Dies muss in eine Config ausgelagert werden.
* [ ] **Fehlerbehandlung:** Was passiert, wenn der Company Explorer "404 Not Found" meldet? (Aktuell: Log Warning & Skip).
* [ ] **Redis:** Bei sehr hoher Last (>100 Events/Sekunde) sollte die SQLite-Queue durch Redis ersetzt werden.
## Authentication (Legacy)
Authentication is handled via the `AuthHandler` class, which uses a refresh token flow to obtain access tokens. Ensure that the `.env` file in the project root is correctly configured. ALWAYS use `override=True` when loading settings.