Enhance: Address/VAT Sync & Infrastructure Hardening [30e88f42]

- Implemented Address (City) and VAT (OrgNumber) sync back to SuperOffice.
- Hardened Infrastructure: Removed Pydantic dependency in config for better Docker compatibility.
- Improved SuperOffice Client error logging and handled empty SO_ENVIRONMENT variables.
- Updated Matrix Generator: Switched to gemini-2.0-flash, added industry filtering, and robust JSON parsing.
- Updated Documentation with session findings and troubleshooting steps.
This commit is contained in:
2026-02-21 21:26:57 +00:00
parent 1acdad9923
commit b41b6c38b8
8 changed files with 169 additions and 51 deletions

View File

@@ -107,10 +107,28 @@ Der Connector ist der Bote, der diese Daten in das CRM bringt.
2. Sync-Skript laufen lassen: `python3 backend/scripts/sync_notion_industries.py`.
3. Matrix neu berechnen: `python3 backend/scripts/generate_matrix.py --live`.
### Prompt-Tuning
Die Prompts für Matrix und Opener liegen in:
* Matrix: `backend/scripts/generate_matrix.py`
* Opener: `backend/services/classification.py` (oder `enrichment.py`)
### End-to-End Tests
Ein automatisierter Integrationstest (`tests/test_e2e_flow.py`) deckt den gesamten Zyklus ab:
1. **Company Creation:** Webhook -> CE Provisioning -> Write-back (Vertical).
2. **Person Creation:** Webhook -> CE Matrix Lookup -> Write-back (Texte).
3. **Vertical Change:** Änderung im CRM -> CE Update -> Cascade zu Personen -> Neue Texte.
Ausführen mittels:
```bash
python3 connector-superoffice/tests/test_e2e_flow.py
```
## 7. Troubleshooting & Known Issues
### Authentication "URL has an invalid label"
Tritt auf, wenn `SO_ENVIRONMENT` leer ist. Der Client fällt nun automatisch auf `sod` zurück.
### Pydantic V2 Compatibility
Die `config.py` wurde auf natives Python (`os.getenv`) umgestellt, um Konflikte mit `pydantic-settings` in Docker-Containern zu vermeiden.
### Address & VAT Sync (WIP)
Der Worker wurde erweitert, um auch `City` und `OrgNumber` (VAT) zurückzuschreiben.
**Status (21.02.2026):** Implementiert, aber noch im Feinschliff. Logs zeigen teils Re-Queueing während das Enrichment läuft.
## Appendix: The "First Sentence" Prompt

View File

@@ -1,44 +1,37 @@
import os
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
# --- Infrastructure ---
# Internal Docker URL for Company Explorer
COMPANY_EXPLORER_URL: str = "http://company-explorer:8000"
# --- SuperOffice API Credentials ---
SO_ENVIRONMENT: str = "sod" # 'sod' or 'online'
SO_CLIENT_ID: str = ""
SO_CLIENT_SECRET: str = ""
SO_REFRESH_TOKEN: str = ""
SO_REDIRECT_URI: str = "http://localhost"
SO_CONTEXT_IDENTIFIER: str = "Cust55774" # e.g. Cust12345
# --- Feature Flags ---
ENABLE_WEBSITE_SYNC: bool = False # Disabled by default to prevent loops
# --- Mappings (IDs from SuperOffice) ---
# Vertical IDs (List Items)
# Default values match the current hardcoded DEV IDs
# Format: "Name In Explorer": ID_In_SuperOffice
VERTICAL_MAP_JSON: str = '{"Logistics - Warehouse": 23, "Healthcare - Hospital": 24, "Infrastructure - Transport": 25, "Leisure - Indoor Active": 26}'
class Settings:
def __init__(self):
# --- Infrastructure ---
# Internal Docker URL for Company Explorer
self.COMPANY_EXPLORER_URL = os.getenv("COMPANY_EXPLORER_URL", "http://company-explorer:8000")
# --- SuperOffice API Credentials ---
# Fallback for empty string in env var
env_val = os.getenv("SO_ENVIRONMENT")
self.SO_ENVIRONMENT = env_val if env_val else "sod"
self.SO_CLIENT_ID = os.getenv("SO_CLIENT_ID", "")
self.SO_CLIENT_SECRET = os.getenv("SO_CLIENT_SECRET", "")
self.SO_REFRESH_TOKEN = os.getenv("SO_REFRESH_TOKEN", "")
self.SO_REDIRECT_URI = os.getenv("SO_REDIRECT_URI", "http://localhost")
self.SO_CONTEXT_IDENTIFIER = os.getenv("SO_CONTEXT_IDENTIFIER", "Cust55774") # e.g. Cust12345
# --- Feature Flags ---
self.ENABLE_WEBSITE_SYNC = os.getenv("ENABLE_WEBSITE_SYNC", "False").lower() in ("true", "1", "t")
# --- Mappings (IDs from SuperOffice) ---
# Vertical IDs (List Items)
self.VERTICAL_MAP_JSON = os.getenv("VERTICAL_MAP_JSON", '{"Logistics - Warehouse": 23, "Healthcare - Hospital": 24, "Infrastructure - Transport": 25, "Leisure - Indoor Active": 26}')
# Persona / Job Role IDs (List Items for "Position" field)
# To be filled after discovery
PERSONA_MAP_JSON: str = '{}'
# Persona / Job Role IDs (List Items for "Position" field)
self.PERSONA_MAP_JSON = os.getenv("PERSONA_MAP_JSON", '{}')
# User Defined Fields (ProgIDs)
# The technical names of the fields in SuperOffice
# Default values match the current hardcoded DEV UDFs
UDF_SUBJECT: str = "SuperOffice:5"
UDF_INTRO: str = "SuperOffice:6"
UDF_SOCIAL_PROOF: str = "SuperOffice:7"
UDF_VERTICAL: str = "SuperOffice:5" # NOTE: Currently same as Subject in dev? Need to verify. worker.py had 'SuperOffice:5' for vertical AND 'SuperOffice:5' for subject in the map?
class Config:
env_file = ".env"
env_file_encoding = "utf-8"
extra = "ignore" # Ignore extra fields in .env
# User Defined Fields (ProgIDs)
self.UDF_SUBJECT = os.getenv("UDF_SUBJECT", "SuperOffice:5")
self.UDF_INTRO = os.getenv("UDF_INTRO", "SuperOffice:6")
self.UDF_SOCIAL_PROOF = os.getenv("UDF_SOCIAL_PROOF", "SuperOffice:7")
self.UDF_VERTICAL = os.getenv("UDF_VERTICAL", "SuperOffice:5")
# Global instance
settings = Settings()
settings = Settings()

View File

@@ -19,6 +19,8 @@ class SuperOfficeClient:
self.env = settings.SO_ENVIRONMENT
self.cust_id = settings.SO_CONTEXT_IDENTIFIER
logger.info(f"DEBUG CONFIG: Env='{self.env}', CustID='{self.cust_id}', ClientID='{self.client_id[:4]}...'")
if not all([self.client_id, self.client_secret, self.refresh_token]):
# Graceful failure: Log error but allow init (for help/docs/discovery scripts)
logger.error("❌ SuperOffice credentials missing in .env file (or environment variables).")
@@ -57,7 +59,8 @@ class SuperOfficeClient:
resp.raise_for_status()
return resp.json().get("access_token")
except requests.exceptions.HTTPError as e:
logger.error(f"❌ Token Refresh Error: {e.response.text}")
logger.error(f"❌ Token Refresh Error (Status: {e.response.status_code}): {e.response.text}")
logger.debug(f"Response Headers: {e.response.headers}")
return None
except Exception as e:
logger.error(f"❌ Connection Error during token refresh: {e}")
@@ -71,7 +74,8 @@ class SuperOfficeClient:
resp.raise_for_status()
return resp.json()
except requests.exceptions.HTTPError as e:
logger.error(f"❌ API GET Error for {endpoint}: {e.response.text}")
logger.error(f"❌ API GET Error for {endpoint} (Status: {e.response.status_code}): {e.response.text}")
logger.debug(f"Response Headers: {e.response.headers}")
return None
def _put(self, endpoint, payload):

View File

@@ -210,6 +210,44 @@ def process_job(job, so_client: SuperOfficeClient):
else:
logger.warning(f"Vertical '{vertical_name}' not found in internal mapping.")
# 2b.2 Sync Address & VAT (Standard Fields)
# Check if we have address data to sync
ce_city = provisioning_data.get("address_city")
ce_country = provisioning_data.get("address_country") # Assuming 'DE' code or similar
ce_vat = provisioning_data.get("vat_id")
if ce_city or ce_vat:
try:
# Re-fetch contact to be safe (or use cached if optimal)
contact_data = so_client.get_contact(contact_id)
changed = False
# City (PostalAddress)
if ce_city:
# SuperOffice Address structure is complex. Simplified check on PostalAddress.
# Address: { "PostalAddress": { "City": "..." } }
current_city = contact_data.get("PostalAddress", {}).get("City", "")
if current_city != ce_city:
if "PostalAddress" not in contact_data: contact_data["PostalAddress"] = {}
contact_data["PostalAddress"]["City"] = ce_city
changed = True
logger.info(f"Updating City: {current_city} -> {ce_city}")
# VAT (OrgNumber)
if ce_vat:
current_vat = contact_data.get("OrgNumber", "")
if current_vat != ce_vat:
contact_data["OrgNumber"] = ce_vat
changed = True
logger.info(f"Updating VAT: {current_vat} -> {ce_vat}")
if changed:
logger.info(f"Pushing standard field updates for Contact {contact_id}...")
so_client._put(f"Contact/{contact_id}", contact_data)
except Exception as e:
logger.error(f"Failed to sync Address/VAT for Contact {contact_id}: {e}")
# 2c. Sync Website (Company Level)
# TEMPORARILY DISABLED TO PREVENT LOOP (SO API Read-after-Write latency or field mapping issue)
# Re-enable via config if needed