Files
Brancheneinstufung2/connector-superoffice
Floke 36be6ee156 [31188f42] Feat: Filterlogik für 'person.changed' Webhooks implementiert.
Nur relevante Änderungen (Jobtitel, Position, UDFs) lösen eine KI-Verarbeitung aus. Irrelevante Änderungen (Telefon, etc.) werden ignoriert, um Loops und unnötige Last zu vermeiden.
2026-03-04 17:51:55 +00:00
..
2026-02-22 14:31:00 +00:00
2026-02-09 16:05:04 +00:00
2026-02-24 08:40:38 +00:00

SuperOffice Connector README

Overview

This directory contains Python scripts designed to integrate with the SuperOffice CRM API, primarily for data extraction and analysis related to sales and customer product information.

🚀 Production Deployment (March 2026)

Status: Live (with active workarounds) Environment: online3 (Production) Tenant: Cust26720

1. Architecture & Flow

  1. Trigger: SuperOffice sends a webhook (contact.created, contact.changed) to https://floke-ai.duckdns.org/connector/webhook.
  2. Reception: webhook_app.py (FastAPI) receives the event, validates the token, and pushes a job to the SQLite queue (connector_queue.db).
  3. Processing: worker.py polls the queue, fetches contact details from SuperOffice, and sends them to the Company Explorer.
  4. Enrichment: Company Explorer analyzes the data and returns enrichment info (Vertical, Summary, etc.).
  5. Sync: worker.py patches the data back into SuperOffice (UserDefinedFields).

2. Critical Issues & Workarounds (Lessons Learned)

🛑 A. The "Unhashable Dict" API Bug (Critical)

  • Symptom: When fetching contact details (GET /Contact/{id}), Python crashes with TypeError: unhashable type: 'dict' when accessing UserDefinedFields.
  • Cause: The SuperOffice Production API (online3) returns a malformed structure for UserDefinedFields for this tenant. It appears that one of the keys in the JSON response is being parsed as a dictionary object instead of a string, rendering the entire dictionary invalid for standard Python lookups. This behavior did not occur on the DEV (sod) environment.
  • Workaround (Active): The worker.py implements a "Fail Open" strategy (safe_get_udfs).
    • It catches the TypeError.
    • It treats the existing UDFs as empty.
    • It proceeds with the enrichment and overwrites/patches the fields blindly.
    • Consequence: We cannot check if a field is already set before writing. We always write.

🔄 B. The "Ping-Pong" Loop (Resolved)

  • Symptom: Accounts stuck in a loop (Processing -> Completed -> Processing -> ...).
  • Cause: Due to Workaround A (Blind Write), the worker always sends a PATCH request to SuperOffice. Every PATCH triggers a new contact.changed webhook. Since we can't read the value to see "it's already done", we write again -> new webhook -> infinite loop.
  • Solution: Implemented a Circuit Breaker in worker.py.
    • The worker checks the ChangedByAssociateId in the webhook payload.
    • If the ID matches our API User (528), the job is immediately marked as SUCCESS and skipped.

🔐 C. Environment Variables in Docker

  • Lesson: docker-compose env_file directive makes variables available to the container, but python scripts run manually via docker exec do NOT see them unless load_dotenv is used explicitly.
  • Fix: All scripts in tools/ now explicitly load the .env from the project root. config.py defaults to online3 to prevent accidental dev-connections.

3. Tooling (New)

Located in connector-superoffice/tools/:

  • create_company.py: Creates a test company ("Bremer Abenteuerland") in Prod to trigger the webhook flow.
  • verify_enrichment.py: Checks if the enrichment data (Vertical, Summary) actually landed in SuperOffice (bypassing the UDF crash).
  • debug_raw_response.py: Saves the raw JSON response from SuperOffice to raw_api_response.json for debugging API structure issues.
  • who_am_i.py: Attempts to identify the current API user (Associate ID).

4. Open Todos

  • Support Ticket: Wait for SuperOffice to fix the UserDefinedFields JSON structure on Cust26720.
  • Docker Optimization: The connector-superoffice build takes >8 minutes due to compiling C-extensions. Implement Multi-Stage Build.
  • Hardcoded ID: Make the Circuit Breaker ID (528) configurable via .env (SO_API_ASSOCIATE_ID).

Authentication (Legacy)

Authentication is handled via the AuthHandler class, which uses a refresh token flow to obtain access tokens. Ensure that the .env file in the project root is correctly configured with SO_CLIENT_ID, SO_CLIENT_SECRET, SO_REFRESH_TOKEN, SO_REDIRECT_URI, SO_ENVIRONMENT, and SO_CONTEXT_IDENTIFIER.

Key SuperOffice Entities and Data Model Observations

During the development of reporting functionalities, the following observations were made regarding the SuperOffice data model:

1. Sale Entity

  • Primary Source for Product Information: Contrary to initial expectations, product information is frequently stored as free-text within the Heading field of the Sale object (e.g., "2xOmnie CD-01 mit Nachlass"). This appears to be a common practice in the system, rather than utilizing structured product catalog items linked via quotes.
  • Contact Association: A significant number of Sale objects (Sale?$filter=Contact ne null) are not directly linked to a Contact object (Contact: null), making it challenging to attribute sales to specific customers programmatically. Our reporting scripts specifically filter for sales where a Contact is present.
  • No Direct Quote/QuoteLine Linkage: Attempts to retrieve Quote or QuoteLine objects directly via Sale/{saleId}/Quotes, Contact/{contactId}/Quotes, or Sale/{saleId}/Activities resulted in 500 Internal Server Errors or empty result sets. This indicates that direct, API-accessible linkages between Sales and structured QuoteLines are often absent or not exposed via these endpoints.

2. Product Information Storage (Hypothesis & Workaround)

  • Free-Text in Heading: The primary source for identifying products associated with a sale is the Heading field of the Sale entity itself. This field often contains product codes, descriptions, and other relevant details as free-text.
  • User-Defined Fields (UDFs): While UserDefinedFields were inspected for structured product data (e.g., RR-02-017-OMNIE), no such patterns were found in the sale_id=342243 example. This suggests that UDFs are either not consistently used for product codes or are named in a way that doesn't align with common product terminology.

Scripts

list_products.py

  • Purpose: Fetches and displays a list of all defined product families from the SuperOffice product catalog (/List/ProductFamily/Items).
  • Usage: python3 list_products.py

generate_customer_product_report.py

  • Purpose: Generates a CSV report of customer sales, extracting product information from the Sale.Heading field using keyword matching.
  • Methodology:
    1. Retrieves the latest SALE_LIMIT (e.g., 1000) Sale objects, filtering only those with an associated Contact ($filter=Contact ne null).
    2. Extracts SaleId, CustomerName, and SaleHeading for each relevant sale.
    3. Searches the SaleHeading for predefined PRODUCT_KEYWORDS (e.g., OMNIE, CD-01, Service).
    4. Outputs the results to product_report.csv.
  • Usage: python3 generate_customer_product_report.py

Future Work

  • Analyse der leeren product_report.csv: Untersuchen, warum die product_report.csv auch nach der Filterung nach Sale-Objekten mit Contact-Verknüpfung leer bleibt. Es ist entscheidend zu verstehen, ob es keine solchen Verkäufe gibt oder ob ein Problem mit der Datenabfrage oder -verarbeitung vorliegt.
  • Manuelle Inspektion gefilterter Sale-Objekte: Wenn der Report leer ist, müssen wir einige Sale-Objekte, die die Bedingung Contact ne null erfüllen, manuell inspizieren, um ihre Struktur zu verstehen und festzustellen, ob das Heading-Feld oder andere Felder Produktinformationen enthalten.
  • Verfeinerung der PRODUCT_KEYWORDS: Die Liste der Produkt-Schlüsselwörter muss möglicherweise erweitert werden, basierend auf einer detaillierteren manuellen Analyse der Verkaufsüberschriften.
  • Erforschung alternativer API-Pfade: Falls der aktuelle Ansatz weiterhin Schwierigkeiten bereitet, müssen wir tiefer in die SuperOffice-API eintauchen, um strukturierte Produktdaten zu finden, auch wenn sie nicht direkt mit den Verkäufen verknüpft sind.