SuperOffice Connector README
Overview
This directory contains Python scripts designed to integrate with the SuperOffice CRM API, primarily for data extraction and analysis related to sales and customer product information.
🚀 Production Deployment (March 2026)
Status: ✅ Live (with active workarounds)
Environment: online3 (Production)
Tenant: Cust26720
1. Architecture & Flow
- Trigger: SuperOffice sends a webhook (
contact.created,contact.changed) tohttps://floke-ai.duckdns.org/connector/webhook. - Reception:
webhook_app.py(FastAPI) receives the event, validates the token, and pushes a job to the SQLite queue (connector_queue.db). - Processing:
worker.pypolls the queue, fetches contact details from SuperOffice, and sends them to the Company Explorer. - Enrichment: Company Explorer analyzes the data and returns enrichment info (Vertical, Summary, etc.).
- Sync:
worker.pypatches the data back into SuperOffice (UserDefinedFields).
2. Critical Issues & Workarounds (Lessons Learned)
🛑 A. The "Unhashable Dict" API Bug (Critical)
- Symptom: When fetching contact details (
GET /Contact/{id}), Python crashes withTypeError: unhashable type: 'dict'when accessingUserDefinedFields. - Cause: The SuperOffice Production API (
online3) returns a malformed structure forUserDefinedFieldsfor this tenant. It appears that one of the keys in the JSON response is being parsed as a dictionary object instead of a string, rendering the entire dictionary invalid for standard Python lookups. This behavior did not occur on the DEV (sod) environment. - Workaround (Active): The
worker.pyimplements a "Fail Open" strategy (safe_get_udfs).- It catches the
TypeError. - It treats the existing UDFs as empty.
- It proceeds with the enrichment and overwrites/patches the fields blindly.
- Consequence: We cannot check if a field is already set before writing. We always write.
- It catches the
🔄 B. The "Ping-Pong" Loop (Resolved)
- Symptom: Accounts stuck in a loop (
Processing->Completed->Processing-> ...). - Cause: Due to Workaround A (Blind Write), the worker always sends a PATCH request to SuperOffice. Every PATCH triggers a new
contact.changedwebhook. Since we can't read the value to see "it's already done", we write again -> new webhook -> infinite loop. - Solution: Implemented a Circuit Breaker in
worker.py.- The worker checks the
ChangedByAssociateIdin the webhook payload. - If the ID matches our API User (528), the job is immediately marked as
SUCCESSand skipped.
- The worker checks the
🔐 C. Environment Variables in Docker
- Lesson:
docker-composeenv_filedirective makes variables available to the container, but python scripts run manually viadocker execdo NOT see them unlessload_dotenvis used explicitly. - Fix: All scripts in
tools/now explicitly load the.envfrom the project root.config.pydefaults toonline3to prevent accidental dev-connections.
3. Tooling (New)
Located in connector-superoffice/tools/:
create_company.py: Creates a test company ("Bremer Abenteuerland") in Prod to trigger the webhook flow.verify_enrichment.py: Checks if the enrichment data (Vertical, Summary) actually landed in SuperOffice (bypassing the UDF crash).debug_raw_response.py: Saves the raw JSON response from SuperOffice toraw_api_response.jsonfor debugging API structure issues.who_am_i.py: Attempts to identify the current API user (Associate ID).
4. Open Todos
- Support Ticket: Wait for SuperOffice to fix the
UserDefinedFieldsJSON structure onCust26720. - Docker Optimization: The
connector-superofficebuild takes >8 minutes due to compiling C-extensions. Implement Multi-Stage Build. - Hardcoded ID: Make the Circuit Breaker ID (
528) configurable via.env(SO_API_ASSOCIATE_ID).
Authentication (Legacy)
Authentication is handled via the AuthHandler class, which uses a refresh token flow to obtain access tokens. Ensure that the .env file in the project root is correctly configured with SO_CLIENT_ID, SO_CLIENT_SECRET, SO_REFRESH_TOKEN, SO_REDIRECT_URI, SO_ENVIRONMENT, and SO_CONTEXT_IDENTIFIER.
Key SuperOffice Entities and Data Model Observations
During the development of reporting functionalities, the following observations were made regarding the SuperOffice data model:
1. Sale Entity
- Primary Source for Product Information: Contrary to initial expectations, product information is frequently stored as free-text within the
Headingfield of theSaleobject (e.g.,"2xOmnie CD-01 mit Nachlass"). This appears to be a common practice in the system, rather than utilizing structured product catalog items linked via quotes. - Contact Association: A significant number of
Saleobjects (Sale?$filter=Contact ne null) are not directly linked to aContactobject (Contact: null), making it challenging to attribute sales to specific customers programmatically. Our reporting scripts specifically filter for sales where aContactis present. - No Direct Quote/QuoteLine Linkage: Attempts to retrieve
QuoteorQuoteLineobjects directly viaSale/{saleId}/Quotes,Contact/{contactId}/Quotes, orSale/{saleId}/Activitiesresulted in500 Internal Server Errorsor empty result sets. This indicates that direct, API-accessible linkages betweenSalesandstructured QuoteLinesare often absent or not exposed via these endpoints.
2. Product Information Storage (Hypothesis & Workaround)
- Free-Text in Heading: The primary source for identifying products associated with a sale is the
Headingfield of theSaleentity itself. This field often contains product codes, descriptions, and other relevant details as free-text. - User-Defined Fields (UDFs): While
UserDefinedFieldswere inspected for structured product data (e.g.,RR-02-017-OMNIE), no such patterns were found in thesale_id=342243example. This suggests that UDFs are either not consistently used for product codes or are named in a way that doesn't align with common product terminology.
Scripts
list_products.py
- Purpose: Fetches and displays a list of all defined product families from the SuperOffice product catalog (
/List/ProductFamily/Items). - Usage:
python3 list_products.py
generate_customer_product_report.py
- Purpose: Generates a CSV report of customer sales, extracting product information from the
Sale.Headingfield using keyword matching. - Methodology:
- Retrieves the latest
SALE_LIMIT(e.g., 1000)Saleobjects, filtering only those with an associatedContact($filter=Contact ne null). - Extracts
SaleId,CustomerName, andSaleHeadingfor each relevant sale. - Searches the
SaleHeadingfor predefinedPRODUCT_KEYWORDS(e.g.,OMNIE,CD-01,Service). - Outputs the results to
product_report.csv.
- Retrieves the latest
- Usage:
python3 generate_customer_product_report.py
Future Work
- Analyse der leeren
product_report.csv: Untersuchen, warum dieproduct_report.csvauch nach der Filterung nachSale-Objekten mitContact-Verknüpfung leer bleibt. Es ist entscheidend zu verstehen, ob es keine solchen Verkäufe gibt oder ob ein Problem mit der Datenabfrage oder -verarbeitung vorliegt. - Manuelle Inspektion gefilterter
Sale-Objekte: Wenn der Report leer ist, müssen wir einigeSale-Objekte, die die BedingungContact ne nullerfüllen, manuell inspizieren, um ihre Struktur zu verstehen und festzustellen, ob dasHeading-Feld oder andere Felder Produktinformationen enthalten. - Verfeinerung der
PRODUCT_KEYWORDS: Die Liste der Produkt-Schlüsselwörter muss möglicherweise erweitert werden, basierend auf einer detaillierteren manuellen Analyse der Verkaufsüberschriften. - Erforschung alternativer API-Pfade: Falls der aktuelle Ansatz weiterhin Schwierigkeiten bereitet, müssen wir tiefer in die SuperOffice-API eintauchen, um strukturierte Produktdaten zu finden, auch wenn sie nicht direkt mit den Verkäufen verknüpft sind.