docs(connector): [31188f42] Comprehensive documentation update with multi-tenant filtering, echo protection, and API lessons
This commit is contained in:
@@ -12,47 +12,53 @@ This directory contains Python scripts designed to integrate with the SuperOffic
|
|||||||
### 1. Architecture & Flow
|
### 1. Architecture & Flow
|
||||||
1. **Trigger:** SuperOffice sends a webhook (`contact.created`, `contact.changed`, `person.created`, `person.changed`) to `https://floke-ai.duckdns.org/connector/webhook`.
|
1. **Trigger:** SuperOffice sends a webhook (`contact.created`, `contact.changed`, `person.created`, `person.changed`) to `https://floke-ai.duckdns.org/connector/webhook`.
|
||||||
2. **Reception:** `webhook_app.py` (FastAPI) receives the event, validates the `WEBHOOK_TOKEN`, and pushes a job to the SQLite queue (`connector_queue.db`).
|
2. **Reception:** `webhook_app.py` (FastAPI) receives the event, validates the `WEBHOOK_TOKEN`, and pushes a job to the SQLite queue (`connector_queue.db`).
|
||||||
3. **Processing:** `worker.py` (v1.9) polls the queue, filters for relevance, fetches details from SuperOffice, and calls the **Company Explorer** for AI analysis.
|
3. **Processing:** `worker.py` (v1.9.1) polls the queue, filters for relevance, fetches details from SuperOffice, and calls the **Company Explorer** for AI analysis.
|
||||||
4. **Sync:** Results (Vertical, Summary, hyper-personalized Openers) are patched back into SuperOffice `UserDefinedFields`.
|
4. **Sync:** Results (Vertical, Summary, hyper-personalized Openers) are patched back into SuperOffice `UserDefinedFields`.
|
||||||
|
|
||||||
### 2. Operational Resilience (Lessons Learned)
|
### 2. Operational Resilience (Lessons Learned)
|
||||||
|
|
||||||
#### 🛡️ A. Noise Reduction & Loop Prevention
|
#### 🛡️ A. Noise Reduction & Loop Prevention
|
||||||
* **Problem:** Every data update (`PATCH`) triggers a new `contact.changed` webhook, potentially creating infinite loops (Ping-Pong effect). Also, irrelevant changes (e.g., phone numbers) waste AI processing resources.
|
* **Problem:** Every data update (`PATCH`) triggers a new `contact.changed` webhook, creating infinite loops (Ping-Pong effect).
|
||||||
* **Solution:** Strict **Whitelist Filtering** in `worker.py`.
|
* **Solution 1 (Whitelist):** Only changes to `name`, `urladdress`, `urls`, `orgnr`, or `userdef_id` (UDFs) trigger processing.
|
||||||
* **Contacts:** Only changes to `name`, `urladdress`, `urls`, `orgnr`, or `userdef_id` (UDFs) trigger processing.
|
* **Solution 2 (Circuit Breaker):** The system explicitly ignores any events triggered by its own API User (**Associate ID 528**). This stops the echo effect immediately.
|
||||||
* **Persons:** Only changes to `jobtitle`, `position_id`, or `userdef_id` trigger processing.
|
|
||||||
* All other events are instantly marked as **`SKIPPED`** (visible in Dashboard).
|
|
||||||
|
|
||||||
#### 📊 B. Dashboard & Sync-Runs
|
#### 👥 B. Multi-Tenant Filtering (Roboplanet vs. Wackler)
|
||||||
* **Problem:** Continuous webhooks created hundreds of messy dashboard entries for the same account.
|
* **Challenge:** The SuperOffice tenant is shared with Wackler. We must only process Roboplanet accounts.
|
||||||
* **Solution:** **Sync-Run Clustering**.
|
* **Solution:** **Hybrid Whitelist Filtering**. We maintain a list of Roboplanet Associate IDs and Shortnames (e.g., `485`, `528`, `RKAB`, `RCGO`) in `config.py`.
|
||||||
* Jobs for the same account occurring within **15 minutes** of each other are grouped into a single "Sync-Run" row.
|
* **Execution:** The worker fetches contact details and skips any account whose owner is not in this whitelist. This is faster and more reliable than querying group memberships via the API (which often returns 500 errors).
|
||||||
* **Status Prioritization:** A row shows `COMPLETED` if at least one job was successful, even if subsequent "echo" webhooks were `SKIPPED`.
|
|
||||||
|
|
||||||
#### 🐞 C. The "Unhashable Dict" Mystery
|
#### 📊 C. Dashboard Persistence & Prioritization
|
||||||
* **Incident:** During deployment, a `TypeError: unhashable type: 'dict'` occurred. Initial suspicion was a SuperOffice API bug.
|
* **Challenge:** Webhooks often only contain IDs, leaving the dashboard with "Entity C123".
|
||||||
* **Resolution:** Deep analysis of raw JSON proved the API is healthy. The error was caused by a bug in the *test/dashboard scripts* (creating a dictionary literal using another dictionary as a key).
|
* **Solution:** **Late Name Resolution**. The worker persists the resolved Company Name and Associate Shortname to the SQLite database as soon as it fetches them from SuperOffice.
|
||||||
* **Lesson:** Always verify raw API output (`debug_raw_response.py`) before blaming the vendor.
|
* **Status Priority:** Success (`COMPLETED`) now "outshines" subsequent ignored echos (`SKIPPED`). Once an account is green, it stays green in the dashboard cluster for 15 minutes.
|
||||||
|
|
||||||
### 3. Tooling & Diagnosis
|
### 3. Advanced API Handling (Critical Fixes)
|
||||||
|
|
||||||
|
* **OData Pagination:** We implemented `odata.nextLink` support (Manuel Zierl's advice) to correctly handle large result sets (>1000 records).
|
||||||
|
* **Case Sensitivity:** We learned that SuperOffice API keys are case-sensitive in responses (e.g., `contactId` vs `ContactId`) depending on the endpoint. Our code now handles both.
|
||||||
|
* **Auth Consistency:** Always use `load_dotenv(override=True)` to prevent stale environment variables (like expired tokens) from lingering in the shell process.
|
||||||
|
* **Identity Crisis:** The `Associate/Me` and `Associate/{id}` endpoints return 500 errors if the API user lacks a linked Person record. This is a known configuration blocker for automated mailing.
|
||||||
|
|
||||||
|
### 4. Docker Optimization
|
||||||
|
|
||||||
|
* **Multi-Stage builds:** Reduced build time from 8+ minutes to seconds (for code changes) by separating the build environment (compilers) from the runtime environment.
|
||||||
|
* **Size reduction:** Removed `build-essential` and Node.js runtimes from final images, drastically shrinking the footprint.
|
||||||
|
|
||||||
|
### 5. Tooling & Diagnosis
|
||||||
Located in `connector-superoffice/tools/`:
|
Located in `connector-superoffice/tools/`:
|
||||||
|
|
||||||
* `verify_enrichment.py`: Verifies if AI data actually landed in SuperOffice (using raw text search).
|
* `verify_latest_roboplanet.py`: The "Ocular Proof" script that finds the youngest target account.
|
||||||
* `create_company.py`: Creates a test company in Prod to trigger the flow.
|
* `check_filter_counts.py`: Compares API counts with UI counts.
|
||||||
* `full_discovery.py`: Lists all available UDFs and Lists on the current tenant.
|
* `find_latest_roboplanet_account.py`: Diagnostic tool for group filtering.
|
||||||
* `final_truth_check.py`: Validates JSON structure of API responses.
|
* `cleanup_test_data.py`: Safely removes all session-created test objects.
|
||||||
* `who_am_i.py`: Identifies the API user's Associate ID.
|
|
||||||
|
|
||||||
### 4. Open Todos
|
### 6. Open Todos
|
||||||
* [ ] **List ID Mapping:** Identify the correct List ID for the "Vertical" field (ProgID `SuperOffice:83`). `udlist331` returned 404.
|
* [x] **List ID Mapping:** Identified Production IDs (1613-1637). Hardcoded in `config.py`.
|
||||||
* [ ] **Mailing Identity:** Resolve 500 error on `Associate/Me`. Required for automated "Send As" mailing functionality.
|
* [ ] **Mailing Identity:** Resolve 500 error on `Associate/Me`. (Meeting scheduled for Monday).
|
||||||
* [ ] **Docker Build:** Optimize `connector-superoffice` build time (>8 mins) using Multi-Stage Dockerfiles to avoid C-compilation on every deploy.
|
* [x] **Docker Build:** Multi-stage builds implemented.
|
||||||
|
* [x] **Dashboard:** Names and Associate shortnames now visible.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Authentication (Legacy)
|
## Authentication (Legacy)
|
||||||
Authentication is handled via the `AuthHandler` class, which uses a refresh token flow to obtain access tokens. Ensure that the `.env` file in the project root is correctly configured with `SO_CLIENT_ID`, `SO_CLIENT_SECRET`, `SO_REFRESH_TOKEN`, `SO_REDIRECT_URI`, `SO_ENVIRONMENT`, and `SO_CONTEXT_IDENTIFIER`.
|
Authentication is handled via the `AuthHandler` class, which uses a refresh token flow to obtain access tokens. Ensure that the `.env` file in the project root is correctly configured. ALWAYS use `override=True` when loading settings.
|
||||||
|
|
||||||
## Key SuperOffice Entities and Data Model Observations
|
|
||||||
(See full file for historical notes on Sale Entity and Product Information Storage)
|
|
||||||
|
|||||||
Reference in New Issue
Block a user