Fix GTM Architect: Robust prompt syntax, Gemini API migration, Docker logging

2026-01-01 22:39:43 +00:00
parent 05753edbb1
commit 1fe17d88bc
6 changed files with 598 additions and 1651 deletions
--- a/BUILDER_APPS_MIGRATION.md
+++ b/BUILDER_APPS_MIGRATION.md
@@ -233,3 +233,21 @@ Achtung beim Routing. Wenn die App unter `/app/` laufen soll, muss der Trailing
 - [ ] `docker-compose.yml` mountet auch `helpers.py` und `config.py`?
 - [ ] Leere `.db` Datei auf dem Host erstellt?
 - [ ] Dockerfile nutzt Multi-Stage Build?
+
+---
+
+## Appendix A: GTM Architect Fixes & Gemini Migration (Jan 2026)
+
+### A.1 Problemstellung
+- **SyntaxError bei großen Prompts:** Python-Parser (3.11) hatte massive Probleme mit f-Strings, die 100+ Zeilen lang waren und Sonderzeichen enthielten.
+- **Library Deprecation:** `google.generativeai` hat Support eingestellt? Nein, aber die Fehlermeldung im Log deutete auf einen Konflikt zwischen alten `openai`-Wrappern und neuen Gemini-Paketen hin.
+- **Lösung:** 
+    1.  **Prompts ausgelagert:** System-Prompts liegen jetzt in `gtm_prompts.json` und werden zur Laufzeit geladen. Kein Code-Parsing mehr notwendig.
+    2.  **Native Gemini Lib:** Statt OpenAI-Wrapper nutzen wir jetzt `google.generativeai` direkt via `helpers.call_gemini_flash`.
+    3.  **Config:** `gtm-architect/Dockerfile` kopiert nun explizit `gtm_prompts.json`.
+
+### A.2 Neuer Standard für KI-Apps
+Für zukünftige Apps gilt:
+1.  **Prompts in JSON/Text-Files:** Niemals riesige Strings im Python-Code hardcoden.
+2.  **`helpers.call_gemini_flash` nutzen:** Diese Funktion ist nun der Gold-Standard für einfache, stateless Calls.
+3.  **JSON im Dockerfile:** Vergesst nicht, die externen Prompt-Files mit `COPY` in den Container zu holen!
--- a/GEMINI.md
+++ b/GEMINI.md
@@ -51,3 +51,17 @@ The application will be available at `http://localhost:8080`.
 *   **API:** The Flask application in `app.py` provides an API to interact with the system.
 *   **Logging:** The project uses the `logging` module to log information and errors.
 *   **Error Handling:** The `readme.md` indicates a critical error related to the `openai` library. The next step is to downgrade the library to a compatible version.
+
+## Current Status (Jan 2026) - GTM Architect & Core Updates
+
+*   **GTM Architect Fixed & Stabilized:** The "GTM Architect" service is now fully operational.
+    *   **Prompt Engineering:** Switched from fragile f-strings to robust `"\n".join([...])` list construction for all system prompts. This eliminates syntax errors caused by mixed quotes and multi-line strings in Python.
+    *   **AI Backend:** Migrated to `google.generativeai` via `helpers.call_gemini_flash` with JSON mode support.
+    *   **Scraping:** Implemented robust URL scraping using `requests` and `BeautifulSoup` (with `html.parser` to avoid dependency issues).
+    *   **Docker:** Container (`gtm-app`) updated with correct volume mounts for logs and config.
+*   **Gemini Migration:** The codebase is moving towards `google-generativeai` as the primary driver. `helpers.py` now supports this natively.
+*   **Deployment:** To apply these fixes, a rebuild of the `gtm-app` container is required.
+
+## Next Steps
+*   **Monitor Logs:** Check `Log_from_docker/` for detailed execution traces of the GTM Architect.
+*   **Feedback Loop:** Verify the quality of the generated GTM strategies and adjust prompts in `gtm_architect_orchestrator.py` if necessary.
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -104,6 +104,9 @@ services:
      - ./gtm_projects.db:/app/gtm_projects.db
      # Mount the API key to the location expected by config.py
      - ./gemini_api_key.txt:/app/api_key.txt
+      # Mount Logs
+      - ./Log_from_docker:/app/Log_from_docker
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
-      - SERPAPI_API_KEY=${SERPAPI_API_KEY}
+      - SERPAPI_API_KEY=${SERPAPI_API_KEY}
+      - PYTHONUNBUFFERED=1
--- a/gtm-architect/Dockerfile
+++ b/gtm-architect/Dockerfile
@@ -33,6 +33,9 @@ RUN apt-get update && \
 COPY gtm-architect/requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt

+# Google Generative AI Library installieren
+RUN pip install --no-cache-dir google-generativeai
+
 # Copy the Node.js server script and package.json for runtime deps
 COPY gtm-architect/server.cjs .
 COPY gtm-architect/package.json .
@@ -45,6 +48,7 @@ COPY --from=frontend-builder /app/dist ./dist

 # Copy the Python Orchestrator and shared modules
 COPY gtm_architect_orchestrator.py .
+COPY gtm_prompts.json .
 COPY helpers.py .
 COPY config.py .
 COPY market_db_manager.py .
@@ -56,4 +60,4 @@ COPY market_db_manager.py .
 EXPOSE 3005

 # Start the server
-CMD ["node", "server.cjs"]
+CMD ["node", "server.cjs"]
--- a/gtm_architect_orchestrator.py
+++ b/gtm_architect_orchestrator.py
@@ -1,259 +1,431 @@
-import os
-import sys
-import json
 import argparse
-from pathlib import Path
+import json
+import logging
+import re
+import sys
+import os
+import requests
+from bs4 import BeautifulSoup
+from datetime import datetime
+from config import Config

-# Add project root to Python path
-project_root = Path(__file__).resolve().parents[1]
-sys.path.append(str(project_root))
+# Append the current directory to sys.path
+sys.path.append(os.path.dirname(os.path.abspath(__file__)))

-from helpers import call_openai_chat
-import market_db_manager
+from helpers import call_gemini_flash

-print("--- PYTHON ORCHESTRATOR V_20260101_1200_FINAL ---", file=sys.stderr)
+# Configure logging to file
+LOG_DIR = "Log_from_docker"
+if not os.path.exists(LOG_DIR):
+    os.makedirs(LOG_DIR)

-__version__ = "1.4.0_UltimatePromptFix"
+timestamp = datetime.now().strftime("%Y-%m-%d")
+log_file = os.path.join(LOG_DIR, f"{timestamp}_gtm_architect.log")

-# Ensure DB is ready
-market_db_manager.init_db()
+logging.basicConfig(
+    level=logging.DEBUG,
+    format='%(asctime)s - %(levelname)s - %(message)s',
+    handlers=[
+        logging.FileHandler(log_file, mode='a', encoding='utf-8'),
+        logging.StreamHandler(sys.stderr)
+    ]
+)

+def log_to_stderr(msg):
+    sys.stderr.write(f"[GTM-ORCHESTRATOR] {msg}\n")
+    sys.stderr.flush()
+
+# --- SCRAPING HELPER ---
+def get_text_from_url(url):
+    try:
+        log_to_stderr(f"Scraping URL: {url}")
+        headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'}
+        response = requests.get(url, headers=headers, timeout=15)
+        response.raise_for_status()
+        
+        # Using html.parser
+        soup = BeautifulSoup(response.content, 'html.parser')
+        
+        # Remove noise
+        for element in soup(['script', 'style', 'noscript', 'iframe', 'svg', 'header', 'footer', 'nav', 'aside']):
+            element.decompose()
+
+        # Get text
+        text = soup.get_text(separator=' ', strip=True)
+        log_to_stderr(f"Scraping success. Length: {len(text)}")
+        return text[:30000] # Limit length
+
+    except Exception as e:
+        log_to_stderr(f"Scraping failed: {e}")
+        logging.warning(f"Could not scrape URL {url}: {e}")
+        return ""
+
+# --- SYSTEM PROMPTS (Constructed reliably) ---
 def get_system_instruction(lang):
    if lang == 'de':
-        return """
-# IDENTITY & PURPOSE
-Du bist die "GTM Architect Engine" für Roboplanet. Deine Aufgabe ist es, für neue technische Produkte (Roboter) eine präzise Go-to-Market-Strategie zu entwickeln.
-Du handelst nicht als kreativer Werbetexter, sondern als strategischer Analyst. Dein oberstes Ziel ist Product-Market-Fit und operative Umsetzbarkeit.
-Antworte IMMER auf DEUTSCH.
-
-# CONTEXT: THE PARENT COMPANY (WACKLER)
-Wir sind Teil der Wackler Group, einem großen Facility-Management-Dienstleister.
-Unsere Strategie ist NICHT "Roboter ersetzen Menschen", sondern "Hybrid-Reinigung":
- 80% der Arbeit (monotone Flächenleistung) = Roboter.
- 20% der Arbeit (Edge Cases, Winterdienst, Treppen, Grobschmutz) = Manuelle Reinigung durch Wackler.
-
-# STRICT ANALYSIS RULES (MUST FOLLOW):
-1. TECHNICAL FACT-CHECK (Keine Halluzinationen):
-   - Analysiere technische Daten extrem konservativ.
-   - Vakuumsystem = Kein "Winterdienst" (Schnee) und keine "Schwerindustrie" (Metallspäne), außer explizit genannt.
-   - Erfinde keine Features, nur um eine Zielgruppe passend zu machen.
-   
-2. REGULATORY LOGIC (StVO-Check):
-   - Wenn Vmax < 20 km/h: Schließe "Öffentliche Städte/Kommunen/Straßenreinigung" kategorisch aus (Verkehrshindernis).
-   - Fokusänderung: Konzentriere dich stattdessen ausschließlich auf "Große, zusammenhängende Privatflächen" (Gated Areas).
-
-3. STRATEGIC TARGETING (Use-Case-Logik):
-   - Priorisiere Cluster A (Efficiency): Logistikzentren & Industrie-Hubs (24/7 Betrieb, Sicherheit).
-   - Priorisiere Cluster B (Experience): Shopping Center, Outlets & Freizeitparks (Sauberkeit als Visitenkarte).
-   - Entferne reine E-Commerce-Händler ohne physische Kundenfläche.
-
-4. THE "HYBRID SERVICE" LOGIC (RULE 5):
-   Wann immer du ein "Hartes Constraint" oder eine technische Limitierung identifizierst (z.B. "Kein Winterdienst" oder "Kommt nicht in Ecken"), darfst du dies niemals als reines "Nein" stehen lassen.
-   Wende stattdessen die **"Yes, and..." Logik** an:
-   1. **Identifiziere die Lücke:** (z.B. "Roboter kann bei Schnee nicht fahren").
-   2. **Fülle die Lücke mit Service:** Schlage explizit vor, diesen Teil durch "Wackler Human Manpower" abzudecken.
-   3. **Formuliere den USP:** Positioniere das Gesamtpaket als "100% Coverage" (Roboter + Mensch aus einer Hand).
-
-# THE PRODUCT MATRIX (CONTEXT)
-Behalte immer im Hinterkopf, dass wir bereits folgende Produkte im Portfolio haben:
-1. "Indoor Scrubber 50": Innenreinigung, Hartboden, Fokus: Supermärkte. Message: "Sauberkeit im laufenden Betrieb."
-2. "Service Bot Bella": Service/Gastro, Indoor. Fokus: Restaurants. Message: "Entlastung für Servicekräfte."
-"""
+        return "\n".join([
+            "# IDENTITY & PURPOSE",
+            'Du bist die "GTM Architect Engine" für Roboplanet. Deine Aufgabe ist es, für neue technische Produkte (Roboter) eine präzise Go-to-Market-Strategie zu entwickeln.',
+            "Du handelst nicht als kreativer Werbetexter, sondern als strategischer Analyst. Dein oberstes Ziel ist Product-Market-Fit und operative Umsetzbarkeit.",
+            "Antworte IMMER auf DEUTSCH.",
+            "",
+            "# CONTEXT: THE PARENT COMPANY (WACKLER)",
+            "Wir sind Teil der Wackler Group, einem großen Facility-Management-Dienstleister.",
+            'Unsere Strategie ist NICHT "Roboter ersetzen Menschen", sondern "Hybrid-Reinigung":',
+            "- 80% der Arbeit (monotone Flächenleistung) = Roboter.",
+            "- 20% der Arbeit (Edge Cases, Winterdienst, Treppen, Grobschmutz) = Manuelle Reinigung durch Wackler.",
+            "",
+            "# STRICT ANALYSIS RULES (MUST FOLLOW):",
+            "1. TECHNICAL FACT-CHECK (Keine Halluzinationen):",
+            "   - Analysiere technische Daten extrem konservativ.",
+            '   - Vakuumsystem = Kein "Winterdienst" (Schnee) und keine "Schwerindustrie" (Metallspäne), außer explizit genannt.',
+            "   - Erfinde keine Features, nur um eine Zielgruppe passend zu machen.",
+            "   ",
+            "2. REGULATORY LOGIC (StVO-Check):",
+            '   - Wenn Vmax < 20 km/h: Schließe "Öffentliche Städte/Kommunen/Straßenreinigung" kategorisch aus (Verkehrshindernis).',
+            '   - Fokusänderung: Konzentriere dich stattdessen ausschließlich auf "Große, zusammenhängende Privatflächen" (Gated Areas).',
+            "",
+            "3. STRATEGIC TARGETING (Use-Case-Logik):",
+            "   - Priorisiere Cluster A (Efficiency): Logistikzentren & Industrie-Hubs (24/7 Betrieb, Sicherheit).",
+            "   - Priorisiere Cluster B (Experience): Shopping Center, Outlets & Freizeitparks (Sauberkeit als Visitenkarte).",
+            "   - Entferne reine E-Commerce-Händler ohne physische Kundenfläche.",
+            "",
+            '4. THE "HYBRID SERVICE" LOGIC (RULE 5):',
+            'Wann immer du ein "Hartes Constraint" oder eine technische Limitierung identifizierst (z.B. "Kein Winterdienst" oder "Kommt nicht in Ecken"), darfst du dies niemals als reines "Nein" stehen lassen.',
+            'Wende stattdessen die **"Yes, and..." Logik** an:',
+            '   1. **Identifiziere die Lücke:** (z.B. "Roboter kann bei Schnee nicht fahren").',
+            '   2. **Fülle die Lücke mit Service:** Schlage explizit vor, diesen Teil durch "Wackler Human Manpower" abzudecken.',
+            '   3. **Formuliere den USP:** Positioniere das Gesamtpaket als "100% Coverage" (Roboter + Mensch aus einer Hand).'
+        ])
    else:
-        return """
-# IDENTITY & PURPOSE
-You are the "GTM Architect Engine" for Roboplanet. Your task is to develop a precise Go-to-Market strategy for new technical products (robots).
-You do not act as a creative copywriter, but as a strategic analyst. Your top goal is product-market fit and operational feasibility.
-ALWAYS respond in ENGLISH.
+        return "\n".join([
+            "# IDENTITY & PURPOSE",
+            'You are the "GTM Architect Engine" for Roboplanet. Your task is to develop a precise Go-to-Market strategy for new technical products (robots).',
+            "You do not act as a creative copywriter, but as a strategic analyst. Your top goal is product-market fit and operational feasibility.",
+            "ALWAYS respond in ENGLISH.",
+            "",
+            "# CONTEXT: THE PARENT COMPANY (WACKLER)",
+            "We are part of the Wackler Group, a major facility management service provider.",
+            'Our strategy is NOT "Robots replace humans", but "Hybrid Cleaning":',
+            "- 80% of work (monotonous area coverage) = Robots.",
+            "- 20% of work (Edge cases, winter service, stairs, heavy debris) = Manual cleaning by Wackler.",
+            "",
+            "# STRICT ANALYSIS RULES (MUST FOLLOW):",
+            "1. TECHNICAL FACT-CHECK (No Hallucinations):",
+            "   - Analyze technical data extremely conservatively.",
+            '   - Vacuum System = No "Winter Service" (snow) and no "Heavy Industry" (metal shavings), unless explicitly stated.',
+            "   - Do not invent features just to fit a target audience.",
+            "",
+            "2. REGULATORY LOGIC (Traffic Regs):",
+            '   - If Vmax < 20 km/h: Categorically exclude "Public Cities/Streets" (traffic obstruction).',
+            '   - Change Focus: Concentrate exclusively on "Large, contiguous private areas" (Gated Areas).',
+            "",
+            "3. STRATEGIC TARGETING (Use Case Logic):",
+            "   - Prioritize Cluster A (Efficiency): Logistics Centers & Industrial Hubs (24/7 ops, safety).",
+            "   - Prioritize Cluster B (Experience): Shopping Centers, Outlets & Theme Parks (Cleanliness as a calling card).",
+            "   - Remove pure E-commerce retailers without physical customer areas.",
+            "",
+            '4. THE "HYBRID SERVICE" LOGIC (RULE 5):',
+            'Whenever you identify a "Hard Constraint" or technical limitation (e.g., "No winter service" or "Cannot reach corners"), never let this stand as a simple "No".',
+            'Instead, apply the **"Yes, and..." logic**:',
+            '   1. **Identify the gap:** (e.g., "Robot cannot operate in snow").',
+            '   2. **Fill the gap with service:** Explicitly suggest covering this part with "Wackler Human Manpower".',
+            '   3. **Formulate the USP:** Position the total package as "100% Coverage" (Robot + Human from a single source).'
+        ])

-# CONTEXT: THE PARENT COMPANY (WACKLER)
-We are part of the Wackler Group, a major facility management service provider.
-Our strategy is NOT "Robots replace humans", but "Hybrid Cleaning":
- 80% of work (monotonous area coverage) = Robots.
- 20% of work (Edge cases, winter service, stairs, heavy debris) = Manual cleaning by Wackler.
+# --- ORCHESTRATOR LOGIC ---

-# STRICT ANALYSIS RULES (MUST FOLLOW):
-1. TECHNICAL FACT-CHECK (No Hallucinations):
-   - Analyze technical data extremely conservatively.
-   - Vacuum System = No "Winter Service" (snow) and no "Heavy Industry" (metal shavings), unless explicitly stated.
-   - Do not invent features just to fit a target audience.
+def analyze_product(product_input, lang):
+    # 1. Scraping if URL
+    content = product_input
+    if re.match(r'^https?://', product_input.strip()):
+        logging.info(f"Detected URL: {product_input}. Scraping...")
+        scraped_text = get_text_from_url(product_input.strip())
+        if scraped_text:
+            content = scraped_text
+            logging.info(f"Scraped {len(content)} chars.")
+        else:
+            logging.warning("Scraping failed, using URL as input.")

-2. REGULATORY LOGIC (Traffic Regs):
-   - If Vmax < 20 km/h: Categorically exclude "Public Cities/Streets" (traffic obstruction).
-   - Change Focus: Concentrate exclusively on "Large, contiguous private areas" (Gated Areas).
-
-3. STRATEGIC TARGETING (Use Case Logic):
-   - Prioritize Cluster A (Efficiency): Logistics Centers & Industrial Hubs (24/7 ops, safety).
-   - Prioritize Cluster B (Experience): Shopping Centers, Outlets & Theme Parks (Cleanliness as a calling card).
-   - Remove pure E-commerce retailers without physical customer areas.
-
-4. THE "HYBRID SERVICE" LOGIC (RULE 5):
-   Whenever you identify a "Hard Constraint" or a technical limitation (e.g., "No winter service" or "Cannot reach corners"), never let this stand as a simple "No".
-   Instead, apply the **"Yes, and..." logic**:
-   1. **Identify the gap:** (e.g., "Robot cannot operate in snow").
-   2. **Fill the gap with service:** Explicitly suggest covering this part with "Wackler Human Manpower".
-   3. **Formulate the USP:** Position the total package as "100% Coverage" (Robot + Human from a single source).
-
-# THE PRODUCT MATRIX (CONTEXT)
-Always keep in mind that we already have the following products in our portfolio:
-1. "Indoor Scrubber 50": Indoor cleaning, hard floor, supermarkets.
-2. "Service Bot Bella": Service/Hospitality, indoor. Focus: restaurants. Message: "Relief for service staff."
-"""
-
-# --- Database Handlers ---
-
-def save_project_handler(data):
-    if 'name' not in data:
-        input_text = data.get('productInput', '')
-        derived_name = input_text.split('\n')[0][:30] if input_text else "Untitled Strategy"
-        data['name'] = derived_name
-        
-    result = market_db_manager.save_project(data)
-    print(json.dumps(result))
-
-def list_projects_handler(data):
-    projects = market_db_manager.get_all_projects()
-    print(json.dumps(projects))
-
-def load_project_handler(data):
-    project_id = data.get('id')
-    project = market_db_manager.load_project(project_id)
-    if project:
-        print(json.dumps(project))
-    else:
-        print(json.dumps({"error": "Project not found"}))
-
-def delete_project_handler(data):
-    project_id = data.get('id')
-    result = market_db_manager.delete_project(project_id)
-    print(json.dumps(result))
-
-# --- AI Handlers ---
-
-def analyze_product(data):
-    product_input = data.get('productInput')
-    lang = data.get('language', 'de')
    sys_instr = get_system_instruction(lang)
+    
+    # 1. Extraction
+    prompt_extract = "\n".join([
+        "PHASE 1-A: TECHNICAL EXTRACTION",
+        f'Input Product Description: "{content[:25000]}"',
+        "",
+        "Task:",
+        "1. Extract key technical features (specs, capabilities).",
+        '2. Derive "Hard Constraints". IMPORTANT: Check Vmax (<20km/h = Private Grounds) and Cleaning Type (Vacuum != Heavy Debris/Snow).',
+        "3. Create a short raw analysis summary.",
+        "",
+        "Output JSON format ONLY:",
+        "{",
+        '    "features": ["feature1", "feature2"],
+        '    "constraints": ["constraint1", "constraint2"],
+        '    "rawAnalysis": "summary text"',
+        "}"
+    ])
+    
+    log_to_stderr("Starting Phase 1-A: Technical Extraction...")
+    raw_response = call_gemini_flash(prompt_extract, system_instruction=sys_instr, json_mode=True)
+    try:
+        data = json.loads(raw_response)
+    except json.JSONDecodeError:
+        logging.error(f"Failed to parse Phase 1 JSON: {raw_response}")
+        return {"features": [], "constraints": [], "rawAnalysis": "Error parsing AI response."}

-    # Prompts als Liste von Strings konstruieren, um Syntax-Fehler zu vermeiden
-    if lang == 'en':
-        extraction_prompt_parts = [
-            "PHASE 1-A: TECHNICAL EXTRACTION",
-            f"Input Product Description: \"{product_input[:25000]}\"",
-            "",
-            "Task:",
-            "1) Extract key technical features (specs, capabilities).",
-            "2) Derive \"Hard Constraints\". IMPORTANT: Check Vmax (<20km/h = Private Grounds) and Cleaning Type (Vacuum != Heavy Debris/Snow).",
-            "3) Create a short raw analysis summary.",
-            "",
-            "Output JSON format ONLY."
-        ]
-    else:
-        extraction_prompt_parts = [
-            "PHASE 1-A: TECHNICAL EXTRACTION",
-            f"Input Product Description: \"{product_input[:25000]}\"",
-            "",
-            "Aufgabe:",
-            "1) Extrahiere technische Hauptmerkmale (Specs, Fähigkeiten).",
-            "2) Leite \"Harte Constraints\" ab. WICHTIG: Prüfe Vmax (<20km/h = Privatgelände) und Reinigungstyp (Vakuum != Grobschmutz/Schnee).",
-            "3) Erstelle eine kurze Rohanalyse-Zusammenfassung.",
-            "",
-            "Output JSON format ONLY."
-        ]
-    extraction_prompt = "\n".join(extraction_prompt_parts)
+    # 2. Conflict Check
+    prompt_conflict = "\n".join([
+        "PHASE 1-B: PORTFOLIO CONFLICT CHECK",
+        "",
+        f"New Product Features: {json.dumps(data.get('features'))}",
+        f"New Product Constraints: {json.dumps(data.get('constraints'))}",
+        "",
+        "Existing Portfolio:",
+        '1. "Indoor Scrubber 50": Indoor cleaning, hard floor, supermarkets.',
+        '2. "Service Bot Bella": Service/Gastro, indoor, restaurants.',
+        "",
+        "Task:",
+        "Check if the new product overlaps significantly with existing ones (is it just a clone?).",
+        "",
+        "Output JSON format ONLY:",
+        "{",
+        '    "conflictCheck": {',
+        '        "hasConflict": true/false,',
+        '        "details": "explanation",',
+        '        "relatedProduct": "name or null"',
+        "    }
+    ])
+    
+    log_to_stderr("Starting Phase 1-B: Conflict Check...")
+    conflict_response = call_gemini_flash(prompt_conflict, system_instruction=sys_instr, json_mode=True)
+    try:
+        conflict_data = json.loads(conflict_response)
+        data.update(conflict_data)
+    except:
+        pass # Ignore conflict check error
+        
+    return data

-    full_extraction_prompt = sys_instr + "\n\n" + extraction_prompt
-    print(f"DEBUG: Full Extraction Prompt: \n{full_extraction_prompt[:1000]}...\n", file=sys.stderr)
-    extraction_response = call_openai_chat(full_extraction_prompt, response_format_json=True)
-    print(f"DEBUG: Raw Extraction Response from API: {extraction_response}", file=sys.stderr)
-    extraction_data = json.loads(extraction_response)
+def discover_icps(phase1_result, lang):
+    sys_instr = get_system_instruction(lang)
+    prompt = "\n".join([
+        "PHASE 2: ICP DISCOVERY & DATA PROXIES",
+        f"Based on the product features: {json.dumps(phase1_result.get('features'))}",
+        f"And constraints: {json.dumps(phase1_result.get('constraints'))}",
+        "",
+        "Task:",
+        "1. Negative Selection: Which industries are impossible? (Remember Vmax & Vacuum rules!)",
+        "2. High Pain: Identify Cluster A (Logistics/Industry) and Cluster B (Shopping/Outlets).",
+        "3. Data Proxy Generation: How to find them digitally via data traces (e.g. satellite, registries).",
+        "",
+        "Output JSON format ONLY:",
+        "{",
+        '    "icps": [',
+        '        { "name": "Industry Name", "rationale": "Why this is a good fit" }',
+        "    ],
+        '    "dataProxies": [',
+        '        { "target": "Specific criteria", "method": "How to find" }',
+        "    ]
+    ])
+    log_to_stderr("Starting Phase 2: ICP Discovery...")
+    response = call_gemini_flash(prompt, system_instruction=sys_instr, json_mode=True)
+    return json.loads(response)

-    features_json = json.dumps(extraction_data.get('features'))
-    constraints_json = json.dumps(extraction_data.get('constraints'))
+def hunt_whales(phase2_result, lang):
+    sys_instr = get_system_instruction(lang)
+    prompt = "\n".join([
+        "PHASE 3: WHALE HUNTING",
+        f"Target ICPs (Industries): {json.dumps(phase2_result.get('icps'))}",
+        "",
+        "Task:",
+        "1. Group 'Whales' (Key Accounts) strictly by the identified ICP industries.",
+        "2. Identify 3-5 concrete top companies in the DACH market per industry.",
+        "3. Define Buying Center Roles.",
+        "",
+        "Output JSON format ONLY:",
+        "{",
+        '    "whales": [',
+        '        { "industry": "Name of ICP Industry", "accounts": ["Company A", "Company B"] }',
+        "    ],
+        '    "roles": ["Job Title 1", "Job Title 2"]
+    ])
+    log_to_stderr("Starting Phase 3: Whale Hunting...")
+    response = call_gemini_flash(prompt, system_instruction=sys_instr, json_mode=True)
+    return json.loads(response)

-    if lang == 'en':
-        conflict_prompt_parts = [
-            "PHASE 1-B: PORTFOLIO CONFLICT CHECK",
-            "",
-            f"New Product Features: {features_json}",
-            f"New Product Constraints: {constraints_json}",
-            "",
-            "Existing Portfolio:",
-            "1. 'Indoor Scrubber 50': Indoor, Hard Floor, Supermarkets.",
-            "2. 'Service Bot Bella': Indoor, Service/Hospitality, Restaurants.",
-            "",
-            "Task: Check for cannibalization. Does the new product make an existing one obsolete? Explain briefly.",
-            "Output JSON format ONLY."
-        ]
-    else:
-        conflict_prompt_parts = [
-            "PHASE 1-B: PORTFOLIO-KONFLIKT-CHECK",
-            "",
-            f"Neue Produktmerkmale: {features_json}",
-            f"Neue Produkt-Constraints: {constraints_json}",
-            "",
-            "Bestehendes Portfolio:",
-            "1. 'Indoor Scrubber 50': Innenreinigung, Hartboden, Supermärkte.",
-            "2. 'Service Bot Bella': Indoor, Service/Gastro, Restaurants.",
-            "",
-            "Aufgabe: Prüfe auf Kannibalisierung. Macht das neue Produkt ein bestehendes obsolet? Begründe kurz.",
-            "Output JSON format ONLY."
-        ]
-    conflict_prompt = "\n".join(conflict_prompt_parts)
+def develop_strategy(phase3_result, phase1_result, lang):
+    sys_instr = get_system_instruction(lang)
+    
+    all_accounts = []
+    for w in phase3_result.get('whales', []):
+        all_accounts.extend(w.get('accounts', []))

-    full_conflict_prompt = sys_instr + "\n\n" + conflict_prompt
-    print(f"DEBUG: Full Conflict Prompt: \n{full_conflict_prompt[:1000]}...\n", file=sys.stderr)
-    conflict_response = call_openai_chat(full_conflict_prompt, response_format_json=True)
-    conflict_data = json.loads(conflict_response)
+    prompt = "\n".join([
+        "PHASE 4: STRATEGY & ANGLE DEVELOPMENT",
+        f"Accounts: {json.dumps(all_accounts)}",
+        f"Product Features: {json.dumps(phase1_result.get('features'))}",
+        "",
+        "Task:",
+        "1. Develop specific 'Angle' per target/industry.",
+        "2. Consistency Check against Product Matrix.",
+        '3. **IMPORTANT:** Apply "Hybrid Service Logic" if technical constraints exist!',
+        "",
+        "Output JSON format ONLY:",
+        "{",
+        '    "strategyMatrix": [',
+        "        {",
+        '            "segment": "Target Segment",',
+        '            "painPoint": "Specific Pain",',
+        '            "angle": "Our Marketing Angle",',
+        '            "differentiation": "How it differs"',
+        "        }
+    ])
+    log_to_stderr("Starting Phase 4: Strategy...")
+    response = call_gemini_flash(prompt, system_instruction=sys_instr, json_mode=True)
+    return json.loads(response)

-    # --- Final Response Assembly ---
-    final_response = {
-        "phase1_technical_extraction": extraction_data,
-        "phase2_portfolio_conflict": conflict_data,
-    }
-    print(json.dumps(final_response, indent=2))
+def generate_assets(phase4_result, phase3_result, phase2_result, phase1_result, lang):
+    sys_instr = get_system_instruction(lang)
+    prompt = "\n".join([
+        "PHASE 5: ASSET GENERATION & FINAL REPORT",
+        "",
+        "CONTEXT DATA:",
+        f"- Technical: {json.dumps(phase1_result)}",
+        f"- ICPs: {json.dumps(phase2_result)}",
+        f"- Targets (Whales): {json.dumps(phase3_result)}",
+        f"- Strategy: {json.dumps(phase4_result)}",
+        "",
+        "TASK:",
+        '1. Create a "GTM STRATEGY REPORT" in Markdown.',
+        "2. Report Structure: Executive Summary, Product Analysis, Target Audience, Target Accounts, Strategy Matrix, Assets.",
+        '3. Hybrid-Check: Ensure "Hybrid Service Logic" is visible.',
+        "",
+        "Output:",
+        'Return strictly MARKDOWN formatted text. Start with "# GTM STRATEGY REPORT".'
+    ])
+    # For Phase 5, we expect TEXT (Markdown), not JSON. So json_mode=False.
+    log_to_stderr("Starting Phase 5: Asset Generation...")
+    response = call_gemini_flash(prompt, system_instruction=sys_instr, json_mode=False)
+    # The frontend expects a string here, not a JSON object wrapping it?
+    return response
+
+def generate_sales_enablement(phase4_result, phase3_result, phase1_result, lang):
+    sys_instr = get_system_instruction(lang)
+    prompt = "\n".join([
+        "PHASE 6: SALES ENABLEMENT & VISUALS",
+        "",
+        "CONTEXT:",
+        f"- Product Features: {json.dumps(phase1_result.get('features'))}",
+        f"- Accounts (Personas): {json.dumps(phase3_result.get('roles'))}",
+        f"- Strategy: {json.dumps(phase4_result.get('strategyMatrix'))}",
+        "",
+        "TASK:",
+        "1. Anticipate Friction & Objections.",
+        "2. Formulate Battlecards.",
+        "3. Create Visual Prompts.",
+        "",
+        "Output JSON format ONLY:",
+        "{",
+        '    "battlecards": [',
+        "        {",
+        '            "persona": "Role",',
+        '            "objection": "Objection quote",',
+        '            "responseScript": "Response"',
+        "        }
+    ],
+        '    "visualPrompts": [',
+        "        {",
+        '            "title": "Title",',
+        '            "context": "Context",',
+        '            "prompt": "Prompt Code"',
+        "        }
+    ],
+    ])
+    log_to_stderr("Starting Phase 6: Sales Enablement...")
+    response = call_gemini_flash(prompt, system_instruction=sys_instr, json_mode=True)
+    return json.loads(response)


-# --- Main Execution Logic ---
+# --- MAIN ---

 def main():
-    parser = argparse.ArgumentParser(description='GTM Architect main entry point.')
-    parser.add_argument('--data', help='JSON data for the command.')
-    parser.add_argument('--mode', required=True, help='The command to execute.')
-
-    args = parser.parse_args()
+    log_to_stderr("--- GTM Orchestrator Starting ---")
    
-    command = args.mode
-    
-    # Get data from --data arg or stdin
-    if args.data:
-        data = json.loads(args.data)
-    else:
-        # Read from stdin if no data arg provided
-        try:
-            input_data = sys.stdin.read()
-            data = json.loads(input_data) if input_data.strip() else {}
-        except Exception:
-            data = {}
+    # --- CRITICAL FIXES FOR API KEY & SCRAPING ---
+    # 1. Load API keys manually because helpers.py relies on Config class state
+    try:
+        Config.load_api_keys()
+        log_to_stderr("API Keys loaded.")
+        logging.info("Config.load_api_keys() called successfully.")
+    except Exception as e:
+        log_to_stderr(f"CRITICAL: Failed to load API keys: {e}")
+        logging.critical(f"Failed to load API keys: {e}")
+    # ---------------------------------------------

-    if not command:
-        print(json.dumps({"error": "No command or mode provided"}))
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--mode', required=True)
+    parser.add_argument('--data', required=True)
+    
+    try:
+        args = parser.parse_args()
+        data_in = json.loads(args.data)
+        mode = args.mode
+        lang = data_in.get('language', 'de')
+        
+        log_to_stderr(f"Processing mode: {mode} in language: {lang}")
+        logging.info(f"Processing mode: {mode} in language: {lang}")
+
+        result = {}
+
+        if mode == 'analyze_product':
+            product_input = data_in.get('productInput')
+            result = analyze_product(product_input, lang)
+
+        elif mode == 'discover_icps':
+            phase1_result = data_in.get('phase1Result')
+            result = discover_icps(phase1_result, lang)
+
+        elif mode == 'hunt_whales':
+            phase2_result = data_in.get('phase2Result')
+            result = hunt_whales(phase2_result, lang)
+            
+        elif mode == 'develop_strategy':
+            phase3_result = data_in.get('phase3Result')
+            phase1_result = data_in.get('phase1Result')
+            result = develop_strategy(phase3_result, phase1_result, lang)
+
+        elif mode == 'generate_assets':
+            phase4_result = data_in.get('phase4Result')
+            phase3_result = data_in.get('phase3Result')
+            phase2_result = data_in.get('phase2Result')
+            phase1_result = data_in.get('phase1Result')
+            # Returns a string (Markdown)
+            markdown_report = generate_assets(phase4_result, phase3_result, phase2_result, phase1_result, lang)
+            print(json.dumps(markdown_report)) 
+            log_to_stderr("Finished Phase 5. Output sent to stdout.")
+            return
+
+        elif mode == 'generate_sales_enablement':
+            phase4_result = data_in.get('phase4Result')
+            phase3_result = data_in.get('phase3Result')
+            phase1_result = data_in.get('phase1Result')
+            result = generate_sales_enablement(phase4_result, phase3_result, phase1_result, lang)
+
+        else:
+            logging.error(f"Unknown mode: {mode}")
+            result = {"error": f"Unknown mode: {mode}"}
+
+        print(json.dumps(result))
+        log_to_stderr("Finished. Output sent to stdout.")
+
+    except Exception as e:
+        log_to_stderr(f"CRITICAL ERROR: {e}")
+        logging.error(f"Error in orchestrator: {e}", exc_info=True)
+        # Return error as JSON so server.cjs can handle it gracefully
+        print(json.dumps({"error": str(e)}))
        sys.exit(1)

-    command_handlers = {
-        'save': save_project_handler,
-        'list': list_projects_handler,
-        'load': load_project_handler,
-        'delete': delete_project_handler,
-        'analyze': analyze_product,
-    }
-
-    handler = command_handlers.get(command)
-    if handler:
-        handler(data)
-    else:
-        print(json.dumps({"error": f"Unknown command: {command}"}))
-
-if __name__ == '__main__':
-    print(f"GTM Architect Engine v{__version__}_FORCE_RELOAD_2 running.", file=sys.stderr)
-    main()
+if __name__ == "__main__":
+    main()
--- a/helpers.py
+++ b/helpers.py