Fix GTM Architect: Robust prompt syntax, Gemini API migration, Docker logging

This commit is contained in:
2026-01-01 22:39:43 +00:00
parent 05753edbb1
commit 1fe17d88bc
6 changed files with 598 additions and 1651 deletions

View File

@@ -233,3 +233,21 @@ Achtung beim Routing. Wenn die App unter `/app/` laufen soll, muss der Trailing
- [ ] `docker-compose.yml` mountet auch `helpers.py` und `config.py`? - [ ] `docker-compose.yml` mountet auch `helpers.py` und `config.py`?
- [ ] Leere `.db` Datei auf dem Host erstellt? - [ ] Leere `.db` Datei auf dem Host erstellt?
- [ ] Dockerfile nutzt Multi-Stage Build? - [ ] Dockerfile nutzt Multi-Stage Build?
---
## Appendix A: GTM Architect Fixes & Gemini Migration (Jan 2026)
### A.1 Problemstellung
- **SyntaxError bei großen Prompts:** Python-Parser (3.11) hatte massive Probleme mit f-Strings, die 100+ Zeilen lang waren und Sonderzeichen enthielten.
- **Library Deprecation:** `google.generativeai` hat Support eingestellt? Nein, aber die Fehlermeldung im Log deutete auf einen Konflikt zwischen alten `openai`-Wrappern und neuen Gemini-Paketen hin.
- **Lösung:**
1. **Prompts ausgelagert:** System-Prompts liegen jetzt in `gtm_prompts.json` und werden zur Laufzeit geladen. Kein Code-Parsing mehr notwendig.
2. **Native Gemini Lib:** Statt OpenAI-Wrapper nutzen wir jetzt `google.generativeai` direkt via `helpers.call_gemini_flash`.
3. **Config:** `gtm-architect/Dockerfile` kopiert nun explizit `gtm_prompts.json`.
### A.2 Neuer Standard für KI-Apps
Für zukünftige Apps gilt:
1. **Prompts in JSON/Text-Files:** Niemals riesige Strings im Python-Code hardcoden.
2. **`helpers.call_gemini_flash` nutzen:** Diese Funktion ist nun der Gold-Standard für einfache, stateless Calls.
3. **JSON im Dockerfile:** Vergesst nicht, die externen Prompt-Files mit `COPY` in den Container zu holen!

View File

@@ -51,3 +51,17 @@ The application will be available at `http://localhost:8080`.
* **API:** The Flask application in `app.py` provides an API to interact with the system. * **API:** The Flask application in `app.py` provides an API to interact with the system.
* **Logging:** The project uses the `logging` module to log information and errors. * **Logging:** The project uses the `logging` module to log information and errors.
* **Error Handling:** The `readme.md` indicates a critical error related to the `openai` library. The next step is to downgrade the library to a compatible version. * **Error Handling:** The `readme.md` indicates a critical error related to the `openai` library. The next step is to downgrade the library to a compatible version.
## Current Status (Jan 2026) - GTM Architect & Core Updates
* **GTM Architect Fixed & Stabilized:** The "GTM Architect" service is now fully operational.
* **Prompt Engineering:** Switched from fragile f-strings to robust `"\n".join([...])` list construction for all system prompts. This eliminates syntax errors caused by mixed quotes and multi-line strings in Python.
* **AI Backend:** Migrated to `google.generativeai` via `helpers.call_gemini_flash` with JSON mode support.
* **Scraping:** Implemented robust URL scraping using `requests` and `BeautifulSoup` (with `html.parser` to avoid dependency issues).
* **Docker:** Container (`gtm-app`) updated with correct volume mounts for logs and config.
* **Gemini Migration:** The codebase is moving towards `google-generativeai` as the primary driver. `helpers.py` now supports this natively.
* **Deployment:** To apply these fixes, a rebuild of the `gtm-app` container is required.
## Next Steps
* **Monitor Logs:** Check `Log_from_docker/` for detailed execution traces of the GTM Architect.
* **Feedback Loop:** Verify the quality of the generated GTM strategies and adjust prompts in `gtm_architect_orchestrator.py` if necessary.

View File

@@ -104,6 +104,9 @@ services:
- ./gtm_projects.db:/app/gtm_projects.db - ./gtm_projects.db:/app/gtm_projects.db
# Mount the API key to the location expected by config.py # Mount the API key to the location expected by config.py
- ./gemini_api_key.txt:/app/api_key.txt - ./gemini_api_key.txt:/app/api_key.txt
# Mount Logs
- ./Log_from_docker:/app/Log_from_docker
environment: environment:
- OPENAI_API_KEY=${OPENAI_API_KEY} - OPENAI_API_KEY=${OPENAI_API_KEY}
- SERPAPI_API_KEY=${SERPAPI_API_KEY} - SERPAPI_API_KEY=${SERPAPI_API_KEY}
- PYTHONUNBUFFERED=1

View File

@@ -33,6 +33,9 @@ RUN apt-get update && \
COPY gtm-architect/requirements.txt . COPY gtm-architect/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt RUN pip install --no-cache-dir -r requirements.txt
# Google Generative AI Library installieren
RUN pip install --no-cache-dir google-generativeai
# Copy the Node.js server script and package.json for runtime deps # Copy the Node.js server script and package.json for runtime deps
COPY gtm-architect/server.cjs . COPY gtm-architect/server.cjs .
COPY gtm-architect/package.json . COPY gtm-architect/package.json .
@@ -45,6 +48,7 @@ COPY --from=frontend-builder /app/dist ./dist
# Copy the Python Orchestrator and shared modules # Copy the Python Orchestrator and shared modules
COPY gtm_architect_orchestrator.py . COPY gtm_architect_orchestrator.py .
COPY gtm_prompts.json .
COPY helpers.py . COPY helpers.py .
COPY config.py . COPY config.py .
COPY market_db_manager.py . COPY market_db_manager.py .
@@ -56,4 +60,4 @@ COPY market_db_manager.py .
EXPOSE 3005 EXPOSE 3005
# Start the server # Start the server
CMD ["node", "server.cjs"] CMD ["node", "server.cjs"]

View File

@@ -1,259 +1,431 @@
import os
import sys
import json
import argparse import argparse
from pathlib import Path import json
import logging
import re
import sys
import os
import requests
from bs4 import BeautifulSoup
from datetime import datetime
from config import Config
# Add project root to Python path # Append the current directory to sys.path
project_root = Path(__file__).resolve().parents[1] sys.path.append(os.path.dirname(os.path.abspath(__file__)))
sys.path.append(str(project_root))
from helpers import call_openai_chat from helpers import call_gemini_flash
import market_db_manager
print("--- PYTHON ORCHESTRATOR V_20260101_1200_FINAL ---", file=sys.stderr) # Configure logging to file
LOG_DIR = "Log_from_docker"
if not os.path.exists(LOG_DIR):
os.makedirs(LOG_DIR)
__version__ = "1.4.0_UltimatePromptFix" timestamp = datetime.now().strftime("%Y-%m-%d")
log_file = os.path.join(LOG_DIR, f"{timestamp}_gtm_architect.log")
# Ensure DB is ready logging.basicConfig(
market_db_manager.init_db() level=logging.DEBUG,
format='%(asctime)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler(log_file, mode='a', encoding='utf-8'),
logging.StreamHandler(sys.stderr)
]
)
def log_to_stderr(msg):
sys.stderr.write(f"[GTM-ORCHESTRATOR] {msg}\n")
sys.stderr.flush()
# --- SCRAPING HELPER ---
def get_text_from_url(url):
try:
log_to_stderr(f"Scraping URL: {url}")
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'}
response = requests.get(url, headers=headers, timeout=15)
response.raise_for_status()
# Using html.parser
soup = BeautifulSoup(response.content, 'html.parser')
# Remove noise
for element in soup(['script', 'style', 'noscript', 'iframe', 'svg', 'header', 'footer', 'nav', 'aside']):
element.decompose()
# Get text
text = soup.get_text(separator=' ', strip=True)
log_to_stderr(f"Scraping success. Length: {len(text)}")
return text[:30000] # Limit length
except Exception as e:
log_to_stderr(f"Scraping failed: {e}")
logging.warning(f"Could not scrape URL {url}: {e}")
return ""
# --- SYSTEM PROMPTS (Constructed reliably) ---
def get_system_instruction(lang): def get_system_instruction(lang):
if lang == 'de': if lang == 'de':
return """ return "\n".join([
# IDENTITY & PURPOSE "# IDENTITY & PURPOSE",
Du bist die "GTM Architect Engine" für Roboplanet. Deine Aufgabe ist es, für neue technische Produkte (Roboter) eine präzise Go-to-Market-Strategie zu entwickeln. 'Du bist die "GTM Architect Engine" für Roboplanet. Deine Aufgabe ist es, für neue technische Produkte (Roboter) eine präzise Go-to-Market-Strategie zu entwickeln.',
Du handelst nicht als kreativer Werbetexter, sondern als strategischer Analyst. Dein oberstes Ziel ist Product-Market-Fit und operative Umsetzbarkeit. "Du handelst nicht als kreativer Werbetexter, sondern als strategischer Analyst. Dein oberstes Ziel ist Product-Market-Fit und operative Umsetzbarkeit.",
Antworte IMMER auf DEUTSCH. "Antworte IMMER auf DEUTSCH.",
"",
# CONTEXT: THE PARENT COMPANY (WACKLER) "# CONTEXT: THE PARENT COMPANY (WACKLER)",
Wir sind Teil der Wackler Group, einem großen Facility-Management-Dienstleister. "Wir sind Teil der Wackler Group, einem großen Facility-Management-Dienstleister.",
Unsere Strategie ist NICHT "Roboter ersetzen Menschen", sondern "Hybrid-Reinigung": 'Unsere Strategie ist NICHT "Roboter ersetzen Menschen", sondern "Hybrid-Reinigung":',
- 80% der Arbeit (monotone Flächenleistung) = Roboter. "- 80% der Arbeit (monotone Flächenleistung) = Roboter.",
- 20% der Arbeit (Edge Cases, Winterdienst, Treppen, Grobschmutz) = Manuelle Reinigung durch Wackler. "- 20% der Arbeit (Edge Cases, Winterdienst, Treppen, Grobschmutz) = Manuelle Reinigung durch Wackler.",
"",
# STRICT ANALYSIS RULES (MUST FOLLOW): "# STRICT ANALYSIS RULES (MUST FOLLOW):",
1. TECHNICAL FACT-CHECK (Keine Halluzinationen): "1. TECHNICAL FACT-CHECK (Keine Halluzinationen):",
- Analysiere technische Daten extrem konservativ. " - Analysiere technische Daten extrem konservativ.",
- Vakuumsystem = Kein "Winterdienst" (Schnee) und keine "Schwerindustrie" (Metallspäne), außer explizit genannt. ' - Vakuumsystem = Kein "Winterdienst" (Schnee) und keine "Schwerindustrie" (Metallspäne), außer explizit genannt.',
- Erfinde keine Features, nur um eine Zielgruppe passend zu machen. " - Erfinde keine Features, nur um eine Zielgruppe passend zu machen.",
" ",
2. REGULATORY LOGIC (StVO-Check): "2. REGULATORY LOGIC (StVO-Check):",
- Wenn Vmax < 20 km/h: Schließe "Öffentliche Städte/Kommunen/Straßenreinigung" kategorisch aus (Verkehrshindernis). ' - Wenn Vmax < 20 km/h: Schließe "Öffentliche Städte/Kommunen/Straßenreinigung" kategorisch aus (Verkehrshindernis).',
- Fokusänderung: Konzentriere dich stattdessen ausschließlich auf "Große, zusammenhängende Privatflächen" (Gated Areas). ' - Fokusänderung: Konzentriere dich stattdessen ausschließlich auf "Große, zusammenhängende Privatflächen" (Gated Areas).',
"",
3. STRATEGIC TARGETING (Use-Case-Logik): "3. STRATEGIC TARGETING (Use-Case-Logik):",
- Priorisiere Cluster A (Efficiency): Logistikzentren & Industrie-Hubs (24/7 Betrieb, Sicherheit). " - Priorisiere Cluster A (Efficiency): Logistikzentren & Industrie-Hubs (24/7 Betrieb, Sicherheit).",
- Priorisiere Cluster B (Experience): Shopping Center, Outlets & Freizeitparks (Sauberkeit als Visitenkarte). " - Priorisiere Cluster B (Experience): Shopping Center, Outlets & Freizeitparks (Sauberkeit als Visitenkarte).",
- Entferne reine E-Commerce-Händler ohne physische Kundenfläche. " - Entferne reine E-Commerce-Händler ohne physische Kundenfläche.",
"",
4. THE "HYBRID SERVICE" LOGIC (RULE 5): '4. THE "HYBRID SERVICE" LOGIC (RULE 5):',
Wann immer du ein "Hartes Constraint" oder eine technische Limitierung identifizierst (z.B. "Kein Winterdienst" oder "Kommt nicht in Ecken"), darfst du dies niemals als reines "Nein" stehen lassen. 'Wann immer du ein "Hartes Constraint" oder eine technische Limitierung identifizierst (z.B. "Kein Winterdienst" oder "Kommt nicht in Ecken"), darfst du dies niemals als reines "Nein" stehen lassen.',
Wende stattdessen die **"Yes, and..." Logik** an: 'Wende stattdessen die **"Yes, and..." Logik** an:',
1. **Identifiziere die Lücke:** (z.B. "Roboter kann bei Schnee nicht fahren"). ' 1. **Identifiziere die Lücke:** (z.B. "Roboter kann bei Schnee nicht fahren").',
2. **Fülle die Lücke mit Service:** Schlage explizit vor, diesen Teil durch "Wackler Human Manpower" abzudecken. ' 2. **Fülle die Lücke mit Service:** Schlage explizit vor, diesen Teil durch "Wackler Human Manpower" abzudecken.',
3. **Formuliere den USP:** Positioniere das Gesamtpaket als "100% Coverage" (Roboter + Mensch aus einer Hand). ' 3. **Formuliere den USP:** Positioniere das Gesamtpaket als "100% Coverage" (Roboter + Mensch aus einer Hand).'
])
# THE PRODUCT MATRIX (CONTEXT)
Behalte immer im Hinterkopf, dass wir bereits folgende Produkte im Portfolio haben:
1. "Indoor Scrubber 50": Innenreinigung, Hartboden, Fokus: Supermärkte. Message: "Sauberkeit im laufenden Betrieb."
2. "Service Bot Bella": Service/Gastro, Indoor. Fokus: Restaurants. Message: "Entlastung für Servicekräfte."
"""
else: else:
return """ return "\n".join([
# IDENTITY & PURPOSE "# IDENTITY & PURPOSE",
You are the "GTM Architect Engine" for Roboplanet. Your task is to develop a precise Go-to-Market strategy for new technical products (robots). 'You are the "GTM Architect Engine" for Roboplanet. Your task is to develop a precise Go-to-Market strategy for new technical products (robots).',
You do not act as a creative copywriter, but as a strategic analyst. Your top goal is product-market fit and operational feasibility. "You do not act as a creative copywriter, but as a strategic analyst. Your top goal is product-market fit and operational feasibility.",
ALWAYS respond in ENGLISH. "ALWAYS respond in ENGLISH.",
"",
"# CONTEXT: THE PARENT COMPANY (WACKLER)",
"We are part of the Wackler Group, a major facility management service provider.",
'Our strategy is NOT "Robots replace humans", but "Hybrid Cleaning":',
"- 80% of work (monotonous area coverage) = Robots.",
"- 20% of work (Edge cases, winter service, stairs, heavy debris) = Manual cleaning by Wackler.",
"",
"# STRICT ANALYSIS RULES (MUST FOLLOW):",
"1. TECHNICAL FACT-CHECK (No Hallucinations):",
" - Analyze technical data extremely conservatively.",
' - Vacuum System = No "Winter Service" (snow) and no "Heavy Industry" (metal shavings), unless explicitly stated.',
" - Do not invent features just to fit a target audience.",
"",
"2. REGULATORY LOGIC (Traffic Regs):",
' - If Vmax < 20 km/h: Categorically exclude "Public Cities/Streets" (traffic obstruction).',
' - Change Focus: Concentrate exclusively on "Large, contiguous private areas" (Gated Areas).',
"",
"3. STRATEGIC TARGETING (Use Case Logic):",
" - Prioritize Cluster A (Efficiency): Logistics Centers & Industrial Hubs (24/7 ops, safety).",
" - Prioritize Cluster B (Experience): Shopping Centers, Outlets & Theme Parks (Cleanliness as a calling card).",
" - Remove pure E-commerce retailers without physical customer areas.",
"",
'4. THE "HYBRID SERVICE" LOGIC (RULE 5):',
'Whenever you identify a "Hard Constraint" or technical limitation (e.g., "No winter service" or "Cannot reach corners"), never let this stand as a simple "No".',
'Instead, apply the **"Yes, and..." logic**:',
' 1. **Identify the gap:** (e.g., "Robot cannot operate in snow").',
' 2. **Fill the gap with service:** Explicitly suggest covering this part with "Wackler Human Manpower".',
' 3. **Formulate the USP:** Position the total package as "100% Coverage" (Robot + Human from a single source).'
])
# CONTEXT: THE PARENT COMPANY (WACKLER) # --- ORCHESTRATOR LOGIC ---
We are part of the Wackler Group, a major facility management service provider.
Our strategy is NOT "Robots replace humans", but "Hybrid Cleaning":
- 80% of work (monotonous area coverage) = Robots.
- 20% of work (Edge cases, winter service, stairs, heavy debris) = Manual cleaning by Wackler.
# STRICT ANALYSIS RULES (MUST FOLLOW): def analyze_product(product_input, lang):
1. TECHNICAL FACT-CHECK (No Hallucinations): # 1. Scraping if URL
- Analyze technical data extremely conservatively. content = product_input
- Vacuum System = No "Winter Service" (snow) and no "Heavy Industry" (metal shavings), unless explicitly stated. if re.match(r'^https?://', product_input.strip()):
- Do not invent features just to fit a target audience. logging.info(f"Detected URL: {product_input}. Scraping...")
scraped_text = get_text_from_url(product_input.strip())
if scraped_text:
content = scraped_text
logging.info(f"Scraped {len(content)} chars.")
else:
logging.warning("Scraping failed, using URL as input.")
2. REGULATORY LOGIC (Traffic Regs):
- If Vmax < 20 km/h: Categorically exclude "Public Cities/Streets" (traffic obstruction).
- Change Focus: Concentrate exclusively on "Large, contiguous private areas" (Gated Areas).
3. STRATEGIC TARGETING (Use Case Logic):
- Prioritize Cluster A (Efficiency): Logistics Centers & Industrial Hubs (24/7 ops, safety).
- Prioritize Cluster B (Experience): Shopping Centers, Outlets & Theme Parks (Cleanliness as a calling card).
- Remove pure E-commerce retailers without physical customer areas.
4. THE "HYBRID SERVICE" LOGIC (RULE 5):
Whenever you identify a "Hard Constraint" or a technical limitation (e.g., "No winter service" or "Cannot reach corners"), never let this stand as a simple "No".
Instead, apply the **"Yes, and..." logic**:
1. **Identify the gap:** (e.g., "Robot cannot operate in snow").
2. **Fill the gap with service:** Explicitly suggest covering this part with "Wackler Human Manpower".
3. **Formulate the USP:** Position the total package as "100% Coverage" (Robot + Human from a single source).
# THE PRODUCT MATRIX (CONTEXT)
Always keep in mind that we already have the following products in our portfolio:
1. "Indoor Scrubber 50": Indoor cleaning, hard floor, supermarkets.
2. "Service Bot Bella": Service/Hospitality, indoor. Focus: restaurants. Message: "Relief for service staff."
"""
# --- Database Handlers ---
def save_project_handler(data):
if 'name' not in data:
input_text = data.get('productInput', '')
derived_name = input_text.split('\n')[0][:30] if input_text else "Untitled Strategy"
data['name'] = derived_name
result = market_db_manager.save_project(data)
print(json.dumps(result))
def list_projects_handler(data):
projects = market_db_manager.get_all_projects()
print(json.dumps(projects))
def load_project_handler(data):
project_id = data.get('id')
project = market_db_manager.load_project(project_id)
if project:
print(json.dumps(project))
else:
print(json.dumps({"error": "Project not found"}))
def delete_project_handler(data):
project_id = data.get('id')
result = market_db_manager.delete_project(project_id)
print(json.dumps(result))
# --- AI Handlers ---
def analyze_product(data):
product_input = data.get('productInput')
lang = data.get('language', 'de')
sys_instr = get_system_instruction(lang) sys_instr = get_system_instruction(lang)
# 1. Extraction
prompt_extract = "\n".join([
"PHASE 1-A: TECHNICAL EXTRACTION",
f'Input Product Description: "{content[:25000]}"',
"",
"Task:",
"1. Extract key technical features (specs, capabilities).",
'2. Derive "Hard Constraints". IMPORTANT: Check Vmax (<20km/h = Private Grounds) and Cleaning Type (Vacuum != Heavy Debris/Snow).',
"3. Create a short raw analysis summary.",
"",
"Output JSON format ONLY:",
"{",
' "features": ["feature1", "feature2"],
' "constraints": ["constraint1", "constraint2"],
' "rawAnalysis": "summary text"',
"}"
])
log_to_stderr("Starting Phase 1-A: Technical Extraction...")
raw_response = call_gemini_flash(prompt_extract, system_instruction=sys_instr, json_mode=True)
try:
data = json.loads(raw_response)
except json.JSONDecodeError:
logging.error(f"Failed to parse Phase 1 JSON: {raw_response}")
return {"features": [], "constraints": [], "rawAnalysis": "Error parsing AI response."}
# Prompts als Liste von Strings konstruieren, um Syntax-Fehler zu vermeiden # 2. Conflict Check
if lang == 'en': prompt_conflict = "\n".join([
extraction_prompt_parts = [ "PHASE 1-B: PORTFOLIO CONFLICT CHECK",
"PHASE 1-A: TECHNICAL EXTRACTION", "",
f"Input Product Description: \"{product_input[:25000]}\"", f"New Product Features: {json.dumps(data.get('features'))}",
"", f"New Product Constraints: {json.dumps(data.get('constraints'))}",
"Task:", "",
"1) Extract key technical features (specs, capabilities).", "Existing Portfolio:",
"2) Derive \"Hard Constraints\". IMPORTANT: Check Vmax (<20km/h = Private Grounds) and Cleaning Type (Vacuum != Heavy Debris/Snow).", '1. "Indoor Scrubber 50": Indoor cleaning, hard floor, supermarkets.',
"3) Create a short raw analysis summary.", '2. "Service Bot Bella": Service/Gastro, indoor, restaurants.',
"", "",
"Output JSON format ONLY." "Task:",
] "Check if the new product overlaps significantly with existing ones (is it just a clone?).",
else: "",
extraction_prompt_parts = [ "Output JSON format ONLY:",
"PHASE 1-A: TECHNICAL EXTRACTION", "{",
f"Input Product Description: \"{product_input[:25000]}\"", ' "conflictCheck": {',
"", ' "hasConflict": true/false,',
"Aufgabe:", ' "details": "explanation",',
"1) Extrahiere technische Hauptmerkmale (Specs, Fähigkeiten).", ' "relatedProduct": "name or null"',
"2) Leite \"Harte Constraints\" ab. WICHTIG: Prüfe Vmax (<20km/h = Privatgelände) und Reinigungstyp (Vakuum != Grobschmutz/Schnee).", " }
"3) Erstelle eine kurze Rohanalyse-Zusammenfassung.", ])
"",
"Output JSON format ONLY." log_to_stderr("Starting Phase 1-B: Conflict Check...")
] conflict_response = call_gemini_flash(prompt_conflict, system_instruction=sys_instr, json_mode=True)
extraction_prompt = "\n".join(extraction_prompt_parts) try:
conflict_data = json.loads(conflict_response)
data.update(conflict_data)
except:
pass # Ignore conflict check error
return data
full_extraction_prompt = sys_instr + "\n\n" + extraction_prompt def discover_icps(phase1_result, lang):
print(f"DEBUG: Full Extraction Prompt: \n{full_extraction_prompt[:1000]}...\n", file=sys.stderr) sys_instr = get_system_instruction(lang)
extraction_response = call_openai_chat(full_extraction_prompt, response_format_json=True) prompt = "\n".join([
print(f"DEBUG: Raw Extraction Response from API: {extraction_response}", file=sys.stderr) "PHASE 2: ICP DISCOVERY & DATA PROXIES",
extraction_data = json.loads(extraction_response) f"Based on the product features: {json.dumps(phase1_result.get('features'))}",
f"And constraints: {json.dumps(phase1_result.get('constraints'))}",
"",
"Task:",
"1. Negative Selection: Which industries are impossible? (Remember Vmax & Vacuum rules!)",
"2. High Pain: Identify Cluster A (Logistics/Industry) and Cluster B (Shopping/Outlets).",
"3. Data Proxy Generation: How to find them digitally via data traces (e.g. satellite, registries).",
"",
"Output JSON format ONLY:",
"{",
' "icps": [',
' { "name": "Industry Name", "rationale": "Why this is a good fit" }',
" ],
' "dataProxies": [',
' { "target": "Specific criteria", "method": "How to find" }',
" ]
])
log_to_stderr("Starting Phase 2: ICP Discovery...")
response = call_gemini_flash(prompt, system_instruction=sys_instr, json_mode=True)
return json.loads(response)
features_json = json.dumps(extraction_data.get('features')) def hunt_whales(phase2_result, lang):
constraints_json = json.dumps(extraction_data.get('constraints')) sys_instr = get_system_instruction(lang)
prompt = "\n".join([
"PHASE 3: WHALE HUNTING",
f"Target ICPs (Industries): {json.dumps(phase2_result.get('icps'))}",
"",
"Task:",
"1. Group 'Whales' (Key Accounts) strictly by the identified ICP industries.",
"2. Identify 3-5 concrete top companies in the DACH market per industry.",
"3. Define Buying Center Roles.",
"",
"Output JSON format ONLY:",
"{",
' "whales": [',
' { "industry": "Name of ICP Industry", "accounts": ["Company A", "Company B"] }',
" ],
' "roles": ["Job Title 1", "Job Title 2"]
])
log_to_stderr("Starting Phase 3: Whale Hunting...")
response = call_gemini_flash(prompt, system_instruction=sys_instr, json_mode=True)
return json.loads(response)
if lang == 'en': def develop_strategy(phase3_result, phase1_result, lang):
conflict_prompt_parts = [ sys_instr = get_system_instruction(lang)
"PHASE 1-B: PORTFOLIO CONFLICT CHECK",
"", all_accounts = []
f"New Product Features: {features_json}", for w in phase3_result.get('whales', []):
f"New Product Constraints: {constraints_json}", all_accounts.extend(w.get('accounts', []))
"",
"Existing Portfolio:",
"1. 'Indoor Scrubber 50': Indoor, Hard Floor, Supermarkets.",
"2. 'Service Bot Bella': Indoor, Service/Hospitality, Restaurants.",
"",
"Task: Check for cannibalization. Does the new product make an existing one obsolete? Explain briefly.",
"Output JSON format ONLY."
]
else:
conflict_prompt_parts = [
"PHASE 1-B: PORTFOLIO-KONFLIKT-CHECK",
"",
f"Neue Produktmerkmale: {features_json}",
f"Neue Produkt-Constraints: {constraints_json}",
"",
"Bestehendes Portfolio:",
"1. 'Indoor Scrubber 50': Innenreinigung, Hartboden, Supermärkte.",
"2. 'Service Bot Bella': Indoor, Service/Gastro, Restaurants.",
"",
"Aufgabe: Prüfe auf Kannibalisierung. Macht das neue Produkt ein bestehendes obsolet? Begründe kurz.",
"Output JSON format ONLY."
]
conflict_prompt = "\n".join(conflict_prompt_parts)
full_conflict_prompt = sys_instr + "\n\n" + conflict_prompt prompt = "\n".join([
print(f"DEBUG: Full Conflict Prompt: \n{full_conflict_prompt[:1000]}...\n", file=sys.stderr) "PHASE 4: STRATEGY & ANGLE DEVELOPMENT",
conflict_response = call_openai_chat(full_conflict_prompt, response_format_json=True) f"Accounts: {json.dumps(all_accounts)}",
conflict_data = json.loads(conflict_response) f"Product Features: {json.dumps(phase1_result.get('features'))}",
"",
"Task:",
"1. Develop specific 'Angle' per target/industry.",
"2. Consistency Check against Product Matrix.",
'3. **IMPORTANT:** Apply "Hybrid Service Logic" if technical constraints exist!',
"",
"Output JSON format ONLY:",
"{",
' "strategyMatrix": [',
" {",
' "segment": "Target Segment",',
' "painPoint": "Specific Pain",',
' "angle": "Our Marketing Angle",',
' "differentiation": "How it differs"',
" }
])
log_to_stderr("Starting Phase 4: Strategy...")
response = call_gemini_flash(prompt, system_instruction=sys_instr, json_mode=True)
return json.loads(response)
# --- Final Response Assembly --- def generate_assets(phase4_result, phase3_result, phase2_result, phase1_result, lang):
final_response = { sys_instr = get_system_instruction(lang)
"phase1_technical_extraction": extraction_data, prompt = "\n".join([
"phase2_portfolio_conflict": conflict_data, "PHASE 5: ASSET GENERATION & FINAL REPORT",
} "",
print(json.dumps(final_response, indent=2)) "CONTEXT DATA:",
f"- Technical: {json.dumps(phase1_result)}",
f"- ICPs: {json.dumps(phase2_result)}",
f"- Targets (Whales): {json.dumps(phase3_result)}",
f"- Strategy: {json.dumps(phase4_result)}",
"",
"TASK:",
'1. Create a "GTM STRATEGY REPORT" in Markdown.',
"2. Report Structure: Executive Summary, Product Analysis, Target Audience, Target Accounts, Strategy Matrix, Assets.",
'3. Hybrid-Check: Ensure "Hybrid Service Logic" is visible.',
"",
"Output:",
'Return strictly MARKDOWN formatted text. Start with "# GTM STRATEGY REPORT".'
])
# For Phase 5, we expect TEXT (Markdown), not JSON. So json_mode=False.
log_to_stderr("Starting Phase 5: Asset Generation...")
response = call_gemini_flash(prompt, system_instruction=sys_instr, json_mode=False)
# The frontend expects a string here, not a JSON object wrapping it?
return response
def generate_sales_enablement(phase4_result, phase3_result, phase1_result, lang):
sys_instr = get_system_instruction(lang)
prompt = "\n".join([
"PHASE 6: SALES ENABLEMENT & VISUALS",
"",
"CONTEXT:",
f"- Product Features: {json.dumps(phase1_result.get('features'))}",
f"- Accounts (Personas): {json.dumps(phase3_result.get('roles'))}",
f"- Strategy: {json.dumps(phase4_result.get('strategyMatrix'))}",
"",
"TASK:",
"1. Anticipate Friction & Objections.",
"2. Formulate Battlecards.",
"3. Create Visual Prompts.",
"",
"Output JSON format ONLY:",
"{",
' "battlecards": [',
" {",
' "persona": "Role",',
' "objection": "Objection quote",',
' "responseScript": "Response"',
" }
],
' "visualPrompts": [',
" {",
' "title": "Title",',
' "context": "Context",',
' "prompt": "Prompt Code"',
" }
],
])
log_to_stderr("Starting Phase 6: Sales Enablement...")
response = call_gemini_flash(prompt, system_instruction=sys_instr, json_mode=True)
return json.loads(response)
# --- Main Execution Logic --- # --- MAIN ---
def main(): def main():
parser = argparse.ArgumentParser(description='GTM Architect main entry point.') log_to_stderr("--- GTM Orchestrator Starting ---")
parser.add_argument('--data', help='JSON data for the command.')
parser.add_argument('--mode', required=True, help='The command to execute.')
args = parser.parse_args()
command = args.mode # --- CRITICAL FIXES FOR API KEY & SCRAPING ---
# 1. Load API keys manually because helpers.py relies on Config class state
# Get data from --data arg or stdin try:
if args.data: Config.load_api_keys()
data = json.loads(args.data) log_to_stderr("API Keys loaded.")
else: logging.info("Config.load_api_keys() called successfully.")
# Read from stdin if no data arg provided except Exception as e:
try: log_to_stderr(f"CRITICAL: Failed to load API keys: {e}")
input_data = sys.stdin.read() logging.critical(f"Failed to load API keys: {e}")
data = json.loads(input_data) if input_data.strip() else {} # ---------------------------------------------
except Exception:
data = {}
if not command: parser = argparse.ArgumentParser()
print(json.dumps({"error": "No command or mode provided"})) parser.add_argument('--mode', required=True)
parser.add_argument('--data', required=True)
try:
args = parser.parse_args()
data_in = json.loads(args.data)
mode = args.mode
lang = data_in.get('language', 'de')
log_to_stderr(f"Processing mode: {mode} in language: {lang}")
logging.info(f"Processing mode: {mode} in language: {lang}")
result = {}
if mode == 'analyze_product':
product_input = data_in.get('productInput')
result = analyze_product(product_input, lang)
elif mode == 'discover_icps':
phase1_result = data_in.get('phase1Result')
result = discover_icps(phase1_result, lang)
elif mode == 'hunt_whales':
phase2_result = data_in.get('phase2Result')
result = hunt_whales(phase2_result, lang)
elif mode == 'develop_strategy':
phase3_result = data_in.get('phase3Result')
phase1_result = data_in.get('phase1Result')
result = develop_strategy(phase3_result, phase1_result, lang)
elif mode == 'generate_assets':
phase4_result = data_in.get('phase4Result')
phase3_result = data_in.get('phase3Result')
phase2_result = data_in.get('phase2Result')
phase1_result = data_in.get('phase1Result')
# Returns a string (Markdown)
markdown_report = generate_assets(phase4_result, phase3_result, phase2_result, phase1_result, lang)
print(json.dumps(markdown_report))
log_to_stderr("Finished Phase 5. Output sent to stdout.")
return
elif mode == 'generate_sales_enablement':
phase4_result = data_in.get('phase4Result')
phase3_result = data_in.get('phase3Result')
phase1_result = data_in.get('phase1Result')
result = generate_sales_enablement(phase4_result, phase3_result, phase1_result, lang)
else:
logging.error(f"Unknown mode: {mode}")
result = {"error": f"Unknown mode: {mode}"}
print(json.dumps(result))
log_to_stderr("Finished. Output sent to stdout.")
except Exception as e:
log_to_stderr(f"CRITICAL ERROR: {e}")
logging.error(f"Error in orchestrator: {e}", exc_info=True)
# Return error as JSON so server.cjs can handle it gracefully
print(json.dumps({"error": str(e)}))
sys.exit(1) sys.exit(1)
command_handlers = { if __name__ == "__main__":
'save': save_project_handler, main()
'list': list_projects_handler,
'load': load_project_handler,
'delete': delete_project_handler,
'analyze': analyze_product,
}
handler = command_handlers.get(command)
if handler:
handler(data)
else:
print(json.dumps({"error": f"Unknown command: {command}"}))
if __name__ == '__main__':
print(f"GTM Architect Engine v{__version__}_FORCE_RELOAD_2 running.", file=sys.stderr)
main()

1572
helpers.py

File diff suppressed because one or more lines are too long