feat(ca): Finalize v5 pipeline - Hybrid Matrix, CoT Enrichment & User Repair Mode

This commit is contained in:
2026-01-12 15:29:43 +00:00
parent b5e6c415c7
commit 70e689384e
15 changed files with 547 additions and 1224 deletions

View File

@@ -81,6 +81,59 @@ Um die Analyse-Ergebnisse optimal nutzbar zu machen, wurde ein intelligenter Imp
- *Cleaning (Indoor/Outdoor), Transport/Logistics, Service/Gastro, Security, Software*. - *Cleaning (Indoor/Outdoor), Transport/Logistics, Service/Gastro, Security, Software*.
* **Dual-Way Relations:** Alle Datenbanken sind bidirektional verknüpft. Auf einer Produktkarte sieht man sofort den Hersteller; auf einer Herstellerkarte sieht man das gesamte (kategorisierte) Portfolio. * **Dual-Way Relations:** Alle Datenbanken sind bidirektional verknüpft. Auf einer Produktkarte sieht man sofort den Hersteller; auf einer Herstellerkarte sieht man das gesamte (kategorisierte) Portfolio.
--- ### 🚀 Next-Gen Analyse-Strategie (v5) - Beschluss vom 12. Jan. 2026
*Dokumentation aktualisiert am 11.01.2026 nach Implementierung der semantischen Klassifizierung (Level 4).*
Die Analyse der v4-Ergebnisse zeigte fundamentale Schwächen im monolithischen Analyse-Ansatz. Probleme wie die falsche Gruppierung von Produktlisten (z.B. "A, B, C" als ein Produkt) und die fehlende Differenzierung desselben Produkts über verschiedene Anbieter hinweg (z.B. "BellaBot" vs. "Pudu Bella Bot") machten eine strategische Neuausrichtung notwendig.
Der neue Prozess basiert auf einer **"Extract-Normalize-Enrich" Pipeline in drei Phasen:**
**Phase 1: Datensammlung (pro Wettbewerber)**
1. **Gezieltes Scraping:** Fokussierte Suche nach Seiten mit Keywords wie "Produkte", "Portfolio", "Lösungen". Der Rohtext dieser Seiten wird extrahiert.
2. **Rohe Produkt-Extraktion (1. LLM-Call):** Ein einfacher, isolierter Prompt extrahiert nur eine unstrukturierte Liste potenzieller Produktnamen.
3. **Deterministische Säuberung (Python):** Die rohe Liste wird per Code aufbereitet (Trennung bei Kommas, "und", etc.), um einzelne Produkte zu isolieren.
4. **Allgemeine Analyse (2. LLM-Call):** Ein paralleler Prompt analysiert den Rohtext auf allgemeine, nicht-produktbezogene Merkmale (`delivery_model`, `target_industries`, `differentiators`).
**Phase 2: Marktweite Produkt-Kanonisierung (Global & Einmalig)**
5. **Master-Liste erstellen:** Die bereinigten Produktlisten aller Wettbewerber werden zu einer globalen Master-Liste zusammengefasst.
6. **Grounded Truth (Hersteller-Daten):** Eine vordefinierte, kanonische Produktliste der wichtigsten Hersteller (Pudu, Gausium etc.) wird als Referenz geladen.
7. **Kanonisierungs-Prompt (3. LLM-Call):** Ein einziger, globaler Call gleicht die Master-Liste mit der "Grounded Truth" der Hersteller ab. Er gruppiert alle Namensvarianten ("Bella Bot", "Pudu BellaBot") unter einem kanonischen Namen ("BellaBot").
**Phase 3: Gezielte Anreicherung & Assemblierung (pro Wettbewerber)**
8. **Portfolio zusammenstellen:** Das System iteriert durch die kanonische Produkt-Map aus Phase 2.
9. **Anreicherungs-Prompt (Serie von Micro-LLM-Calls):** Für jedes kanonische Produkt, das ein Wettbewerber anbietet, wird ein kleiner, gezielter Prompt abgesetzt, um spezifische Details (`purpose`, `category`) aus dem ursprünglichen Rohtext zu extrahieren.
10. **Finale Assemblierung:** Die Ergebnisse der allgemeinen Analyse (Schritt 4) und des angereicherten Portfolios (Schritt 9) werden zum finalen, sauberen und normalisierten JSON-Objekt für den Wettbewerber zusammengefügt.
Dieser Ansatz trennt klar zwischen Datensammlung, marktweiter Normalisierung und spezifischer Anreicherung, was zu deutlich präziseren, konsistenteren und wertvolleren Analyse-Ergebnissen führt.
### ✅ Status: Jan 12, 2026 - v5 STABLE & PRODUCTION READY
Die Implementierung der v5-Pipeline ist abgeschlossen. Die initialen Kinderkrankheiten (SDK-Konflikte, Token-Limits bei Matrizen) wurden durch radikale Architektur-Entscheidungen behoben.
**Die 3 Säulen der finalen Lösung:**
1. **Hybrid Intelligence (The "Matrix Fix"):**
* **Problem:** LLMs scheiterten daran, große Matrizen (Produkte x Wettbewerber) konsistent als JSON zu generieren (Abbruch wegen Token-Limit).
* **Lösung:** **Python übernimmt die Struktur, KI den Inhalt.** Die Matrizen (`product_matrix`, `industry_matrix`) werden nun **deterministisch im Python-Code berechnet**, basierend auf den in Phase 3 gesammelten Daten. Die KI wird nur noch für die *textliche* Zusammenfassung (`summary`, `opportunities`) gerufen. Das Ergebnis ist 100% fehlerfrei und stabil.
2. **User-In-The-Loop (Data Rescue):**
* **Feature:** In Schritt 4 wurde ein "Repair"-Modus implementiert. Nutzer können nun für jeden Wettbewerber **manuelle Produkt-URLs** nachreichen und eine **isolierte Neu-Analyse** (Re-Scrape & Re-Enrich) für diesen einen Wettbewerber auslösen, ohne den gesamten Prozess neu zu starten. Dies löst das Problem von "versteckten" Produktseiten (z.B. Giobotics).
3. **Quality Boost (CoT):**
* **Feature:** Der Enrichment-Prompt (Phase 3) nutzt nun **Chain-of-Thought (CoT)**. Anstatt stumpf Daten zu extrahieren, wird die KI angewiesen: *"Suche alle Erwähnungen -> Synthetisiere eine Beschreibung -> Bestimme die Kategorie"*. Dies hat die inhaltliche Qualität der Produktbeschreibungen wieder auf das hohe Niveau von v2 gehoben, bei gleichzeitiger Beibehaltung der sauberen Struktur von v3.
**Fazit:** Der Competitor Analysis Agent ist nun ein robustes, professionelles BI-Tool, das bereit für den produktiven Einsatz und den Import nach Notion ist.
### 🏆 Final Validation (Jan 12, 2026 - 14:00)
Der Validierungslauf mit `analysis_robo-planet.de-4.json` bestätigt den Erfolg aller Maßnahmen:
* **Lückenlose Erfassung:** Dank des "Repair Mode" (Manuelle URLs) konnte das Portfolio von *Giobotics*, das zuvor leer war, vollständig mit 9 Produkten (PuduBot 2, KettyBot, etc.) erfasst werden.
* **Hohe Datentiefe:** Die Produktbeschreibungen sind dank **Chain-of-Thought** detailliert und wertstiftend (z.B. genaue Funktionsweise des *Phantas* Roboters bei TCO Robotics), statt nur generische Einzeiler zu sein.
* **Perfekte Matrizen:** Die Python-generierten Matrizen sind vollständig und fehlerfrei. Das Token-Limit-Problem gehört der Vergangenheit an.
* **Stabilität:** Der gesamte Prozess lief ohne Abstürze oder API-Fehler durch.
**Status:****MIGRATION COMPLETE & VERIFIED.**
---
*Dokumentation finalisiert am 12.01.2026.*

12
check_syntax.py Normal file
View File

@@ -0,0 +1,12 @@
import py_compile
import sys
try:
py_compile.compile('/app/competitor-analysis-app/competitor_analysis_orchestrator.py', doraise=True)
print("Syntax OK")
except py_compile.PyCompileError as e:
print(f"Syntax Error: {e}")
sys.exit(1)
except Exception as e:
print(f"General Error: {e}")
sys.exit(1)

5
commit.sh Normal file
View File

@@ -0,0 +1,5 @@
#!/bin/bash
git status
git add .
git commit -m "fix(competitor-analysis): final migration fixes and documentation updates"
git push origin main

View File

@@ -1,6 +1,6 @@
import React, { useState, useCallback, useEffect, useRef } from 'react'; import React, { useState, useCallback, useEffect, useRef } from 'react';
import type { AppState, CompetitorCandidate, Product, TargetIndustry, Keyword, SilverBullet, Battlecard, ReferenceAnalysis } from './types'; import type { AppState, CompetitorCandidate, Product, TargetIndustry, Keyword, SilverBullet, Battlecard, ReferenceAnalysis } from './types';
import { fetchStep1Data, fetchStep2Data, fetchStep3Data, fetchStep4Data, fetchStep5Data_SilverBullets, fetchStep6Data_Conclusion, fetchStep7Data_Battlecards, fetchStep8Data_ReferenceAnalysis } from './services/geminiService'; import { fetchStep1Data, fetchStep2Data, fetchStep3Data, fetchStep4Data, fetchStep5Data_SilverBullets, fetchStep6Data_Conclusion, fetchStep7Data_Battlecards, fetchStep8Data_ReferenceAnalysis, reanalyzeCompetitor } from './services/geminiService';
import { generatePdfReport } from './services/pdfService'; import { generatePdfReport } from './services/pdfService';
import InputForm from './components/InputForm'; import InputForm from './components/InputForm';
import StepIndicator from './components/StepIndicator'; import StepIndicator from './components/StepIndicator';
@@ -91,6 +91,15 @@ const App: React.FC = () => {
} }
}, []); }, []);
const handleUpdateAnalysis = useCallback((index: number, updatedAnalysis: any) => {
setAppState(prevState => {
if (!prevState) return null;
const newAnalyses = [...prevState.analyses];
newAnalyses[index] = updatedAnalysis;
return { ...prevState, analyses: newAnalyses };
});
}, []);
const handleConfirmStep = useCallback(async () => { const handleConfirmStep = useCallback(async () => {
if (!appState) return; if (!appState) return;
@@ -112,7 +121,11 @@ const App: React.FC = () => {
case 3: case 3:
const shortlist = [...appState.competitor_candidates] const shortlist = [...appState.competitor_candidates]
.sort((a, b) => b.confidence - a.confidence) .sort((a, b) => b.confidence - a.confidence)
.slice(0, appState.initial_params.max_competitors); .slice(0, appState.initial_params.max_competitors)
.map(c => ({
...c,
manual_urls: c.manual_urls ? c.manual_urls.split('\n').map(u => u.trim()).filter(u => u) : []
}));
const { analyses } = await fetchStep4Data(appState.company, shortlist, lang); const { analyses } = await fetchStep4Data(appState.company, shortlist, lang);
newState = { competitors_shortlist: shortlist, analyses, step: 4 }; newState = { competitors_shortlist: shortlist, analyses, step: 4 };
break; break;
@@ -158,7 +171,7 @@ const App: React.FC = () => {
case 1: return <Step1Extraction products={appState.products} industries={appState.target_industries} onProductsChange={(p) => handleUpdateState('products', p)} onIndustriesChange={(i) => handleUpdateState('target_industries', i)} t={t.step1} lang={appState.initial_params.language} />; case 1: return <Step1Extraction products={appState.products} industries={appState.target_industries} onProductsChange={(p) => handleUpdateState('products', p)} onIndustriesChange={(i) => handleUpdateState('target_industries', i)} t={t.step1} lang={appState.initial_params.language} />;
case 2: return <Step2Keywords keywords={appState.keywords} onKeywordsChange={(k) => handleUpdateState('keywords', k)} t={t.step2} />; case 2: return <Step2Keywords keywords={appState.keywords} onKeywordsChange={(k) => handleUpdateState('keywords', k)} t={t.step2} />;
case 3: return <Step3Competitors candidates={appState.competitor_candidates} onCandidatesChange={(c) => handleUpdateState('competitor_candidates', c)} maxCompetitors={appState.initial_params.max_competitors} t={t.step3} />; case 3: return <Step3Competitors candidates={appState.competitor_candidates} onCandidatesChange={(c) => handleUpdateState('competitor_candidates', c)} maxCompetitors={appState.initial_params.max_competitors} t={t.step3} />;
case 4: return <Step4Analysis analyses={appState.analyses} t={t.step4} />; case 4: return <Step4Analysis analyses={appState.analyses} company={appState.company} onAnalysisUpdate={handleUpdateAnalysis} t={t.step4} />;
case 5: return <Step5SilverBullets silver_bullets={appState.silver_bullets} t={t.step5} />; case 5: return <Step5SilverBullets silver_bullets={appState.silver_bullets} t={t.step5} />;
case 6: return <Step6Conclusion appState={appState} t={t.step6} />; case 6: return <Step6Conclusion appState={appState} t={t.step6} />;
case 7: return <Step7_Battlecards appState={appState} t={t.step7} />; case 7: return <Step7_Battlecards appState={appState} t={t.step7} />;

File diff suppressed because it is too large Load Diff

View File

@@ -27,9 +27,10 @@ const Step3Competitors: React.FC<Step3CompetitorsProps> = ({ candidates, onCandi
fieldConfigs={[ fieldConfigs={[
{ key: 'name', label: t.nameLabel, type: 'text' }, { key: 'name', label: t.nameLabel, type: 'text' },
{ key: 'url', label: 'URL', type: 'text' }, { key: 'url', label: 'URL', type: 'text' },
{ key: 'manual_urls', label: t.manualUrlsLabel, type: 'textarea' },
{ key: 'why', label: t.whyLabel, type: 'textarea' }, { key: 'why', label: t.whyLabel, type: 'textarea' },
]} ]}
newItemTemplate={{ name: '', url: '', confidence: 0.8, why: '', evidence: [] }} newItemTemplate={{ name: '', url: '', confidence: 0.8, why: '', evidence: [], manual_urls: '' }}
renderDisplay={(item, index) => ( renderDisplay={(item, index) => (
<div> <div>
<div className="flex items-center justify-between"> <div className="flex items-center justify-between">
@@ -39,6 +40,11 @@ const Step3Competitors: React.FC<Step3CompetitorsProps> = ({ candidates, onCandi
{t.visitButton} {t.visitButton}
</a> </a>
<EvidencePopover evidence={item.evidence} /> <EvidencePopover evidence={item.evidence} />
{item.manual_urls && item.manual_urls.trim() && (
<span className="ml-2 inline-flex items-center px-2 py-0.5 rounded text-xs font-medium bg-blue-100 text-blue-800 dark:bg-blue-900 dark:text-blue-200">
{item.manual_urls.split('\n').filter(u => u.trim()).length} Manual URLs
</span>
)}
</div> </div>
<span className="text-xs font-mono bg-light-accent dark:bg-brand-accent text-light-text dark:text-white py-1 px-2 rounded-full"> <span className="text-xs font-mono bg-light-accent dark:bg-brand-accent text-light-text dark:text-white py-1 px-2 rounded-full">
{(item.confidence * 100).toFixed(0)}% {(item.confidence * 100).toFixed(0)}%

View File

@@ -1,13 +1,17 @@
import React from 'react'; import React, { useState } from 'react';
import type { Analysis } from '../types'; import type { Analysis, AppState } from '../types';
import EvidencePopover from './EvidencePopover'; import EvidencePopover from './EvidencePopover';
import { reanalyzeCompetitor } from '../services/geminiService';
interface Step4AnalysisProps { interface Step4AnalysisProps {
analyses: Analysis[]; analyses: Analysis[];
company: AppState['company'];
onAnalysisUpdate: (index: number, analysis: Analysis) => void;
t: any; t: any;
} }
const DownloadIcon = () => (<svg xmlns="http://www.w3.org/2000/svg" className="h-4 w-4" fill="none" viewBox="0 0 24 24" stroke="currentColor"><path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M4 16v1a3 3 0 003 3h10a3 3 0 003-3v-1m-4-4l-4 4m0 0l-4-4m4 4V4" /></svg>); const DownloadIcon = () => (<svg xmlns="http://www.w3.org/2000/svg" className="h-4 w-4" fill="none" viewBox="0 0 24 24" stroke="currentColor"><path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M4 16v1a3 3 0 003 3h10a3 3 0 003-3v-1m-4-4l-4 4m0 0l-4-4m4 4V4" /></svg>);
const RefreshIcon = () => (<svg xmlns="http://www.w3.org/2000/svg" className="h-4 w-4" fill="none" viewBox="0 0 24 24" stroke="currentColor"><path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M4 4v5h.582m15.356 2A8.001 8.001 0 004.582 9m0 0H9m11 11v-5h-.581m0 0a8.003 8.003 0 01-15.357-2m15.357 2H15" /></svg>);
const downloadJSON = (data: any, filename: string) => { const downloadJSON = (data: any, filename: string) => {
const jsonStr = JSON.stringify(data, null, 2); const jsonStr = JSON.stringify(data, null, 2);
@@ -28,8 +32,46 @@ const OverlapBar: React.FC<{ score: number }> = ({ score }) => (
</div> </div>
); );
const Step4Analysis: React.FC<Step4AnalysisProps> = ({ analyses, t }) => { const Step4Analysis: React.FC<Step4AnalysisProps> = ({ analyses, company, onAnalysisUpdate, t }) => {
const sortedAnalyses = [...analyses].sort((a, b) => b.overlap_score - a.overlap_score); const sortedAnalyses = [...analyses].sort((a, b) => b.overlap_score - a.overlap_score);
const [editingIndex, setEditingIndex] = useState<number | null>(null);
const [manualUrls, setManualUrls] = useState<string>("");
const [isReanalyzing, setIsReanalyzing] = useState<boolean>(false);
const handleEditStart = (index: number) => {
setEditingIndex(index);
setManualUrls(""); // Reset
};
const handleReanalyze = async (index: number, competitor: Analysis['competitor']) => {
setIsReanalyzing(true);
try {
const urls = manualUrls.split('\n').map(u => u.trim()).filter(u => u);
// Construct a partial CompetitorCandidate object as expected by the service
const candidate = {
name: competitor.name,
url: competitor.url,
confidence: 0, // Not needed for re-analysis
why: "",
evidence: []
};
const updatedAnalysis = await reanalyzeCompetitor(company, candidate, urls);
// Find the original index in the unsorted 'analyses' array to update correctly
const originalIndex = analyses.findIndex(a => a.competitor.name === competitor.name);
if (originalIndex !== -1) {
onAnalysisUpdate(originalIndex, updatedAnalysis);
}
setEditingIndex(null);
} catch (error) {
console.error("Re-analysis failed:", error);
alert("Fehler bei der Re-Analyse. Bitte Logs prüfen.");
} finally {
setIsReanalyzing(false);
}
};
return ( return (
<div> <div>
@@ -49,6 +91,13 @@ const Step4Analysis: React.FC<Step4AnalysisProps> = ({ analyses, t }) => {
</a> </a>
</div> </div>
<div className="flex items-center space-x-2"> <div className="flex items-center space-x-2">
<button
onClick={() => handleEditStart(index)}
title="Daten anreichern / Korrigieren"
className="text-light-subtle dark:text-brand-light hover:text-blue-500 p-1 rounded-full"
>
<RefreshIcon />
</button>
<button <button
onClick={() => downloadJSON(analysis, `analysis_${analysis.competitor.name.replace(/ /g, '_')}.json`)} onClick={() => downloadJSON(analysis, `analysis_${analysis.competitor.name.replace(/ /g, '_')}.json`)}
title={t.downloadJson_title} title={t.downloadJson_title}
@@ -60,6 +109,39 @@ const Step4Analysis: React.FC<Step4AnalysisProps> = ({ analyses, t }) => {
</div> </div>
</div> </div>
{/* Re-Analysis UI */}
{editingIndex === index && (
<div className="mb-6 bg-light-primary dark:bg-brand-primary p-4 rounded-md border border-brand-highlight">
<h4 className="font-bold text-sm mb-2">Manuelle Produkt-URLs ergänzen (optional)</h4>
<p className="text-xs text-light-subtle dark:text-brand-light mb-2">
Falls Produkte fehlen, fügen Sie hier direkte Links zu den Produktseiten ein (eine pro Zeile).
</p>
<textarea
className="w-full bg-light-secondary dark:bg-brand-secondary text-light-text dark:text-brand-text border border-light-accent dark:border-brand-accent rounded-md px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-brand-highlight mb-2"
rows={3}
placeholder="https://example.com/product-a&#10;https://example.com/product-b"
value={manualUrls}
onChange={(e) => setManualUrls(e.target.value)}
/>
<div className="flex justify-end space-x-2">
<button
onClick={() => setEditingIndex(null)}
className="px-3 py-1 text-sm bg-gray-500 hover:bg-gray-600 text-white rounded"
disabled={isReanalyzing}
>
Abbrechen
</button>
<button
onClick={() => handleReanalyze(index, analysis.competitor)}
className="px-3 py-1 text-sm bg-brand-highlight hover:bg-blue-600 text-white rounded flex items-center"
disabled={isReanalyzing}
>
{isReanalyzing ? "Analysiere..." : "Neu scrapen & Analysieren"}
</button>
</div>
</div>
)}
<div className="grid grid-cols-1 md:grid-cols-2 gap-6"> <div className="grid grid-cols-1 md:grid-cols-2 gap-6">
<div> <div>
<h4 className="font-semibold mb-2">{t.portfolio}</h4> <h4 className="font-semibold mb-2">{t.portfolio}</h4>

View File

@@ -72,6 +72,20 @@ export const fetchStep4Data = async (company: AppState['company'], competitors:
return response.json(); return response.json();
}; };
export const reanalyzeCompetitor = async (company: AppState['company'], competitor: CompetitorCandidate, manualUrls: string[]): Promise<Analysis> => {
const response = await fetch(`${API_BASE_URL}api/reanalyzeCompetitor`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ company, competitor, manual_urls: manualUrls }),
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
return response.json();
};
export const fetchStep5Data_SilverBullets = async (company: AppState['company'], analyses: Analysis[], language: 'de' | 'en'): Promise<{ silver_bullets: SilverBullet[] }> => { export const fetchStep5Data_SilverBullets = async (company: AppState['company'], analyses: Analysis[], language: 'de' | 'en'): Promise<{ silver_bullets: SilverBullet[] }> => {
const response = await fetch(`${API_BASE_URL}api/fetchStep5Data_SilverBullets`, { const response = await fetch(`${API_BASE_URL}api/fetchStep5Data_SilverBullets`, {
method: 'POST', method: 'POST',

View File

@@ -58,6 +58,7 @@ export const translations = {
cardTitle: "Wettbewerber-Kandidaten", cardTitle: "Wettbewerber-Kandidaten",
nameLabel: "Name", nameLabel: "Name",
whyLabel: "Begründung", whyLabel: "Begründung",
manualUrlsLabel: "Zusätzliche Produkt-URLs (eine pro Zeile)",
visitButton: "Besuchen", visitButton: "Besuchen",
shortlistBoundary: "SHORTLIST-GRENZE", shortlistBoundary: "SHORTLIST-GRENZE",
editableCard: { add: "Hinzufügen", cancel: "Abbrechen", save: "Speichern" } editableCard: { add: "Hinzufügen", cancel: "Abbrechen", save: "Speichern" }
@@ -200,6 +201,7 @@ export const translations = {
cardTitle: "Competitor Candidates", cardTitle: "Competitor Candidates",
nameLabel: "Name", nameLabel: "Name",
whyLabel: "Justification", whyLabel: "Justification",
manualUrlsLabel: "Additional Product URLs (one per line)",
visitButton: "Visit", visitButton: "Visit",
shortlistBoundary: "SHORTLIST BOUNDARY", shortlistBoundary: "SHORTLIST BOUNDARY",
editableCard: { add: "Add", cancel: "Cancel", save: "Save" } editableCard: { add: "Add", cancel: "Cancel", save: "Save" }

View File

@@ -27,6 +27,7 @@ export interface CompetitorCandidate {
confidence: number; confidence: number;
why: string; why: string;
evidence: Evidence[]; evidence: Evidence[];
manual_urls?: string;
} }
export interface ShortlistedCompetitor { export interface ShortlistedCompetitor {

View File

@@ -4,15 +4,15 @@ import requests
import sys import sys
# Configuration # Configuration
JSON_FILE = 'analysis_robo-planet.de.json' JSON_FILE = 'analysis_robo-planet.de-2.json'
TOKEN_FILE = 'notion_token.txt' TOKEN_FILE = 'notion_token.txt'
PARENT_PAGE_ID = "2e088f42-8544-8024-8289-deb383da3818" PARENT_PAGE_ID = "2e088f42-8544-8024-8289-deb383da3818"
# Database Titles # Database Titles
DB_TITLE_HUB = "📦 Competitive Radar (Companies) v4" DB_TITLE_HUB = "📦 Competitive Radar (Companies) v5"
DB_TITLE_LANDMINES = "💣 Competitive Radar (Landmines) v4" DB_TITLE_LANDMINES = "💣 Competitive Radar (Landmines) v5"
DB_TITLE_REFS = "🏆 Competitive Radar (References) v4" DB_TITLE_REFS = "🏆 Competitive Radar (References) v5"
DB_TITLE_PRODUCTS = "🤖 Competitive Radar (Products) v4" DB_TITLE_PRODUCTS = "🤖 Competitive Radar (Products) v5"
def load_json_data(filepath): def load_json_data(filepath):
with open(filepath, 'r') as f: with open(filepath, 'r') as f:
@@ -80,7 +80,7 @@ def main():
for prod in analysis.get('portfolio', []): for prod in analysis.get('portfolio', []):
p_props = { p_props = {
"Product": {"title": [{"text": {"content": prod['product'][:100]}}]}, "Product": {"title": [{"text": {"content": prod['product'][:100]}}]},
"Category": {"select": {"name": prod.get('category', 'Other')}}, "Category": {"select": {"name": prod.get('purpose', 'Other')[:100]}},
"Related Competitor": {"relation": [{"id": pid}]} "Related Competitor": {"relation": [{"id": pid}]}
} }
create_page(token, prod_id, p_props) create_page(token, prod_id, p_props)

View File

@@ -1,200 +0,0 @@
import json
import os
import requests
import sys
# Configuration
JSON_FILE = 'analysis_robo-planet.de.json'
TOKEN_FILE = 'notion_token.txt'
# Root Page ID from notion_integration.md
PARENT_PAGE_ID = "2e088f42-8544-8024-8289-deb383da3818"
DB_TITLE = "Competitive Radar 🎯"
def load_json_data(filepath):
try:
with open(filepath, 'r') as f:
return json.load(f)
except Exception as e:
print(f"Error loading JSON: {e}")
sys.exit(1)
def load_notion_token(filepath):
try:
with open(filepath, 'r') as f:
return f.read().strip()
except Exception as e:
print(f"Error loading token: {e}")
sys.exit(1)
def create_competitor_database(token, parent_page_id):
url = "https://api.notion.com/v1/databases"
headers = {
"Authorization": f"Bearer {token}",
"Notion-Version": "2022-06-28",
"Content-Type": "application/json"
}
payload = {
"parent": {"type": "page_id", "page_id": parent_page_id},
"title": [{"type": "text", "text": {"content": DB_TITLE}}],
"properties": {
"Competitor Name": {"title": {}},
"Website": {"url": {}},
"Target Industries": {"multi_select": {}},
"USPs / Differentiators": {"rich_text": {}},
"Silver Bullet": {"rich_text": {}},
"Landmines": {"rich_text": {}},
"Strengths vs Weaknesses": {"rich_text": {}},
"Portfolio": {"rich_text": {}},
"Known References": {"rich_text": {}}
}
}
print(f"Creating database '{DB_TITLE}'...")
response = requests.post(url, headers=headers, json=payload)
if response.status_code != 200:
print(f"Error creating database: {response.status_code}")
print(response.text)
sys.exit(1)
db_data = response.json()
print(f"Database created successfully! ID: {db_data['id']}")
return db_data['id']
def format_list_as_bullets(items):
"""Converts a python list of strings into a Notion rich_text text block with bullets."""
if not items:
return ""
text_content = ""
for item in items:
text_content += f"{item}\n"
return text_content.strip()
def get_competitor_data(data, comp_name):
"""Aggregates data from different sections of the JSON for a single competitor."""
# Init structure
comp_data = {
"name": comp_name,
"url": "",
"industries": [],
"differentiators": [],
"portfolio": [],
"silver_bullet": "",
"landmines": [],
"strengths_weaknesses": [],
"references": []
}
# 1. Basic Info & Portfolio (from 'analyses')
for analysis in data.get('analyses', []):
c = analysis.get('competitor', {})
if c.get('name') == comp_name:
comp_data['url'] = c.get('url', '')
comp_data['industries'] = analysis.get('target_industries', [])
comp_data['differentiators'] = analysis.get('differentiators', [])
# Format Portfolio
for prod in analysis.get('portfolio', []):
p_name = prod.get('product', '')
p_purpose = prod.get('purpose', '')
comp_data['portfolio'].append(f"{p_name}: {p_purpose}")
break
# 2. Battlecards
for card in data.get('battlecards', []):
if card.get('competitor_name') == comp_name:
comp_data['silver_bullet'] = card.get('silver_bullet', '')
comp_data['landmines'] = card.get('landmine_questions', [])
comp_data['strengths_weaknesses'] = card.get('strengths_vs_weaknesses', [])
break
# 3. References
for ref_entry in data.get('reference_analysis', []):
if ref_entry.get('competitor_name') == comp_name:
for ref in ref_entry.get('references', []):
r_name = ref.get('name', 'Unknown')
r_ind = ref.get('industry', '')
entry = r_name
if r_ind:
entry += f" ({r_ind})"
comp_data['references'].append(entry)
break
return comp_data
def add_competitor_entry(token, db_id, c_data):
url = "https://api.notion.com/v1/pages"
headers = {
"Authorization": f"Bearer {token}",
"Notion-Version": "2022-06-28",
"Content-Type": "application/json"
}
# Prepare properties
props = {
"Competitor Name": {"title": [{"text": {"content": c_data['name']}}]},
"USPs / Differentiators": {"rich_text": [{"text": {"content": format_list_as_bullets(c_data['differentiators'])}}]},
"Silver Bullet": {"rich_text": [{"text": {"content": c_data['silver_bullet']}}]},
"Landmines": {"rich_text": [{"text": {"content": format_list_as_bullets(c_data['landmines'])}}]},
"Strengths vs Weaknesses": {"rich_text": [{"text": {"content": format_list_as_bullets(c_data['strengths_weaknesses'])}}]},
"Portfolio": {"rich_text": [{"text": {"content": format_list_as_bullets(c_data['portfolio'])}}]},
"Known References": {"rich_text": [{"text": {"content": format_list_as_bullets(c_data['references'])}}]}
}
if c_data['url']:
props["Website"] = {"url": c_data['url']}
# Multi-select for industries
# Note: Notion options are auto-created, but we must ensure no commas or weird chars break it
ms_options = []
for ind in c_data['industries']:
# Simple cleanup
clean_ind = ind.replace(',', '')
ms_options.append({"name": clean_ind})
props["Target Industries"] = {"multi_select": ms_options}
payload = {
"parent": {"database_id": db_id},
"properties": props
}
try:
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
print(f" - Added: {c_data['name']}")
except requests.exceptions.HTTPError as e:
print(f" - Failed to add {c_data['name']}: {e}")
# print(response.text)
def main():
token = load_notion_token(TOKEN_FILE)
data = load_json_data(JSON_FILE)
# 1. Create DB
db_id = create_competitor_database(token, PARENT_PAGE_ID)
# 2. Collect List of Competitors
# We use the shortlist or candidates list to drive the iteration
competitor_list = data.get('competitors_shortlist', [])
if not competitor_list:
competitor_list = data.get('competitor_candidates', [])
print(f"Importing {len(competitor_list)} competitors...")
for comp in competitor_list:
c_name = comp.get('name')
if not c_name: continue
# Aggregate Data
c_data = get_competitor_data(data, c_name)
# Push to Notion
add_competitor_entry(token, db_id, c_data)
print("Import complete.")
if __name__ == "__main__":
main()

View File

@@ -1,143 +0,0 @@
import json
import os
import requests
import sys
# Configuration
JSON_FILE = 'analysis_robo-planet.de.json'
TOKEN_FILE = 'notion_token.txt'
# Root Page ID from notion_integration.md
PARENT_PAGE_ID = "2e088f42-8544-8024-8289-deb383da3818"
DB_TITLE = "Competitive Radar - References"
def load_json_data(filepath):
"""Loads the analysis JSON data."""
try:
with open(filepath, 'r') as f:
return json.load(f)
except FileNotFoundError:
print(f"Error: File '{filepath}' not found.")
sys.exit(1)
except json.JSONDecodeError:
print(f"Error: Failed to decode JSON from '{filepath}'.")
sys.exit(1)
def load_notion_token(filepath):
"""Loads the Notion API token."""
try:
with open(filepath, 'r') as f:
return f.read().strip()
except FileNotFoundError:
print(f"Error: Token file '{filepath}' not found.")
sys.exit(1)
def create_database(token, parent_page_id):
"""Creates the References database in Notion."""
url = "https://api.notion.com/v1/databases"
headers = {
"Authorization": f"Bearer {token}",
"Notion-Version": "2022-06-28",
"Content-Type": "application/json"
}
payload = {
"parent": {"type": "page_id", "page_id": parent_page_id},
"title": [{"type": "text", "text": {"content": DB_TITLE}}],
"properties": {
"Reference Name": {"title": {}},
"Competitor": {"select": {}},
"Industry": {"select": {}},
"Snippet": {"rich_text": {}},
"Case Study URL": {"url": {}}
}
}
print(f"Creating database '{DB_TITLE}'...")
response = requests.post(url, headers=headers, json=payload)
if response.status_code != 200:
print(f"Error creating database: {response.status_code}")
print(response.text)
sys.exit(1)
db_data = response.json()
print(f"Database created successfully! ID: {db_data['id']}")
return db_data['id']
def add_reference(token, db_id, competitor, reference):
"""Adds a single reference entry to the database."""
url = "https://api.notion.com/v1/pages"
headers = {
"Authorization": f"Bearer {token}",
"Notion-Version": "2022-06-28",
"Content-Type": "application/json"
}
ref_name = reference.get("name", "Unknown Reference")
industry = reference.get("industry", "Unknown")
snippet = reference.get("testimonial_snippet", "")
case_url = reference.get("case_study_url")
# Truncate snippet if too long (Notion limit is 2000 chars for text block, but good practice anyway)
if len(snippet) > 2000:
snippet = snippet[:1997] + "..."
properties = {
"Reference Name": {"title": [{"text": {"content": ref_name}}]},
"Competitor": {"select": {"name": competitor}},
"Industry": {"select": {"name": industry}},
"Snippet": {"rich_text": [{"text": {"content": snippet}}]}
}
if case_url and case_url.startswith("http"):
properties["Case Study URL"] = {"url": case_url}
payload = {
"parent": {"database_id": db_id},
"properties": properties
}
try:
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
print(f" - Added: {ref_name}")
except requests.exceptions.HTTPError as e:
print(f" - Failed to add {ref_name}: {e}")
# print(response.text) # Uncomment for debugging
def main():
token = load_notion_token(TOKEN_FILE)
data = load_json_data(JSON_FILE)
# Check if we have reference analysis data
ref_analysis = data.get("reference_analysis", [])
if not ref_analysis:
print("Warning: No reference analysis data found in JSON.")
# We proceed anyway to create the DB and show it works,
# but in a real scenario we might want to exit or use dummy data.
# 1. Create Database (You could also check if it exists, but for this script we create new)
# Ideally, we would search for an existing DB first to avoid duplicates.
# For this task, we create it.
db_id = create_database(token, PARENT_PAGE_ID)
# 2. Iterate and Import
print("Importing references...")
count = 0
for entry in ref_analysis:
competitor = entry.get("competitor_name", "Unknown Competitor")
refs = entry.get("references", [])
if not refs:
print(f"Skipping {competitor} (No references found)")
continue
print(f"Processing {competitor} ({len(refs)} references)...")
for ref in refs:
add_reference(token, db_id, competitor, ref)
count += 1
print(f"\nImport complete! {count} references added.")
if __name__ == "__main__":
main()

View File

@@ -1,265 +0,0 @@
import json
import os
import requests
import sys
# Configuration
JSON_FILE = 'analysis_robo-planet.de.json'
TOKEN_FILE = 'notion_token.txt'
PARENT_PAGE_ID = "2e088f42-8544-8024-8289-deb383da3818"
# Database Titles
DB_TITLE_HUB = "📦 Competitive Radar (Companies)"
DB_TITLE_LANDMINES = "💣 Competitive Radar (Landmines & Intel)"
DB_TITLE_REFS = "🏆 Competitive Radar (References)"
DB_TITLE_PRODUCTS = "🤖 Competitive Radar (Products)"
def load_json_data(filepath):
try:
with open(filepath, 'r') as f:
return json.load(f)
except Exception as e:
print(f"Error loading JSON: {e}")
sys.exit(1)
def load_notion_token(filepath):
try:
with open(filepath, 'r') as f:
return f.read().strip()
except Exception as e:
print(f"Error loading token: {e}")
sys.exit(1)
def create_database(token, parent_page_id, title, properties):
url = "https://api.notion.com/v1/databases"
headers = {
"Authorization": f"Bearer {token}",
"Notion-Version": "2022-06-28",
"Content-Type": "application/json"
}
payload = {
"parent": {"type": "page_id", "page_id": parent_page_id},
"title": [{"type": "text", "text": {"content": title}}],
"properties": properties
}
response = requests.post(url, headers=headers, json=payload)
if response.status_code != 200:
print(f"Error creating DB '{title}': {response.status_code}")
print(response.text)
sys.exit(1)
db_data = response.json()
print(f"✅ Created DB '{title}' (ID: {db_data['id']})")
return db_data['id']
def create_page(token, db_id, properties):
url = "https://api.notion.com/v1/pages"
headers = {
"Authorization": f"Bearer {token}",
"Notion-Version": "2022-06-28",
"Content-Type": "application/json"
}
payload = {
"parent": {"database_id": db_id},
"properties": properties
}
response = requests.post(url, headers=headers, json=payload)
if response.status_code != 200:
print(f"Error creating page: {response.status_code}")
# print(response.text)
return None
return response.json()['id']
def format_list_as_bullets(items):
if not items: return ""
return "\n".join([f"{item}" for item in items])
def main():
token = load_notion_token(TOKEN_FILE)
data = load_json_data(JSON_FILE)
print("🚀 Starting Relational Import (Level 3 - Full Radar)...")
# --- STEP 1: Define & Create Competitors Hub DB ---
props_hub = {
"Name": {"title": {}},
"Website": {"url": {}},
"Target Industries": {"multi_select": {}},
"Portfolio Summary": {"rich_text": {}},
"Silver Bullet": {"rich_text": {}},
"USPs": {"rich_text": {}}
}
hub_db_id = create_database(token, PARENT_PAGE_ID, DB_TITLE_HUB, props_hub)
# --- STEP 2: Define & Create Satellite DBs (Linked to Hub) ---
# Landmines DB
props_landmines = {
"Statement / Question": {"title": {}},
"Type": {"select": {
"options": [
{"name": "Landmine Question", "color": "red"},
{"name": "Competitor Weakness", "color": "green"},
{"name": "Competitor Strength", "color": "orange"}
]
}},
"Related Competitor": {
"relation": {
"database_id": hub_db_id,
"dual_property": {"synced_property_name": "Related Landmines & Intel"}
}
}
}
landmines_db_id = create_database(token, PARENT_PAGE_ID, DB_TITLE_LANDMINES, props_landmines)
# References DB
props_refs = {
"Customer Name": {"title": {}},
"Industry": {"select": {}},
"Snippet": {"rich_text": {}},
"Case Study URL": {"url": {}},
"Related Competitor": {
"relation": {
"database_id": hub_db_id,
"dual_property": {"synced_property_name": "Related References"}
}
}
}
refs_db_id = create_database(token, PARENT_PAGE_ID, DB_TITLE_REFS, props_refs)
# Products DB
props_products = {
"Product Name": {"title": {}},
"Purpose / Description": {"rich_text": {}},
"Related Competitor": {
"relation": {
"database_id": hub_db_id,
"dual_property": {"synced_property_name": "Related Products"}
}
}
}
products_db_id = create_database(token, PARENT_PAGE_ID, DB_TITLE_PRODUCTS, props_products)
# --- STEP 3: Import Competitors (and store IDs) ---
competitor_map = {} # Maps Name -> Notion Page ID
competitors = data.get('competitors_shortlist', []) or data.get('competitor_candidates', [])
print(f"\nImporting {len(competitors)} Competitors...")
for comp in competitors:
c_name = comp.get('name')
if not c_name: continue
# Gather Data
c_url = comp.get('url', '')
# Find extended analysis data
analysis_data = next((a for a in data.get('analyses', []) if a.get('competitor', {}).get('name') == c_name), {})
battlecard_data = next((b for b in data.get('battlecards', []) if b.get('competitor_name') == c_name), {})
industries = analysis_data.get('target_industries', [])
portfolio = analysis_data.get('portfolio', [])
portfolio_text = "\n".join([f"{p.get('product')}: {p.get('purpose')}" for p in portfolio])
usps = format_list_as_bullets(analysis_data.get('differentiators', []))
silver_bullet = battlecard_data.get('silver_bullet', '')
# Create Page
props = {
"Name": {"title": [{"text": {"content": c_name}}]},
"Portfolio Summary": {"rich_text": [{"text": {"content": portfolio_text[:2000]}}]}, # Limit length
"USPs": {"rich_text": [{"text": {"content": usps[:2000]}}]},
"Silver Bullet": {"rich_text": [{"text": {"content": silver_bullet[:2000]}}]},
"Target Industries": {"multi_select": [{"name": i.replace(',', '')} for i in industries]}
}
if c_url: props["Website"] = {"url": c_url}
page_id = create_page(token, hub_db_id, props)
if page_id:
competitor_map[c_name] = page_id
print(f" - Created: {c_name}")
# --- STEP 4: Import Landmines & Intel ---
print("\nImporting Landmines & Intel...")
for card in data.get('battlecards', []):
c_name = card.get('competitor_name')
comp_page_id = competitor_map.get(c_name)
if not comp_page_id: continue
# 1. Landmines
for q in card.get('landmine_questions', []):
props = {
"Statement / Question": {"title": [{"text": {"content": q}}]},
"Type": {"select": {"name": "Landmine Question"}},
"Related Competitor": {"relation": [{"id": comp_page_id}]}
}
create_page(token, landmines_db_id, props)
# 2. Weaknesses
for point in card.get('strengths_vs_weaknesses', []):
p_type = "Competitor Weakness"
props = {
"Statement / Question": {"title": [{"text": {"content": point}}]},
"Type": {"select": {"name": p_type}},
"Related Competitor": {"relation": [{"id": comp_page_id}]}
}
create_page(token, landmines_db_id, props)
print(" - Landmines imported.")
# --- STEP 5: Import References ---
print("\nImporting References...")
count_refs = 0
for ref_group in data.get('reference_analysis', []):
c_name = ref_group.get('competitor_name')
comp_page_id = competitor_map.get(c_name)
if not comp_page_id: continue
for ref in ref_group.get('references', []):
r_name = ref.get('name', 'Unknown')
r_industry = ref.get('industry', 'Unknown')
r_snippet = ref.get('testimonial_snippet', '')
r_url = ref.get('case_study_url', '')
props = {
"Customer Name": {"title": [{"text": {"content": r_name}}]},
"Industry": {"select": {"name": r_industry}},
"Snippet": {"rich_text": [{"text": {"content": r_snippet[:2000]}}]},
"Related Competitor": {"relation": [{"id": comp_page_id}]}
}
if r_url and r_url.startswith('http'):
props["Case Study URL"] = {"url": r_url}
create_page(token, refs_db_id, props)
count_refs += 1
print(f" - {count_refs} References imported.")
# --- STEP 6: Import Products (Portfolio) ---
print("\nImporting Products...")
count_prods = 0
for analysis in data.get('analyses', []):
c_name = analysis.get('competitor', {}).get('name')
comp_page_id = competitor_map.get(c_name)
if not comp_page_id: continue
for prod in analysis.get('portfolio', []):
p_name = prod.get('product', 'Unknown Product')
p_purpose = prod.get('purpose', '')
props = {
"Product Name": {"title": [{"text": {"content": p_name}}]},
"Purpose / Description": {"rich_text": [{"text": {"content": p_purpose[:2000]}}]},
"Related Competitor": {"relation": [{"id": comp_page_id}]}
}
create_page(token, products_db_id, props)
count_prods += 1
print(f" - {count_prods} Products imported.")
print("\n✅ Relational Import Complete (Level 3)!")
if __name__ == "__main__":
main()

View File

@@ -1,35 +0,0 @@
import asyncio
import json
import os
import sys
# Path to the orchestrator
sys.path.append(os.path.join(os.getcwd(), 'competitor-analysis-app'))
from competitor_analysis_orchestrator import analyze_single_competitor_references
async def refresh_references():
json_path = 'analysis_robo-planet.de.json'
with open(json_path, 'r') as f:
data = json.load(f)
competitors = data.get('competitors_shortlist', [])
if not competitors:
competitors = data.get('competitor_candidates', [])
print(f"Refreshing references for {len(competitors)} competitors...")
tasks = [analyze_single_competitor_references(c) for c in competitors]
results = await asyncio.gather(*tasks)
# Filter and update
data['reference_analysis'] = [r for r in results if r is not None]
with open(json_path, 'w') as f:
json.dump(data, f, indent=2)
print(f"Successfully updated {json_path} with grounded reference data.")
if __name__ == "__main__":
asyncio.run(refresh_references())