# Gemini Code Assistant Context ## Wichtige Hinweise - **Projektdokumentation:** Die primäre und umfassendste Dokumentation für dieses Projekt befindet sich in der Datei `readme.md`. Bitte ziehen Sie diese Datei für ein detailliertes Verständnis der Architektur und der einzelnen Module zu Rate. - **Git-Repository:** Dieses Projekt wird über ein Git-Repository verwaltet. Alle Änderungen am Code werden versioniert. Beachten Sie den Abschnitt "Git Workflow & Conventions" für unsere Arbeitsregeln. ## Project Overview This project is a Python-based system for automated company data enrichment and lead generation. It uses a variety of data sources, including web scraping, Wikipedia, and the OpenAI API, to enrich company data from a CRM system. The project is designed to run in a Docker container and can be controlled via a Flask API. The system is modular and consists of the following key components: * **`brancheneinstufung_167.py`:** The core module for data enrichment, including web scraping, Wikipedia lookups, and AI-based analysis. * **`company_deduplicator.py`:** A module for intelligent duplicate checking, both for external lists and internal CRM data. * **`generate_marketing_text.py`:** An engine for creating personalized marketing texts. * **`app.py`:** A Flask application that provides an API to run the different modules. ## Git Workflow & Conventions - **Commit-Nachrichten:** Commits sollen einem klaren Format folgen: - Titel: Eine prägnante Zusammenfassung unter 100 Zeichen. - Beschreibung: Detaillierte Änderungen als Liste mit `- ` am Zeilenanfang (keine Bulletpoints). - **Datei-Umbenennungen:** Um die Git-Historie einer Datei zu erhalten, muss sie zwingend mit `git mv alter_name.py neuer_name.py` umbenannt werden. - **Commit & Push Prozess:** Änderungen werden zuerst lokal committet. Das Pushen auf den Remote-Server erfolgt erst nach expliziter Bestätigung durch Sie. - **Anzeige der Historie:** Web-Oberflächen wie Gitea zeigen die Historie einer umbenannten Datei möglicherweise nicht vollständig an. Die korrekte und vollständige Historie kann auf der Kommandozeile mit `git log --follow ` eingesehen werden. ## Building and Running The project is designed to be run in a Docker container. The `Dockerfile` contains the instructions to build the container. **To build the Docker container:** ```bash docker build -t company-enrichment . ``` **To run the Docker container:** ```bash docker run -p 8080:8080 company-enrichment ``` The application will be available at `http://localhost:8080`. ## Development Conventions * **Configuration:** The project uses a `config.py` file to manage configuration settings. * **Dependencies:** Python dependencies are listed in the `requirements.txt` file. * **Modularity:** The code is modular and well-structured, with helper functions and classes to handle specific tasks. * **API:** The Flask application in `app.py` provides an API to interact with the system. * **Logging:** The project uses the `logging` module to log information and errors. * **Error Handling:** The `readme.md` indicates a critical error related to the `openai` library. The next step is to downgrade the library to a compatible version. ## Current Status (Jan 05, 2026) - GTM & Market Intel Fixes * **GTM Architect (v2.4) - UI/UX Refinement:** * **Corporate Design Integration:** A central, customizable `CORPORATE_DESIGN_PROMPT` was introduced in `config.py` to ensure all generated images strictly follow a "clean, professional, photorealistic" B2B style, avoiding comic aesthetics. * **Aspect Ratio Control:** Implemented user-selectable aspect ratios (16:9, 9:16, 1:1, 4:3) in the frontend (Phase 6), passing through to the Google Imagen/Gemini 2.5 API. * **Frontend Fix:** Resolved a double-declaration bug in `App.tsx` that prevented the build. * **Market Intelligence Tool (v1.2) - Backend Hardening:** * **"Failed to fetch" Resolved:** Fixed a critical Nginx routing issue by forcing the frontend to use relative API paths (`./api`) instead of absolute ports, ensuring requests correctly pass through the reverse proxy in Docker. * **Large Payload Fix:** Increased `client_max_body_size` to 50M in both Nginx configurations (`nginx-proxy.conf` and frontend `nginx.conf`) to prevent 413 Errors when uploading large knowledge base files during campaign generation. * **JSON Stability:** The Python Orchestrator and Node.js bridge were hardened against invalid JSON output. The system now robustly handles stdout noise and logs full raw output to `/app/Log/server_dump.txt` in case of errors. * **Language Support:** Implemented a `--language` flag. The tool now correctly respects the frontend language selection (defaulting to German) and forces the LLM to output German text for signals, ICPs, and outreach campaigns. * **Logging:** Fixed log volume mounting paths to ensure debug logs are persisted and accessible. ## Current Status (Jan 2026) - GTM Architect & Core Updates * **GTM Architect (v2.2) - FULLY OPERATIONAL:** * **Image Generation Fixed:** Successfully implemented a hybrid image generation pipeline. * **Text-to-Image:** Uses `imagen-4.0-generate-001` for generic scenes. * **Image-to-Image:** Uses `gemini-2.5-flash-image` with reference image upload for product-consistent visuals. * **Prompt Engineering:** Strict prompts ensure the product design remains unaltered. * **Library Upgrade:** Migrated core AI logic to `google-genai` (v1.x) to resolve deprecation warnings and access newer models. `Pillow` added for image processing. * **Model Update:** Switched text generation to `gemini-2.0-flash` due to regional unavailability of 1.5. * **Frontend Stability:** Fixed a critical React crash in Phase 3 by handling object-based role descriptions robustly. * **Infrastructure:** Updated Docker configurations (`gtm-architect/requirements.txt`) to support new dependencies. ## Next Steps * **Monitor Logs:** Check `Log_from_docker/` for detailed execution traces of the GTM Architect. * **Feedback Loop:** Verify the quality of the generated GTM strategies and adjust prompts in `gtm_architect_orchestrator.py` if necessary.