Gemini Code Assistant Context

Wichtige Hinweise

Projektdokumentation: Die primäre und umfassendste Dokumentation für dieses Projekt befindet sich in der Datei readme.md. Bitte ziehen Sie diese Datei für ein detailliertes Verständnis der Architektur und der einzelnen Module zu Rate.
Git-Repository: Dieses Projekt wird über ein Git-Repository verwaltet. Alle Änderungen am Code werden versioniert. Beachten Sie den Abschnitt "Git Workflow & Conventions" für unsere Arbeitsregeln.

Project Overview

This project is a Python-based system for automated company data enrichment and lead generation. It uses a variety of data sources, including web scraping, Wikipedia, and the OpenAI API, to enrich company data from a CRM system. The project is designed to run in a Docker container and can be controlled via a Flask API.

The system is modular and consists of the following key components:

brancheneinstufung_167.py: The core module for data enrichment, including web scraping, Wikipedia lookups, and AI-based analysis.
company_deduplicator.py: A module for intelligent duplicate checking, both for external lists and internal CRM data.
generate_marketing_text.py: An engine for creating personalized marketing texts.
app.py: A Flask application that provides an API to run the different modules.

Git Workflow & Conventions

Commit-Nachrichten: Commits sollen einem klaren Format folgen:
- Titel: Eine prägnante Zusammenfassung unter 100 Zeichen.
- Beschreibung: Detaillierte Änderungen als Liste mit - am Zeilenanfang (keine Bulletpoints).
Datei-Umbenennungen: Um die Git-Historie einer Datei zu erhalten, muss sie zwingend mit git mv alter_name.py neuer_name.py umbenannt werden.
Commit & Push Prozess: Änderungen werden zuerst lokal committet. Das Pushen auf den Remote-Server erfolgt erst nach expliziter Bestätigung durch Sie.
Anzeige der Historie: Web-Oberflächen wie Gitea zeigen die Historie einer umbenannten Datei möglicherweise nicht vollständig an. Die korrekte und vollständige Historie kann auf der Kommandozeile mit git log --follow <dateiname> eingesehen werden.

Building and Running

The project is designed to be run in a Docker container. The Dockerfile contains the instructions to build the container.

To build the Docker container:

docker build -t company-enrichment .

To run the Docker container:

docker run -p 8080:8080 company-enrichment

The application will be available at http://localhost:8080.

Development Conventions

Configuration: The project uses a config.py file to manage configuration settings.
Dependencies: Python dependencies are listed in the requirements.txt file.
Modularity: The code is modular and well-structured, with helper functions and classes to handle specific tasks.
API: The Flask application in app.py provides an API to interact with the system.
Logging: The project uses the logging module to log information and errors.
Error Handling: The readme.md indicates a critical error related to the openai library. The next step is to downgrade the library to a compatible version.

Current Status (Jan 2026) - GTM Architect & Core Updates

GTM Architect Fixed & Stabilized: The "GTM Architect" service is now fully operational.
- Prompt Engineering: Switched from fragile f-strings to robust "\n".join([...]) list construction for all system prompts. This eliminates syntax errors caused by mixed quotes and multi-line strings in Python.
- AI Backend: Migrated to google.generativeai via helpers.call_gemini_flash with JSON mode support.
- Scraping: Implemented robust URL scraping using requests and BeautifulSoup (with html.parser to avoid dependency issues).
- Docker: Container (gtm-app) updated with correct volume mounts for logs and config.
Gemini Migration: The codebase is moving towards google-generativeai as the primary driver. helpers.py now supports this natively.
Deployment: To apply these fixes, a rebuild of the gtm-app container is required.

Next Steps

Monitor Logs: Check Log_from_docker/ for detailed execution traces of the GTM Architect.
Feedback Loop: Verify the quality of the generated GTM strategies and adjust prompts in gtm_architect_orchestrator.py if necessary.

4.4 KiB Raw Blame History