# Gemini Code Assistant Context ## Wichtige Hinweise - **Projektdokumentation:** Die primäre und umfassendste Dokumentation für dieses Projekt befindet sich in der Datei `readme.md`. Bitte ziehen Sie diese Datei für ein detailliertes Verständnis der Architektur und der einzelnen Module zu Rate. - **Git-Repository:** Dieses Projekt wird über ein Git-Repository verwaltet. Alle Änderungen am Code werden versioniert. Beachten Sie den Abschnitt "Git Workflow & Conventions" für unsere Arbeitsregeln. ## Project Overview This project is a Python-based system for automated company data enrichment and lead generation. It uses a variety of data sources, including web scraping, Wikipedia, and the OpenAI API, to enrich company data from a CRM system. The project is designed to run in a Docker container and can be controlled via a Flask API. The system is modular and consists of the following key components: * **`brancheneinstufung_167.py`:** The core module for data enrichment, including web scraping, Wikipedia lookups, and AI-based analysis. * **`company_deduplicator.py`:** A module for intelligent duplicate checking, both for external lists and internal CRM data. * **`generate_marketing_text.py`:** An engine for creating personalized marketing texts. * **`app.py`:** A Flask application that provides an API to run the different modules. ## Git Workflow & Conventions - **Commit-Nachrichten:** Commits sollen einem klaren Format folgen: - Titel: Eine prägnante Zusammenfassung unter 100 Zeichen. - Beschreibung: Detaillierte Änderungen als Liste mit `- ` am Zeilenanfang (keine Bulletpoints). - **Datei-Umbenennungen:** Um die Git-Historie einer Datei zu erhalten, muss sie zwingend mit `git mv alter_name.py neuer_name.py` umbenannt werden. - **Commit & Push Prozess:** Änderungen werden zuerst lokal committet. Das Pushen auf den Remote-Server erfolgt erst nach expliziter Bestätigung durch Sie. - **Anzeige der Historie:** Web-Oberflächen wie Gitea zeigen die Historie einer umbenannten Datei möglicherweise nicht vollständig an. Die korrekte und vollständige Historie kann auf der Kommandozeile mit `git log --follow ` eingesehen werden. ## Building and Running The project is designed to be run in a Docker container. The `Dockerfile` contains the instructions to build the container. **To build the Docker container:** ```bash docker build -t company-enrichment . ``` **To run the Docker container:** ```bash docker run -p 8080:8080 company-enrichment ``` The application will be available at `http://localhost:8080`. ## Development Conventions * **Configuration:** The project uses a `config.py` file to manage configuration settings. * **Dependencies:** Python dependencies are listed in the `requirements.txt` file. * **Modularity:** The code is modular and well-structured, with helper functions and classes to handle specific tasks. * **API:** The Flask application in `app.py` provides an API to interact with the system. * **Logging:** The project uses the `logging` module to log information and errors. * **Error Handling:** The `readme.md` indicates a critical error related to the `openai` library. The next step is to downgrade the library to a compatible version. ## Current Status (Jan 2026) - GTM Architect & Core Updates * **GTM Architect Fixed & Stabilized:** The "GTM Architect" service is now fully operational. * **Prompt Engineering:** Switched from fragile f-strings to robust `"\n".join([...])` list construction for all system prompts. This eliminates syntax errors caused by mixed quotes and multi-line strings in Python. * **AI Backend:** Migrated to `google.generativeai` via `helpers.call_gemini_flash` with JSON mode support. * **Scraping:** Implemented robust URL scraping using `requests` and `BeautifulSoup` (with `html.parser` to avoid dependency issues). * **Docker:** Container (`gtm-app`) updated with correct volume mounts for logs and config. * **Market Intelligence Fixed:** Added `google-generativeai` to `market-intel.requirements.txt`. This was necessary because `helpers.py` (which is shared) now relies on this library for all AI operations. * **Gemini Migration:** The codebase is moving towards `google-generativeai` as the primary driver. `helpers.py` now supports this natively. * **Deployment:** To apply these fixes, a rebuild of the `gtm-app` container is required. A rebuild of `market-backend` is also required to pick up the new dependency. ## Next Steps * **Monitor Logs:** Check `Log_from_docker/` for detailed execution traces of the GTM Architect. * **Feedback Loop:** Verify the quality of the generated GTM strategies and adjust prompts in `gtm_architect_orchestrator.py` if necessary.