[34588f42] Performance: Massive Beschleunigung der Analyse durch SQLite-Synchronisierung

- Neue Tabelle JobParticipant speichert detaillierte CSV-Daten von Fotograf.de. - process_reminder_analysis und process_statistics nutzen nun die lokale Datenbank statt Selenium-Crawling. - Neuer 'Daten abgleichen' Button im Vorbereitungs-Tab integriert. - Automatischer Quick-Login Link-Generator basierend auf Zugangscodes.
2026-04-18 13:49:03 +00:00
parent e6061868e6
commit 472f392107
4 changed files with 417 additions and 255 deletions
--- a/fotograf-de-scraper/README.md
+++ b/fotograf-de-scraper/README.md
@@ -1,6 +1,6 @@
 # Fotograf.de Scraper & Management UI
-**Status:** Production-Ready Microservice (Core Feature: PDF List Generation, QR Cards, Shooting Schedule, **Siblings List**, **Gmail API Integration** & **Automated Release Requests**)
+**Status:** Production-Ready Microservice (Core Feature: PDF List Generation, QR Cards, Shooting Schedule, **SQLite Data Sync**, **Gmail API Integration** & **Automated Release Requests**)
 Dieser Service modernisiert die alten `Fotograf.de` Skripte, indem er eine robuste, web-basierte UI zur Verwaltung und Automatisierung von Foto-Aufträgen bereitstellt. Er ist als eigenständiger Microservice konzipiert, der unabhängig vom Haupt-Stack läuft.
@@ -10,16 +10,22 @@ Der Service besteht aus zwei Hauptkomponenten:
 1.  **Backend (Python / FastAPI / Selenium / SQLAlchemy):**
    *   **Automatisierung:** Nutzt Selenium für das Scraping von `fotograf.de`.
-    *   **Persistenz:** Eine SQLite-Datenbank (`fotograf_jobs.db`) speichert die Auftragsliste, OAuth-Tokens (`GmailToken`), Gutscheincodes (`DiscountCode`) und Teilnehmerdaten (`ReleaseParticipant`).
+    *   **Persistenz:** Eine SQLite-Datenbank (`fotograf_jobs.db`) speichert die Auftragsliste, OAuth-Tokens (`GmailToken`), Gutscheincodes (`DiscountCode`), Teilnehmerdaten (`ReleaseParticipant`), **Auftragsteilnehmer (`JobParticipant`)** und die **Versand-Historie (`ReleaseHistory`)**.
    *   **PDF-Engine:** Nutzt WeasyPrint für Teilnehmerlisten und ReportLab/PyPDF2 für präzise PDF-Overlays (QR-Karten).
    *   **API-Integration:** Direkte Anbindung an die **Calendly API (v2)** sowie an die **Gmail API** für direkten E-Mail-Versand und automatisierte Webhook-Antworten.
 2.  **Frontend (TypeScript / React / Vite / TailwindCSS):**
    *   **Modernes UI:** Ein vollständig responsives Dashboard mit Tailwind CSS (Kachel-Layout, Tabs für Kiga/Schule).
-    *   **Arbeitsfluss:** Tools sind direkt in der Detailansicht des jeweiligen Auftrags integriert.
+    *   **Arbeitsfluss:** Tools sind in der Detailansicht eines Auftrags in logische Phasen (Vorbereitung, Follow-Up, Statistik) unterteilt.
 ## ✨ Core Features
 ### 🚀 Performance-Optimierung (SQLite Sync)
 Statt wie früher jedes Mal mühsam durch alle Foto-Alben zu "crawlen", nutzt das System nun eine intelligente Synchronisierung:
 *   **One-Click Sync:** Über den Button "Daten von Fotograf.de abgleichen" lädt das System die detaillierte Namensliste (CSV) herunter.
 *   **Lokale Datenbank:** Alle relevanten Infos (E-Mail der Eltern, Login-Zahlen, Bestellstatus, Zugangscodes) werden in der Tabelle `job_participants` gespeichert.
 *   **Blitzschnelle Analyse:** Nachfass-Mails und Statistiken werden nun in Sekunden (statt Minuten) direkt aus der Datenbank generiert.
 ### Feature 1: Teilnehmerlisten (Vollständig)
 Automatisierter Workflow zum Download und Formatieren der Anmeldelisten von `fotograf.de` als sortiertes PDF inkl. "Kinderfotos Erding" Branding.
@@ -28,43 +34,37 @@ Spezielles Modul für Familien-Mini-Shootings:
 *   **QR-Karten-Andruck:** Präzises Overlay von Name, Kinderanzahl und Uhrzeit inkl. automatischer **Einwilligungs-Checkbox (☑)** aus Calendly-Daten.
 *   **Termin-Übersichtsliste:** Generiert eine A4-Tabelle für den Shooting-Tag im 6-Minuten-Takt inkl. Lückenfüller.
-### Feature 3: Nachfass-E-Mails & Gmail Direkt-Versand (Vollständig)
+### Feature 3: Nachfass-E-Mails & Gmail Direkt-Versand (Optimiert)
-Identifizierung von Nicht-Käufern und automatisierter Massenversand personalisierter E-Mails via Gmail API.
+Identifizierung von Nicht-Käufern (0-1 Logins, keine Bestellung) basierend auf den synchronisierten Datenbank-Daten.
 *   **Vorschau-Modus:** Ermöglicht das Durchklicken der personalisierten E-Mails an jeden Empfänger vor dem eigentlichen Versand.
 *   **Quick-Login Automation:** Die Login-Links (`https://www.kinderfotos-erding.de/a/{code}`) werden automatisch generiert.
-### Feature 4: Verkaufs-Statistiken (Vollständig)
+### Feature 4: Verkaufs-Statistiken (Optimiert)
-Detaillierte Analyse des Kaufverhaltens pro Album mit Echtzeit-Fortschrittsanzeige.
+Detaillierte Analyse des Kaufverhaltens pro Gruppe/Klasse basierend auf den lokalen Datenbank-Einträgen.
 ### Feature 5: Geschwisterliste (Einrichtungsintern) (Vollständig)
 Tool zur Identifizierung von Geschwistergruppen innerhalb einer Einrichtung inkl. Cross-Check mit Calendly-Buchungen und speziellen Geschwister-QR-Karten.
-### Feature 6: Freigabeanfragen & Gutschein-Automation (Vollständig - Neu April 2026)
+### Feature 6: Freigabeanfragen & Gutschein-Automation (Vollständig)
 Vollautomatisierter DSGVO-Workflow zur Einholung von Veröffentlichungsgenehmigungen:
-*   **Schlanker Versand:** Manuelle Eingabe von Empfängern (E-Mail, Vorname, Kindernamen) für gezielte Anfragen.
+*   **Schlanker Versand:** Manuelle Eingabe von Empfängern (E-Mail, Vorname, Kindernamen) mit **E-Mail-Vorschau**.
 *   **Intelligente Personalisierung:** Automatische Bereinigung von Einrichtungsnamen (entfernt "Kindergarten" und Jahreszahlen).
 *   **Versand-Planung:** Einstellbare Versandzeit (Berlin Timezone) via Hintergrund-Tasks.
-*   **Webhook-Integration:** Direkte Anbindung an **Google Forms**. Bei Absenden des Freigabe-Formulars wird automatisch:
+*   **Webhook-Integration:** Direkte Anbindung an **Google Forms**. Bei Absenden des Freigabe-Formulars wird automatisch ein Gutscheincode reserviert und eine Dankes-E-Mail versendet.
-    1. Ein freier Gutscheincode aus der DB reserviert.
+*   **Antwort-Übersicht:** Tabelle aller eingegangenen Freigaben inkl. zugewiesenem Code und Zeitstempel.
    2. Eine personalisierte Dankes-E-Mail mit dem Code und einer bebilderten Einlöse-Anleitung versendet.
 *   **Gutschein-Management:** UI zum Hochladen und Überwachen des Gutschein-Pools.
 ---
 ## 🛠️ Technische Details & Sicherheit
-*   **Sicherer Test-Modus:** Über die Umgebungsvariable `DEV_MODE_EMAIL_RECIPIENT` können alle ausgehenden E-Mails (Anfragen & Gutscheine) global an eine Test-Adresse umgeleitet werden.
+*   **BCC-Kontrolle:** Jede vom System versendete E-Mail sendet automatisch eine Blindkopie (BCC) an `kontakt@kinderfotos-erding.de`.
-*   **Zeitzonen:** Durchgängige Verwendung von `Europe/Berlin` für alle zeitgesteuerten Operationen.
+*   **Versand-Historie:** Alle Aussendungen (Anzahl Empfänger, Zeitpunkt) werden in der Tabelle `release_history` protokolliert.
-*   **E-Mail Signatur:** Die offizielle HTML-Signatur von "Kinderfotos Erding" wird automatisch an alle ausgehenden E-Mails (auch vom Backend) angehängt.
+*   **Sicherer Test-Modus:** Über `DEV_MODE_EMAIL_RECIPIENT` können alle E-Mails global an eine Test-Adresse umgeleitet werden.
-*   **Gmail OAuth:** Persistente Speicherung der Refresh-Tokens in der Datenbank ermöglicht dauerhaften Betrieb ohne erneutes Einloggen.
+*   **Zeitzonen:** Durchgängige Verwendung von `Europe/Berlin`.
 *   **Gmail OAuth:** Persistente Speicherung der Refresh-Tokens in der Datenbank.
 ## 🚀 Deployment & Konfiguration
 Der Service wird über die Haupt-`docker-compose.yml` des Projekts verwaltet.
 ### Umgebungsvariablen (`.env`)
 Wichtige neue Variablen in `/fotograf-de-scraper/.env`:
 *   `DEV_MODE_EMAIL_RECIPIENT`: (Optional) E-Mail für Umleitung im Testbetrieb.
 *   `google_fotograf_client_id` / `google_fotograf_secret`: OAuth Credentials.
 *   `CALENDLY_TOKEN`: API Zugriff.
 ### URLs
 *   **Frontend:** `https://floke-ai.duckdns.org/fotograf-de/`
-*   **Webhook für Google Forms:** `https://floke-ai.duckdns.org/fotograf-de-api/api/publish-request/webhook`
+*   **Webhook für Google Forms:** `https://floke-ai.duckdns.org/fotograf-de-api/api/publish-request/webhook`
--- a/fotograf-de-scraper/backend/database.py
+++ b/fotograf-de-scraper/backend/database.py
@@ -49,6 +49,22 @@ class ReleaseHistory(Base):
    recipient_count = Column(Integer)
    scheduled_time = Column(String, nullable=True)
 class JobParticipant(Base):
    __tablename__ = "job_participants"
    id = Column(Integer, primary_key=True)
    job_id = Column(String, index=True)
    child_id = Column(String, nullable=True)
    vorname_kind = Column(String, nullable=True)
    nachname_kind = Column(String, nullable=True)
    vorname_eltern = Column(String, nullable=True)
    nachname_eltern = Column(String, nullable=True)
    email_eltern = Column(String, nullable=True)
    zugangscode = Column(String, index=True)
    gruppe = Column(String, nullable=True)
    logins = Column(Integer, default=0)
    has_orders = Column(Integer, default=0) # 0 for false, 1 for true
    last_synced = Column(DateTime, default=datetime.datetime.utcnow)
 Base.metadata.create_all(bind=engine)
 def get_db():
--- a/fotograf-de-scraper/backend/main.py
+++ b/fotograf-de-scraper/backend/main.py
@@ -16,7 +16,7 @@ from fastapi.middleware.cors import CORSMiddleware
 from fastapi.responses import FileResponse
 from typing import List, Dict, Any, Optional
 from sqlalchemy.orm import Session
-from database import get_db, Job as DBJob, engine, Base
+from database import get_db, Job as DBJob, engine, Base, JobParticipant, SessionLocal
 import math
 import uuid
@@ -141,6 +141,120 @@ def get_logo_base64():
        logger.warning(f"Logo file not found at {logo_path}")
        return None
 def sync_job_participants(job_id: str, account_type: str, db: Session):
    logger.info(f"Syncing participants for job {job_id} ({account_type})")
    username = os.getenv(f"{account_type.upper()}_USER")
    password = os.getenv(f"{account_type.upper()}_PW")
    with tempfile.TemporaryDirectory() as temp_dir:
        driver = setup_driver(download_path=temp_dir)
        try:
            if not login(driver, username, password):
                raise Exception("Login failed during sync.")
            # Navigate to job names list
            job_url = f"https://app.fotograf.de/config_jobs_settings/index/{job_id}"
            driver.get(job_url)
            wait = WebDriverWait(driver, 20)
            personen_tab = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "[data-qa-id='link:photo-jobs-tabs-names_list']")))
            driver.execute_script("arguments[0].click();", personen_tab)
            export_btn = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, SELECTORS["export_dropdown"])))
            driver.execute_script("arguments[0].click();", export_btn)
            time.sleep(1)
            csv_btn = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, SELECTORS["export_csv_link"])))
            driver.execute_script("arguments[0].click();", csv_btn)
            # Wait for download
            csv_file = None
            for _ in range(30):
                files = [f for f in os.listdir(temp_dir) if f.endswith('.csv')]
                if files:
                    csv_file = os.path.join(temp_dir, files[0])
                    break
                time.sleep(1)
            if not csv_file:
                raise Exception("CSV download timeout during sync.")
            # Parse CSV
            df = None
            for sep in [";", ","]:
                try:
                    df = pd.read_csv(csv_file, sep=sep, encoding="utf-8-sig")
                    if len(df.columns) > 1: break
                except: continue
            if df is None:
                df = pd.read_csv(csv_file, sep=";", encoding="latin1")
            # Clean columns
            df.columns = df.columns.str.strip().str.replace("\"", "")
            # Map columns - based on user feedback
            # Expected columns: Child ID, Email der Eltern (1), Vorname Eltern (1), Nachname Eltern (1), Vorname Kind, Zugangscode (1), Logins (1), Bestellungen
            def get_col(df, patterns):
                for p in patterns:
                    for col in df.columns:
                        if p.lower() in col.lower():
                            return col
                return None
            col_child_id = get_col(df, ["Child ID"])
            col_email = get_col(df, ["Email der Eltern", "E-Mail der Eltern"])
            col_parent_vn = get_col(df, ["Vorname Eltern", "Parent First Name"])
            col_parent_nn = get_col(df, ["Nachname Eltern", "Parent Last Name"])
            col_child_vn = get_col(df, ["Vorname Kind", "Child First Name"])
            col_child_nn = get_col(df, ["Nachname Kind", "Child Last Name"])
            col_code = get_col(df, ["Zugangscode", "Access Code"])
            col_group = get_col(df, ["Gruppe", "Klasse", "Group", "Class"])
            col_logins = get_col(df, ["Logins"])
            col_orders = get_col(df, ["Bestellungen", "Orders"])
            # Delete old entries for this job
            db.query(JobParticipant).filter(JobParticipant.job_id == job_id).delete()
            added = 0
            for _, row in df.iterrows():
                try:
                    logins_val = 0
                    try: logins_val = int(row[col_logins]) if col_logins and pd.notna(row[col_logins]) else 0
                    except: pass
                    orders_val = 0
                    if col_orders and pd.notna(row[col_orders]):
                        val = str(row[col_orders]).lower()
                        if val and val != "0" and val != "nein" and val != "false":
                            orders_val = 1
                    participant = JobParticipant(
                        job_id=job_id,
                        child_id=str(row[col_child_id]) if col_child_id and pd.notna(row[col_child_id]) else None,
                        vorname_kind=str(row[col_child_vn]) if col_child_vn and pd.notna(row[col_child_vn]) else None,
                        nachname_kind=str(row[col_child_nn]) if col_child_nn and pd.notna(row[col_child_nn]) else None,
                        vorname_eltern=str(row[col_parent_vn]) if col_parent_vn and pd.notna(row[col_parent_vn]) else None,
                        nachname_eltern=str(row[col_parent_nn]) if col_parent_nn and pd.notna(row[col_parent_nn]) else None,
                        email_eltern=str(row[col_email]).strip().lower() if col_email and pd.notna(row[col_email]) else None,
                        zugangscode=str(row[col_code]) if col_code and pd.notna(row[col_code]) else None,
                        gruppe=str(row[col_group]) if col_group and pd.notna(row[col_group]) else None,
                        logins=logins_val,
                        has_orders=orders_val
                    )
                    db.add(participant)
                    added += 1
                except Exception as e:
                    logger.warning(f"Error adding participant row: {e}")
            db.commit()
            logger.info(f"Sync complete. {added} participants stored for job {job_id}")
            return added
        finally:
            driver.quit()
 def generate_pdf_from_csv(csv_path: str, institution: str, date_info: str, list_type: str, output_path: str):
    logger.info(f"Generating PDF for {institution} from {csv_path}")
    df = None
@@ -488,264 +602,142 @@ def get_jobs_list(driver) -> List[Dict[str, Any]]:
 task_store: Dict[str, Dict[str, Any]] = {}
 def process_statistics(task_id: str, job_id: str, account_type: str):
-    logger.info(f"Task {task_id}: Starting statistics calculation for job {job_id}")
+    logger.info(f"Task {task_id}: Starting fast statistics calculation for job {job_id}")
-    task_store[task_id] = {"status": "running", "progress": "Initialisiere Browser...", "result": None}
+    task_store[task_id] = {"status": "running", "progress": "Synchronisiere Daten von Fotograf.de...", "result": None}
    username = os.getenv(f"{account_type.upper()}_USER")
    password = os.getenv(f"{account_type.upper()}_PW")
    driver = None
    db = SessionLocal()
    try:
-        driver = setup_driver()
+        # 1. Sync data from CSV
        if not driver or not login(driver, username, password):
            task_store[task_id] = {"status": "error", "progress": "Login fehlgeschlagen. Überprüfe die Zugangsdaten."}
            return
        task_store[task_id]["progress"] = f"Lade Alben-Übersicht für Auftrag..."
        albums_overview_url = f"https://app.fotograf.de/config_jobs_photos/index/{job_id}"
        logger.info(f"Navigating to albums: {albums_overview_url}")
        driver.get(albums_overview_url)
        wait = WebDriverWait(driver, 15)
        albums_to_visit = []
        try:
-            album_rows = wait.until(EC.presence_of_all_elements_located((By.XPATH, SELECTORS["album_overview_rows"])))
+            sync_participants(job_id, account_type, db)
-            for row in album_rows:
+        except Exception as sync_err:
-                try:
+            logger.error(f"Sync failed during statistics: {sync_err}")
-                    album_link = row.find_element(By.XPATH, SELECTORS["album_overview_link"])
+            count = db.query(JobParticipant).filter(JobParticipant.job_id == job_id).count()
-                    albums_to_visit.append({"name": album_link.text, "url": album_link.get_attribute('href')})
+            if count == 0:
-                except NoSuchElementException:
+                task_store[task_id] = {"status": "error", "progress": f"Synchronisierung fehlgeschlagen: {str(sync_err)}"}
-                    continue
+                return
        except TimeoutException:
            task_store[task_id] = {"status": "error", "progress": "Konnte die Album-Liste nicht finden."}
            return
        total_albums = len(albums_to_visit)
        task_store[task_id]["progress"] = f"{total_albums} Alben gefunden. Starte Auswertung..."
        statistics = []
        for index, album in enumerate(albums_to_visit):
            album_name = album['name']
            task_store[task_id]["progress"] = f"Bearbeite Album {index + 1}/{total_albums}: '{album_name}'..."
            driver.get(album['url'])
            try:
                total_codes_text = wait.until(EC.visibility_of_element_located((By.XPATH, SELECTORS["access_code_count"]))).text
                num_pages = math.ceil(int(total_codes_text) / 20)
                total_children_in_album = 0
                children_with_purchase = 0
                children_with_all_purchased = 0
-                for page_num in range(1, num_pages + 1):
+        # 2. Query DB and group by 'gruppe'
-                    task_store[task_id]["progress"] = f"Bearbeite Album {index + 1}/{total_albums}: '{album_name}' (Seite {page_num}/{num_pages})..."
+        task_store[task_id]["progress"] = "Berechne Statistiken..."
-                    
+        
-                    if page_num > 1:
+        # Get all participants for this job
-                        driver.get(album['url'] + f"?page_guest_accesses={page_num}")
+        participants = db.query(JobParticipant).filter(JobParticipant.job_id == job_id).all()
-                    
+        
-                    person_rows = wait.until(EC.presence_of_all_elements_located((By.XPATH, SELECTORS["person_rows"])))
+        # Group by group
-                    
+        groups = {}
-                    for person_row in person_rows:
+        for p in participants:
-                        total_children_in_album += 1
+            g_name = p.gruppe or "Unbekannt"
-                        try:
+            if g_name not in groups:
-                            photo_container = person_row.find_element(By.XPATH, "./following-sibling::div[1]")
+                groups[g_name] = {
-                            
+                    "Album": g_name,
-                            num_total_photos = len(photo_container.find_elements(By.XPATH, SELECTORS["person_all_photos"]))
+                    "Kinder_insgesamt": 0,
-                            num_purchased_photos = len(photo_container.find_elements(By.XPATH, SELECTORS["person_purchased_photos"]))
+                    "Kinder_mit_Käufen": 0,
-                            num_access_cards = len(photo_container.find_elements(By.XPATH, SELECTORS["person_access_card_photo"]))
+                    "Kinder_Alle_Bilder_gekauft": 0 # Not available in CSV, setting to 0 or estimates
-                            
+                }
-                            buyable_photos = num_total_photos - num_access_cards
+            groups[g_name]["Kinder_insgesamt"] += 1
-                            
+            if p.has_orders:
-                            if num_purchased_photos > 0:
+                groups[g_name]["Kinder_mit_Käufen"] += 1
                                children_with_purchase += 1
                            if buyable_photos > 0 and buyable_photos == num_purchased_photos:
                                children_with_all_purchased += 1
                        except NoSuchElementException:
                            continue
                statistics.append({
                    "Album": album_name,
                    "Kinder_insgesamt": total_children_in_album,
                    "Kinder_mit_Käufen": children_with_purchase,
                    "Kinder_Alle_Bilder_gekauft": children_with_all_purchased
                })
-            except Exception as e:
+        statistics = list(groups.values())
-                logger.error(f"Fehler bei Auswertung von Album '{album_name}': {e}")
+        statistics.sort(key=lambda x: x["Album"])
                continue
        task_store[task_id] = {
            "status": "completed", 
-            "progress": "Auswertung erfolgreich abgeschlossen!", 
+            "progress": "Statistik erfolgreich berechnet!", 
            "result": statistics
        }
    except Exception as e:
-        logger.exception(f"Unexpected error in task {task_id}")
+        logger.exception(f"Unexpected error in statistics task {task_id}")
        task_store[task_id] = {"status": "error", "progress": f"Unerwarteter Fehler: {str(e)}"}
    finally:
-        if driver:
+        db.close()
            logger.debug(f"Task {task_id}: Closing driver.")
            driver.quit()
 def process_reminder_analysis(task_id: str, job_id: str, account_type: str):
-    logger.info(f"Task {task_id}: Starting reminder analysis for job {job_id}")
+    logger.info(f"Task {task_id}: Starting fast reminder analysis for job {job_id}")
-    task_store[task_id] = {"status": "running", "progress": "Initialisiere Browser...", "result": None}
+    task_store[task_id] = {"status": "running", "progress": "Synchronisiere Daten von Fotograf.de...", "result": None}
    username = os.getenv(f"{account_type.upper()}_USER")
    password = os.getenv(f"{account_type.upper()}_PW")
    driver = None
    db = SessionLocal()
    try:
-        driver = setup_driver()
+        # 1. Sync data from CSV (This takes ~20s and gets all parent emails, logins and orders)
        if not driver or not login(driver, username, password):
            task_store[task_id] = {"status": "error", "progress": "Login fehlgeschlagen."}
            return
        wait = WebDriverWait(driver, 15)
        # 1. Navigate to albums overview
        albums_overview_url = f"https://app.fotograf.de/config_jobs_photos/index/{job_id}"
        task_store[task_id]["progress"] = "Lade Alben-Übersicht..."
        driver.get(albums_overview_url)
        albums_to_visit = []
        try:
-            album_rows = wait.until(EC.presence_of_all_elements_located((By.XPATH, SELECTORS["album_overview_rows"])))
+            sync_participants(job_id, account_type, db)
-            for row in album_rows:
+        except Exception as sync_err:
-                try:
+            logger.error(f"Sync failed during reminder analysis: {sync_err}")
-                    album_link = row.find_element(By.XPATH, SELECTORS["album_overview_link"])
+            # Continue anyway if we have some data, or fail if we have none
-                    albums_to_visit.append({"name": album_link.text, "url": album_link.get_attribute('href')})
+            count = db.query(JobParticipant).filter(JobParticipant.job_id == job_id).count()
-                except NoSuchElementException:
+            if count == 0:
-                    continue
+                task_store[task_id] = {"status": "error", "progress": f"Synchronisierung fehlgeschlagen: {str(sync_err)}"}
-        except TimeoutException:
+                return
-            task_store[task_id] = {"status": "error", "progress": "Konnte die Album-Liste nicht finden."}
+
        # 2. Query DB for potential candidates (Logins <= 1 and No Orders)
        task_store[task_id]["progress"] = "Analysiere Datenbank-Einträge..."
        candidates = db.query(JobParticipant).filter(
            JobParticipant.job_id == job_id,
            JobParticipant.has_orders == 0,
            JobParticipant.logins <= 1,
            JobParticipant.email_eltern != "",
            JobParticipant.email_eltern != None
        ).all()
        if not candidates:
            task_store[task_id] = {
                "status": "completed", 
                "progress": "Keine passenden Empfänger (0-1 Logins, keine Bestellung) gefunden.", 
                "result": []
            }
            return
-        raw_results = []
+        # 3. Aggregate results by Email
-        total_albums = len(albums_to_visit)
+        aggregation = {}
-        
+        for c in candidates:
-        for index, album in enumerate(albums_to_visit):
+            email = c.email_eltern
-            album_name = album['name']
+            if email not in aggregation:
-            task_store[task_id]["progress"] = f"Album {index+1}/{total_albums}: '{album_name}'..."
+                aggregation[email] = {
-            driver.get(album['url'])
+                    "email": email,
-            
+                    "parent_name": c.vorname_eltern if c.vorname_eltern else "Liebe Eltern",
-            try:
+                    "children": [],
-                total_codes_text = wait.until(EC.visibility_of_element_located((By.XPATH, SELECTORS["access_code_count"]))).text
+                    "links": []
                num_pages = math.ceil(int(total_codes_text) / 20)
                for page_num in range(1, num_pages + 1):
                    task_store[task_id]["progress"] = f"Album {index+1}/{total_albums}: '{album_name}' (Seite {page_num}/{num_pages})..."
                    if page_num > 1:
                        driver.get(album['url'] + f"?page_guest_accesses={page_num}")
                    person_rows = wait.until(EC.presence_of_all_elements_located((By.XPATH, SELECTORS["person_rows"])))
                    num_persons = len(person_rows)
                    for i in range(num_persons):
                        # Re-locate rows to avoid stale element reference
                        person_rows = wait.until(EC.presence_of_all_elements_located((By.XPATH, SELECTORS["person_rows"])))
                        person_row = person_rows[i]
                        login_count_text = person_row.find_element(By.XPATH, ".//span[text()='Logins']/following-sibling::strong").text
                        # Only interested in people with 0 or 1 logins (potential reminders)
                        # Actually, if they haven't bought yet, they might need a reminder regardless of logins, 
                        # but the legacy logic uses login_count <= 1. 
                        # Let's stick to the legacy logic for now.
                        if int(login_count_text) <= 1:
                            vorname = person_row.find_element(By.XPATH, ".//span[text()='Vorname']/following-sibling::strong").text
                            try:
                                photo_container = person_row.find_element(By.XPATH, "./following-sibling::div[1]")
                                purchase_icons = photo_container.find_elements(By.XPATH, ".//img[@alt='Bestellungen mit diesem Foto']")
                                if len(purchase_icons) > 0:
                                    continue
                            except NoSuchElementException:
                                pass
                            # Potential candidate
                            access_code_page_url = person_row.find_element(By.XPATH, ".//a[contains(@data-qa-id, 'guest-access-banner-access-code')]").get_attribute('href')
                            # Open in new tab or navigate back and forth? 
                            # Scraper.py navigates back and forth.
                            driver.get(access_code_page_url)
                            try:
                                wait.until(EC.visibility_of_element_located((By.XPATH, "//a[@id='quick-login-url']")))
                                quick_login_url = driver.find_element(By.XPATH, "//a[@id='quick-login-url']").get_attribute('href')
                                potential_buyer_element = driver.find_element(By.XPATH, "//a[contains(@href, '/config_customers/view_customer')]")
                                buyer_name = potential_buyer_element.text
                                potential_buyer_element.click()
                                email = wait.until(EC.visibility_of_element_located((By.XPATH, "//span[contains(., '@')]"))).text
                                raw_results.append({
                                    "child_name": vorname,
                                    "buyer_name": buyer_name,
                                    "email": email,
                                    "quick_login": quick_login_url
                                })
                            except Exception as e:
                                logger.warning(f"Error getting details for {vorname}: {e}")
                            # Go back to the album page
                            driver.get(album['url'] + (f"?page_guest_accesses={page_num}" if page_num > 1 else ""))
                            wait.until(EC.presence_of_element_located((By.XPATH, SELECTORS["person_rows"])))
            except Exception as e:
                logger.error(f"Fehler bei Album '{album_name}': {e}")
                continue
        # Aggregate Results
        task_store[task_id]["progress"] = "Aggregiere Ergebnisse..."
        aggregated_data = {}
        for res in raw_results:
            email = res['email']
            child_name = "Familienbilder" if res['child_name'] == "Familie" else res['child_name']
            html_link = f'<a href="{res["quick_login"]}">Fotos von {child_name}</a>'
            if email not in aggregated_data:
                aggregated_data[email] = {
                    'buyer_first_name': res['buyer_name'].split(' ')[0],
                    'email': email,
                    'children': [child_name],
                    'links': [html_link]
                }
            else:
                if child_name not in aggregated_data[email]['children']:
                    aggregated_data[email]['children'].append(child_name)
                    aggregated_data[email]['links'].append(html_link)
        final_list = []
        for email, data in aggregated_data.items():
            names = data['children']
            if len(names) > 2:
                names_str = ', '.join(names[:-1]) + ' und ' + names[-1]
            else:
                names_str = ' und '.join(names)
            final_list.append({
                'Name Käufer': data['buyer_first_name'],
                'E-Mail-Adresse Käufer': email,
                'Kindernamen': names_str,
                'LinksHTML': '<br><br>'.join(data['links'])
            })
            # Add child name
            child_name = c.vorname_kind or ""
            child_label = "Familienbilder" if child_name.lower() == "familie" else child_name
            if child_label and child_label not in aggregation[email]["children"]:
                aggregation[email]["children"].append(child_label)
            # Add Quick Login Link
            link = f"https://www.kinderfotos-erding.de/a/{c.zugangscode}"
            html_link = f'<a href="{link}">Fotos von {child_label}</a>'
            if html_link not in aggregation[email]["links"]:
                aggregation[email]["links"].append(html_link)
        # 4. Format for Supermailer/Gmail
        final_result = []
        for email, data in aggregation.items():
            children_str = " und ".join(data["children"]) if len(data["children"]) > 1 else (data["children"][0] if data["children"] else "Eurem Kind")
            links_html = "".join([f"{l}<br>" for l in data["links"]])
            final_result.append({
                "E-Mail-Adresse Käufer": email,
                "Name Käufer": data["parent_name"],
                "Kindernamen": children_str,
                "Anzahl Kinder": len(data["children"]),
                "LinksHTML": links_html
            })
        task_store[task_id] = {
-            "status": "completed",
+            "status": "completed", 
-            "progress": "Analyse abgeschlossen!",
+            "progress": f"Analyse fertig! {len(final_result)} Empfänger identifiziert.", 
-            "result": final_list
+            "result": final_result
        }
    except Exception as e:
        logger.exception(f"Error in task {task_id}")
        task_store[task_id] = {"status": "error", "progress": f"Fehler: {str(e)}"}
    finally:
-        if driver: driver.quit()
+        db.close()
 from fastapi import FastAPI, HTTPException, Depends, BackgroundTasks, UploadFile, File, Form
 from fastapi.middleware.cors import CORSMiddleware
@@ -1092,6 +1084,124 @@ async def send_bulk_emails(request: BulkEmailRequest, db: Session = Depends(get_
        "failed": failed_emails
    }
 def sync_participants(job_id: str, account_type: str, db: Session):
    logger.info(f"Syncing participants for job {job_id} ({account_type})")
    username = os.getenv(f"{account_type.upper()}_USER")
    password = os.getenv(f"{account_type.upper()}_PW")
    with tempfile.TemporaryDirectory() as temp_dir:
        driver = setup_driver(download_path=temp_dir)
        try:
            if not login(driver, username, password):
                raise Exception("Login failed.")
            # Navigate to the Persons tab
            job_url = f"https://app.fotograf.de/config_jobs_settings/index/{job_id}"
            driver.get(job_url)
            wait = WebDriverWait(driver, 30)
            personen_tab = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "[data-qa-id='link:photo-jobs-tabs-names_list']")))
            driver.execute_script("arguments[0].click();", personen_tab)
            # Click Export -> CSV
            export_btn = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, SELECTORS["export_dropdown"])))
            driver.execute_script("arguments[0].click();", export_btn)
            time.sleep(1)
            csv_btn = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, SELECTORS["export_csv_link"])))
            driver.execute_script("arguments[0].click();", csv_btn)
            # Wait for download
            csv_file = None
            for _ in range(45):
                files = os.listdir(temp_dir)
                csv_files = [f for f in files if f.endswith('.csv')]
                if csv_files:
                    csv_file = os.path.join(temp_dir, csv_files[0])
                    break
                time.sleep(1)
            if not csv_file:
                raise Exception("CSV download timed out.")
            # Read CSV with pandas
            df = None
            for sep in [";", ","]:
                try:
                    df = pd.read_csv(csv_file, sep=sep, encoding="utf-8-sig")
                    if len(df.columns) > 1: break
                except: continue
            if df is None: raise Exception("Could not parse CSV.")
            # Clean columns
            df.columns = df.columns.str.strip().str.replace("\"", "")
            logger.debug(f"Sync CSV Columns: {list(df.columns)}")
            # Column Mapping
            mapping = {
                "Child ID": "child_id",
                "Email der Eltern (1)": "email_eltern",
                "Vorname Eltern (1)": "vorname_eltern",
                "Nachname Eltern (1)": "nachname_eltern",
                "Vorname Kind": "vorname_kind",
                "Nachname Kind": "nachname_kind",
                "Zugangscode (1)": "zugangscode",
                "Logins (1)": "logins",
                "Bestellungen": "has_orders",
                "Gruppe": "gruppe",
                "Klasse": "gruppe"
            }
            # Upsert into database
            for _, row in df.iterrows():
                code = str(row.get("Zugangscode (1)", "")).strip()
                if not code or code == "nan": continue
                def clean_val(val):
                    v = str(val).strip()
                    return "" if v.lower() == "nan" else v
                # Determine order status
                orders_val = str(row.get("Bestellungen", "0")).lower()
                has_orders = 1 if (orders_val != "0" and orders_val != "nan" and orders_val != "") else 0
                # Determine logins
                logins_val = row.get("Logins (1)", 0)
                try: logins = int(float(logins_val))
                except: logins = 0
                participant = db.query(JobParticipant).filter(JobParticipant.job_id == job_id, JobParticipant.zugangscode == code).first()
                if not participant:
                    participant = JobParticipant(job_id=job_id, zugangscode=code)
                    db.add(participant)
                participant.child_id = clean_val(row.get("Child ID"))
                participant.vorname_kind = clean_val(row.get("Vorname Kind"))
                participant.nachname_kind = clean_val(row.get("Nachname Kind"))
                participant.vorname_eltern = clean_val(row.get("Vorname Eltern (1)"))
                participant.nachname_eltern = clean_val(row.get("Nachname Eltern (1)"))
                participant.email_eltern = clean_val(row.get("Email der Eltern (1)")).lower()
                participant.gruppe = clean_val(row.get("Gruppe", row.get("Klasse")))
                participant.logins = logins
                participant.has_orders = has_orders
                participant.last_synced = datetime.datetime.utcnow()
            db.commit()
            logger.info(f"Successfully synced {len(df)} participants for job {job_id}")
            return len(df)
        finally:
            driver.quit()
@app.post("/api/jobs/{job_id}/sync-participants")
 async def sync_participants_api(job_id: str, account_type: str, db: Session = Depends(get_db)):
    try:
        count = sync_participants(job_id, account_type, db)
        return {"status": "success", "count": count}
    except Exception as e:
        logger.exception("Sync failed")
        raise HTTPException(status_code=500, detail=str(e))
@app.get("/api/jobs/{job_id}/generate-pdf")
 async def generate_pdf(job_id: str, account_type: str, db: Session = Depends(get_db)):
    logger.info(f"API Request: Generate PDF for job {job_id} ({account_type})")
--- a/fotograf-de-scraper/frontend/src/App.tsx
+++ b/fotograf-de-scraper/frontend/src/App.tsx
@@ -45,6 +45,24 @@ function App() {
  const [isReminderRunning, setIsReminderRunning] = useState(false);
  const [latestFile, setLatestFile] = useState<any>(null);
  const [isGmailAuthenticated, setIsGmailAuthenticated] = useState(false);
  const [isSyncing, setIsSyncing] = useState(false);
  const handleSyncParticipants = async (job: Job) => {
    setIsSyncing(true);
    try {
      const response = await fetch(`${API_BASE_URL}/api/jobs/${job.id}/sync-participants?account_type=${activeTab}`, {
        method: 'POST'
      });
      if (response.ok) {
        alert("Daten erfolgreich mit Fotograf.de synchronisiert!");
      } else {
        alert("Synchronisierung fehlgeschlagen.");
      }
    } catch (e) {
      alert("Netzwerkfehler.");
    }
    setIsSyncing(false);
  };
  // Email States
  const [reminderResult, setReminderResult] = useState<any[] | null>(null);
@@ -958,7 +976,25 @@ function App() {
                            <option key={et.uri} value={et.name}>{et.name}</option>
                          ))}
                        </select>
-                        <p className="text-xs text-gray-500 mt-2">Wird für QR-Karten und die Terminübersicht benötigt.</p>
+                        <p className="text-xs text-gray-500 mt-2 mb-4">Wird für QR-Karten und die Terminübersicht benötigt.</p>
                        <div className="pt-4 border-t border-gray-100">
                           <button 
                             onClick={() => handleSyncParticipants(selectedJob)}
                             disabled={isSyncing}
                             className="w-full px-3 py-2 bg-white border border-indigo-200 text-indigo-600 text-xs font-bold rounded-lg hover:bg-indigo-50 transition-colors flex items-center justify-center gap-2"
                           >
                             {isSyncing ? (
                               <>
                                 <svg className="animate-spin h-3 w-3" viewBox="0 0 24 24" fill="none"><circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" /><path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z" /></svg>
                                 Sync läuft...
                               </>
                             ) : (
                               <>🔄 Daten von Fotograf.de abgleichen</>
                             )}
                           </button>
                           <p className="text-[10px] text-gray-400 mt-2">Aktualisiert E-Mails, Logins & Bestellstatus.</p>
                        </div>
                      </div>
                      {/* Actions */}