[34588f42] Performance: Massive Beschleunigung der Analyse durch SQLite-Synchronisierung

- Neue Tabelle JobParticipant speichert detaillierte CSV-Daten von Fotograf.de.
- process_reminder_analysis und process_statistics nutzen nun die lokale Datenbank statt Selenium-Crawling.
- Neuer 'Daten abgleichen' Button im Vorbereitungs-Tab integriert.
- Automatischer Quick-Login Link-Generator basierend auf Zugangscodes.
This commit is contained in:
2026-04-18 13:49:03 +00:00
parent e6061868e6
commit 472f392107
4 changed files with 417 additions and 255 deletions

View File

@@ -1,6 +1,6 @@
# Fotograf.de Scraper & Management UI # Fotograf.de Scraper & Management UI
**Status:** Production-Ready Microservice (Core Feature: PDF List Generation, QR Cards, Shooting Schedule, **Siblings List**, **Gmail API Integration** & **Automated Release Requests**) **Status:** Production-Ready Microservice (Core Feature: PDF List Generation, QR Cards, Shooting Schedule, **SQLite Data Sync**, **Gmail API Integration** & **Automated Release Requests**)
Dieser Service modernisiert die alten `Fotograf.de` Skripte, indem er eine robuste, web-basierte UI zur Verwaltung und Automatisierung von Foto-Aufträgen bereitstellt. Er ist als eigenständiger Microservice konzipiert, der unabhängig vom Haupt-Stack läuft. Dieser Service modernisiert die alten `Fotograf.de` Skripte, indem er eine robuste, web-basierte UI zur Verwaltung und Automatisierung von Foto-Aufträgen bereitstellt. Er ist als eigenständiger Microservice konzipiert, der unabhängig vom Haupt-Stack läuft.
@@ -10,16 +10,22 @@ Der Service besteht aus zwei Hauptkomponenten:
1. **Backend (Python / FastAPI / Selenium / SQLAlchemy):** 1. **Backend (Python / FastAPI / Selenium / SQLAlchemy):**
* **Automatisierung:** Nutzt Selenium für das Scraping von `fotograf.de`. * **Automatisierung:** Nutzt Selenium für das Scraping von `fotograf.de`.
* **Persistenz:** Eine SQLite-Datenbank (`fotograf_jobs.db`) speichert die Auftragsliste, OAuth-Tokens (`GmailToken`), Gutscheincodes (`DiscountCode`) und Teilnehmerdaten (`ReleaseParticipant`). * **Persistenz:** Eine SQLite-Datenbank (`fotograf_jobs.db`) speichert die Auftragsliste, OAuth-Tokens (`GmailToken`), Gutscheincodes (`DiscountCode`), Teilnehmerdaten (`ReleaseParticipant`), **Auftragsteilnehmer (`JobParticipant`)** und die **Versand-Historie (`ReleaseHistory`)**.
* **PDF-Engine:** Nutzt WeasyPrint für Teilnehmerlisten und ReportLab/PyPDF2 für präzise PDF-Overlays (QR-Karten). * **PDF-Engine:** Nutzt WeasyPrint für Teilnehmerlisten und ReportLab/PyPDF2 für präzise PDF-Overlays (QR-Karten).
* **API-Integration:** Direkte Anbindung an die **Calendly API (v2)** sowie an die **Gmail API** für direkten E-Mail-Versand und automatisierte Webhook-Antworten. * **API-Integration:** Direkte Anbindung an die **Calendly API (v2)** sowie an die **Gmail API** für direkten E-Mail-Versand und automatisierte Webhook-Antworten.
2. **Frontend (TypeScript / React / Vite / TailwindCSS):** 2. **Frontend (TypeScript / React / Vite / TailwindCSS):**
* **Modernes UI:** Ein vollständig responsives Dashboard mit Tailwind CSS (Kachel-Layout, Tabs für Kiga/Schule). * **Modernes UI:** Ein vollständig responsives Dashboard mit Tailwind CSS (Kachel-Layout, Tabs für Kiga/Schule).
* **Arbeitsfluss:** Tools sind direkt in der Detailansicht des jeweiligen Auftrags integriert. * **Arbeitsfluss:** Tools sind in der Detailansicht eines Auftrags in logische Phasen (Vorbereitung, Follow-Up, Statistik) unterteilt.
## ✨ Core Features ## ✨ Core Features
### 🚀 Performance-Optimierung (SQLite Sync)
Statt wie früher jedes Mal mühsam durch alle Foto-Alben zu "crawlen", nutzt das System nun eine intelligente Synchronisierung:
* **One-Click Sync:** Über den Button "Daten von Fotograf.de abgleichen" lädt das System die detaillierte Namensliste (CSV) herunter.
* **Lokale Datenbank:** Alle relevanten Infos (E-Mail der Eltern, Login-Zahlen, Bestellstatus, Zugangscodes) werden in der Tabelle `job_participants` gespeichert.
* **Blitzschnelle Analyse:** Nachfass-Mails und Statistiken werden nun in Sekunden (statt Minuten) direkt aus der Datenbank generiert.
### Feature 1: Teilnehmerlisten (Vollständig) ### Feature 1: Teilnehmerlisten (Vollständig)
Automatisierter Workflow zum Download und Formatieren der Anmeldelisten von `fotograf.de` als sortiertes PDF inkl. "Kinderfotos Erding" Branding. Automatisierter Workflow zum Download und Formatieren der Anmeldelisten von `fotograf.de` als sortiertes PDF inkl. "Kinderfotos Erding" Branding.
@@ -28,43 +34,37 @@ Spezielles Modul für Familien-Mini-Shootings:
* **QR-Karten-Andruck:** Präzises Overlay von Name, Kinderanzahl und Uhrzeit inkl. automatischer **Einwilligungs-Checkbox (☑)** aus Calendly-Daten. * **QR-Karten-Andruck:** Präzises Overlay von Name, Kinderanzahl und Uhrzeit inkl. automatischer **Einwilligungs-Checkbox (☑)** aus Calendly-Daten.
* **Termin-Übersichtsliste:** Generiert eine A4-Tabelle für den Shooting-Tag im 6-Minuten-Takt inkl. Lückenfüller. * **Termin-Übersichtsliste:** Generiert eine A4-Tabelle für den Shooting-Tag im 6-Minuten-Takt inkl. Lückenfüller.
### Feature 3: Nachfass-E-Mails & Gmail Direkt-Versand (Vollständig) ### Feature 3: Nachfass-E-Mails & Gmail Direkt-Versand (Optimiert)
Identifizierung von Nicht-Käufern und automatisierter Massenversand personalisierter E-Mails via Gmail API. Identifizierung von Nicht-Käufern (0-1 Logins, keine Bestellung) basierend auf den synchronisierten Datenbank-Daten.
* **Vorschau-Modus:** Ermöglicht das Durchklicken der personalisierten E-Mails an jeden Empfänger vor dem eigentlichen Versand.
* **Quick-Login Automation:** Die Login-Links (`https://www.kinderfotos-erding.de/a/{code}`) werden automatisch generiert.
### Feature 4: Verkaufs-Statistiken (Vollständig) ### Feature 4: Verkaufs-Statistiken (Optimiert)
Detaillierte Analyse des Kaufverhaltens pro Album mit Echtzeit-Fortschrittsanzeige. Detaillierte Analyse des Kaufverhaltens pro Gruppe/Klasse basierend auf den lokalen Datenbank-Einträgen.
### Feature 5: Geschwisterliste (Einrichtungsintern) (Vollständig) ### Feature 5: Geschwisterliste (Einrichtungsintern) (Vollständig)
Tool zur Identifizierung von Geschwistergruppen innerhalb einer Einrichtung inkl. Cross-Check mit Calendly-Buchungen und speziellen Geschwister-QR-Karten. Tool zur Identifizierung von Geschwistergruppen innerhalb einer Einrichtung inkl. Cross-Check mit Calendly-Buchungen und speziellen Geschwister-QR-Karten.
### Feature 6: Freigabeanfragen & Gutschein-Automation (Vollständig - Neu April 2026) ### Feature 6: Freigabeanfragen & Gutschein-Automation (Vollständig)
Vollautomatisierter DSGVO-Workflow zur Einholung von Veröffentlichungsgenehmigungen: Vollautomatisierter DSGVO-Workflow zur Einholung von Veröffentlichungsgenehmigungen:
* **Schlanker Versand:** Manuelle Eingabe von Empfängern (E-Mail, Vorname, Kindernamen) für gezielte Anfragen. * **Schlanker Versand:** Manuelle Eingabe von Empfängern (E-Mail, Vorname, Kindernamen) mit **E-Mail-Vorschau**.
* **Intelligente Personalisierung:** Automatische Bereinigung von Einrichtungsnamen (entfernt "Kindergarten" und Jahreszahlen).
* **Versand-Planung:** Einstellbare Versandzeit (Berlin Timezone) via Hintergrund-Tasks. * **Versand-Planung:** Einstellbare Versandzeit (Berlin Timezone) via Hintergrund-Tasks.
* **Webhook-Integration:** Direkte Anbindung an **Google Forms**. Bei Absenden des Freigabe-Formulars wird automatisch: * **Webhook-Integration:** Direkte Anbindung an **Google Forms**. Bei Absenden des Freigabe-Formulars wird automatisch ein Gutscheincode reserviert und eine Dankes-E-Mail versendet.
1. Ein freier Gutscheincode aus der DB reserviert. * **Antwort-Übersicht:** Tabelle aller eingegangenen Freigaben inkl. zugewiesenem Code und Zeitstempel.
2. Eine personalisierte Dankes-E-Mail mit dem Code und einer bebilderten Einlöse-Anleitung versendet.
* **Gutschein-Management:** UI zum Hochladen und Überwachen des Gutschein-Pools.
--- ---
## 🛠️ Technische Details & Sicherheit ## 🛠️ Technische Details & Sicherheit
* **Sicherer Test-Modus:** Über die Umgebungsvariable `DEV_MODE_EMAIL_RECIPIENT` können alle ausgehenden E-Mails (Anfragen & Gutscheine) global an eine Test-Adresse umgeleitet werden. * **BCC-Kontrolle:** Jede vom System versendete E-Mail sendet automatisch eine Blindkopie (BCC) an `kontakt@kinderfotos-erding.de`.
* **Zeitzonen:** Durchgängige Verwendung von `Europe/Berlin` für alle zeitgesteuerten Operationen. * **Versand-Historie:** Alle Aussendungen (Anzahl Empfänger, Zeitpunkt) werden in der Tabelle `release_history` protokolliert.
* **E-Mail Signatur:** Die offizielle HTML-Signatur von "Kinderfotos Erding" wird automatisch an alle ausgehenden E-Mails (auch vom Backend) angehängt. * **Sicherer Test-Modus:** Über `DEV_MODE_EMAIL_RECIPIENT` können alle E-Mails global an eine Test-Adresse umgeleitet werden.
* **Gmail OAuth:** Persistente Speicherung der Refresh-Tokens in der Datenbank ermöglicht dauerhaften Betrieb ohne erneutes Einloggen. * **Zeitzonen:** Durchgängige Verwendung von `Europe/Berlin`.
* **Gmail OAuth:** Persistente Speicherung der Refresh-Tokens in der Datenbank.
## 🚀 Deployment & Konfiguration ## 🚀 Deployment & Konfiguration
Der Service wird über die Haupt-`docker-compose.yml` des Projekts verwaltet. Der Service wird über die Haupt-`docker-compose.yml` des Projekts verwaltet.
### Umgebungsvariablen (`.env`)
Wichtige neue Variablen in `/fotograf-de-scraper/.env`:
* `DEV_MODE_EMAIL_RECIPIENT`: (Optional) E-Mail für Umleitung im Testbetrieb.
* `google_fotograf_client_id` / `google_fotograf_secret`: OAuth Credentials.
* `CALENDLY_TOKEN`: API Zugriff.
### URLs ### URLs
* **Frontend:** `https://floke-ai.duckdns.org/fotograf-de/` * **Frontend:** `https://floke-ai.duckdns.org/fotograf-de/`
* **Webhook für Google Forms:** `https://floke-ai.duckdns.org/fotograf-de-api/api/publish-request/webhook` * **Webhook für Google Forms:** `https://floke-ai.duckdns.org/fotograf-de-api/api/publish-request/webhook`

View File

@@ -49,6 +49,22 @@ class ReleaseHistory(Base):
recipient_count = Column(Integer) recipient_count = Column(Integer)
scheduled_time = Column(String, nullable=True) scheduled_time = Column(String, nullable=True)
class JobParticipant(Base):
__tablename__ = "job_participants"
id = Column(Integer, primary_key=True)
job_id = Column(String, index=True)
child_id = Column(String, nullable=True)
vorname_kind = Column(String, nullable=True)
nachname_kind = Column(String, nullable=True)
vorname_eltern = Column(String, nullable=True)
nachname_eltern = Column(String, nullable=True)
email_eltern = Column(String, nullable=True)
zugangscode = Column(String, index=True)
gruppe = Column(String, nullable=True)
logins = Column(Integer, default=0)
has_orders = Column(Integer, default=0) # 0 for false, 1 for true
last_synced = Column(DateTime, default=datetime.datetime.utcnow)
Base.metadata.create_all(bind=engine) Base.metadata.create_all(bind=engine)
def get_db(): def get_db():

View File

@@ -16,7 +16,7 @@ from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import FileResponse from fastapi.responses import FileResponse
from typing import List, Dict, Any, Optional from typing import List, Dict, Any, Optional
from sqlalchemy.orm import Session from sqlalchemy.orm import Session
from database import get_db, Job as DBJob, engine, Base from database import get_db, Job as DBJob, engine, Base, JobParticipant, SessionLocal
import math import math
import uuid import uuid
@@ -141,6 +141,120 @@ def get_logo_base64():
logger.warning(f"Logo file not found at {logo_path}") logger.warning(f"Logo file not found at {logo_path}")
return None return None
def sync_job_participants(job_id: str, account_type: str, db: Session):
logger.info(f"Syncing participants for job {job_id} ({account_type})")
username = os.getenv(f"{account_type.upper()}_USER")
password = os.getenv(f"{account_type.upper()}_PW")
with tempfile.TemporaryDirectory() as temp_dir:
driver = setup_driver(download_path=temp_dir)
try:
if not login(driver, username, password):
raise Exception("Login failed during sync.")
# Navigate to job names list
job_url = f"https://app.fotograf.de/config_jobs_settings/index/{job_id}"
driver.get(job_url)
wait = WebDriverWait(driver, 20)
personen_tab = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "[data-qa-id='link:photo-jobs-tabs-names_list']")))
driver.execute_script("arguments[0].click();", personen_tab)
export_btn = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, SELECTORS["export_dropdown"])))
driver.execute_script("arguments[0].click();", export_btn)
time.sleep(1)
csv_btn = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, SELECTORS["export_csv_link"])))
driver.execute_script("arguments[0].click();", csv_btn)
# Wait for download
csv_file = None
for _ in range(30):
files = [f for f in os.listdir(temp_dir) if f.endswith('.csv')]
if files:
csv_file = os.path.join(temp_dir, files[0])
break
time.sleep(1)
if not csv_file:
raise Exception("CSV download timeout during sync.")
# Parse CSV
df = None
for sep in [";", ","]:
try:
df = pd.read_csv(csv_file, sep=sep, encoding="utf-8-sig")
if len(df.columns) > 1: break
except: continue
if df is None:
df = pd.read_csv(csv_file, sep=";", encoding="latin1")
# Clean columns
df.columns = df.columns.str.strip().str.replace("\"", "")
# Map columns - based on user feedback
# Expected columns: Child ID, Email der Eltern (1), Vorname Eltern (1), Nachname Eltern (1), Vorname Kind, Zugangscode (1), Logins (1), Bestellungen
def get_col(df, patterns):
for p in patterns:
for col in df.columns:
if p.lower() in col.lower():
return col
return None
col_child_id = get_col(df, ["Child ID"])
col_email = get_col(df, ["Email der Eltern", "E-Mail der Eltern"])
col_parent_vn = get_col(df, ["Vorname Eltern", "Parent First Name"])
col_parent_nn = get_col(df, ["Nachname Eltern", "Parent Last Name"])
col_child_vn = get_col(df, ["Vorname Kind", "Child First Name"])
col_child_nn = get_col(df, ["Nachname Kind", "Child Last Name"])
col_code = get_col(df, ["Zugangscode", "Access Code"])
col_group = get_col(df, ["Gruppe", "Klasse", "Group", "Class"])
col_logins = get_col(df, ["Logins"])
col_orders = get_col(df, ["Bestellungen", "Orders"])
# Delete old entries for this job
db.query(JobParticipant).filter(JobParticipant.job_id == job_id).delete()
added = 0
for _, row in df.iterrows():
try:
logins_val = 0
try: logins_val = int(row[col_logins]) if col_logins and pd.notna(row[col_logins]) else 0
except: pass
orders_val = 0
if col_orders and pd.notna(row[col_orders]):
val = str(row[col_orders]).lower()
if val and val != "0" and val != "nein" and val != "false":
orders_val = 1
participant = JobParticipant(
job_id=job_id,
child_id=str(row[col_child_id]) if col_child_id and pd.notna(row[col_child_id]) else None,
vorname_kind=str(row[col_child_vn]) if col_child_vn and pd.notna(row[col_child_vn]) else None,
nachname_kind=str(row[col_child_nn]) if col_child_nn and pd.notna(row[col_child_nn]) else None,
vorname_eltern=str(row[col_parent_vn]) if col_parent_vn and pd.notna(row[col_parent_vn]) else None,
nachname_eltern=str(row[col_parent_nn]) if col_parent_nn and pd.notna(row[col_parent_nn]) else None,
email_eltern=str(row[col_email]).strip().lower() if col_email and pd.notna(row[col_email]) else None,
zugangscode=str(row[col_code]) if col_code and pd.notna(row[col_code]) else None,
gruppe=str(row[col_group]) if col_group and pd.notna(row[col_group]) else None,
logins=logins_val,
has_orders=orders_val
)
db.add(participant)
added += 1
except Exception as e:
logger.warning(f"Error adding participant row: {e}")
db.commit()
logger.info(f"Sync complete. {added} participants stored for job {job_id}")
return added
finally:
driver.quit()
def generate_pdf_from_csv(csv_path: str, institution: str, date_info: str, list_type: str, output_path: str): def generate_pdf_from_csv(csv_path: str, institution: str, date_info: str, list_type: str, output_path: str):
logger.info(f"Generating PDF for {institution} from {csv_path}") logger.info(f"Generating PDF for {institution} from {csv_path}")
df = None df = None
@@ -488,264 +602,142 @@ def get_jobs_list(driver) -> List[Dict[str, Any]]:
task_store: Dict[str, Dict[str, Any]] = {} task_store: Dict[str, Dict[str, Any]] = {}
def process_statistics(task_id: str, job_id: str, account_type: str): def process_statistics(task_id: str, job_id: str, account_type: str):
logger.info(f"Task {task_id}: Starting statistics calculation for job {job_id}") logger.info(f"Task {task_id}: Starting fast statistics calculation for job {job_id}")
task_store[task_id] = {"status": "running", "progress": "Initialisiere Browser...", "result": None} task_store[task_id] = {"status": "running", "progress": "Synchronisiere Daten von Fotograf.de...", "result": None}
username = os.getenv(f"{account_type.upper()}_USER")
password = os.getenv(f"{account_type.upper()}_PW")
driver = None
db = SessionLocal()
try: try:
driver = setup_driver() # 1. Sync data from CSV
if not driver or not login(driver, username, password):
task_store[task_id] = {"status": "error", "progress": "Login fehlgeschlagen. Überprüfe die Zugangsdaten."}
return
task_store[task_id]["progress"] = f"Lade Alben-Übersicht für Auftrag..."
albums_overview_url = f"https://app.fotograf.de/config_jobs_photos/index/{job_id}"
logger.info(f"Navigating to albums: {albums_overview_url}")
driver.get(albums_overview_url)
wait = WebDriverWait(driver, 15)
albums_to_visit = []
try: try:
album_rows = wait.until(EC.presence_of_all_elements_located((By.XPATH, SELECTORS["album_overview_rows"]))) sync_participants(job_id, account_type, db)
for row in album_rows: except Exception as sync_err:
try: logger.error(f"Sync failed during statistics: {sync_err}")
album_link = row.find_element(By.XPATH, SELECTORS["album_overview_link"]) count = db.query(JobParticipant).filter(JobParticipant.job_id == job_id).count()
albums_to_visit.append({"name": album_link.text, "url": album_link.get_attribute('href')}) if count == 0:
except NoSuchElementException: task_store[task_id] = {"status": "error", "progress": f"Synchronisierung fehlgeschlagen: {str(sync_err)}"}
continue return
except TimeoutException:
task_store[task_id] = {"status": "error", "progress": "Konnte die Album-Liste nicht finden."}
return
total_albums = len(albums_to_visit)
task_store[task_id]["progress"] = f"{total_albums} Alben gefunden. Starte Auswertung..."
statistics = []
for index, album in enumerate(albums_to_visit):
album_name = album['name']
task_store[task_id]["progress"] = f"Bearbeite Album {index + 1}/{total_albums}: '{album_name}'..."
driver.get(album['url'])
try:
total_codes_text = wait.until(EC.visibility_of_element_located((By.XPATH, SELECTORS["access_code_count"]))).text
num_pages = math.ceil(int(total_codes_text) / 20)
total_children_in_album = 0
children_with_purchase = 0
children_with_all_purchased = 0
for page_num in range(1, num_pages + 1): # 2. Query DB and group by 'gruppe'
task_store[task_id]["progress"] = f"Bearbeite Album {index + 1}/{total_albums}: '{album_name}' (Seite {page_num}/{num_pages})..." task_store[task_id]["progress"] = "Berechne Statistiken..."
if page_num > 1: # Get all participants for this job
driver.get(album['url'] + f"?page_guest_accesses={page_num}") participants = db.query(JobParticipant).filter(JobParticipant.job_id == job_id).all()
person_rows = wait.until(EC.presence_of_all_elements_located((By.XPATH, SELECTORS["person_rows"]))) # Group by group
groups = {}
for person_row in person_rows: for p in participants:
total_children_in_album += 1 g_name = p.gruppe or "Unbekannt"
try: if g_name not in groups:
photo_container = person_row.find_element(By.XPATH, "./following-sibling::div[1]") groups[g_name] = {
"Album": g_name,
num_total_photos = len(photo_container.find_elements(By.XPATH, SELECTORS["person_all_photos"])) "Kinder_insgesamt": 0,
num_purchased_photos = len(photo_container.find_elements(By.XPATH, SELECTORS["person_purchased_photos"])) "Kinder_mit_Käufen": 0,
num_access_cards = len(photo_container.find_elements(By.XPATH, SELECTORS["person_access_card_photo"])) "Kinder_Alle_Bilder_gekauft": 0 # Not available in CSV, setting to 0 or estimates
}
buyable_photos = num_total_photos - num_access_cards groups[g_name]["Kinder_insgesamt"] += 1
if p.has_orders:
if num_purchased_photos > 0: groups[g_name]["Kinder_mit_Käufen"] += 1
children_with_purchase += 1
if buyable_photos > 0 and buyable_photos == num_purchased_photos:
children_with_all_purchased += 1
except NoSuchElementException:
continue
statistics.append({
"Album": album_name,
"Kinder_insgesamt": total_children_in_album,
"Kinder_mit_Käufen": children_with_purchase,
"Kinder_Alle_Bilder_gekauft": children_with_all_purchased
})
except Exception as e: statistics = list(groups.values())
logger.error(f"Fehler bei Auswertung von Album '{album_name}': {e}") statistics.sort(key=lambda x: x["Album"])
continue
task_store[task_id] = { task_store[task_id] = {
"status": "completed", "status": "completed",
"progress": "Auswertung erfolgreich abgeschlossen!", "progress": "Statistik erfolgreich berechnet!",
"result": statistics "result": statistics
} }
except Exception as e: except Exception as e:
logger.exception(f"Unexpected error in task {task_id}") logger.exception(f"Unexpected error in statistics task {task_id}")
task_store[task_id] = {"status": "error", "progress": f"Unerwarteter Fehler: {str(e)}"} task_store[task_id] = {"status": "error", "progress": f"Unerwarteter Fehler: {str(e)}"}
finally: finally:
if driver: db.close()
logger.debug(f"Task {task_id}: Closing driver.")
driver.quit()
def process_reminder_analysis(task_id: str, job_id: str, account_type: str): def process_reminder_analysis(task_id: str, job_id: str, account_type: str):
logger.info(f"Task {task_id}: Starting reminder analysis for job {job_id}") logger.info(f"Task {task_id}: Starting fast reminder analysis for job {job_id}")
task_store[task_id] = {"status": "running", "progress": "Initialisiere Browser...", "result": None} task_store[task_id] = {"status": "running", "progress": "Synchronisiere Daten von Fotograf.de...", "result": None}
username = os.getenv(f"{account_type.upper()}_USER")
password = os.getenv(f"{account_type.upper()}_PW")
driver = None
db = SessionLocal()
try: try:
driver = setup_driver() # 1. Sync data from CSV (This takes ~20s and gets all parent emails, logins and orders)
if not driver or not login(driver, username, password):
task_store[task_id] = {"status": "error", "progress": "Login fehlgeschlagen."}
return
wait = WebDriverWait(driver, 15)
# 1. Navigate to albums overview
albums_overview_url = f"https://app.fotograf.de/config_jobs_photos/index/{job_id}"
task_store[task_id]["progress"] = "Lade Alben-Übersicht..."
driver.get(albums_overview_url)
albums_to_visit = []
try: try:
album_rows = wait.until(EC.presence_of_all_elements_located((By.XPATH, SELECTORS["album_overview_rows"]))) sync_participants(job_id, account_type, db)
for row in album_rows: except Exception as sync_err:
try: logger.error(f"Sync failed during reminder analysis: {sync_err}")
album_link = row.find_element(By.XPATH, SELECTORS["album_overview_link"]) # Continue anyway if we have some data, or fail if we have none
albums_to_visit.append({"name": album_link.text, "url": album_link.get_attribute('href')}) count = db.query(JobParticipant).filter(JobParticipant.job_id == job_id).count()
except NoSuchElementException: if count == 0:
continue task_store[task_id] = {"status": "error", "progress": f"Synchronisierung fehlgeschlagen: {str(sync_err)}"}
except TimeoutException: return
task_store[task_id] = {"status": "error", "progress": "Konnte die Album-Liste nicht finden."}
# 2. Query DB for potential candidates (Logins <= 1 and No Orders)
task_store[task_id]["progress"] = "Analysiere Datenbank-Einträge..."
candidates = db.query(JobParticipant).filter(
JobParticipant.job_id == job_id,
JobParticipant.has_orders == 0,
JobParticipant.logins <= 1,
JobParticipant.email_eltern != "",
JobParticipant.email_eltern != None
).all()
if not candidates:
task_store[task_id] = {
"status": "completed",
"progress": "Keine passenden Empfänger (0-1 Logins, keine Bestellung) gefunden.",
"result": []
}
return return
raw_results = [] # 3. Aggregate results by Email
total_albums = len(albums_to_visit) aggregation = {}
for c in candidates:
for index, album in enumerate(albums_to_visit): email = c.email_eltern
album_name = album['name'] if email not in aggregation:
task_store[task_id]["progress"] = f"Album {index+1}/{total_albums}: '{album_name}'..." aggregation[email] = {
driver.get(album['url']) "email": email,
"parent_name": c.vorname_eltern if c.vorname_eltern else "Liebe Eltern",
try: "children": [],
total_codes_text = wait.until(EC.visibility_of_element_located((By.XPATH, SELECTORS["access_code_count"]))).text "links": []
num_pages = math.ceil(int(total_codes_text) / 20)
for page_num in range(1, num_pages + 1):
task_store[task_id]["progress"] = f"Album {index+1}/{total_albums}: '{album_name}' (Seite {page_num}/{num_pages})..."
if page_num > 1:
driver.get(album['url'] + f"?page_guest_accesses={page_num}")
person_rows = wait.until(EC.presence_of_all_elements_located((By.XPATH, SELECTORS["person_rows"])))
num_persons = len(person_rows)
for i in range(num_persons):
# Re-locate rows to avoid stale element reference
person_rows = wait.until(EC.presence_of_all_elements_located((By.XPATH, SELECTORS["person_rows"])))
person_row = person_rows[i]
login_count_text = person_row.find_element(By.XPATH, ".//span[text()='Logins']/following-sibling::strong").text
# Only interested in people with 0 or 1 logins (potential reminders)
# Actually, if they haven't bought yet, they might need a reminder regardless of logins,
# but the legacy logic uses login_count <= 1.
# Let's stick to the legacy logic for now.
if int(login_count_text) <= 1:
vorname = person_row.find_element(By.XPATH, ".//span[text()='Vorname']/following-sibling::strong").text
try:
photo_container = person_row.find_element(By.XPATH, "./following-sibling::div[1]")
purchase_icons = photo_container.find_elements(By.XPATH, ".//img[@alt='Bestellungen mit diesem Foto']")
if len(purchase_icons) > 0:
continue
except NoSuchElementException:
pass
# Potential candidate
access_code_page_url = person_row.find_element(By.XPATH, ".//a[contains(@data-qa-id, 'guest-access-banner-access-code')]").get_attribute('href')
# Open in new tab or navigate back and forth?
# Scraper.py navigates back and forth.
driver.get(access_code_page_url)
try:
wait.until(EC.visibility_of_element_located((By.XPATH, "//a[@id='quick-login-url']")))
quick_login_url = driver.find_element(By.XPATH, "//a[@id='quick-login-url']").get_attribute('href')
potential_buyer_element = driver.find_element(By.XPATH, "//a[contains(@href, '/config_customers/view_customer')]")
buyer_name = potential_buyer_element.text
potential_buyer_element.click()
email = wait.until(EC.visibility_of_element_located((By.XPATH, "//span[contains(., '@')]"))).text
raw_results.append({
"child_name": vorname,
"buyer_name": buyer_name,
"email": email,
"quick_login": quick_login_url
})
except Exception as e:
logger.warning(f"Error getting details for {vorname}: {e}")
# Go back to the album page
driver.get(album['url'] + (f"?page_guest_accesses={page_num}" if page_num > 1 else ""))
wait.until(EC.presence_of_element_located((By.XPATH, SELECTORS["person_rows"])))
except Exception as e:
logger.error(f"Fehler bei Album '{album_name}': {e}")
continue
# Aggregate Results
task_store[task_id]["progress"] = "Aggregiere Ergebnisse..."
aggregated_data = {}
for res in raw_results:
email = res['email']
child_name = "Familienbilder" if res['child_name'] == "Familie" else res['child_name']
html_link = f'<a href="{res["quick_login"]}">Fotos von {child_name}</a>'
if email not in aggregated_data:
aggregated_data[email] = {
'buyer_first_name': res['buyer_name'].split(' ')[0],
'email': email,
'children': [child_name],
'links': [html_link]
} }
else:
if child_name not in aggregated_data[email]['children']:
aggregated_data[email]['children'].append(child_name)
aggregated_data[email]['links'].append(html_link)
final_list = []
for email, data in aggregated_data.items():
names = data['children']
if len(names) > 2:
names_str = ', '.join(names[:-1]) + ' und ' + names[-1]
else:
names_str = ' und '.join(names)
final_list.append({
'Name Käufer': data['buyer_first_name'],
'E-Mail-Adresse Käufer': email,
'Kindernamen': names_str,
'LinksHTML': '<br><br>'.join(data['links'])
})
# Add child name
child_name = c.vorname_kind or ""
child_label = "Familienbilder" if child_name.lower() == "familie" else child_name
if child_label and child_label not in aggregation[email]["children"]:
aggregation[email]["children"].append(child_label)
# Add Quick Login Link
link = f"https://www.kinderfotos-erding.de/a/{c.zugangscode}"
html_link = f'<a href="{link}">Fotos von {child_label}</a>'
if html_link not in aggregation[email]["links"]:
aggregation[email]["links"].append(html_link)
# 4. Format for Supermailer/Gmail
final_result = []
for email, data in aggregation.items():
children_str = " und ".join(data["children"]) if len(data["children"]) > 1 else (data["children"][0] if data["children"] else "Eurem Kind")
links_html = "".join([f"{l}<br>" for l in data["links"]])
final_result.append({
"E-Mail-Adresse Käufer": email,
"Name Käufer": data["parent_name"],
"Kindernamen": children_str,
"Anzahl Kinder": len(data["children"]),
"LinksHTML": links_html
})
task_store[task_id] = { task_store[task_id] = {
"status": "completed", "status": "completed",
"progress": "Analyse abgeschlossen!", "progress": f"Analyse fertig! {len(final_result)} Empfänger identifiziert.",
"result": final_list "result": final_result
} }
except Exception as e: except Exception as e:
logger.exception(f"Error in task {task_id}") logger.exception(f"Error in task {task_id}")
task_store[task_id] = {"status": "error", "progress": f"Fehler: {str(e)}"} task_store[task_id] = {"status": "error", "progress": f"Fehler: {str(e)}"}
finally: finally:
if driver: driver.quit() db.close()
from fastapi import FastAPI, HTTPException, Depends, BackgroundTasks, UploadFile, File, Form from fastapi import FastAPI, HTTPException, Depends, BackgroundTasks, UploadFile, File, Form
from fastapi.middleware.cors import CORSMiddleware from fastapi.middleware.cors import CORSMiddleware
@@ -1092,6 +1084,124 @@ async def send_bulk_emails(request: BulkEmailRequest, db: Session = Depends(get_
"failed": failed_emails "failed": failed_emails
} }
def sync_participants(job_id: str, account_type: str, db: Session):
logger.info(f"Syncing participants for job {job_id} ({account_type})")
username = os.getenv(f"{account_type.upper()}_USER")
password = os.getenv(f"{account_type.upper()}_PW")
with tempfile.TemporaryDirectory() as temp_dir:
driver = setup_driver(download_path=temp_dir)
try:
if not login(driver, username, password):
raise Exception("Login failed.")
# Navigate to the Persons tab
job_url = f"https://app.fotograf.de/config_jobs_settings/index/{job_id}"
driver.get(job_url)
wait = WebDriverWait(driver, 30)
personen_tab = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "[data-qa-id='link:photo-jobs-tabs-names_list']")))
driver.execute_script("arguments[0].click();", personen_tab)
# Click Export -> CSV
export_btn = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, SELECTORS["export_dropdown"])))
driver.execute_script("arguments[0].click();", export_btn)
time.sleep(1)
csv_btn = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, SELECTORS["export_csv_link"])))
driver.execute_script("arguments[0].click();", csv_btn)
# Wait for download
csv_file = None
for _ in range(45):
files = os.listdir(temp_dir)
csv_files = [f for f in files if f.endswith('.csv')]
if csv_files:
csv_file = os.path.join(temp_dir, csv_files[0])
break
time.sleep(1)
if not csv_file:
raise Exception("CSV download timed out.")
# Read CSV with pandas
df = None
for sep in [";", ","]:
try:
df = pd.read_csv(csv_file, sep=sep, encoding="utf-8-sig")
if len(df.columns) > 1: break
except: continue
if df is None: raise Exception("Could not parse CSV.")
# Clean columns
df.columns = df.columns.str.strip().str.replace("\"", "")
logger.debug(f"Sync CSV Columns: {list(df.columns)}")
# Column Mapping
mapping = {
"Child ID": "child_id",
"Email der Eltern (1)": "email_eltern",
"Vorname Eltern (1)": "vorname_eltern",
"Nachname Eltern (1)": "nachname_eltern",
"Vorname Kind": "vorname_kind",
"Nachname Kind": "nachname_kind",
"Zugangscode (1)": "zugangscode",
"Logins (1)": "logins",
"Bestellungen": "has_orders",
"Gruppe": "gruppe",
"Klasse": "gruppe"
}
# Upsert into database
for _, row in df.iterrows():
code = str(row.get("Zugangscode (1)", "")).strip()
if not code or code == "nan": continue
def clean_val(val):
v = str(val).strip()
return "" if v.lower() == "nan" else v
# Determine order status
orders_val = str(row.get("Bestellungen", "0")).lower()
has_orders = 1 if (orders_val != "0" and orders_val != "nan" and orders_val != "") else 0
# Determine logins
logins_val = row.get("Logins (1)", 0)
try: logins = int(float(logins_val))
except: logins = 0
participant = db.query(JobParticipant).filter(JobParticipant.job_id == job_id, JobParticipant.zugangscode == code).first()
if not participant:
participant = JobParticipant(job_id=job_id, zugangscode=code)
db.add(participant)
participant.child_id = clean_val(row.get("Child ID"))
participant.vorname_kind = clean_val(row.get("Vorname Kind"))
participant.nachname_kind = clean_val(row.get("Nachname Kind"))
participant.vorname_eltern = clean_val(row.get("Vorname Eltern (1)"))
participant.nachname_eltern = clean_val(row.get("Nachname Eltern (1)"))
participant.email_eltern = clean_val(row.get("Email der Eltern (1)")).lower()
participant.gruppe = clean_val(row.get("Gruppe", row.get("Klasse")))
participant.logins = logins
participant.has_orders = has_orders
participant.last_synced = datetime.datetime.utcnow()
db.commit()
logger.info(f"Successfully synced {len(df)} participants for job {job_id}")
return len(df)
finally:
driver.quit()
@app.post("/api/jobs/{job_id}/sync-participants")
async def sync_participants_api(job_id: str, account_type: str, db: Session = Depends(get_db)):
try:
count = sync_participants(job_id, account_type, db)
return {"status": "success", "count": count}
except Exception as e:
logger.exception("Sync failed")
raise HTTPException(status_code=500, detail=str(e))
@app.get("/api/jobs/{job_id}/generate-pdf") @app.get("/api/jobs/{job_id}/generate-pdf")
async def generate_pdf(job_id: str, account_type: str, db: Session = Depends(get_db)): async def generate_pdf(job_id: str, account_type: str, db: Session = Depends(get_db)):
logger.info(f"API Request: Generate PDF for job {job_id} ({account_type})") logger.info(f"API Request: Generate PDF for job {job_id} ({account_type})")

View File

@@ -45,6 +45,24 @@ function App() {
const [isReminderRunning, setIsReminderRunning] = useState(false); const [isReminderRunning, setIsReminderRunning] = useState(false);
const [latestFile, setLatestFile] = useState<any>(null); const [latestFile, setLatestFile] = useState<any>(null);
const [isGmailAuthenticated, setIsGmailAuthenticated] = useState(false); const [isGmailAuthenticated, setIsGmailAuthenticated] = useState(false);
const [isSyncing, setIsSyncing] = useState(false);
const handleSyncParticipants = async (job: Job) => {
setIsSyncing(true);
try {
const response = await fetch(`${API_BASE_URL}/api/jobs/${job.id}/sync-participants?account_type=${activeTab}`, {
method: 'POST'
});
if (response.ok) {
alert("Daten erfolgreich mit Fotograf.de synchronisiert!");
} else {
alert("Synchronisierung fehlgeschlagen.");
}
} catch (e) {
alert("Netzwerkfehler.");
}
setIsSyncing(false);
};
// Email States // Email States
const [reminderResult, setReminderResult] = useState<any[] | null>(null); const [reminderResult, setReminderResult] = useState<any[] | null>(null);
@@ -958,7 +976,25 @@ function App() {
<option key={et.uri} value={et.name}>{et.name}</option> <option key={et.uri} value={et.name}>{et.name}</option>
))} ))}
</select> </select>
<p className="text-xs text-gray-500 mt-2">Wird für QR-Karten und die Terminübersicht benötigt.</p> <p className="text-xs text-gray-500 mt-2 mb-4">Wird für QR-Karten und die Terminübersicht benötigt.</p>
<div className="pt-4 border-t border-gray-100">
<button
onClick={() => handleSyncParticipants(selectedJob)}
disabled={isSyncing}
className="w-full px-3 py-2 bg-white border border-indigo-200 text-indigo-600 text-xs font-bold rounded-lg hover:bg-indigo-50 transition-colors flex items-center justify-center gap-2"
>
{isSyncing ? (
<>
<svg className="animate-spin h-3 w-3" viewBox="0 0 24 24" fill="none"><circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" /><path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z" /></svg>
Sync läuft...
</>
) : (
<>🔄 Daten von Fotograf.de abgleichen</>
)}
</button>
<p className="text-[10px] text-gray-400 mt-2">Aktualisiert E-Mails, Logins & Bestellstatus.</p>
</div>
</div> </div>
{/* Actions */} {/* Actions */}