feat(company-explorer): force-refresh analysis and refine extraction logic
- Enforced fresh scrape on 'Analyze' request to bypass stale cache. - Implemented 2-Hop Impressum scraping strategy (via Kontakt page). - Refined numeric extraction for German locale (thousands separators). - Updated documentation with Lessons Learned.
This commit is contained in:
@@ -324,6 +324,19 @@ def analyze_company(req: AnalysisRequest, background_tasks: BackgroundTasks, db:
|
||||
if not company.website or company.website == "k.A.":
|
||||
return {"error": "No website to analyze. Run Discovery first."}
|
||||
|
||||
# FORCE SCRAPE LOGIC
|
||||
# If explicit force_scrape is requested OR if we want to ensure fresh data for debugging
|
||||
# We delete the old scrape data.
|
||||
# For now, let's assume every manual "Analyze" click implies a desire for fresh results if previous failed.
|
||||
# But let's respect the flag from frontend if we add it later.
|
||||
|
||||
# Always clearing scrape data for now to fix the "stuck cache" issue reported by user
|
||||
db.query(EnrichmentData).filter(
|
||||
EnrichmentData.company_id == company.id,
|
||||
EnrichmentData.source_type == "website_scrape"
|
||||
).delete()
|
||||
db.commit()
|
||||
|
||||
background_tasks.add_task(run_analysis_task, company.id, company.website)
|
||||
return {"status": "queued"}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user