- Fixed a critical in the company-explorer by forcing a database re-initialization with a new file (). This ensures the application code is in sync with the database schema.
- Documented the schema mismatch incident and its resolution in MIGRATION_PLAN.md.
- Restored and enhanced BUILDER_APPS_MIGRATION.md by recovering extensive, valuable content from the git history that was accidentally deleted. The guide now again includes detailed troubleshooting steps and code templates for common migration pitfalls.
- Added logic to automatically flatten list-wrapped JSON responses from LLM in Impressum extraction.
- Fixed 'Unknown Legal Name' issue by ensuring property access on objects, not lists.
- Finalized v0.3.0 features and updated documentation with Lessons Learned.
- Increased logging verbosity in to track raw input to LLM and raw LLM response.
- This helps diagnose why Impressum data extraction might be failing for specific company websites.
- Enforced fresh scrape on 'Analyze' request to bypass stale cache.
- Implemented 2-Hop Impressum scraping strategy (via Kontakt page).
- Refined numeric extraction for German locale (thousands separators).
- Updated documentation with Lessons Learned.
- Updated version to v0.3.0 (UI & Backend) to clear potential caching confusion.
- Enhanced Impressum scraper to extract VAT ID (Umsatzsteuer-ID).
- Implemented 2-Hop scraping strategy: Looks for 'Kontakt' page if Impressum isn't on the start page.
- Added VAT ID display to the Legal Data block in Inspector.
- Implemented Impressum scraping with Root-URL fallback and enhanced keyword detection.
- Added 'clean_json_response' helper to strip Markdown from LLM outputs, preventing JSONDecodeErrors.
- Improved numeric extraction for German formatting (thousands separators vs decimals).
- Updated Inspector UI with Polling logic for auto-refresh and display of AI Dossier and Legal Data.
- Added Manual Override for Website URL.
- Ported robust Wikipedia extraction logic (categories, first paragraph) from legacy system.
- Implemented database-driven Robotics Category configuration with frontend settings UI.
- Updated Robotics Potential analysis to use Chain-of-Thought infrastructure reasoning.
- Added Manual Override features for Wikipedia URL (with locking) and Website URL (with re-scrape trigger).
- Enhanced Inspector UI with Wikipedia profile, category tags, and action buttons.
This commit introduces the foundational elements for the new "Company Explorer" web application, marking a significant step away from the legacy Google Sheets / CLI system.
Key changes include:
- Project Structure: A new directory with separate (FastAPI) and (React/Vite) components.
- Data Persistence: Migration from Google Sheets to a local SQLite database () using SQLAlchemy.
- Core Utilities: Extraction and cleanup of essential helper functions (LLM wrappers, text utilities) into .
- Backend Services: , , for AI-powered analysis, and logic.
- Frontend UI: Basic React application with company table, import wizard, and dynamic inspector sidebar.
- Docker Integration: Updated and for multi-stage builds and sideloading.
- Deployment & Access: Integrated into central Nginx proxy and dashboard, accessible via .
Lessons Learned & Fixed during development:
- Frontend Asset Loading: Addressed issues with Vite's path and FastAPI's .
- TypeScript Configuration: Added and .
- Database Schema Evolution: Solved errors by forcing a new database file and correcting override.
- Logging: Implemented robust file-based logging ().
This new foundation provides a powerful and maintainable platform for future B2B robotics lead generation.