Commit Graph

393 Commits

Author SHA1 Message Date
e1bfa5d853 bugfix 2025-04-25 13:05:14 +00:00
26508f945d bugfix 2025-04-25 12:55:53 +00:00
3d4e06867d bugfix 2025-04-25 12:44:42 +00:00
c6c0257407 v1.7.0: Refactor into modular classes & flexible UI
- Restructured codebase into modular classes (DataProcessor, Handlers)
- Centralized processing logic in DataProcessor class
- Implemented flexible step selection via flags in single row processing
- Added detailed timestamp/status checks for conditional step execution
- Integrated batch processing methods into DataProcessor
- Developed robust CLI/interactive user interface for mode selection
- Added new modes: find_wiki_serp, website_details, wiki_reextract_missing_an, combined_all
- Enhanced reeval & full_run modes with granular step control
- Improved logging with file output and better detail
- Consolidated & refined helper functions and external API calls
- Updated column mapping with new timestamps
- Revised ML model loading, prediction, and training data prep
2025-04-25 12:40:18 +00:00
73a258daf0 bugfix 2025-04-24 18:27:41 +00:00
1da3bf82d1 bugfix 2025-04-24 18:24:43 +00:00
e882ea226f bugfix 2025-04-24 18:20:15 +00:00
2b0b1ade79 bugfix 2025-04-24 18:12:21 +00:00
495c883a60 bugfix 2025-04-24 18:10:23 +00:00
5b81eb9851 bugfix 2025-04-24 18:08:24 +00:00
3c5bc42f75 bugfix 2025-04-24 18:06:28 +00:00
4fcec7c2c9 bugfix 2025-04-24 18:04:30 +00:00
f0084493e2 bugfix 2025-04-24 18:02:08 +00:00
36cf9e28f0 bugfix 2025-04-24 18:00:24 +00:00
e26067d815 bugfix 2025-04-24 17:58:25 +00:00
26a5b42d32 bugfix 2025-04-24 17:55:45 +00:00
98d015cc37 bugfix 2025-04-24 17:28:08 +00:00
c0702fce97 bugfix 2025-04-24 17:08:48 +00:00
95f7e4a061 bugfix 2025-04-24 17:01:31 +00:00
4948c4f52e bugfix 2025-04-24 16:55:18 +00:00
ba6e1ede9f bugfix 2025-04-24 16:49:43 +00:00
47bc20b0a4 bugfix 2025-04-24 16:36:07 +00:00
ac50b760e9 bugfix 2025-04-24 16:29:53 +00:00
d86bcf3725 bugfix 2025-04-24 16:17:17 +00:00
f8d2f9a7d6 bugfix 2025-04-24 15:45:46 +00:00
1c277928ea bugfix 2025-04-24 15:24:45 +00:00
485ba1cc70 bugfix 2025-04-24 15:18:25 +00:00
eace029dda debug 2025-04-24 15:05:35 +00:00
b83b584e5b debug 2025-04-24 14:57:48 +00:00
55e712733b bugfix 2025-04-24 14:42:12 +00:00
cf2e611212 bugfix 2025-04-24 14:41:06 +00:00
8557144b1e bugfix 2025-04-24 14:39:50 +00:00
8f3cb16db4 bugfix 2025-04-24 06:33:57 +00:00
37100fe8a8 bugfix 2025-04-24 06:32:25 +00:00
ea015edb57 bugfix 2025-04-24 06:12:04 +00:00
43476983a4 bugfix 2025-04-24 05:59:51 +00:00
76f1c72cc3 v1.7.0: Major refactoring for flexible processing modes and UI.
This version introduces significant structural changes to improve code maintainability and user flexibility by centralizing processing logic within the DataProcessor class and implementing a new menu-driven user interface with granular control over processing steps and row selection.

- Increment version number to v1.7.0.
- Major Structural Refactoring:
    - DataProcessor Centralization: Move the core processing logic for sequential runs, re-evaluation, batch modes, and specific data lookups/updates into the `DataProcessor` class as methods.
    - Resolve AttributeErrors: Correct the indentation for all methods belonging to the `DataProcessor` class to ensure they are correctly defined within the class scope.
    - Fix DataProcessor Initialization: Update the `DataProcessor.__init__` signature and implementation to accept and store required handler instances (e.g., `GoogleSheetHandler`, `WikipediaScraper`).
- New User Interface:
    - Menu-Driven Dispatcher: Implement a new `run_user_interface` function to replace the old `main` logic block. This function provides an interactive, multi-level numeric menu for selecting processing modes and parameters. It can also process direct CLI arguments.
    - Simplified Main: The `main` function is reduced to handling initial setup (Config, Logging, Handlers, DataProcessor instantiation) and then calling `run_user_interface`.
- Granular Processing Control:
    - Step Selection: Implement the ability for users to select specific processing steps (grouped logically, e.g., 'website', 'wiki', 'chatgpt') for execution within sequential, re-evaluation, and criteria-based modes.
    - Flags for Steps: Adapt the `_process_single_row` method and the methods that call it (`process_reevaluation_rows`, `process_sequential`, `process_rows_matching_criteria`) to accept and utilize flags (e.g., `process_wiki`, `process_chatgpt`) to control which processing blocks are attempted for a given row.
    - Refined Step Logic: Ensure processing blocks within `_process_single_row` correctly check their corresponding step flag *and* the necessary timestamp/status conditions (unless `force_reeval` is active).
- New Processing Modes:
    - Criteria Mode: Implement the `process_rows_matching_criteria` method and its UI integration, allowing users to select a predefined criterion function (e.g., 'M filled and AN empty') to filter rows for processing.
    - Wiki Re-Extraction (Criteria-based): Integrate the logic for processing rows where Wiki URL (M) is filled and Wiki Timestamp (AN) is empty, likely as a specific option within the new Criteria mode.
- Fixes and Improvements:
    - SyntaxError Resolution: Resolve persistent `SyntaxError`s related to complex f-string formatting in logging calls by constructing message parts separately.
    - `find_wiki_serp` Filter Logic: Ensure the `process_find_wiki_serp` method correctly uses the `get_numeric_filter_value` helper to apply the Umsatz OR Mitarbeiter threshold filter logic based on the correct data units.
    - Timestamp/Status Logic: Consolidate and clarify the logic for checking process necessity based on timestamps, status flags (like S='X'), and the `force_reeval` parameter in helper methods like `_is_step_processing_needed`.
    - ML Integration: Ensure `prepare_data_for_modeling` and `train_technician_model` are correctly integrated as `DataProcessor` methods and function within the new structure.
    - Consistency: Address inconsistencies in timestamp setting (e.g., ensuring AP is set by batch modes) and parameter handling across different methods where identified during the refactoring.
    - Helper Functions: Define or confirm the global scope of necessary helper functions (`get_numeric_filter_value`, criteria functions, `_process_batch`, etc.).

This version marks a significant milestone in making the script more modular, maintainable, and user-controllable, laying the groundwork for further enhancements like the ML estimation mode.
2025-04-24 05:04:51 +00:00
d190188d04 bugfix 2025-04-23 12:47:54 +00:00
0c799ded26 bugfix 2025-04-23 12:27:01 +00:00
99afe4681d bugfix 2025-04-23 05:36:57 +00:00
bf5db76232 v1.6.7: Behebt strukturelle/Syntax-Fehler; passt Filter für Wiki-Suche via SerpAPI an
- Inkrementiere Versionsnummer auf v1.6.7.
- Behebe kritischen AttributeError: Korrigiere die Einrückung für mehrere Verarbeitungsmethoden (_process_single_row, process_reevaluation_rows, process_serp_website_lookup_for_empty, process_website_details_for_marked_rows, prepare_data_for_modeling, process_rows_sequentially, process_find_wiki_with_serp), sodass diese korrekt als Methoden innerhalb der Klasse DataProcessor definiert sind.
- Behebe SyntaxError: Löse das Problem mit komplexen f-Strings in _process_single_row und potenziell anderen Stellen, indem die String-Konstruktion von Ausdrücken innerhalb der f-String-Syntax getrennt wird.
- Passe Filterlogik für Modus 'find_wiki_serp' an: Die SerpAPI-Suche nach fehlenden Wiki-URLs (M=k.A./leer) wird nun ausgelöst, wenn (CRM Umsatz (J) > 200 Mio ODER CRM Anzahl Mitarbeiter (K) > 500). Implementiere robuste numerische Extraktion für J und K innerhalb der Filterlogik.
- Stelle sicher, dass SerpAPI Wiki Search Timestamp (AY) immer nach einem Suchversuch im Modus 'find_wiki_serp' gesetzt wird, unabhängig vom Ergebnis.
- Diverse Logging-Anpassungen für Klarheit und Debugging (z.B. im Wiki-Verarbeitungsschritt).
2025-04-23 05:18:30 +00:00
808a015904 bugfix 2025-04-22 14:17:22 +00:00
27bdabc0b8 bugfix 2025-04-22 14:10:50 +00:00
66ad46665c bugfix 2025-04-22 14:03:42 +00:00
dcf33501e9 bugfix 2025-04-22 13:58:36 +00:00
21e8fa21d8 bugfix 2025-04-22 12:42:49 +00:00
e5741a23c9 bugfix 2025-04-22 12:29:03 +00:00
564aef9d20 bugfix 2025-04-22 12:21:33 +00:00
cca2191e2f bugfix 2025-04-22 11:18:10 +00:00
8284b5fa9f bugfix 2025-04-22 09:54:08 +00:00