6cf9afca875acd26b9c6940cb0e93392fb786d91
- Replace custom `debug_print` function with standard Python `logging` module calls throughout the codebase.
- Use appropriate logging levels (DEBUG, INFO, WARNING, ERROR, CRITICAL, EXCEPTION).
- Refactor logging setup in `main` for clarity and proper handler initialization.
- Integrate updated `WikipediaScraper` class (previously developed as v1.6.5 logic):
- Implement more robust infobox parsing (`_extract_infobox_value`) using flexible selectors, keyword checking (`in`), and improved value cleaning (incl. `sup` removal).
- Remove old infobox fallback functions.
- Enhance article validation (`_validate_article`) with better link checking via `_get_page_soup`.
- Improve reliability of article search (`search_company_article`) with direct match attempt and better error handling.
- Apply `@retry_on_failure` decorator to network-dependent scraper methods (`_get_page_soup`, `search_company_article`).
- Ensure `Config.VERSION` reflects the logical state (v1.6.5 for this commit).
Description
No description provided
Languages
Python
63.6%
TypeScript
19.2%
JavaScript
15.6%
HTML
0.7%
Dockerfile
0.4%
Other
0.5%