Eat System - Google Maps Crawler¶
File: cogs/eat/providers/googlemap_crawler.py
The GoogleMapCrawler class is responsible for fetching real-time restaurant data from Google Maps. It uses the Selenium library to automate a Chrome browser, enabling it to perform searches and scrape information from the resulting pages.
GoogleMapCrawler Class¶
__init__(self)¶
Initializes the Selenium WebDriver. It configures Chrome to run in headless mode (without a visible UI) for efficiency.
search(self, keyword)¶
This is the main method of the class. It performs a search on Google Maps and returns detailed information about a randomly selected restaurant.
- Parameters:
keyword(str): The food type or restaurant name to search for (e.g., "pizza", "Taverna Siguenza").
- Returns: A tuple containing the following information about the restaurant:
title(str): The name of the restaurant.rating(str): The star rating (e.g., "4.5 stars").category(str): The type of restaurant (e.g., "Italian restaurant").address(str): The physical address.url(str): The Google Maps URL for the restaurant.reviews(str): A string containing snippets of user reviews.menu(str): A URL to the menu image, if available.
Crawling Process¶
- Initial Search: Navigates to
https://www.google.com/maps/search/{keyword} restaurant. - Result Selection: It finds all the search result links on the page and randomly clicks one to navigate to the restaurant's specific page.
- Data Extraction: It uses
BeautifulSoupto parse the HTML of the restaurant's page and extracts key information like the title, rating, address, and category. - Review Scraping: It scrolls down the page to load user reviews and then extracts the text from them.
- Menu Scraping: It attempts to find and click the "Menu" button to reveal and scrape the URL of the menu image.
- Cleanup: The
close()method is called to shut down the WebDriver instance when it's no longer needed.