-
Notifications
You must be signed in to change notification settings - Fork 181
Open
Description
Hi,
I’m trying to use GoogleImageCrawler with the following script:
from icrawler.builtin import GoogleImageCrawler
from icrawler import ImageDownloader
import os
# Custom image downloader without format filtering
class CustomDownloader(ImageDownloader):
def _filter(self, task):
return super()._filter(task) # No extension/type filtering now
def download_images(plant_name, category, num_images=300):
save_dir = os.path.join('dataset', category.replace(' ', '_'), plant_name.replace(' ', '_'))
os.makedirs(save_dir, exist_ok=True)
crawler = GoogleImageCrawler(
downloader_cls=CustomDownloader,
storage={'root_dir': save_dir}
)
crawler.crawl(
keyword=f"{plant_name} plant images",
max_num=num_images,
min_size=(512, 512),
file_idx_offset=0
)
if __name__ == '__main__':
plant_name = input("Enter plant name (e.g., 'Loropetalum 'Plum''): ").strip()
category = input("Enter category (e.g., 'Small or Mass Planting Shrub'): ").strip()
download_images(plant_name, category)
print("✅ Done! Check your dataset folder.")
Error / Logs:
2025-08-21 14:47:15,712 - INFO - parser - parsing result page https://www.google.com/search?q=coconut+palm+plant+images&ijn=0&start=0&tbs=&tbm=isch
Exception in thread parser-001:
Traceback (most recent call last):
...
File ".../icrawler/parser.py", line 93, in worker_exec
for task in self.parse(response, **kwargs):
TypeError: 'NoneType' object is not iterable
Environment:
- OS: Ubuntu 20.04
- Python: 3.10
- icrawler version: (please confirm with pip show icrawler)
Notes:
- Baidu and Bing crawlers work fine.
- Only GoogleImageCrawler fails with the above error.
- Looks like Google changed its HTML/response format.
Is there any fix or workaround planned for Google support?
Thanks!
Metadata
Metadata
Assignees
Labels
No labels