An automated ETL pipeline for collecting and analyzing job market data from 1111/Yes123/104/cake Job Bank, powered by Apache Airflow.
/airflow βββ 1111&yes123/ # ETL for 1111,yes123 / merge all sources data and cleaning βββ 104/ # ETL for 104 / dbt processing βββ cake/ # ETL for cake βββ README.md # Project documentation
please read the detail README in the directory
- π Python: Core development
- πͺ Apache Airflow: Workflow management
- βοΈ Google Cloud Platform:
- π¦ Cloud Storage: Data storage
- π BigQuery: Data warehousing
- π· BeautifulSoup4: Web parsing
- πΌ Pandas: Data processing
- π DBT: Data transformation