Create Your First Project
Start adding your projects to your portfolio. Click on "Manage Projects" to get started
Data Upload Automation Project
Customer
The company is one of Australia’s largest independent real estate investment platforms.
Detailed information about the client cannot be disclosed under the provisions of the NDA.
Role
</> Data Engineer / Data Scientist
Industry
Real estate
Challenge
The company's web platform is characterized by non-scalability and the need for more automation. It spends much time manually uploading and cleaning data into the backend system.
Solution
I built a cloud-based data ingestion pipeline.
Core pipeline components:
>Azure Data Factory (ADF) - built the data ingestion and orchestration layer using the low-code interface of ADF or
>Azure Batch Service - ADF pipeline that runs an Azure Batch workload. A Python script runs on the Batch nodes to get comma-separated value (CSV) input from an Azure Blob Storage container, manipulate the data, and write the output to a different storage container.
>Azure Data Lake Storage Gen 2 - save extracted raw data
>ADF Data Flows - data transformation
>Azure SQL database - back-end database
>Elasticsearch - front-end database
Data automation service enabled the company to streamline data upload and saved 20 hours of work each week.