Data Engineer

As a Data Engineer at GoodHabitz, you’ll be part of an exciting journey as we migrate to AWS, enhancing our data infrastructure to support our growing business. In our scale-up environment, adaptability and problem-solving are key. This role is crucial in designing, building, and optimizing our data architecture to meet evolving needs. With expertise in data pipeline stacks (both open-source and AWS), you’ll develop scalable, high-performance solutions that drive efficiency and reliability. Collaborating closely with engineering teams, you’ll help shape robust data pipelines that align with our business objectives. If you're eager to make a real impact and be part of this transformation, we’d love to hear from you.

Key Responsibilities:

Data pipeline Design: Robust design experience in follow and implement scalable, high-performance data architectures using AWS services and OLAP databases (Clickhouse/ Snowflake).
Data Pipeline Development: Design, build, and maintain robust ETL pipelines that efficiently handle large-scale data ingestion, transformation, and storage using solutions like Databricks.
Cloud Infrastructure: Combine open-source data stack and AWS technologies to build and optimize data workflows.
Data Governance & Quality: Ensure data accuracy and consistency through best practices in data governance, lineage, and monitoring.
Performance Optimization: Optimize data storage, retrieval, and processing to support high-performance analytical workloads using partitioning, indexing, and query optimization techniques.
Collaboration & Leadership: Work closely with data analysts, and software engineers to understand requirements and deliver data-driven solutions, mentoring junior engineers.
Automation & CI/CD: Implement automated data pipeline deployment and monitoring strategies.

Requirements:

5+ years of experience in data engineering with solid experience on open-source data stack, and cloud native experiences.
Deep understanding of ETL processes, data modeling, and data warehousing (experience with medallion architecture and delta lake).
Strong experience in designing and architecting large-scale data systems.
Proficiency in programming languages like PySpark, Python, or productivity libraries scripting languages for data processing and pipeline development.

Experience with orchestration tools such as Apache Airflow, Step Functions, or Dragster.

Hands-on experience with infrastructure-as-code (Terraform, CloudFormation, CDK).
Strong problem-solving skills and ability to work in a fast-paced environment.
Knowledge of SQL query performance tuning, materialized views, and sharding strategies for large datasets.

Nice to have:

Expertise in ClickHouse or Snowflake, or similar OLAP databases is a plus.
Familiarity with containerization and serverless computing (Docker, Kubernetes) is a plus.
Experience with monitoring and observability tools such as Prometheus, Grafana, AWS CloudWatch is a plus.

Here's a glimpse of what's waiting for you:  

A competitive salary package that rewards your hard work. 
25 paid vacation days. And if that's not enough, you can purchase up to 10 more. 
A world of growth and development opportunities to enhance your skills. You'll have unlimited access to our treasure trove of GoodHabitz resources and MyAcademy . 
Access to mental coaching through our partner, OpenUp, to keep your mind in top shape. 
An annual do-good-day, fully paid, so you can contribute to a cause you're passionate about.  
Travel and expense reimbursement because we've got your journey covered.  
Pension and disability insurance, securing your financial well-being in the long run.  
A hybrid way of working .  
Working in a company that welcomes artificial intelligence and uses it to improve internal processes and push AI-powered features quickly . 
A company laptop.

Apply to this vacancy

Donut worry about cookies