Data Engineer – OpenShift

Smart It Frame

  • Full Time

To apply for this job please visit smartitframe.com.

Job Summary

We are looking for a highly experienced Senior Data Engineer with 8+ years of expertise in enterprise data engineering and platform development. The ideal candidate will have strong hands-on experience in Apache Airflow DAG development, dbt Core implementation, and containerized environments using Kubernetes or OpenShift.

This role plays a key part in designing, building, and optimizing scalable and reliable data pipelines that power financial and accounting systems, including large-scale data migrations and high-volume processing workloads.

You will be responsible for ensuring high performance, reliability, and scalability across modern data platforms while working closely with engineering and business teams.

Required Skills & Qualifications

  • 8–10+ years of experience in Data Engineering, Analytics Engineering, or Platform Engineering roles
  • Proven experience building and maintaining enterprise-grade data platforms in production
  • Advanced expertise in Apache Airflow (DAG design, scheduling, optimization, and monitoring)
  • Advanced expertise in dbt Core (data modeling, testing, macros, and deployment practices)
  • Strong proficiency in Python for data engineering and automation
  • Deep hands-on experience with Kubernetes and/or OpenShift in production environments
  • Strong understanding of distributed systems, workload optimization, and performance tuning
  • Excellent SQL skills for complex transformations and analytical processing
  • Experience working with cloud-based data platforms
  • Familiarity with CI/CD pipelines, Git-based workflows, and containerized deployments

Key Responsibilities

1. Data Pipeline & Orchestration

  • Design, develop, and maintain scalable Airflow DAGs for batch and event-driven pipelines
  • Implement best practices for scheduling, dependency management, retries, SLA monitoring, and alerting
  • Optimize Airflow components (scheduler, executor, workers) for high-throughput workloads

2. dbt Core & Data Modeling

  • Lead end-to-end implementation of dbt Core projects, including structure, environments, and CI/CD integration
  • Design scalable data models (staging, intermediate, and marts) following analytics engineering standards
  • Develop and maintain dbt tests, macros, documentation, and incremental models
  • Optimize dbt performance for large-scale datasets and downstream reporting needs

3. Kubernetes / OpenShift & Cloud Platforms

  • Deploy, manage, and optimize data workloads on Kubernetes/OpenShift
  • Implement scaling strategies including autoscaling, resource allocation, and pod scheduling
  • Configure and tune CPU/memory requests and limits for optimal performance
  • Troubleshoot container-level performance and resource contention issues

4. Performance, Monitoring & Reliability

  • Monitor and optimize end-to-end data pipeline performance across Airflow, dbt, and infrastructure
  • Identify and resolve bottlenecks in processing, orchestration, and query execution
  • Implement observability solutions including logging, metrics, and alerting systems
  • Ensure high availability, fault tolerance, and resiliency of data pipelines

5. Collaboration & Governance

  • Collaborate with data architects, platform engineers, and business stakeholders
  • Support financial reporting, accounting, and regulatory data requirements
  • Enforce engineering standards, security practices, and data governance policies
Job Overview