Job Description
A Cloud Software & Data Engineer is responsible for developing data engineering applications using third-party and in-house frameworks, leveraging a broad set of development skills that cover data engineering, data accessibility skillsets. The Cloud Software & Data Engineer is responsible for the complete software lifecycle – analysis, design, development, testing, implementation and support , as well as troubleshooting issues, deployment/upgrade of services and associated data, performance tuning and other maintenance work. This specific type of Full stack developer will focus on additional items: data engineering (large scale data transformation and manipulation, ETL, etc.), as well as infrastructure fine-tuning for optimization purposes. The position reports to the software project manager.
Responsibilities
• Work with subject matter experts to clarify requirements and use cases
• Turn requirements and user stories into functionality via implementation efforts which includes Design, build & maintain efficient, reusable, reliable code, High Quality software, Documentation, Traceability
• Develop server-side services to be elastically scalable and secure by design to support high volume & high velocity data processing. Services should be backward and forward compatible to ease deployment.
• Ensure the solution is deployable, operable and secure by default.
• Write and maintain provisioning, deployment, CI/CD and maintenance scripts for services they developed
• Write Unit Tests, Automation testing, Data Simulations
• Support, maintain, troubleshoot and fine-tune working cloud environments and the software run within
• Builds prototypes, products and systems that meets the project quality standards and requirements
• Be an individual contributor which includes technical leadership and documentation to developers and stakeholders
• Provide timely corrective actions on all assigned defects and issues.
• Contributes to development plan by providing task estimates.
• Fulfil organizational responsibilities (sharing knowledge & experience with other teams/ groups)
• Conduct technical training(s)/session(s), write whitepapers/case studies/blogs etc.
Background
Bachelors degree or higher in Computer Science or related with minimum years working experience
Skills and knowledge
Mandatory
• 5+ years of software development experience in Big Data technologies (Spark/, Database & Data Lakes)
• SQL, No-SQL, JSON, CSV, Parquet data types experience
• Advanced knowledge of large scale parallel computing engines (Spark) – provisioning, deployment, development of computing pipelines, operation and support, performance tuning (3y+)
• Good experience in building/tuning Spark pipelines in Python
• Design, build and maintain data processing pipelines in Apache NiFi, Spark Jobs
• Extensive knowledge of data structures, patterns and algorithms (5y +)
• Expertise with several back-end development languages and their associated frameworks – python (3y+)
• In-depth knowledge of application, cloud networking and security as well as related development best-practices and patterns (3y+)
• Cloud platform knowledge – Azure public cloud expertise (3y+)
• Advanced knowledge of DevOps, CI/CD and cloud deployment practices (5y+)
• Advanced knowledge of containerization and virtualization (Kubernetes), as well as scale clusters & debug issues on high volume/velocity data jobs and best practices (3y+)
• Advanced skills in setting up and operating databases (relational and non-relational) (3y+)
• Good experience in Databricks, Spark on Kubernetes
• Good Programming experience with Python
• Experienced in application profiling, bottleneck analysis and performance tuning
• Good communication and cross functional skills.
• Problem solving skills, Team player, adaptable & hustler
• Have worked in highly Agile projects in past
Nice to have
• Build, test and maintain tools, infrastructure to support Data science initiatives
• Exposure in PowerBI, SpotFire, Dataiku
• Knowledge and experience with version control tools (Git preferred but not mandatory)
• In Country cloud providers – Azure Stack (3y+)
• Experience deploying machine learning models into production environment.
• Experience with ML training/retraining, Model Registry, ML model performance measurement
• Oil and gas industry experience
• Architectural expertise