Big Data Engineer

eTeam UK ,
London, Greater London

Overview

Job Description

Job Tittle: Big Data Engineer Location: London, UK Duration: 6 months Contract Job Purpose and primary objectives: * Developer should have experience in Spark & Apache Big Data Technologies. * SQL skills for pipeline development and analysis * Python for pipeline development and analysis Key responsibilities: * Participation in daily scrum to update progress on tasks. * Participation in sprint planning to decompose stories/PBIs into tasks and provide estimates. * Maintain and enhance the data engineering SDLC and framework. * Support, maintain and enhance the build and deploy framework for SDLC. * Support of development and test activity of use case delivery teams (BP consumers). * Support performance testing and tuning examples for BP consumers * Maintenance and development of transitioned data pipelines under enterprise * Maintenance and development of core systems pipelines and reporting feeds for enterprise platforms * Operational support of transitioned data pipelines including issues investigation and remediation * Support maintain and enhance reference implementation and documentation for key design patterns and best practice. * Maintenance and development of DataOps test frameworks, utilities and related assets * Development of additional data engineering utilities for BP consumers * Demonstration and documentation of evidence to demonstrate Acceptance Criteria. Key Skills/Knowledge: General: * Analytical mindset - data driven ability to problem solve and refine engineering requirements * Service oriented - understand principles of DevOps & ITIL and maintaining provision of products and services to consumers. Can work with ticketing systems. * Attention to detail - focus on rigour and completeness of investigation, and consideration for the needs of BP customers * Be a good communicator - able to deliver messages effectively, with good verbal communication skills, and act as an independent voice within a collaborative team. * Be self-motivated and self-starting - able to work with little supervision and unfamiliar concepts quickly. Be confident in their judgement. * Proficiency in a language such as Java, Python, or Go * Deep familiarity with container schedulers and related security best practices * Experience working with a cloud provider and in-house data centers * Experience managing and developing highly-available and distributed software * Exposure to infrastructure-as-code frameworks such as Terraform * Development toolchain support: CI/CD pipelines, container schedulers, and custom applications. * Infrastructure-as-code deployment tooling, supporting services on multiple cloud providers and on-premise virtualization. * Metrics, logging, analytics and alerting for performance and security across all endpoints and applications. * Building applications that enable development teams to test out petabytes of test datasets. Technical * Exposure to Spark & Apache Big Data Technologies. * SQL skills for pipeline development and analysis * Python for pipeline development and analysis * Experience of continuous integration/deployment and development techniques * Exposure to Scrum/Agile delivery methodologies * Experience in toolsets for agile delivery management and DevOps; for example Azure DevOps (ADO) or JIRA. * Experience in integrated toolsets for supporting CI/CD, for example GIT or BitBucket, and cloud deployment services * Experience of big data engineering environment delivering data pipelines, and ETL concepts * In depth understanding of map/reduce and spark patterns to deliver data pipelines including evidence of best practice implementation. * Programming Skills:Python,SCALA,Java * Tools:GIT,Jenkins,Ansible,Docker etc. * Data Processing:Hive,Spark etc. * Messaging system: Kafka,Strom,SNS,SQS * Scheduling and Workflows:Oozie,Azkaban,Luigi * No SQL database: Hbase,Casandra,Mongo db * Exposure to cloud deployments and data pipelines. Desirable Skills / Experience: * Experience of support and diagnosis in a data pipeline environment * Basic scrum master qualification. * Data migration delivery experience. * Exposure to BDD frameworks * Demonstrable experience of requirements elicitation and documentation carried through to design and implementation. * Excel for data analysis * Use of Visio / Wiki / Sharepoint / Confluence for documentation of architecture / design / implementation. * Familiarity with scheduling and orchestration tools. * Knowledge of toolsets for service management, for example Service Now * Experience in use of Kafka & Streaming data solutions (in addition to Spark), and Flink / Flume frameworks for managing streaming data, or collecting log data Good to have: * Exposure to AWS data lake * Hands on Kinesis,AWS Glue ,Redshift etc.