Data Scientist - Cyber

CapitalOne ,
London, Greater London

Overview

Job Description

White Collar Factory (95009), United Kingdom, London, London At Capital One, we're building a leading information-based technology company. Still founder-led by Chairman and Chief Executive Officer Richard Fairbank, Capital One is on a mission to help our customers succeed by bringing ingenuity, simplicity, and humanity to banking. We measure our efforts by the success our customers enjoy and the advocacy they exhibit. We are succeeding because they are succeeding. Guided by our shared values, we thrive in an environment where collaboration and openness are valued. We believe that innovation is powered by perspective and that teamwork and respect for each other lead to superior results. We elevate each other and obsess about doing the right thing. Our associates serve with humility and a deep respect for their responsibility in helping our customers achieve their goals and realize their dreams. Together, we are on a quest to change banking for good. Data Scientist - Cyber Threat modelling is the practice of examining software and systems from the design phase forward to identify and eliminate insecure patterns before they manifest in production code. Threat modelling ensures that software and systems continue to evolve securely as new features are added. Capital One aims to fully embed threat modelling across the software delivery lifecycle, ensuring that all Capital One software is measurably secure by design. As part of that initiative, we aim to develop software tools to enable engineers to threat model. As a data scientist working in the Enterprise Threat Modelling team, you will be delivering data-driven tools that empower engineers to threat model. These tools aim to capture and track all threat modelling activity across the enterprise, and deliver on our mission to provide meaningful measurements of the quality and value of threat models. The features you will be adding to our tooling ecosystem will aim to drive down the cost of threat modelling to engineering teams, while also driving up the quality and visibility of the threat models that are produced. You will be working with a diverse team including Threat Modelling Engineers, Data Scientists and Front and Backend software engineers to deliver data-driven tools for discovering and identifying threats in architectures and software designs. As a data scientist working on this project we'll be looking for you to develop the data driven aspects of the tools. This will involve working on a diverse set of data-centric problems including: * Establishing key metrics for threat model quality and associated visualisation for end user consumption that drive improvements in threat modelling practice across the enterprise * Using data collected from the user interface to understand key areas of friction for the user * Leveraging NLP technology to automate the discovery of architectural constructs and patterns in use for a project that is being threat modelled from issue trackers, code repositories and wiki pages * Building a threat recommendation engine that automates the suggestion of relevant threats to a team based on the similarity of an architecture to other known architectures About you * You thrive in a collaborative and collegial team that employs Agile and Scrum practices to deliver at pace with high visibility * You are a self-starter who seeks opportunities to innovate and bring new ways of thinking to a team * You have a product mindset, and you seek customer and stakeholder feedback on your work and strive to use that feedback to improve the features you deliver * You value high quality and secure code with good documentation and full test coverage * You value and seek to understand the context of your work * You are naturally curious, and always looking to learn new technologies and stay at the leading edge of technology. You enjoy bringing new technologies and practices to the team This role requires a range of skills, including: * Experience with Python and the data science ecosystem including Pandas, Numpy, scipy, scikit-learn, NLTK etc * Experience with linked data and semantic web technologies for building knowledge graphs * Experience with building recommendation engines * The ability to design visualisations and dashboards that provide meaningful and actionable insights for users * Expertise at working with structured and unstructured data from a variety of sources including APIs, object stores, and relational and non-relational databases * The ability to produce data pipelines using tools such as Airflow * Capability to build APIs for exposing analytics to other consumers * A drive to work with front-end and back-end engineers to deliver analytic products from inception to production * Agile and Scrum working practices and ceremonies Experience in some of the following is preferred: * Experience with leveraging Python frameworks for backend API services * Experience with using javascript for front-end visualisations and dashboards * Expe