Director, Site Reliability Engineering (SRE) Job at Binary Defense

Binary Defense Remote

Description:

Overview

Binary Defense, headquartered in Stow, Ohio, is a rapidly growing cybersecurity software and services firm with solutions that include best-in-class Managed Detection & Response powered by a Managed Open XDR platform. The company has a 24/7 Security Operations Center that monitors their own proprietary managed EDR software as well as supporting leadings network, cloud and identity solutions. Advanced threat hunting, defense validation and counterintelligence services provide additional layers of security. Our expert security staff and technology help shield businesses from cyberattacks.

Binary Defense is a fast-paced business that enjoys a relaxed culture (from anywhere in the continental United States) and flexible remote work options. For the fourth year in a row, Binary Defense has been recognized as one of the fastest-growing private companies in the US on the Inc. 5000 list! At the 2022 Greater Cleveland Partnership’s “Best of Tech Awards,” Binary Defense was recognized as the “Best Technology Solution” for the third year in a row. We’ve also been named “North American Partner of the Year” by AT&T Cybersecurity, providing best-in-class SIEM technology and service. Binary Defense recently completed a $36 million growth equity round of funding from Invictus Growth Partners to accelerate our growth and technology and service delivery offerings.

Binary Defense offers competitive medical, dental and vision coverage for employees and dependents, a 401k match which vests every payroll, a flexible and remote friendly work environment, as well as training opportunities to expand your skill set (to name a few!). If you’re interested in joining a growing team with great perks, we encourage you to apply!

About the Role

The Director of Site Reliability Engineering (SRE) will lead our SRE Team, serve as a key player in the operational excellence of our current EDR product, and will be pivotal in both informing and executing on the vision of our next generation capabilities. We believe that production stability is the responsibility of the entire delivery team, and that excellent software is created through the proximity of development and operations activities. Keep reading if you are a software engineering leader with a passion for automation, enjoy short release cycles, appreciate working with software delivery teams, and relentlessly focus on continuous improvement.

Our delivery cycles are highly variable – with some activities needing a rapid response and delivery to all customers within hours and others requiring deep research, planning, and validation over many weeks. The successful candidate for this role must have experience leading teams through similar release models and possess the mental flexibility to deal with such complexity. You will be responsible for automating, monitoring, and improving both system reliability and availability. You will be a Subject Matter Expert in evaluating performance and risk of outgoing software features. You will lead the effort of monitoring, tracking, reporting, and improving trends for Service Level Indicators (SLIs) and performance against Service Level Objectives (SLOs) within agreed upon error budgets. Additionally, you will partner with the Corporate IT Team to ensure the environments supporting our products are aligned to standards and ownership is clear.

Binary Defense is looking for a talented, high-energy, collaborative person with deep expertise in building cloud-native enterprise applications to lead a team maintaining and evolving a complex product. In this leadership role you will be responsible for solutions focused on the success of internal and external customers, and building deep partnerships within our organization (e.g. Product Management, Customer Service, Implementation, Security Operations, and Security Engineering).

As a remote-friendly team, we default to trust and expect the best from each other. We thrive when we cooperate with each other to deliver timely and effective work. We do our best to help everyone bring their whole selves to work, encourage diversity, and support family-friendliness and flexibility in our schedule.

Key Responsibilities

  • Engage as a player-coach, capable and interested in splitting time across hands-on and leadership work.
  • Focused on reliability, performance, efficiency improvements, & monitoring of the various environments supporting our internal team members and external customers.
  • Effectively manage deployments and monitor environments in the cloud (Azure and AWS) and on-premise.
  • Participate in Architecture, Design, and Proof of Concept activities to inform decisions around automation, environments, and tooling.
  • Coach and mentor your team while they are members of a delivery team, involved in design, development, testing, capacity planning, and readiness reviews.
  • Responsible for identification and implementation of enablement tools that support our continuous integration (CI) / continuous delivery (CD) system and automation framework. Ensure the tools are selected with input from the delivery team and are then adopted with consistency.?
  • Partner with architects, developers, product management, and other internal subject matter experts to ensure the team has a strategy to achieve desired service level objectives (SLOs).
  • Lead effort to monitor, alert, & report on overall system health by tracking Service Level Indicators (SLIs) and Service Level Objectives (SLOs).
  • Share the status of key performance indicators and metrics via dashboards during regular reviews with the Software Engineering Leadership Team.
  • Use data to advocate for changes that will measurably improve reliability and increase velocity.
  • Reduce manual activities with limited long-term value through automation. Measures, establishes goals, and celebrates the improvements achieved.
  • Participate in the on-call rotation as a point of escalation.
  • Perform initial review and triage of production issues, escalates as appropriate.
  • Champion sustainable incident response and blameless root cause analysis.
  • Other projects and responsibilities, as assigned by the direct manager.
Requirements:

Education/Experience

  • Must be a US Citizen and reside in the continental US.
  • Deep expertise in Azure (prefer both AWS and Azure experience).
  • Computer Science, Software Engineering, or similar degree. Equivalent real-world experience would be acceptable in lieu of degree.
  • 10+ years hands-on experience architecting, contributing to code bases, and successfully delivering customer facing software.
  • 7+ years experience managing, mentoring, coaching, and leading senior and mid-level Architects, Managers, and Engineers.
  • Experience successfully leading distributed teams.
  • Expert knowledge of software engineering best practices.
  • Strong communication and collaboration skills, including the ability to clearly express technical concepts in verbal and written forms.
  • Ability to successfully define and drive adoption of tools, processes, and frameworks across multiple teams.
  • Deep knowledge and experience with:

Managed Kubernetes offerings such as AKS, EKS, GKE

GitLab Build Pipelines

Docker

Terraform

Other Knowledge, Skills and Abilities

  • Balanced business and technical background. Sufficient level of technical background to provide highly-credible leadership to technology teams. Ability to accurately and objectively evaluate complex risks and issues, and communicate these effectively to business stakeholders.
  • Successfully achieved positive outcomes executing software engineering initiatives applying Agile methodologies (Scrum, Kanban, XP, etc.) in a pragmatic way.
  • Proven track record of motivating teams, instilling accountability for high quality delivery.
  • History of leading teams with varied release horizons and correctly implementing strategies that solve problems at the right level, taking into consideration all relevant factors.
  • Technologist - Knowledge and interest in the latest system architecture, automation, cloud, and advanced technology trends with the ability to rapidly learn and apply new technology. Strong ability to share and teach to accelerate the team's adoption of new technologies.
  • Calculated Risk Taker - Understands that end user satisfaction is a balance between features, service, and performance.
  • Collaborative - Works closely with team members and stakeholders to understand needs, gain perspective, and collectively deliver solutions with a shared purpose.
  • Enthusiastic - must be high-energy and a passionate advocate for quickly delivering value.
  • Adaptive and Inclusive - works with team members to understand pain points and adjust standards, tools, and best practices accordingly. Learns from the team and adjusts with a focus on enablement.
  • Attitude of transparency - must desire to bring disclosure and transparency.
  • Creativity, initiative, and flexibility - tempered by pragmatism, patience, and attention to detail.
  • Honest, humble, friendly, and collegial.
  • Creative problem-solver - Ability to look at solutions in creative and unconventional ways, recognize opportunities to innovate, and engage partners in a vision and strategy while maintaining the "big picture" view
  • Commitment to continuous improvement. Ability to dynamically adjust the plan, to resolve impediments as well as to meet changing business needs.
  • Accountable - must embody a strong sense of responsibility for the timely completion of tasks, as well as the responsibility to ensure a shared understanding of shared tasks.

Preferred

  • Experience with mix of processes (Kanban, Scrum, XP, LeSS, SAFe, Waterfall, etc.).
  • Azure and AWS related certifications.
  • Security industry experience.
  • Experience with Kubernetes based tooling:

Cert Manager

KEDA

Ingress Controllers such as Nginx Ingress and HA Proxy Ingress

AWS Controllers for Kubernetes (ACK) or Azure Service Operator (ASO)

Argo

  • Master of secrets management
  • Experience monitoring with:

Prometheus

Grafana

DataDog

  • Experience administering and scaling:

.NET Core

Python

Postgres

Redis

  • Experience with Windows and Linux operating systems.
  • Experience managing messaging infrastructures.
  • Experience with AWS solution architecture and management. Including:

EC2

ECS

S3

VPC

  • Experience with

Debian/Linux

Ansible

PM21




Please Note :
caminobluff.com is the go-to platform for job seekers looking for the best job postings from around the web. With a focus on quality, the platform guarantees that all job postings are from reliable sources and are up-to-date. It also offers a variety of tools to help users find the perfect job for them, such as searching by location and filtering by industry. Furthermore, caminobluff.com provides helpful resources like resume tips and career advice to give job seekers an edge in their search. With its commitment to quality and user-friendliness, Site.com is the ideal place to find your next job.