William Lipford

Senior Automation Engineer | Site Reliability Engineer

Currently at Foresight Tech Sourcing | 25+ Years Engineering Resilient Systems

Transforming reliability from reactive firefighting to proactive architecture. Achieved Five 9s uptime for mission-critical financial systems through chaos engineering, automation, and intelligent observability.

25+
Years Experience
99.999%
Uptime Achievement
82%
MTTR Reduction
500+
Systems Managed

Core Competencies

Two decades of hands-on experience building and scaling reliable infrastructure

Site Reliability Engineering

20 yrs experience

Expert

SLO/SLI Design Incident Response Capacity Planning Chaos Engineering Five 9s HA Architecture

Observability & Monitoring

20 yrs experience

Expert

OpenNMS (v22-27) Grafana Dashboards Prometheus Splunk Apache Kafka

DevOps & Automation

15 yrs experience

Expert

Ansible GitLab CI/CD Python Scripting MuleSoft Integration Infrastructure as Code

Cloud & Platform Engineering

15 yrs experience

Advanced

Multi-Cloud Architecture Container Orchestration Network Automation (Nautobot) Platform Engineering

Database & Data Systems

15 yrs experience

Advanced

PostgreSQL Clusters Cassandra Multi-node Redis Database HA/DR

Security & Compliance

10 yrs experience

Advanced

DISA STIG Implementation RHEL Hardening SSL/TLS Config SELinux Compliance Automation

AI/ML Infrastructure Emerging

1 yr experience

Advanced

MLOps Best Practices AI-driven Observability Platform SRE for AI Emerging Technologies

Certifications & Training

ITIL 4 Foundation

IT Service Management (2024-2027)

SolarWinds Certified Professional

Network Management (2015-2020)

Network to Code Training

Network Automation, Nautobot, Python

Continuous Learning

MLOps, Kubernetes (CKA path)

Featured Projects

Real-world solutions delivering measurable business impact through reliability engineering

⭐ Flagship Project

Five 9s Achievement: NMS Customer Portal

April 2020 - October 2021

Achieved 99.999% availability for enterprise monitoring platform serving 500+ critical systems across global financial infrastructure. Custom-compiled OpenNMS with NGINX SSL reverse proxy, integrated with 6 data sources including BMC Remedy, Cassandra, Salesforce, Kafka, and Splunk. Implemented complete DR/HA stack with PostgreSQL 12 cluster, Cassandra 3-node cluster, and hot/cold standby architecture.

Five 9s Achievement: NMS Customer Portal - Architecture Diagram

System Architecture: Multi-tier infrastructure with NGINX reverse proxy

Key Achievements:

  • Reduced MTTR from 45 minutes to 8 minutes (82% improvement)
  • Real-time alerting via Apache Kafka for immediate incident response
  • Custom compilation of OpenNMS for optimized performance
  • Topology and geographical dashboards with automated geo-location
  • Self-contained dashboard sets for enhanced security and navigation
  • Complete high-availability across entire stack
OpenNMS Grafana PostgreSQL 12 Cassandra Apache Kafka NGINX BMC Remedy Splunk
Read Full Case Study
99.999%
Uptime Achieved
8 min
MTTR
6+
Data Sources
500+
Systems Monitored

DevOps Transformation: Source of Truth Implementation

Network to Code - July 2023 - September 2025

Led comprehensive transformation of company DevOps core structure, overhauling source of truth data repositories (IPAM, DNS, SOT, Code Revision). Implemented standardized orchestrations using Ansible for device configuration management and MuleSoft for application integration. Large-scale implementation replacing significant portion of core systems, aligned with zero-touch provisioning strategy.

Key Achievements:

  • Standardized documentation practices through GitLab and MkDocs
  • Compliance management and activity tracking via Splunk
  • Maintained stringent high-security standards throughout transformation
  • Zero-touch provisioning strategy implementation
  • Self-service infrastructure provisioning for development teams
Nautobot Ansible GitLab MuleSoft Splunk MkDocs Python Docker Grafana
Read Full Case Study
-89%
Provisioning Time
-90%
Config Drift
98%
Compliance Score
80%
Automation Coverage

Event-Driven Password Management with Kafka Streams

March 2021 - May 2022

Designed and implemented event-driven password management solution using Apache Kafka Streams. Password State triggers automated account password rotation workflows with real-time event processing. Integrated with Grafana for monitoring and GitHub for version-controlled playbook management.

Key Achievements:

  • Real-time password rotation triggered by state changes
  • Event-driven architecture for scalable credential management
  • Grafana dashboards for monitoring password lifecycle
  • Automated compliance reporting and audit trails
Apache Kafka Kafka Streams Grafana GitHub Python Password State Event-Driven Architecture
Read Full Case Study
100%
Automated Rotations
<2s
Response Time
1000+
Accounts Managed
14+
Integration Points

Nautobot on RHEL: Production-Grade Documentation

Comprehensive installation guide for Nautobot (Network Source of Truth) with DISA STIG security hardening, air-gapped deployment procedures, and PostgreSQL security best practices. Community-referenced implementation guide.

Nautobot RHEL 8 PostgreSQL SELinux Ansible
View on GitHub
XX+
GitHub Stars
DISA STIG
Security Standard
XX+
Organizations Using
50+
Documentation Pages

Professional Experience

25+ years driving infrastructure reliability and automation excellence

✓ Current Position

Senior Automation Engineer

Foresight Tech Sourcing Inc.

January 2025 - Present
Remote
  • Serving as technical lead and architect driving innovative automation engineering solutions
  • Leveraging cutting-edge technology to enhance site reliability and performance
  • Leading CI/CD pipeline development and DevOps transformation initiatives
  • Fostering excellence in every project through innovative automation strategies
Continuous Integration (CI) DevOps Automation Engineering Site Reliability

Technical Lead, Automation DevOps

Transaction Network Services

January 2022 - September 2025
Virginia, United States (Remote)
  • Spearheaded comprehensive transformation of DevOps core structure
  • Overhauled source of truth data repositories (IPAM, DNS, SOT, Code Revision)
  • Implemented standardized DevOps orchestrations using Ansible and MuleSoft
  • Established documentation standards through GitLab and MkDocs
  • Led large-scale implementation aligning with zero-touch provisioning strategy
  • Maintained stringent high-security standards for data integrity
Network Automation DevOps Ansible GitLab MuleSoft Splunk
⭐ Achieved Five 9s (99.999%) Uptime

Sr. NMS Engineer

Transaction Network Services

January 2015 - January 2021
Reston, Virginia (Hybrid)
  • Deployed SolarWinds solution for SNMP/ICMP polling and NCM management
  • Developed Network Automation engine with MongoDB asynchronous communication
  • Achieved 99.999% availability for enterprise monitoring platform
  • Designed and deployed OpenNMS (v22.01 → v27.0.1) with DR/HA architecture
  • Built Grafana integration with 6+ data sources (Remedy, Cassandra, Kafka, Splunk)
  • Implemented Cassandra 3-node cluster and PostgreSQL 12 cluster with DR failover
  • Reduced MTTR from 45 minutes to 8 minutes (82% improvement)
  • Provided 24/7 operational support for mission-critical financial systems
OpenNMS Grafana SolarWinds PostgreSQL Cassandra Apache Kafka BMC Remedy Network Management

Senior Application Architect

Transaction Network Services

June 2000 - March 2015
Reston, Virginia (Hybrid)
  • Deployed IT solutions for financial and operations sectors
  • Primary integration developer for 6+ major applications
  • Provided consultative guidance to junior architects
  • Managed end-to-end website development for custom applications
  • Maintained applications with 24/7 support for SLA commitments
Integration Full-Stack Development Architecture Database Management

Certifications & Professional Development

Automating Networks with Python II

Network to Code

April 2025 • Current

ID: 193_5_13317_1744980730

Automating Nautobot with Python & Ansible Workshop

Network to Code

December 2024 • Current

ID: 155_5_13317_1734473903

Nautobot Config Compliance & Remediation w/ Golden Config

Network to Code

December 2024 • Expires December 2034

ID: I-D19PVG

Nautobot Extensibility Workshop

Network to Code

December 2024 • Current

ID: 154_5_13317_1734447661

ITIL 4 Foundation Certificate in IT Service Management

PeopleCert

November 2024 • Expires November 2027

ID: GR67170837OWL

ITIL-4 Foundation Certified

Collaborative Workflows with Git and GitHub

Network to Code

October 2023 • Current

Network Programming and Automation

Network to Code

October 2023 • Current

SolarWinds Certified Professional

SolarWinds

October 2015 • Expired October 2020

ID: SCP4816

Continuous Learning Path

Currently expanding expertise in MLOps, Kubernetes (CKA certification path), and Platform Engineering to stay at the forefront of SRE evolution.

Get In Touch

Interested in discussing SRE challenges, DevOps transformation, or potential collaboration? Let's connect.