Available for opportunities

Hello, I'm William Valdez

Platform Engineer | Site Reliability Engineering | Agentic AI Systems

I have spent over a decade building and running cloud and hybrid infrastructure for enterprise organizations. I care about making systems that stay up, stay secure, and make life easier for the teams that depend on them.

infrastructure.tf
module "platform" {
  source  = "./modules/core"
  version = "1.5.0"

  # Multi-cloud infrastructure
  providers = {
    aws   = "5.0"
    azure = "3.0"
    gcp   = "5.0"
  }

  # Enterprise features
  observability    = true
  zero_trust       = true
  dr_automation    = true
  security_hub     = true
  compliance       = ["SOC2", "HIPAA"]
}

Technical Skills

These are the tools and technologies I use daily to build and maintain reliable infrastructure. I enjoy working across the full stack, from cloud platforms to application code.

Cloud and Platform Engineering

  • AWS (EC2, ECS, EKS, Lambda, RDS, S3, CloudFormation)
  • Azure (AZ-104) with Entra ID, VMs, AKS
  • GCP with Compute, GKE, Cloud Functions
  • Terraform for Infrastructure as Code
  • Multi-Cloud Architecture
  • High Availability and Fault Tolerance
  • Cloud Networking and VPC Design

APIs, Integrations and Events

  • REST APIs and API Gateway
  • Webhooks and Event-Driven Architecture
  • JSON/HTTP Protocols
  • OAuth2 and JWT Authentication
  • SSO Integration
  • Third-Party APIs (Stripe, Supabase)
  • Cloud Services Integration

Data and Persistence Layer

  • PostgreSQL and SQL Schema Design
  • Relational Databases
  • Row-Level Security (RLS)
  • Data Modeling and Query Optimization
  • State and Metadata Storage
  • DynamoDB and NoSQL
  • Database Replication and DR

DevOps, CI/CD and Automation

  • Azure DevOps and GitHub Actions
  • Jenkins CI/CD Pipelines
  • GitOps Workflows
  • Automated Deployments
  • Environment Promotion (Dev/Prod)
  • Secrets Management and Vault
  • Infrastructure Automation

SRE and Production Operations

  • System Reliability Engineering
  • Prometheus, Grafana, Loki
  • Monitoring and Alerting
  • Incident Response and Root Cause Analysis
  • Patch and Fleet Management (Automox)
  • Backup and Disaster Recovery
  • 99.9%+ Uptime SLAs

Security, Identity and Zero Trust

  • IAM, RBAC, and Entitlement Management
  • Active Directory and Azure AD Administration
  • Security Attack Surface Analyzer
  • Firewall Implementation and Patch Coordination
  • SiteLock Website Security Monitoring
  • Symantec Anti-Virus Deployment
  • Security Hub and GuardDuty

GRC and Identity Governance

  • Varonis and Sailpoint Administration
  • Separation of Duties (SoD) Policies
  • User Access Reviews and Certifications
  • Risk Analysis and Remediation (RAR)
  • Emergency Access Management (Firefighter)
  • AD Security Groups Compliance Auditing
  • Service Account Management and Auditing

AI and Application Engineering

  • AI-Driven Applications
  • Agentic Workflows
  • OpenAI / Claude Integrations
  • Mobile and Web App Development (iOS, Android, PWA)
  • Real-Time Interactive UIs
  • Subscription and Entitlement Systems
  • MCP Server Integration

Networking and Systems

  • TCP/IP and DNS
  • Citrix Access Management
  • Distributed File System (DFS)
  • VPC Architecture
  • Windows Systems Administration
  • Transit Gateway and Peering
  • Route53 and Global Accelerator

Certifications & Credentials

Azure AZ-104 Azure Administrator
Security SSCP - Systems Security Certified
Cloud Multi-Cloud Architecture
Identity Active Directory & Entra ID

Projects I Have Built

Here are some of the infrastructure and platform projects I have worked on. Each one taught me something new about building reliable systems at scale.

Full-Stack

Terraform Academy Platform

A learning platform I built for teaching Infrastructure as Code. It has interactive labs, progress tracking, and supports multiple organizations with SSO.

Architecture
React SPA frontend
Vercel Edge cdn
Serverless Functions api
PostgreSQL database
Stripe payments
Docker Labs compute

Key Features

  • Interactive IaC coding labs
  • Progress tracking and certifications
  • Multi-tenant organizations with SSO
  • Subscription management
ReactTypeScriptServerlessStripeDocker
AWS Infrastructure

AWS DR Automation

Cross-region disaster recovery that fails over in under 15 minutes. I built this using Route53, RDS replication, and Step Functions to automate the entire process.

Architecture
Route53 dns
Global Accelerator edge
ECS/EKS compute
RDS Multi-AZ database
S3 CRR storage
Step Functions orchestration

Key Features

  • RTO under 15 min, RPO under 5 min
  • Health-based DNS failover
  • Cross-region RDS replication
  • Automated runbook execution
TerraformAWSStep FunctionsRoute53RDS
Kubernetes

AWS EKS Observability

A complete observability stack for Kubernetes. Prometheus for metrics, Loki for logs, Tempo for traces, all tied together with Grafana dashboards.

Architecture
Prometheus/Thanos metrics
Loki logs
Tempo traces
Grafana visualization
Alertmanager alerting
S3 Backend storage

Key Features

  • Unified metrics, logs, traces
  • IRSA for secure AWS access
  • Long-term S3 storage
  • Pre-built dashboards
EKSPrometheusGrafanaLokiTerraform
Security

AWS Security Compliance

Security scanning and auto-remediation built on Security Hub, GuardDuty, and Config. When something drifts out of compliance, Lambda functions fix it automatically.

Architecture
Security Hub aggregation
GuardDuty detection
AWS Config compliance
EventBridge routing
Lambda remediation
SNS/Slack notifications

Key Features

  • CIS/PCI-DSS/SOC2/HIPAA compliance
  • Auto-remediation workflows
  • Real-time security alerts
  • Compliance dashboards
Security HubGuardDutyConfigLambdaTerraform
Governance

AWS Landing Zone

Multi-account setup using AWS Organizations and Control Tower. I automated account provisioning and set up SCPs to keep everything secure by default.

Architecture

Key Features

  • OU hierarchy management
  • Preventive SCPs
  • Account factory automation
  • Centralized logging
OrganizationsControl TowerSCPsIAMTerraform
DevOps

AWS CI/CD Pipeline

Production deployment pipeline using CodePipeline and CodeBuild. Supports blue-green and canary deployments with automatic rollback if something goes wrong.

Architecture
CodeCommit/GitHub source
CodeBuild build
ECR registry
CodeDeploy deploy
ECS/EKS runtime
CloudWatch monitoring

Key Features

  • Blue-green deployments
  • Cross-account promotion
  • Container scanning
  • Automated rollback
CodePipelineCodeBuildCodeDeployECRTerraform

Plus 4 more projects including AWS Cost Optimizer, Instance Scheduler, Server Tagging Module, and Homeschool Planner PWA.

View All on GitHub

Professional Experience

I have spent over a decade building and running infrastructure for enterprise organizations. Here is a bit about my journey and what drives me.

Enterprise Cloud Infrastructure

I have spent over a decade building and running cloud systems for large organizations. My focus has always been on creating infrastructure that stays up when it matters most.

Identity and Access Management

I administer Active Directory, Azure AD, and Entra ID across multiple domains. I manage user identities, roles, service accounts, and access privileges while ensuring seamless integration with on-premises AD.

Security and Compliance

I run Security Attack Surface Analyzer on new implementations, audit AD Security Groups for compliance, and work with security teams on firewall patches and policy enforcement.

Platform Leadership

I lead platform work across both on-premises and cloud environments. I work closely with security, compliance, virtualization, and networking teams.

Career Timeline

Current

Site Reliability and Platform Engineer

Enterprise

I design and operate cloud infrastructure across AWS, Azure, and hybrid environments. I lead our identity management, observability, and compliance automation efforts.

  • Multi-cloud infrastructure (AWS, Azure, GCP)
  • Identity governance with Entra ID
  • Automated security compliance
  • High availability platforms (99.9%+ SLA)
Previous

Platform Engineer

Enterprise

I built and maintained critical infrastructure with a focus on automation, monitoring, and making sure we could recover quickly from any issue.

  • Terraform infrastructure as code
  • CI/CD pipeline automation
  • Monitoring and alerting systems
  • Disaster recovery implementation
Previous

IAM and GRC Engineer

Enterprise

I managed identity governance, AWS account provisioning, and compliance controls. I worked with Varonis and Sailpoint to enforce Separation of Duties policies and conduct access certifications.

  • Active Directory and Azure AD administration with hybrid integration
  • User account, service account management and auditing
  • Security Attack Surface Analyzer for vendor reviews
  • AD Security Groups auditing for compliance standards
  • GRC process management with Varonis and Sailpoint
  • SoD policy enforcement and risk analysis
  • User Access Reviews and access certifications
  • Emergency Access Management (Firefighter access)
Earlier

Systems Engineer

Various

This is where I built my foundation in systems, networking, and security. I worked across IT Security, Virtualization, DFS, Citrix, and networking teams.

  • Windows Server and Citrix administration
  • Firewall implementation and patch coordination
  • SiteLock website monitoring and malware protection
  • Symantec Anti-Virus deployment
  • Security policy enforcement and escalation
  • DFS and network access coordination
10+ Years Experience
99.9% Platform Uptime
3 Cloud Platforms
1000+ Identities Managed

Let's Talk

If you want to chat about platform engineering, cloud infrastructure, or potential opportunities, I would love to hear from you.

Open to Opportunities

I am looking for my next challenge in platform engineering or SRE. I want to join a team where I can help build infrastructure that makes a real difference.

Platform Engineering Site Reliability Cloud Architecture DevOps Agentic AI
Send me an email