DevOps, SRE, and platform work from Jakarta

I'm Faisal. I do the work that keeps production boring.

I lead infrastructure and reliability work for a multi-brand fintech operation in Jakarta. Most days that means AWS, deployments, migrations, and making sure real-time trading systems stay steady when the pressure is on.

Get in touch

I did not get into infrastructure through some neat master plan. I started with networks, Linux boxes, and the usual "can you just fix this one thing" work. That turned into cloud engineering, SRE, and eventually the kind of role where a quiet weekend usually means the prep was good.

Right now I lead infrastructure and platform reliability for a fintech operation with multiple brands and multiple AWS accounts. I spend a lot of time around Kubernetes, ECS, Terraform, CI pipelines, and observability tooling, but the real work is usually about safer change, clearer rollback paths, and fewer surprises in production.

The part I care about most is turning messy operational knowledge into something a team can actually use. Runbooks people trust. Migration plans with real stop points. RCAs that explain what actually happened instead of just satisfying a template.

Writing

Notes from infra work, usually written after the deploy is done and the pager is quiet again.

Latest March 14, 2026 6 min read

Can We Stop With the LeetCode for DevOps Roles?

If a platform or DevOps interview wants to test real skill, it should look more like debugging, trade-offs, and change management than a whiteboard binary tree exercise.

devopsinterviewscareerplatform-engineering

Read post

7 min read devops

DevOps Is Not a Side Task

Teams ask for calmer deploys, safer infra, and fewer incidents, then try to squeeze that work into somebody else's sprint. That arrangement usually breaks.

Read note

6 min read devops

Nobody Memorizes YAML

The useful skill is not memorizing Kubernetes YAML. It is understanding the system, knowing what to look up, and staying comfortable with the docs.

Read note

5 min read sre

Before You Promise 99.9% Uptime, Do the Math

An uptime target sounds simple in a meeting until you turn it into downtime minutes, breach conversations, and the engineering work needed to support it.

Read note

What I work with

Tools I touch often enough to have opinions about. Not everything, just the ones that show up in the real work.

Cloud & Infrastructure

AWS Still where a lot of my week goes · Terraform Useful until state reminds you who's in charge · CloudFormation I use it when the job calls for it · AliCloud Part of my MIFX years · Ansible Still handy when the servers are not containerized · Secrets Manager A better home for credentials

Containers

Kubernetes Powerful, but not a place to bluff · K3s Nice when you want less ceremony · Docker Still the quickest way to explain a runtime problem · Portainer Helpful when a simple UI saves time · ECS A calmer container story for some workloads · ECR Plain, reliable, does what it needs to

CI/CD

GitHub Actions Where a lot of release discipline lives · GitLab CI/CD Used it long enough to know its habits · OIDC Roles Better than passing credentials around

Observability

Prometheus Great once the alerts earn their keep · Grafana Useful when the dashboards stay honest · CloudWatch Basic, noisy, still unavoidable · Uptime Kuma Simple and honest · EventBridge More useful than it looks at first

Networking

Cloudflare Helpful at the edge, especially on bad days · WireGuard Fast and clean · OpenVPN Not pretty, but dependable · Tailscale Saves a lot of setup time · Nginx Still the reverse proxy I reach for first · Traefik Good when discovery matters

Databases

PostgreSQL The database I trust most · Aurora Managed convenience with trade-offs · TimescaleDB Very nice when time-series fits the problem · MySQL/MariaDB Familiar territory · Redis/Valkey Fast until everyone treats it like a database · Memcached Perfectly fine when simple is enough

Languages

Bash Still the fastest path from idea to automation · Python My usual glue language · Go What I reach for when the tool should feel solid

Operations & SRE

Incident Response Calm matters more than heroics · Runbooks Best written before the page · Root Cause Analysis Where the real learning should happen · On-call Teaches you what is actually brittle

Journey

The short version of how I got here, working backward.

Engineering Operations Lead

Fulk Tech

Nov 2023 - Present

Promoted into the lead role and now own the infrastructure side of a fast-moving fintech platform.

I look after infrastructure, CI/CD, and reliability across more than five AWS accounts. A lot of the work is technical, but a lot of it is coordination too: internal engineers, consultants, and business teams all touching the same production path.

9 details

-- Moved database access away from shared static root passwords and into short-lived IAM-backed auth. Built a small onboarding helper so new engineers could get into RDS without the usual credential scramble
-- Led a Memcached to Redis migration using AWS Valkey, with rollback-safe API routing, separate secret paths per brand, and Multi-AZ failover built in
-- Proposed and led a REST to HTTP API Gateway migration to cut cost and improve latency, then worked through the phased rollout with the backend team
-- Built the Terraform foundations for multi-account work and wrote a Go CLI that helped standardize repo onboarding, workflows, environments, and AWS discovery across five repositories
-- Designed GitHub Actions pipelines with OIDC auth, matrix deploys, dry runs, smoke checks, and approval gates across five AWS accounts
-- Built Ansible playbooks for non-containerized EC2 services over SSM, covering restarts, artifact rollout, config templating, and health checks
-- Built a Jira to Slack notification bot with shift rotation, acknowledgement tracking, escalation after four hours, and follow-up monitoring after issues were marked resolved
-- Wrote RCAs for production incidents involving CloudFront, RDS, service crashes, and release regressions, with a focus on what actually changed and why it mattered
-- Ran large migration programs with clear run sheets, stop points, and rollback criteria, and spent plenty of time digging into tagging strategy and spend spikes along the way

Site Reliability Engineer

MIFX (Monex Investindo Futures)

Jan - Nov 2023

This was where I built a lot of the migration and monitoring muscle I still rely on.

I worked as an SRE on a multi-cloud fintech platform running across AWS and AliCloud. The job touched reliability, Kubernetes operations, databases, monitoring, and the kind of migration work that can quietly take over your weekends.

6 details

-- Built CloudWatch alarms for the EC2 fleet, installed the agent across instances, tuned thresholds after OOM incidents, and later helped move the stack toward Prometheus and Grafana
-- Led a Tableau Server migration from AliCloud to AWS, including Terraform provisioning, Ansible-based install, backup and restore planning, and cron alignment
-- Handled PostgreSQL to Aurora migration work, TimescaleDB restore and rename operations, and more than a few MariaDB production headaches
-- Managed K8s and K3s cron jobs, rollouts, scaling, OOM debugging, and image pull tuning across namespaces
-- Wrote Terraform for VPC, EC2, and security groups from scratch, imported existing resources into state, and joined migration sprints across more than six environments
-- Configured VPN routing across AWS and AliCloud, plus Cloudflare Zero Trust DNS and site-to-site VPN links with local banking partners

Cloud Engineer

PT Maha Karya Perwira

2020 - 2022

This was the stretch where infrastructure stopped being abstract and became daily practice.

I worked across cloud and on-prem infrastructure, set up CI/CD around self-hosted GitLab and Nexus, handled migration work with Ansible, and built out monitoring with Prometheus and Grafana.

3 details

-- Managed AWS infrastructure with Docker, Portainer, and Kubernetes in the mix
-- Handled DNS, load balancing, WAF, and VPN setup across Route 53, Nginx, Traefik, OpenVPN, and Tailscale
-- Put disaster recovery and backup flows in place with S3 and archival storage

IT Support & Networking

PT Transformasi Mindset Indonesia

2016 - 2018

This is where I realized I liked infrastructure work more than anything else in IT.

I managed network infrastructure and worked on system upgrades that improved transfer speeds by about 40%. More importantly, this was where I figured out what kind of work I actually wanted to keep doing.

Education

Bachelor of Information Technology

BINUS University, Jakarta - 2017 – 2023

Get in touch

If you have an infrastructure problem, a role to discuss, or something interesting to build, send me a note.

Jakarta time

Loading local time...

UTC+7 --:--

Copied. Send me a note when you're ready.

LinkedIn GitLab

The best emails tell me what you're building, what feels risky, and where you want help.