2 Matching Annotations
  1. Jan 2025
    1. How we migrated onto K8s in less than 12 months
      • Figma's Initial Infrastructure Challenges:

        • Figma's monolithic architecture struggled with resource allocation inefficiencies and limited scalability.
        • High traffic spikes from collaborative design workflows required more robust solutions for resource autoscaling and failover.
      • Why Kubernetes Was Chosen:

        • Kubernetes' container orchestration capabilities promised better resource management and service isolation.
        • Features like Horizontal Pod Autoscaling (HPA), robust networking via Kubernetes Services, and support for StatefulSets made it an ideal fit for Figma’s needs.
        • The platform also wanted better alignment with cloud-native practices and modern CI/CD workflows.
      • Incremental Migration Approach:

        • Step 1: Non-Critical Services: Figma migrated stateless services first, allowing experimentation without risking core functionality.
        • Step 2: Custom Tooling: Internal tooling was built to manage Kubernetes manifests and automate Helm chart creation for standardization.
        • Step 3: Stateful Services: For databases and other stateful components, Figma relied on Kubernetes' StatefulSets and persistent volumes (PVs) to ensure data integrity during the migration.
        • Step 4: Observability Enhancements: Kubernetes-native tools like Prometheus and Grafana were integrated to provide detailed metrics and system insights.
      • Key Technical Adjustments During Migration:

        • Service Discovery: Transitioned to Kubernetes-native DNS for internal service communication, replacing legacy methods.
        • Load Balancing: Leveraged Kubernetes Ingress and external load balancers (e.g., NGINX or cloud-native solutions) for traffic routing.
        • Networking Complexity: Resolved challenges around multi-cluster networking using Kubernetes CNI plugins like Calico.
        • Resource Management: Used Resource Quotas and Limits to prevent pod overcommitment and optimize cluster utilization.
      • Challenges Faced:

        • Stateful Services: Ensuring zero-downtime migration for databases required careful orchestration of PersistentVolumeClaims (PVCs) and StatefulSets.
        • Networking: Handling cross-region traffic and external dependencies required tweaking Kubernetes Ingress configurations.
        • Resource Constraints: Balancing costs and performance involved tuning cluster-autoscaler configurations and evaluating node pool setups.
      • Benefits Realized Post-Migration:

        • Scalability: Kubernetes' HPA allowed Figma to scale pods dynamically based on traffic patterns, ensuring consistent performance.
        • Deployment Efficiency: CI/CD pipelines integrated seamlessly with Kubernetes, enabling faster and more reliable rollouts using tools like Argo CD.
        • Reliability: Self-healing capabilities, such as pod restarts and node failover, reduced downtime during failures.
        • Observability: Improved system monitoring with Kubernetes' native metrics server and integrations with Prometheus and Grafana.
      • Future Enhancements Planned:

        • Service Mesh Integration: Adoption of Istio or Linkerd to enhance observability, security (e.g., mutual TLS), and traffic management.
        • Cost Optimization: Further tuning autoscaling policies and resource limits to minimize waste.
        • Edge Improvements: Deploying Kubernetes clusters closer to end-users for reduced latency, potentially using Kubernetes' Cluster Federation.
  2. Jan 2023
    1. tl;dw (best DevOps tools in 2023)

      1. Low-budget cloud computing : Civo (close to Scaleway)
      2. Infrastructure and Service Management: Crossplane
      3. App Management - manifests : cdk8s (yes, not Kustomize or Helm)
      4. App Management - k8s operators: tie between Knative and Crossplane
      5. App Management - managed services: Google Cloud Run
      6. Dev Envs: Okteto (yeap, not GitPod)
      7. CI/CD: GitHub Actions (as it's simplest to use)
      8. GitOps (CD): Argo CD (wins with Flux due to its adoption rate)
      9. Policy Management: Kyverno (simpler to use than industry's most powerful tool: OPA / Gatekeeper)
      10. Observability: OpenTelemetry (instrumentation of apps), VictoriaMetrics (metrics - yes not Prometheus), Grafana / Loki (logs), Grafana Tempo (tracing), Grafana (dashboards), Robusta (alerting), Komodor (troubleshooting)