Platform Engineering: Building Internal Developer Platforms That Developers Actually Want to Use
A practical guide to building Internal Developer Platforms (IDPs) using golden paths, self-service infrastructure, and product thinking. Covers Backstage, Port, AWS services, metrics beyond DORA, and common pitfalls.
Platform Engineering creates Internal Developer Platforms that enable self-service infrastructure through golden paths and standardized workflows. This guide covers IDP architecture, implementation patterns, tooling options (Backstage, Port, AWS services), success metrics, and common pitfalls for teams building platforms developers will actually adopt.
Understanding Platform Engineering
Working with platform teams over the years has shown me that Platform Engineering is fundamentally about designing toolchains and workflows that enable self-service capabilities for software engineering organizations. The key shift is treating internal developers as customers rather than users to be controlled.
What makes Platform Engineering different from traditional DevOps?
The core difference is the platform-as-a-product mindset. Instead of DevOps teams acting as service providers responding to tickets, platform teams become product owners who build tools that developers voluntarily choose because they provide superior experiences.
This isn't just semantic; it's a fundamental change in approach:
- DevOps: "We'll deploy that for you (ticket required)"
- Platform Engineering: "Here's how you deploy it yourself in 5 minutes"
Why Platform Engineering is trending in 2025:
The market data tells the story. The Platform Engineering Services Market grew from 40.17B by 2032 (23.99% CAGR). Gartner predicts 80% of large software engineering organizations will have platform teams by 2026 (up from 45% in 2022).
The driver? A developer productivity crisis. Research shows 75% of developers lose 6+ hours weekly to tool fragmentation. Organizations need to scale DevOps practices without creating bottlenecks, while managing ever-increasing cloud-native complexity.
Core Philosophy:
Working with various platform implementations has taught me these principles matter most:
- Developer empowerment through self-service (not ticket-driven workflows)
- Standardization without reducing flexibility (golden paths, not rigid mandates)
- Reducing cognitive load (one way that works well beats ten options)
- Enabling speed AND safety simultaneously (not trading one for the other)
- Product thinking applied to internal tools (developers are customers)
Core Components of an Internal Developer Platform
Here's what actually makes up an effective IDP, based on what works in practice:
Software Catalog / Service Catalog
A central registry of all services, APIs, resources, and teams. This isn't just documentation; it's a living inventory that answers "who owns this?" and "what depends on what?"
Backstage uses YAML-based catalog entities (Component, System, API, Resource, User, Group, Domain). The key is version control integration with GitHub or GitLab for catalog-as-code:
Golden Paths / Paved Roads
The term "Golden Paths" comes from Spotify (Netflix calls the same concept "Paved Roads"). The definition that works best: "An opinionated, well-documented, supported way of building and deploying software".
Characteristics of effective golden paths:
- Fully self-serviceable (no ticket filing required)
- Minimal cognitive load (sensible defaults, clear documentation)
- Discoverable by anyone in the organization
- Optional but most convenient (adoption through superiority, not mandate)
- Drives standardization naturally (people choose it because it's better)
Technical examples of golden paths:
- Standardized service templates (new microservice in 30 minutes)
- Pre-configured CI/CD pipelines (testing + deployment wired up)
- Infrastructure-as-Code modules (common patterns ready to use)
- Containerization blueprints (security baseline included)
- Monitoring and observability pre-wired (no after-the-fact instrumentation)
Self-Service Actions / Developer Portal
The UI layer for platform capabilities. This is where developers interact with the platform to create services, deploy environments, provision databases, all without waiting.
Common implementations include web portals, CLI tools, IDE plugins, and Slack/ChatOps integrations. The best platforms support all of these, letting developers work however they prefer.
Infrastructure as Code (IaC) Orchestration
Centralized IaC templates and modules enable consistent infrastructure provisioning. Version-controlled infrastructure definitions ensure repeatability.
Examples include AWS CDK constructs, Terraform modules, and Pulumi components. Note that AWS Proton is being discontinued (October 2026), so teams should plan alternatives.
CI/CD Integration
Standardized pipeline templates with automated testing, quality gates, and progressive deployment strategies (canary, blue-green). Integration with GitHub Actions, GitLab CI, Jenkins, or CircleCI.
Observability and Monitoring
Pre-configured monitoring dashboards, standardized logging and tracing, alerting templates, and cost tracking. Tools like Prometheus, Datadog, New Relic, or CloudWatch integrate here.
Security and Compliance
Policy-as-code enforcement, security scanning automation (SAST, DAST, dependency scanning), compliance guardrails built into golden paths, and secret management integration.
Tools like Snyk, Dependabot, AWS Security Hub, and OPA (Open Policy Agent) automate security at platform level.
Platform as a Product Mindset
Here's what I've learned matters most when building platforms: developers are customers, not users to be controlled.
Platform success is measured by voluntary adoption. If you mandate platform usage, you've lost the most important feedback signal: whether developers actually find it valuable.
Critical practices that work:
- Research developer pain points before building
- Gather continuous feedback loops (surveys, office hours, Slack channels)
- Measure developer satisfaction quarterly (NPS or custom metrics)
- Don't mandate platform adoption (it closes feedback loops)
- Build what developers need, not what you think they need
- Maintain a product roadmap and communicate changes transparently
Product thinking checklist:
- Defined user personas (different dev team types)
- User research conducted (surveys, interviews)
- Clear value proposition for developers
- Onboarding experience designed
- Documentation written for humans
- Support channels established
- Success metrics defined and tracked
- Roadmap shared transparently
- Feedback mechanisms in place
Implementation Patterns and Best Practices
Starting Your Platform Journey
Don't start with:
- Biggest, most critical service (too risky for nascent platform)
- Portal/UI first (need solid backend APIs first)
- Top-down mandates (kills adoption)
- "Build it and they will come" mentality
- Everything in-house ("Not Invented Here" syndrome)
Do start with:
- Minimum viable platform (MVP) with core value
- Pilot with friendly, interested team
- Backend APIs and orchestration first, UI second
- One golden path done really well
- Existing pain point that everyone feels
- Developer research and feedback
Build vs. Buy Considerations
Build custom when:
- Highly specific organizational requirements
- Deep integration with legacy systems needed
- Have dedicated platform engineering team (3+ engineers)
- Long-term investment commitment
- React/TypeScript expertise available (for Backstage)
Buy/adopt existing when:
- Standard requirements covered by existing tools
- Limited platform engineering resources
- Need fast time-to-value (weeks, not months)
- Prefer managed solutions over self-hosting
- Want active community and ecosystem
Team Structure Recommendations
Platform Engineering Team Composition:
- Former infrastructure engineers (already have expertise)
- Product-minded engineers (user empathy)
- Developer experience (DevEx) specialists
- Technical writers (documentation matters)
Avoid: Moving all senior engineers to platform team (creates knowledge gaps in dev teams)
Team size by organization:
- Small (< 50 engineers): 1-2 platform engineers
- Medium (50-200 engineers): 3-5 platform engineers
- Large (200-500 engineers): 5-10 platform engineers
- Enterprise (500+ engineers): 10-20+ platform engineers
Common guideline (not industry standard): ~1 platform engineer per 30-50 application developers. This ratio varies significantly based on platform maturity, organizational complexity, and automation level.
Tooling Landscape
Backstage (Open Source by Spotify)
Open-sourced in 2020, Backstage has the largest ecosystem and market share. It's a plugin-based architecture with React/TypeScript frontend and Node.js backend.
Strengths:
- Highly customizable for complex requirements
- Rich plugin ecosystem (100+ plugins)
- Free and open source
- Large community support
- Unified developer experience
Challenges:
- Significant implementation effort (3-6 months typical)
- Requires React/TypeScript/SAML expertise
- Self-hosting required (or use managed service like Roadie)
- Steep learning curve
- Ongoing maintenance burden
Important reality check: Organizations often underestimate that Backstage is "not ready-to-use" (Gartner warning). It's a framework, not a product. You're building your IDP using Backstage components.
Data point: Companies report 2x code change improvement and 17% cycle time decrease after Backstage implementation.
When to choose Backstage:
- Large-scale, complex environments
- Need deep customization
- Have dedicated development resources
- React/TypeScript expertise available
- Open source preference
Port (Commercial Platform)
Founded in 2022, Port raised 60M. It's a no-code platform for IDP creation, hosted as SaaS.
Strengths:
- Much faster time-to-value (weeks vs months)
- No React/TypeScript expertise required
- Excellent onboarding experience
- Auto-import from GitHub/GitLab
- Less maintenance overhead
- Built-in tutorials and guides
- Dynamic inventorying (CI/CD flows, clusters, environments)
- Advanced search and RBAC
Challenges:
- Implementation still lengthy (3-6 months reported)
- Significant licensing costs (higher than competitors)
- Less customizable than Backstage
- Vendor lock-in concerns
- Higher total cost of ownership
When to choose Port:
- Want user-friendly, hassle-free IDP
- Limited technical expertise for platform development
- Prefer SaaS over self-hosted
- Need fast deployment
- Budget for commercial tooling
AWS Services for Platform Engineering
AWS Proton (Note: Discontinued October 7, 2026)
- Managed service for infrastructure template vending
- Platform engineers define standards, developers self-serve
- Organizations should plan alternatives
Amazon EKS (Elastic Kubernetes Service)
- Fully managed Kubernetes for platform foundations
- EKS Blueprints: Curated templates for complete EKS setup
- Platform engineering patterns well-supported
- Integrates with Backstage, Port, custom IDPs
AWS CDK (Cloud Development Kit)
Infrastructure as Code in programming languages. Perfect for building platform golden paths:
AWS Service Catalog
- Vending machine for approved AWS resources
- Pre-configured product portfolios
- Budget controls and governance
- Alternative to AWS Proton
Other Notable Tools
- Humanitec: Platform orchestration, focuses on application configuration
- Kratix: Platform-as-a-product framework on Kubernetes
- Crossplane: Infrastructure composition using Kubernetes APIs
- Terraform Cloud/Enterprise: Workspace management for teams
- Pulumi: Multi-language IaC with state management
- ArgoCD / Flux: GitOps for Kubernetes deployments
- Cortex: Developer scorecards and standards tracking
Measuring Platform Success
DORA Metrics (Traditional DevOps)
Four Key Metrics:
- Deployment Frequency: How often code deploys to production
- Lead Time for Change: Time from commit to production
- Time to Restore Service: How long to recover from failure
- Change Failure Rate: Percentage of deployments causing issues
Elite Performers (DORA 2024):
- Deploy multiple times per day
- Lead time < 1 day
- Recovery time < 1 hour
- Failure rate < 5%
DORA Limitations for Platform Engineering
Critical gap I've observed: DORA measures software delivery performance, NOT platform engineering effectiveness.
What DORA misses:
- Infrastructure management quality
- Security and compliance improvements
- Platform usability and developer happiness
- Tech debt reduction (unless it causes failures)
- Scalability and maintainability work
- Day 2-N operations improvements
Key insight: You cannot evaluate platform engineering teams on DORA metrics alone.
Platform-Specific Metrics
Developer Experience (DevEx) Metrics:
- Platform Adoption Rate: % of teams using IDP vs alternatives
- Self-Service Success Rate: % of self-service actions completed without help
- Time to First Deployment: How long for new team to deploy using platform
- Developer Satisfaction Score: Quarterly surveys (NPS or custom)
- Tool Fragmentation Score: Number of tools developers must use
- Onboarding Time: Days to productivity for new engineers
Platform Health Metrics:
- Golden Path Usage: % of deployments using standard templates
- Support Ticket Volume: Platform-related help requests (downward trend = success)
- Platform Uptime: Availability of platform services
- Template Update Velocity: How quickly platform capabilities improve
- Documentation Coverage: % of platform features documented
Business Impact Metrics:
- Cost Optimization: Infrastructure spend reduction through standardization
- Security Posture: Vulnerability reduction, compliance improvements
- Velocity Impact: Team throughput before/after platform adoption
- Operational Efficiency: Reduced toil, automation coverage
Recommended Approach (DX Core 4 Framework):
Combine quantitative DORA metrics with:
- Speed: Deployment frequency, lead time
- Effectiveness: Self-service success, time to value
- Quality: Change failure rate, security posture
- Business Impact: Cost savings, team efficiency
Organizations using this balanced approach see 3-12% efficiency gains, 14% increases in R&D focus, 15% improvements in developer engagement.
Platform metrics tracking example:
Common Anti-Patterns and Pitfalls
Strategic Anti-Patterns
Confusing Platform with Portal
- Reality: Platform is backend APIs, orchestration, and golden paths
- Portal is just the UI layer
- Fix: Build solid backend first, UI second
Not Treating Platform as Product
- Symptom: Low adoption, developer frustration
- Root cause: Building without user research
- Fix: Apply product management practices, treat developers as customers
Top-Down Mandates Without Buy-In
- Problem: Forces developers to use tools they don't want
- Impact: Resistance, workarounds, shadow IT
- Fix: Make platform best option through superior experience
"Field of Dreams" Mentality
- Mistake: "Build it and they will come"
- Reality: Platform must solve actual developer pain
- Fix: Start with research, validate with pilots
Implementation Anti-Patterns
Starting with Biggest/Most Critical Service
- Risk: Too much pressure on nascent platform
- Impact: Failed pilot damages platform credibility
- Fix: Start with friendly team, non-critical service
Overly Complex Platforms
- Symptoms: Unfamiliar config formats, no documentation, inconsistent APIs
- Impact: Developers avoid platform
- Fix: Simplicity first, consistency always, document everything
Over-Reliance on Ticket Systems
- Problem: Tickets create bottlenecks, reduce autonomy
- Impact: Slow delivery, developer frustration
- Fix: True self-service, minimal approval workflows
Templates-as-a-Service Only
- Problem: Rigid templates, no customization
- Impact: Workarounds, shadow IT, abandoned templates
- Fix: Templates as starting point, allow customization within boundaries
Organizational Anti-Patterns
Skill Concentration Trap
- Mistake: Moving all senior engineers to platform team
- Impact: Knowledge gaps in development teams
- Fix: Balanced team distribution, rotate people
Underinvested Platforms
- Symptom: Platform team disbands after initial delivery
- Impact: Platform becomes unmaintained anchor
- Fix: Long-term investment commitment, ongoing team
Lack of Cost Controls
- Problem: Self-service without spending limits
- Impact: Runaway cloud costs
- Fix: Budget controls, automated spending limits, cost visibility
Getting Started: Practical Roadmap
Phase 1: Foundation (Weeks 1-4)
- Identify 3-5 biggest developer pain points (surveys, interviews)
- Define platform vision and success metrics
- Form initial platform team (2-3 people)
- Choose pilot team (friendly, non-critical service)
- Document current state (tools, workflows, pain points)
Phase 2: MVP Development (Weeks 5-12)
- Build one golden path (e.g., standard API service template)
- Create basic service catalog (manual is fine)
- Implement self-service workflow (CLI or simple UI)
- Write clear documentation
- Deploy pilot with friendly team
Phase 3: Validation (Weeks 13-16)
- Gather pilot feedback (what works, what doesn't)
- Measure baseline metrics (time to deploy, satisfaction)
- Iterate on golden path based on feedback
- Expand to 2-3 more teams
- Document learnings and adjust roadmap
Phase 4: Scale (Months 5-12)
- Add 2-3 more golden paths (common use cases)
- Build or adopt developer portal (Backstage/Port decision)
- Integrate observability and security
- Implement cost controls
- Scale to 25-50% of organization
- Establish support channels
Phase 5: Mature (Year 2+)
- Continuous improvement based on metrics
- Advanced features (AI assistance, automated optimization)
- Cross-team collaboration features
- Platform API stability and versioning
- Community building (internal user groups)
Quick win ideas for early momentum:
- Standardized service template: New service in < 30 minutes
- One-click environment: Ephemeral dev/test environments
- Automated security scanning: Build into pipelines
- Cost dashboards: Show teams their spend
- Onboarding automation: New engineer productivity in < 1 day
Key Takeaways
For Engineering Leaders:
- Platform engineering is product management applied to internal tools
- Measure success through developer satisfaction, not just DORA metrics
- Long-term investment required (people, time, budget)
- Start small, validate, scale based on feedback
For Platform Engineers:
- Backend APIs before UI/portal
- One golden path done well beats many mediocre ones
- Developer research prevents building wrong things
- Documentation and onboarding are product features, not afterthoughts
- Security and cost controls from day one
For Development Teams:
- Platforms succeed when developers choose them voluntarily
- Provide feedback to platform teams (they need it)
- Golden paths are shortcuts, not cages
- Self-service reduces friction but requires platform investment
Critical Success Factors:
- Product mindset (developers as customers)
- Start with real pain points
- Measure what matters (not just DORA)
- Avoid common anti-patterns
- Long-term commitment and iteration
Platform Engineering is fundamentally about enabling developers to do their best work. When you treat your platform as a product and your developers as customers, you create something they'll actually want to use. And that makes all the difference.