The Evaluation Framework
Why Most CIAM Evaluations Fail
Most teams evaluate CIAM solutions the way they evaluate any SaaS tool: they read a few G2 reviews, sit through vendor demos, and pick the one with the slickest presentation. Six months later, they discover the pricing model doesn't scale, the SDK has undocumented limitations, or migration off the platform requires rewriting half their auth flows.
I've been on both sides of this process - as a vendor building LoginRadius and as an advisor helping companies select identity solutions. The companies that make good decisions follow a structured methodology. The ones that don't end up switching vendors within 18-24 months, burning engineering cycles and budget in the process.
This chapter gives you that methodology. It's built from evaluating hundreds of CIAM deployments across startups, mid-market companies, and enterprises serving hundreds of millions of users.
The Five Evaluation Dimensions
Every CIAM solution should be evaluated across five dimensions. The weight you assign to each depends on your stage, industry, and use case - but ignoring any of them is a mistake.
Dimension 1: Security
Security isn't a feature checkbox. It's the foundation. A CIAM solution that handles authentication for your customers is, by definition, your most critical attack surface.
What to evaluate:
- Credential storage: How are passwords hashed? Bcrypt with a work factor of 12+ is the minimum. Argon2id is better. If a vendor can't tell you their hashing algorithm, that's a red flag.
- Token management: Are JWTs signed with RS256 or ES256 (asymmetric), not HS256 (symmetric shared secret)? Is token rotation implemented? What's the refresh token lifetime?
- Brute force protection: Configurable account lockout policies, rate limiting per IP and per account, CAPTCHA integration after failed attempts.
- Breached password detection: Does the system check passwords against known breach databases (like Have I Been Pwned) at registration and login?
- Bot detection: Beyond CAPTCHA - behavioral analysis, device fingerprinting, IP reputation scoring.
- Vulnerability management: What's their CVE response time? Do they have a bug bounty program? When was their last pen test?
- Encryption: Data at rest (AES-256) and in transit (TLS 1.2+). Key management practices.
Ask vendors for their SOC 2 Type II report, not just their Type I. Type I is a point-in-time snapshot - it confirms controls exist. Type II covers 6-12 months and confirms controls actually work consistently. Many vendors will proudly wave a Type I report while hoping you don't notice the difference.
Dimension 2: Scalability
The question isn't whether the vendor can handle your current user base. It's whether they can handle 10x your current user base during a Black Friday traffic spike with no degradation.
What to evaluate:
- Authentication throughput: How many logins per second can the system handle? Get specific numbers, not "we auto-scale." Ask for P99 latency under load.
- Geographic distribution: Where are their data centers or edge nodes? Latency matters for login - a 300ms auth request feels instant, a 2-second one feels broken.
- Multi-tenancy architecture: If you serve multiple customers (B2B2C), does the platform support logical tenant isolation? Can each tenant have its own configuration?
- Rate limits: What are the API rate limits? Are they per-tenant, per-endpoint, or global? Rate limits that seem generous at 10,000 MAU can be crippling at 1 million.
- Data storage limits: Is there a cap on custom user attributes? Profile data size limits? Event log retention?
Scalability benchmarks to request:
| Metric | Acceptable | Good | Excellent |
|---|---|---|---|
| Login latency (P50) | < 500ms | < 200ms | < 100ms |
| Login latency (P99) | < 2s | < 500ms | < 200ms |
| Auth throughput | 1,000 rps | 5,000 rps | 10,000+ rps |
| Uptime SLA | 99.9% | 99.95% | 99.99% |
| Failover time | < 60s | < 30s | < 10s |
Dimension 3: Developer Experience
Your developers will live with this SDK every day. A CIAM solution with great features but a terrible developer experience will slow your team down and breed resentment.
What to evaluate:
- SDK quality: Are SDKs available for your tech stack? Are they actively maintained (check GitHub commit history)? Do they have TypeScript types? Are they tree-shakable?
- Documentation: Is the documentation complete, accurate, and up to date? Try following a quickstart guide before signing a contract. If you can't get a basic login working in under 30 minutes, the documentation has problems.
- API design: RESTful? GraphQL? Is the API consistent and well-structured? Are error messages helpful or cryptic?
- Customization depth: Can you customize the login UI without forking the SDK? Can you inject custom logic into auth flows (pre-login hooks, post-registration webhooks)?
- Local development: Can developers run the auth flow locally, or does every test require hitting the vendor's servers? Is there a local emulator or sandbox?
- Migration tooling: Can you import existing users without forcing password resets? Does the vendor support bulk import with hashed passwords?
Here's a practical test: have one of your engineers implement a basic auth flow using the vendor's SDK and documentation, from scratch, without vendor assistance. Time it. If it takes more than 4 hours to get email/password login and social login working, the developer experience needs improvement. If it takes more than 8 hours, consider it a red flag.
Dimension 4: Compliance
Compliance requirements vary dramatically by industry and geography. A CIAM solution that works for a consumer social app may be completely inadequate for a healthcare platform or a financial services product.
What to evaluate:
- Certifications: SOC 2 Type II, ISO 27001, HIPAA BAA (if healthcare), PCI DSS (if payments), FedRAMP (if government). Ask for current certificates, not "in progress."
- Data residency: Can user data be stored in specific regions? This is a hard requirement for EU (GDPR), Canada (PIPEDA), Australia, and increasingly other jurisdictions.
- Consent management: Does the platform support granular consent collection, versioned privacy policies, and consent withdrawal? Can you track consent history per user?
- Data portability: Can you export all user data in a standard format? GDPR Article 20 requires this.
- Right to erasure: Can you fully delete a user's data, including from backups and logs? How long does deletion take?
- Audit logging: What events are logged? How long are logs retained? Can they be exported to your SIEM?
- Age verification: If you serve minors, does the platform support age gates and parental consent (COPPA, UK Age Appropriate Design Code)?
Compliance matrix by industry:
| Requirement | SaaS (General) | Healthcare | Finance | E-Commerce | Government |
|---|---|---|---|---|---|
| SOC 2 Type II | Required | Required | Required | Recommended | Required |
| HIPAA BAA | - | Required | - | - | Sometimes |
| PCI DSS | - | - | Required | Required | - |
| GDPR | If EU users | If EU users | If EU users | If EU users | If EU users |
| FedRAMP | - | - | - | - | Required |
| Data residency | Recommended | Required | Required | Recommended | Required |
| Audit logs (12mo+) | Recommended | Required | Required | Recommended | Required |
Dimension 5: Pricing
CIAM pricing is where vendors get creative - and where buyers get burned. The initial quote is almost never the actual cost at scale.
What to evaluate:
- Pricing model: Per MAU, per authentication, flat rate, or hybrid? Each has different scaling characteristics.
- MAU definition: How does the vendor define "monthly active user"? Some count any user who exists in the system. Others count only users who authenticate. This difference can be 5-10x in cost.
- Feature gating: Which features are included in your tier? MFA, SSO, breached password detection, and adaptive auth are sometimes locked behind enterprise tiers that cost 3-5x the base price.
- Overage charges: What happens when you exceed your MAU limit? Some vendors charge per-user overages. Others throttle your service. Others auto-upgrade you to the next tier.
- Contract terms: Annual vs. monthly? What's the minimum commitment? What are the cancellation terms?
- Hidden costs: Custom domains, additional environments (staging, QA), premium support, data migration assistance, dedicated infrastructure.
Get pricing in writing for your projected scale at 12, 24, and 36 months. Several vendors offer attractive startup pricing that increases 5-10x as you grow. A vendor that costs $500/month at 10,000 MAU might cost $15,000/month at 500,000 MAU. Model the growth curve before you sign.
The Weighted Scoring Template
Not all dimensions matter equally for every organization. Here's a template with two example weightings:
For a B2B SaaS Company (Enterprise Customers)
| Dimension | Weight | Vendor A Score (1-5) | Weighted | Vendor B Score (1-5) | Weighted |
|---|---|---|---|---|---|
| Security | 30% | 4 | 1.20 | 5 | 1.50 |
| Scalability | 15% | 3 | 0.45 | 4 | 0.60 |
| Developer Experience | 20% | 5 | 1.00 | 3 | 0.60 |
| Compliance | 25% | 3 | 0.75 | 5 | 1.25 |
| Pricing | 10% | 4 | 0.40 | 2 | 0.20 |
| Total | 100% | 3.80 | 4.15 |
For a Consumer App (High-Volume B2C)
| Dimension | Weight | Vendor A Score (1-5) | Weighted | Vendor B Score (1-5) | Weighted |
|---|---|---|---|---|---|
| Security | 20% | 4 | 0.80 | 5 | 1.00 |
| Scalability | 30% | 5 | 1.50 | 3 | 0.90 |
| Developer Experience | 20% | 5 | 1.00 | 3 | 0.60 |
| Compliance | 10% | 3 | 0.30 | 4 | 0.40 |
| Pricing | 20% | 4 | 0.80 | 2 | 0.40 |
| Total | 100% | 4.40 | 3.30 |
Notice how the same two vendors can rank differently depending on what you optimize for. A vendor that wins for enterprise B2B might be the wrong choice for a high-volume consumer app.
Must-Have vs. Nice-to-Have Feature Taxonomy
Must-Haves (Non-Negotiable)
These are features that, if missing, should eliminate a vendor from consideration:
- Email/password authentication with secure hashing
- Social login (at minimum Google and Apple)
- Multi-factor authentication (TOTP at minimum, WebAuthn preferred)
- Password reset and account recovery flows
- JWT/OAuth 2.0 token management
- Rate limiting and brute force protection
- HTTPS everywhere (TLS 1.2+)
- User management API (CRUD operations, search)
- Webhook support for auth events
- SOC 2 Type II certification (or equivalent)
Strong Preferences (Important but Negotiable)
- Passwordless authentication (magic links, passkeys)
- SAML 2.0 and OIDC federation (essential if selling to enterprises)
- Adaptive/risk-based authentication
- Breached password detection
- Custom domains for login pages
- Data residency options
- Pre-built UI components with customization
- User migration tools (bulk import with hashed passwords)
Nice-to-Haves (Differentiators)
- Progressive profiling
- Account linking (merge identities)
- Bot detection and behavioral analytics
- Built-in consent management
- A/B testing for auth flows
- No-code/low-code flow builders
- Multi-tenancy support
- Offline authentication
Questions Your Vendor Does Not Want You to Ask
These questions expose the gaps that sales teams are trained to gloss over. Ask them anyway.
On Pricing
- "What's my cost at 1M MAU, 5M MAU, and 10M MAU?" - Force them to give specific numbers at scale, not just the startup tier.
- "How do you define MAU? If a user logs in once and never returns, are they counted for the entire month?" - Some vendors count created accounts, not active ones.
- "What features are gated behind enterprise pricing?" - Get the full feature matrix for every tier.
- "If I exceed my MAU limit by 20% for one month, what happens?" - Overages can be brutal.
- "What's the price increase history for existing customers?" - Auth0's post-Okta-acquisition price increases caught many customers off guard.
On Lock-In
- "Can I export all user data, including hashed passwords?" - If they can't export password hashes, every user will need to reset their password during migration. That's a 20-40% user churn event.
- "What's the average migration timeline for a customer leaving your platform?" - If they can't answer this, it means nobody leaves (unlikely) or they've made leaving painful (more likely).
- "Do you use proprietary token formats or standard JWT?" - Proprietary formats increase switching costs.
- "What happens to my data if I stop paying?" - Grace periods, data retention, and deletion policies matter.
On Security
- "When was your last penetration test, and can I see the executive summary?" - If the last pen test was more than 12 months ago, that's concerning.
- "How quickly do you patch critical vulnerabilities? What's your SLA?" - Get specific commitments, not "as fast as possible."
- "Have you ever had a data breach? How was it handled?" - Transparency here is more important than a perfect record.
On Reality
- "Can I talk to three customers in my industry and at my scale?" - Not hand-picked references - customers they didn't prepare.
- "What's your engineering team size dedicated to the CIAM product?" - If the CIAM product is a side project within a larger platform, expect slower feature development and bug fixes.
- "What percentage of your revenue comes from CIAM vs. other products?" - If CIAM is 5% of their revenue, it's not their priority.
Red Flags in Vendor Evaluations
After hundreds of evaluations, these are the patterns that consistently predict problems:
Pricing Red Flags
- No public pricing page: If you can't find pricing without talking to sales, expect enterprise pricing regardless of your size.
- "Contact us for pricing" on every tier: They're pricing based on what they think you can pay, not what the product costs.
- Annual-only contracts with no monthly option: They're optimizing for lock-in, not confidence in their product.
- Minimum commitments above your current usage: They're protecting their revenue, not aligning with your growth.
Technical Red Flags
- SDKs with no recent commits: Check GitHub. If the SDK for your language hasn't been updated in 6+ months, expect bugs and missing features.
- Documentation that references deprecated features: If the docs mention features that no longer exist, the documentation team is under-resourced.
- No staging/sandbox environment: If you can't test without affecting production, the infrastructure isn't mature.
- Custom login page requires iframes: This is a security anti-pattern and signals an older architecture.
- No webhook or event system: You'll need to poll APIs to react to auth events, which is both fragile and expensive.
Organizational Red Flags
- High customer support turnover: If you talk to three different support contacts during the evaluation, internal churn is high.
- Acquiring company recently pivoted strategy: Post-acquisition, products often get deprioritized, merged, or sunset. Okta's acquisition of Auth0 is the canonical example - product strategy shifted, pricing changed, and some features were consolidated.
- "We're building that" for critical features: If a must-have feature is on the roadmap but not shipped, treat it as if it doesn't exist. Roadmaps change.
- They can't explain their architecture: If the sales engineer can't clearly describe how authentication requests flow through their system, the engineering team may not fully understand it either.
No vendor will be perfect across all dimensions. The goal isn't to find a vendor with zero red flags - it's to ensure the red flags are in areas that don't matter for your use case. A vendor with weak mobile SDKs is fine if you're building a web-only product. A vendor with no HIPAA BAA is disqualifying if you're in healthcare.
Putting the Framework Into Practice
Here's the process I recommend:
-
Define your requirements (Week 1): List your must-haves, strong preferences, and nice-to-haves using the taxonomy above. Be honest about what you actually need today vs. what you might need in two years.
-
Weight your dimensions (Week 1): Use the scoring template and adjust weights for your industry, scale, and stage.
-
Create a shortlist (Week 2): Narrow to 3-4 vendors based on public information, documentation review, and the solution landscape in the next chapter.
-
Run technical evaluations (Weeks 3-4): Have your engineers implement a proof of concept with each shortlisted vendor. Time the implementation. Note friction points.
-
Ask the hard questions (Week 4): Use the vendor questions list above during sales calls and technical deep-dives.
-
Score and decide (Week 5): Apply your weighted scoring, factor in pricing at projected scale, and make a decision.
Total evaluation timeline: 5 weeks. It's tempting to compress this, but rushing a CIAM decision is how you end up migrating 18 months later.
For more guidance on evaluating authentication approaches, see my article on the essential guide to modern authentication and understanding MFA implementation.