Self-Service Data Infrastructure
A good self-service product makes your life easier, but not messier.
Motivation
Your data infrastructure evolved over time — first you had 10 users, now you have 100, or even 1,000. With your first few users, incoming requests for new tables, roles, etc. are easy to handle. But with more than 100 users, these requests take valuable time out of your day.
Enter self-service data infrastructure.
Maybe you start to use a tool like Terraform to manage users, objects, and permissions. This is great at first, but over time your data infra becomes a complicated web, unless you have an opinionated framework for your users to work within.
A good self-service product makes your life easier, but not messier.
Successful Self-Service at Scale
Some of our observations about what makes for good self-service:
User-centric. Focus on who needs to use these workflows regularly. Are they technical? What tools do they know how to use? What is important to them?
Opinionated. Don’t allow users to do everything — it’s often unnecessary and only makes your service harder to use. Keep the API simple. Your users will thank you, and it will keep your infra more uniform as it grows.
Debuggable. People love to problem solve. Give them the tools to understand common issues and fix it themselves.
Reversible. Expect people to make mistakes, especially when moving fast. Make sure to have an “undo” button.
These are the many qualities of a good “decentralized” system: centralized control with distributed bulk activity. Remove yourself from the hot path of any frequent user actions.
Data Access and RBAC
When it comes to data access requests and role based access control in data platforms like Snowflake, things can get complicated quickly. In order to make requests, users often need to know:
Which role do I need in order to view this dashboard?
Why am I seeing this error?
002003 (42S02): SQL compilation error: Object does not exist or not authorized
Can I get temporary access to these tables?
Who can actually approve whether I can join this role?
Should I create a new role for this dbt pipeline?
Of course, Snowflake provides data to answer all these questions, but your users won’t be able to make sense of that data unless they’re experts. Your self-service program should combine and simplify this data in order to empower users to take action.
About Spyglass
Since you’re here, let me tell you what we’ve cooked up at Spyglass. In short, we make Snowflake data access controls easy - or provide an automated and better way to do the above.
If you’ve nodded your head while reading this, reach out at spyglass.software (or demo@spyglass.software) and we’ll show you a product demo to give you a taste of the future of data access management.