By Megan Lim, Consultant – Altis NZ
Within the cybersecurity realm, authentication is ensuring parties (e.g. users and service accounts) are who they say they are; they are not pretending to be someone else. Whereas authorisation is granting permissions to an already authenticated party to specify what data can the party access, and what they can do with that data. Both authentication and authorisation are required for the correct parties to have the appropriate permissions to access data.
As we can see, authentication is paramount for data sources and ingestion processes to be secure; it is also the step before authorisation. This blog explores the journey of how Altis Consulting can help organisations authenticate to Snowflake securely, for both service accounts and users. We strive to walk hand-in-hand with our clients for them to have a smooth authentication setup.
Federated Authentication (Fed Auth) VS Single-Sign-On (SSO)
Both Fed Auth and SSO are used for the same purpose – for users to access multiple services and resources with fewer login attempts and help reduce the need to remember passwords. From the user’s perspective, both Fed Auth and SSO have the same behaviour. However, they are different behind the scenes.
Federated Authentication or Federated Identity Management (FIM) provides a single access to multiple services and resources. When a user tries to log into a service or resource, the authentication request is routed to a trusted Identity Provider (IdP) (e.g. OKTA). The user will then authenticate themselves to the IdP if required. Once successfully authenticated, the IdP will indicate to the service provider that this user has been authenticated. This works because the service provider trusts the IdP – they trust that the IdP that the user has indeed been authenticated, and the service provider can move on to the next step.
For Fed Auth, multiple service providers and organisations across different domains can trust the same IdP. With one password and one authentication event, the user can access multiple services across different domains; and user credentials are never directly shared between organisations. Thus, Fed Auth is the single authentication to multiple services across different domains.
Meanwhile SSO uses a single credential across multiple services within one organisation because SSO is a token-based system. When a user tries to log into a service or resource, the authentication request is routed to a trusted IdP. The user will then authenticate themselves to the IdP if required. Once authenticated, the identity provider will send an authentication token to the service provider. This token represents that this user has been authenticated, and therefore the service will allow the user access.
Another difference between the two is that SSO is designed to authenticate a single credential for multiple systems within a single organisation. For example, Azure AD‘s SSO feature allows users to only need one credential to access all their Microsoft apps and cloud or on-prem resources. Whereas FIM uses a single authentication to multiple systems across different organisations. For example, if a user is already logged into Gmail, they can have access to Canva, Spotify and other popular platforms. In this case, these platforms use Google as their federated partner for Fed Auth. Sometimes a service (e.g. OKTA) can provide both SSO and Fed Auth features.
Snowflake is a popular data warehousing platform designed for the cloud. It is a fully managed SaaS that separates its storage from compute, and customers can let go the stress for maintaining or installing hardware or software. Other features include time travel, native SQL and ability to define file formats.
WhereScape RED is a data warehouse automation platform that provides a simple interface with code templates to auto-generate code and documentation. Traditional data warehousing is slow, manual and complex. However, with data warehouse automation, warehouse processing is more efficient for design, development, testing and deployment.
The idea is to setup authentication for Snowflake for both service accounts and users (developers). This means ETL tools, reporting tools and developers need to be authenticated so that the data warehouse solution can be implemented and tested. For this article, we will use WhereScape RED as our chosen ETL tool and Power BI as the reporting tool.
There are several challenges to look out for when setting up authentication for Snowflake which we will cover in the next section.
WhereScape has limited documentation on the public internet as they only share them with paid customers. Altis has experts with years of WhereScape experience, and we can help guide you in your data warehouse developing journey and deblur the documentation mystery.
Authenticate service account to Snowflake securely
A service account is like a normal Snowflake user account, but it is used to represent an external application or machine that can run scheduled tasks and retrieve and access data.
There are several methods to authenticate service accounts to Snowflake, for example:
- Key-Pair authentication
- Username and password
With a service account, we can immediately rule out ‘username and password’ as it raises two major problems:
- Problem 1: The password is set to never expire. In most cases, the service account’s password is set to never expire because they do not have an interactive login. This could pose a security risk.
- Problem 2: A service account cannot do MFA. Since service accounts do not have a human behind it, it is not physically possible to do MFA. Thus, some organisations may agree to disable MFA for services accounts.
That leaves us with Key-Pair authentication and OAuth – choosing either one will depend on the customer’s use case. For instance, OAuth’s token-based authentication may be better suited for API requests to Snowflake SQL REST API especially since the access tokens are temporary.
We also need to consider what driver or connector the service account should use to connect to Snowflake. The following lists the available drivers and connectors that Snowflake supports:
- ODBC driver
- SQL API
- Node.js driver
- Go driver (Go Snowflake driver)
- .NET driver
- JDBC driver
- Python connector
- PHP driver
Authenticating users (developers) to Snowflake securely
There are several methods to authenticate users to Snowflake: Using usernames and passwords, or more commonly via SSO or Fed Auth. If SSO or Fed Auth was used, then it may be preferred to additionally disable users from manually logging into Snowflake’s login page using their password; in other words, they must login via their IdP.
Moreover, since Snowflake has now upgraded to using SAML 2.0, there could be a situation where the existing SAML_IDENTITY_PROVIDER needs to be migrated too.
Another requirement to consider is authenticating users in bulk load, as it is not efficient to authenticate each user one-by-one. In fact, multiple users may require authentication, such as Azure AD groups. On top of that, these users need to be assigned the appropriate Snowflake user roles for RBAC management.
As you can see there are many considerations to think about when authenticating users to Snowflake. But don’t let this stress you out. Here at Altis, we have gone through this before with other clients and strive to make this a smooth process. Successful authentication can be tested via a preview page, or through the IdP’s platform. As an example, if OKTA was the IdP, users can test to see if they can use the Snowflake application in their OKTA account to access Snowflake.
SCIM integration for Snowflake
Your organisation may also want to configure SCIM integration. This is where supported external platforms are allowed to manage the user lifecycle (create, update and delete), role lifecycle (create, update and date) and how users are assigned to roles in Snowflake. With these managed by the external platform, it will provide a more convenient user experience rather than coding it inside a Snowflake worksheet.
Authenticating Power BI to access Snowflake
When creating a Snowflake dataset or dataflow in Power BI, one of the source credential methods is using a username and password. However, this means the developer would need to type in the same username and password. To make it more convenient, SSO via Azure AD can be used instead. And with SSO in place, developers can still work as normal with Power BI Desktop and Power BI Service (e.g. Import and DirectQuery are supported for both Desktop and Service).
Outcomes of one of our previous clients
- Deployed WhereScape to Dev, Test, Pre-prod and Prod servers
- Configured Key-Pair authentication for the WhereScape service account to connect to Snowflake
- Users successfully connected to Snowflake using OKTA as the IdP with the SAML 2.0.
- Users must login to Snowflake via OKTA SSO
- Configured SCIM with Snowflake using OKTA
- Connected Power BI Service and Desktop to Snowflake using SSO via Azure AD
- Implemented a script to automatically rotate Key-Pair authentication keys
Other points to consider when setting up authentication with Snowflake are:
- Network policies
- Migrating SSRS reports to access Snowflake data, especially there are any existing internal stored procedures, queries, expressions and/or parameters.
Altis has a wealth of experience on using Snowflake with clients, and authentication is vital for secure data analysis and implementation. If you would like to know more about this solution, or how we can help in your next data journey, please contact us.