A guide to threat hunting and monitoring in Snowflake

Snowflake is a cloud-based data platform that provides a fully managed, scalable solution for storing, processing, and analyzing large data volumes on AWS, Microsoft Azure, and Google Cloud.

In light of an emerging threat currently targeting Snowflake customers, we're releasing a detailed guide to threat hunting in your Snowflake environment. This post will give you the queries and context needed to conduct a thorough analysis of your Snowflake instances and create threat detections based on your findings.

This post builds on top of the excellent work done by the team at Rezonate in their post "Threat-Hunting for Identity Threats in Snowflake" and also includes the prospective guidance that Snowflake has released to help customers track down the recent threat mentioned above.

Threat model

An attacker who has successfully compromised Snowflake credentials can access and exfiltrate sensitive data stored within Snowflake. In order to investigate initial access attempts and subsequent behaviors, there are specific Snowflake logs that you should monitor for abuse.

Each query section will contain the purpose of the query, the query itself, and guidance on what to look for in the returned data.

Threat hunting queries for identifying malicious activity in Snowflake environments

In this section, we'll describe the queries you can use to detect signs of attacker behavior in your Snowflake instances.

Initial access

These queries search for activity that suggest an attacker may have successfully gained access to your Snowflake environment.

Successful web and CLI logins with GeoIP analysis (T1078 - Valid Accounts)

Logins directly to the Snowflake UI and CLI should come from user accounts and expected IP addresses. For example, if your workforce is remote, traffic should be coming from residential ISPs.

SELECT *,
FROM SNOWFLAKE.ACCOUNT_USAGE.LOGIN_HISTORY
WHERE
    IS_SUCCESS = 'YES' AND 
    FIRST_AUTHENTICATION_FACTOR='PASSWORD' AND
    EVENT_TIMESTAMP >= '2024-04-01' AND
    (REPORTED_CLIENT_TYPE = 'SNOWFLAKE_UI' OR REPORTED_CLIENT_TYPE = 'SNOWSQL_CLI');

What to look for in the results: Monitor logins via password to the Snowflake UI and CLI tooling to ensure they are performing as expected. If you are ingesting Snowflake logs into your SIEM and enriching them with GeoIP data, you can also look for logins from known proxy or VPN services that are not approved by the company.

Suspicious external access integration (T1199 - Trusted Relationship)

Review any suspicious entries of external access performed by procedure or user-defined functions (UDF) handler code within the last 365 days through the External Access History table.

SELECT *
FROM snowflake.account_usage.external_access_history;

What to look for in the results: Look for evidence of unusual source and target cloud destinations. You can cross-reference any query ID available in external access logs with the Query History view for more details on whether the query includes sensitive information. Unexpected use of external access for your environment is a potential indicator of compromise.

Logins from known bad IP addresses (T1078 - Valid Accounts)

Review your login history for any login attempts from known potentially malicious IP addresses. The Snowflake team has provided the query below to help users hunt for known bad actors---you can learn more about it in their recent posting.

SELECT
    *
FROM
    snowflake.account_usage.login_history
WHERE
    event_timestamp >= '2024-04-01' AND
    client_ip IN (
'104.223.91.28',
'198.54.135.99',
'184.147.100.29',
'146.70.117.210',
'198.54.130.153',
'169.150.203.22',
'185.156.46.163',
'146.70.171.99',
'206.217.206.108',
'45.86.221.146',
'193.32.126.233',
'87.249.134.11',
'66.115.189.247',
'104.129.24.124',
'146.70.171.112',
'198.54.135.67',
'146.70.124.216',
'45.134.142.200',
'206.217.205.49',
'146.70.117.56',
'169.150.201.25',
'66.63.167.147',
'194.230.144.126',
'146.70.165.227',
'154.47.30.137',
'154.47.30.150',
'96.44.191.140',
'146.70.166.176',
'198.44.136.56',
'176.123.6.193',
'192.252.212.60',
'173.44.63.112',
'37.19.210.34',
'37.19.210.21',
'185.213.155.241',
'198.44.136.82',
'93.115.0.49',
'204.152.216.105',
'198.44.129.82',
'185.248.85.59',
'198.54.131.152',
'102.165.16.161',
'185.156.46.144',
'45.134.140.144',
'198.54.135.35',
'176.123.3.132',
'185.248.85.14',
'169.150.223.208',
'162.33.177.32',
'194.230.145.67',
'5.47.87.202',
'194.230.160.5',
'194.230.147.127',
'176.220.186.152',
'194.230.160.237',
'194.230.158.178',
'194.230.145.76',
'45.155.91.99',
'194.230.158.107',
'194.230.148.99',
'194.230.144.50',
'185.204.1.178',
'79.127.217.44',
'104.129.24.115',
'146.70.119.24',
'138.199.34.144'
    )
ORDER BY
event_timestamp;

What to look for in the results: The IP addresses in the query above are specific to the current threat actor campaign from April to June 2024. However, you can modify this query to target any IP address known to be associated with a threat actor. Note that IP addresses are often dynamic and can be re-assigned, so a query of this type is prone to false positives. We suggest using any positive hits as signals to trigger an investigation into logs containing those specific IP addresses. If all other activity related to the IP appears benign, this is potentially a false positive. However, if this IP address is associated with active queries and other behavior, this might be a strong indicator of compromise.

Anomalous client application activity (T1199 - Trusted Relationship)

Review all client applications in the Sessions view for any unusual usage. The Snowflake team has provided this query to assist with hunting for known bad actors; it's described further in their recent posting.

SELECT
    COUNT(*) AS client_app_count,
    PARSE_JSON(client_environment) :APPLICATION :: STRING AS client_application,
    PARSE_JSON(client_environment) :OS :: STRING AS client_os,
    PARSE_JSON(client_environment) :OS_VERSION :: STRING AS client_os_version
FROM snowflake.account_usage.sessions
WHERE
    1 = 1
    AND sessions.created_on >= '2024-04-01'
GROUP BY
    ALL
ORDER BY
    1 ASC;

What to look for in the results: Client applications are set up to allow for automation and integrations. Investigate any unapproved or unrecognized client applications and the associated OS. An attacker may have set up a session from an outside tool in order to access and exfiltrate data.

Known suspicious client application activity (T1199 - Trusted Relationship)

As described a recent posting by Snowflake, there are known application clients associated with attackers. You can search for them with the below query.

SELECT *
FROM snowflake.account_usage.sessions
WHERE
    (PARSE_JSON(CLIENT_ENVIRONMENT):APPLICATION LIKE '%DBeaver%'
    OR PARSE_JSON(CLIENT_ENVIRONMENT):APPLICATION LIKE '%rapeflake%')
   AND CREATED_ON >= '2024-01-01';

What to look for in the results: Look for activity from known attacker application clients in your environment. For example, DBeaver is a free database tool that could be used legitimately depending on the use cases---however, a combination of `DBeaver` and Windows Server 2002 or use of `rapeflake` is a strong indicator of compromise.

Defense evasion

These queries can help you detect activity that could indicate an attacker is attempting to evade established defense protocols in your environment.

Network policy additions, alterations, or deletions (T1484 - Domain or Tenant Policy Modification)

Snowflake recommends setting network policies on specific accounts as a security measure. An attacker may attempt to change network policies to allow traffic from a specific set of IP addresses.

SELECT *
from SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE (QUERY_TYPE IN ('CREATE_NETWORK_POLICY', 'ALTER_NETWORK_POLICY', 'DROP_NETWORK_POLICY') OR
       QUERY_TEXT ilike any ('% set network_policy%', '% unset network_policy%') ) AND
      START_TIME>='2024-01-01';

What to look for in the results: Network policies should only be set and altered by your company's Snowflake administrator team, who likely have a ticketing system to track these changes. Review any changes to network policies without internal documentation for change management. A suspicious login coupled with a network policy change by an administrator is a strong indicator of compromise that should be investigated. For another example query of this behavior, refer to previous work by Rezonate.

Persistence

Here's a look at how to identify activity that could indicate an attacker has gained persistent access to your Snowflake environment.

Admin-level access granted to user (T1098 - Account Manipulation)

If an attacker gains administrative access, they may grant admin-level access to a second compromised user in an attempt to fly under the radar. The following admin permissions are available: AccountAdmin, OrgAdmin, SysAdmin, and SecurityAdmin.

SELECT *,
FROM
 snowflake.account_usage.grants_to_users
WHERE 
    (role='SYSADMIN' OR role='SECURITYADMIN' OR role='ORGADMIN' OR role='ACCOUNTADMIN') 
    AND created_on >= '2024-04-01';

What to look for in the results: To understand potential privilege escalation actions, review logs that indicate a user has been granted an administrator role. Investigate any grants of administrator access outside of expected users, such as business technology or IT team members. An administrator grant combined with suspicious login behavior could indicate an account compromise and privilege escalation and should be investigated further. For another example query of this behavior, refer to previous work by Rezonate.

Credential access

Use these queries to detect potential access of Snowflake credentials by an attacker.

Successful logins via OAUTH access tokens (T1528 - Steal Application Access Token)

Review use of OAuth access tokens to ensure they align with expected client type and IP address behavior from your environment.

SELECT *,
from SNOWFLAKE.ACCOUNT_USAGE.LOGIN_HISTORY
WHERE 
    IS_SUCCESS = 'YES' AND 
    FIRST_AUTHENTICATION_FACTOR='OAUTH_ACCESS_TOKEN' AND
    EVENT_TIMESTAMP >= '2024-04-01';

What to look for in the results: When reviewing your OAuth access token usage, look for evidence that an OAuth access token was used from unexpected IP addresses or to log in via an unexpected client type. For example, if you haven't implemented OAuth access tokens for use with the Snowflake CLI, then monitor for uses where reported_client_type = "SNOWSQL_CLI". This can be an indicator that an access token was compromised and is being used to obtain access to Snowflake.

A brute-force attack is a method used by hackers to gain unauthorized access to a system by systematically trying all possible combinations of passwords until the correct one is found. Evidence of a brute-force attack followed by a successful login may indicate that the attacker was successful in obtaining the credentials and bypassing any other security controls.

WITH failed_logins AS (
  SELECT
    user_name,
    event_timestamp,
    COUNT(*) OVER (
      PARTITION BY user_name
      ORDER BY event_timestamp
    ) AS failed_count,
    REPORTED_CLIENT_TYPE
  FROM
    SNOWFLAKE.ACCOUNT_USAGE.LOGIN_HISTORY
  WHERE
    IS_SUCCESS = 'NO'
    AND FIRST_AUTHENTICATION_FACTOR = 'PASSWORD'
    -- Date filter for login attempts after April 1
    AND event_timestamp >= TO_TIMESTAMP('2024-04-01')
),
potential_brute_force AS (
  SELECT
    user_name,
    event_timestamp,
    REPORTED_CLIENT_TYPE  -- Include REPORTED_CLIENT_TYPE from failed_logins
  FROM
    failed_logins
  WHERE
    failed_count >= 5 -- Assuming 5 or more failed attempts in an hour indicate a brute force attempt
),
brute_force_count AS (
  SELECT
    user_name,
    COUNT(*) AS brute_force_count,  -- Count occurrences of potential brute force per user
    REPORTED_CLIENT_TYPE  -- Include REPORTED_CLIENT_TYPE from potential_brute_force
  FROM
    potential_brute_force
  GROUP BY
    user_name,
    REPORTED_CLIENT_TYPE
)
SELECT
  bfc.user_name,
  bfc.brute_force_count,
  bfc.REPORTED_CLIENT_TYPE
FROM
  brute_force_count bfc
ORDER BY
  bfc.user_name;

What to look for in the results: Modify the query above to look at various brute-force attack thresholds, with the current default set at five failed logins followed by a successful login. If you see several failed attempts followed by a successful login, investigate the IP address associated with the successful login attempt to ensure it's been seen previously and can be tied back to the user's behavior through another data source.

Collection

These queries can help you identify potential attacker attempts to collect data from your Snowflake instances.

Anomalies in the amount of data queried (T1530 - Data from Cloud Storage)

Look for users who have queried large amounts of data as an indicator of suspicious activity.

SELECT user_name, date(start_time), rows_produced
FROM
snowflake.account_usage.query_history
WHERE
start_time >= '2024-04-01'
AND bytes_scanned > 0
AND (rows_produced > 0 OR rows_inserted > 0 OR rows_updated > 0 OR rows_unloaded > 0 OR bytes_deleted > 0)
GROUP BY user_name, date(start_time), rows_produced
LIMIT 1000;

What to look for in the results: User behavior varies greatly depending on business use case and corporate environments. Edit the above query to look for large numbers of rows produced from a data query, benchmarking against a baseline specific to your environment. Looking at an example of 100 or 1,000 query examples can give you an idea of what an outlier would be in your environment for a service account and a typical user account. Large numbers of rows produced in the query above indicate a high amount of data searched and could be an indicator of collection prior to exfiltration. Review anomalies and validate whether that behavior is expected for your environment.

Exfiltration

Use these methods to detect activity that could reflect an attacker exfiltrating data from your environment.

COPY query usage to external location (T1567 - Exfiltration Over Web Service)

An attacker may use the COPY INTO feature in Snowflake querying to copy the data to an external location.

SELECT *,
FROM
    snowflake.account_usage.query_history
WHERE
    CONTAINS(QUERY_TEXT, 'COPY INTO') AND CONTAINS(QUERY_TEXT, 'http')
    AND START_TIME >= '2024-04-01';

What to look for in the results: The use of COPY INTO followed by a URL should be reviewed, as it indicates data is being moved to another location. Validate that all URLs are expected and resolve to resources owned by the company. An unknown URL in query results should be investigated further as a potential sign of data exfiltration. For another example query of this behavior, refer to previous work by Rezonate.

Data export to external cloud storage (T1567 - Exfiltration Over Web Service)

Snowflake facilitates data export to cloud storage services through the "stage" functionality. You can find detailed information about stages on Snowflake's official documentation page. Use the below query to go through all staging URLs that have been sent data; to filter out expected transfers, you can use the NOT CONTAINS field.

SELECT *
FROM
  snowflake.account_usage.stages
WHERE
 NOT CONTAINS(stage_url,'companynamingconvention')
AND created >= '2024-04-01';

What to look for in the results: There may be cases in which a stage being set to an external stage URL is expected behavior---however, all URLs in the logs should be populated from your own infrastructure resources, such as a company-owned S3 bucket. An unknown external stage URL can be an indicator that an attacker has exfiltrated data to that URL. For another example query of this behavior, refer to previous work by Rezonate.

Unique stage URL export by cloud provider (T1567 - Exfiltration Over Web Service)

Similar to the above example, Snowflake data exports through the "stage" functionality, which should be monitored for unique URLs that are unexpected in your environment. If you have a large number of staging URLs expected, use this query to visualize unique URLs that have been seen two times or less, and count them into the appropriate cloud service provider for an easier hunt experience.

SELECT stage_name, stage_url, stage_owner, 
    count(stage_url) as unique_stage_url,
    count_if(stage_url ilike 's3%') as aws_usage,
    count_if(stage_url ilike 'gcs%') as gcp_usage,
    count_if(stage_url ilike 'azure%') as azure_usage,
FROM
  snowflake.account_usage.stages
WHERE
 NOT CONTAINS(stage_url,'companynamingconvention')
 AND created >= '2024-04-01'
GROUP by 1,2,3,
HAVING unique_stage_url <=2;

What to look for in the results: Results including specific cloud providers can be an indicator of compromise if your organization typically uses AWS but there are URLs to Google Cloud or Azure environments. In addition, review any unique staging URLs after filtering out the expected buckets. This may be an indicator that an anomalous storage location was added to the stage, and you'll want to investigate further to determine if this is evidence of data exfiltration.

Suspicious data transfer activity (T1567 - Exfiltration Over Web Service)

Review data transfers in the Data Transfer History table for any evidence that an attacker triggered a data transfer event.

SELECT *
FROM
    snowflake.account_usage.data_transfer_history
WHERE
    start_time >= '2024-04-01';

What to look for in the results: Data transfers include a source_cloud, bytes_transferred, and source_region that can be compared with the target_cloud and target_region. This behavior should be able to be traced to known data transfers from expected cloud environments. If the source and target destinations do not match, consider further investigation to determine if this is a potential indicator of data exfiltration.

How Datadog can help

You can use Datadog's new API-based Snowflake integration to collect query history logs. Other sources mentioned in this post and additional OOTB detections are planned for release at DASH on June 25th, 2024. Any threat hunts that are performed on login and query history can be moved into Datadog Cloud SIEM as custom detections.

Best practices for securing Snowflake environments

To proactively minimize security risks, consider these best practices for your Snowflake instances:

Secure user accounts: Enforce strong authentication by implementing single sign-on (SSO) and multi-factor authentication (MFA) for individual users.
Control service account access: Utilize secure methods like OAuth 2.0 or key pairs to manage access for service accounts. Additionally, define network policy rules to restrict activity for these accounts.

While preventative measures are crucial, a comprehensive security strategy should also include:

Threat detection: Implement threat detections to identify potential attacker behaviors.
An incident response playbook: Develop a clear incident response playbook outlining steps your team should take if a security incident occurs specific to Snowflake and the data you store there. This ensures a swift and coordinated response, minimizing potential damage.