About
BigQuery datasets can be made public, allowing anyone to query them. This is useful for open data projects, but can lead to data leaks when the dataset contains sensitive data and is typically unintended.
Understanding Impact
Business Impact
BigQuery datasets typically contain business sensitive data. Making them publicly accessible can lead to data leaks.
Technical Impact
BigQuery datasets can be assigned resource-based permissions. When a dataset IAM policy contains the allUsers
or allAuthenticatedUsers
special identities, the dataset is publicly accessible.
Identify affected resources
Use the following gcloud CLI commands to identify BigQuery datasets that are publicly accessible:
gcloud alpha bq datasets list --format='value(datasetReference.datasetId)' |
while read dataset; do
echo "Checking dataset $dataset"
iamPolicy=$(gcloud alpha bq datasets describe $dataset)
echo "$iamPolicy" | grep -iqE '(allUsers|allAuthenticatedUsers)'
if [[ $? == 0 ]]; then
echo "WARNING: $dataset is publicly accessible. Its IAM policy is shown below"
echo
echo "$iamPolicy" | grep -C3 -E '(allUsers|allAuthenticatedUsers)'
echo "--"
fi
done
Remediate vulnerable resources
Remove the permission grant to allUsers
or allAuthenticatedUsers
in the IAM policy of the BigQuery dataset.
How Datadog can help
Cloud Security Management
Datadog Cloud Security Management detects this vulnerability using the out-of-the-box rule "BigQuery Dataset should not be publicly accessible".
References
Introduction to IAM in BigQuery
gcp documentation
BigQuery public datasets
gcp documentation