Skip to main content

Google BigQuery

The Google BigQuery connector for rudol allows you to connect your Google BigQuery data warehouse instances.

caution

As Google BigQuery credits are charged per use we recommend you to choose a Scanning Frequency that matches your team's workflow to minimize your warehouse costs

Connection parameters

NameTypeDescription
Service Account KeyfilejsonJSON Service Account key file content, see Generate Service Account keyfile
Dataset locationtextDataset region, single and multiregion locations are supported (i.e. us-central-1 or US)

Generate Service Account keyfile

The following steps provide examples for how to allow rudol reads to Cloud Storage buckets in your-storage-project for a Cloud Composer environment deployed in the project ID your-composer-project.

Create a service account in your-storage-project

  1. In the Google Cloud console, go to IAM & Admin -> Service Accounts page.
  2. Select your project.
  3. Click + CREATE SERVICE ACCOUNT.
  4. In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.
  5. Click CREATE AND CONTINUE.
  6. Click Select a role field and add the required roles.
info

Minimum permissions for metadata scan require at least BigQuery User and BigQuery Metadata Viewer roles.

For Data Quality Validations the service account should also have the BigQuery Data Viewer role.

  1. Click Done to finish creating the service account.
  2. Do not close your browser window. You will use it in the next step.

Download a JSON key for the service account you just created

  1. In the Google Cloud console, click the email address for the service account that you created.
  2. Click KEYS.
  3. Click ADD KEY, then click Create new key.
  4. Click CREATE. A JSON key file will be downloaded to your computer.

Your JSON Service Account key file should look like this:

{
"type": "service_account",
"project_id": "your-composer-project",
"private_key_id": "132asd123asd123asd",
"private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\\n",
"client_email": "my-service-account@your-composer-project.iam.gserviceaccount.com",
"client_id": "1234567890123456789",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/my-service-account%40your-composer-project.iam.gserviceaccount.com"
}

If you need more details on the previous steps you can check BigQuery Doc here