Skip to main content

Firestore to BigQuery Validation Workflow

This document provides a step-by-step guide to run the GitHub Actions workflow for validating Firestore data against BigQuery schema and data.


πŸ“Œ Overview​

The GitHub Actions workflow titled:

<env> – Firestore to BigQuery Enquiry Validation (Schema/Data)

is used to:

  • βœ… Validate Firestore data against BigQuery schema.
  • βœ… Validate Firestore data against BigQuery data.

πŸ” The same workflow handles both schema and data validation. You can select the mode using the validationType input.


🧭 Navigate & Launch the Workflow​

1. Open the GitHub Repository​

Go to the visn-devops-scripts repository.

2. Access the Workflow​

  • Click the Actions tab.
  • Select the workflow named:
    <env> – Firestore to BigQuery Enquiry Validation (Schema/Data)

3. Run the Workflow​

  • Click the Run workflow button (top-right).
  • Select the target branch (e.g., sprint123-feature-narwhal).
  • A form will appear with input fields.

🧾 Required Inputs​

The form requires the following parameters:

ParameterRequiredDescription
Lease Company IDβœ…Firestore document ID for the leasing company.
Account IDβœ…The account ID.
Service Unit IDβœ…ID of the service unit related to the enquiry.
Enquiry IDβœ…Firestore document ID of the enquiry to validate.
Validation Typeβœ…Select either schema or data based on what you want to validate.
Generate Schema if Fields Missing⚠️ Only for schemaWhen set to true, the script auto-generates missing fields BigQuery schema. If false, it only reports them.

⚠️ The "Generate Schema" field only appears when validationType is set to schema.


βœ… Validation Modes​

πŸ” 1. Schema Validation (validationType = schema)​

This mode checks whether the Firestore document fields match the BigQuery schema.

You will be prompted with:

Generate schema if fields are missing?

Choose:

  • true β†’ The workflow will automatically generate the missing fields schema.
  • false β†’ The workflow will only report missing fields without altering the schema.

πŸ” 2. Data Validation (validationType = data)​

This mode checks that the values in Firestore documents exist in BigQuery and match exactly. It detects:

  • Missing documents
  • Mismatched values

▢️ Executing the Workflow​

Once all input fields are completed:

  1. Click Run workflow again to start execution.
  2. GitHub Actions will queue and run the job with your selected parameters.

πŸ“Š Reviewing the Results​

1. Open the Workflow Run​

  • Go to the Actions tab.
  • Select the recent run of the workflow under your branch and environment.

2. Locate the Validation Step​

Find the step titled: Run Firestore Validator based on selected validation type

3. Analyze the Logs​

If validationType = schema:​

  • Logs will show:
    • Logs will list any missing fields.
    • You’ll see if fields were auto-generated (if enabled).

Auto-generated Schema:

If validationType = data:​

  • Logs will show:

    • Fields present in Firestore but missing in BigQuery.
    • Mismatched values between Firestore and BigQuery rows.

πŸ–ΌοΈ UI Reference​

Below is an example of how the workflow input form appears in GitHub Actions:

Run Firestore to BigQuery Validation Workflow

Features visible in the UI:

  • Workflow name with environment: DEV – Firestore to BigQuery Enquiry Validation (Schema/Data)
  • Manual branch selector (e.g., sprint123-feature-narwhal)
  • Input fields for:
    • Lease Company ID
    • Account ID
    • Service Unit ID
    • Enquiry ID
    • Validation Type
    • Schema generation toggle (for schema validation)

πŸ’‘ Tips​

  • Always verify you're selecting the correct environment and branch before running.

πŸ“˜ Notes​

  • <env> refers to your environment: DEV, DEMO, or PROD.
  • Only users with appropriate permissions can trigger workflows and view logs.
  • Schema generation is not available in data mode.

Status: Approved
Category: Protected
Authored By: Jeyakumar Arunagiri on July 14, 2025