Datastream don't process any events

Hello!

I've set up stream in Datastream with MariaDB (using MySQL profile) as source and BigQuery as destination. Testing in source profile was passed. But when I've launched this stream no data wasn't processed. I've got a zero in processed events.

Stanislav_1-1699268846453.png

Also there is some activity in Logs Explorer. No errors, no warnings at all. Just 0 events but I know that this table is updated frequently enough. 

Stanislav_0-1699268802895.png

Could someone help me to understand the problem what can be in this case? 

Solved Solved
0 2 436
1 ACCEPTED SOLUTION

When you're dealing with Google Cloud Datastream and you're not seeing any events processed, there are several areas you can check to troubleshoot the issue:

Stream Settings:

  • Start State: Ensure the stream is "Started" (or "Running") in the UI.
  • CDC Mode: Verify "Incremental" mode for ongoing changes or "Full Refresh" for a one-time operation (depending on needs).
  • Table Inclusion: Confirm desired tables are included with correct names.
  • Start Timestamp: Set it before the latest data to replicate. Avoid future timestamps that prevent processing.
  • Additional Settings: Check for any stream-specific settings affecting data capture, such as filters or transformations.

Source Database Configuration:

  • Data Changes: Ensure changes are logged in binary logs (e.g., INSERT, UPDATE, DELETE).
  • Binlog Configuration: Enable binlog in ROW format.
  • Database User Access: Grant REPLICATION SLAVE and SELECT privileges minimum. Consider additional permissions based on your use case (e.g., creating BigQuery objects).
  • Database Version: Confirm compatibility with Datastream.

Network and Firewall Restrictions:

  • Connectivity: Confirm connectivity between source and Datastream service.
  • SSL Certificates: Validate certificate validity and recognition.
  • IP Whitelist: Ensure required Datastream IP ranges are allowed in firewall rules.
  • Private Connections: Verify VPC peering or Cloud Interconnect configuration.

Data Type Mismatch:

  • Check for incompatible data types between MariaDB and BigQuery.
  • Verify supported character sets and collations and their mapping between the two systems.
  • Avoid unsupported MariaDB features.

Advanced Troubleshooting:

  • Logs and Monitoring:
    • Inspect Datastream logs in Logs Explorer, looking for patterns like specific times when errors occur.
    • Check relevant metrics in Google Cloud Monitoring.
    • Analyze the stream activity chart for anomalies.
  • Quotas: Check for exceeded Datastream quotas in the Google Cloud Console (IAM & Admin -> Quotas).
  • CDC Events: Confirm the source database actively receives transactions that modify data.

Troubleshooting Steps:

  1. Start with common, easily checked issues.
  2. Follow the troubleshooting guide in a logical and methodical order.
  3. Contact Google Cloud Support for further assistance if needed.

Remember:

  • Restart the Stream after any configuration changes.
  • Refer to the latest documentation for accuracy and updates.
  • Update the stream configuration if the source database structure changes (e.g., adding or dropping columns).

Documentation:

 

View solution in original post

2 REPLIES 2

When you're dealing with Google Cloud Datastream and you're not seeing any events processed, there are several areas you can check to troubleshoot the issue:

Stream Settings:

  • Start State: Ensure the stream is "Started" (or "Running") in the UI.
  • CDC Mode: Verify "Incremental" mode for ongoing changes or "Full Refresh" for a one-time operation (depending on needs).
  • Table Inclusion: Confirm desired tables are included with correct names.
  • Start Timestamp: Set it before the latest data to replicate. Avoid future timestamps that prevent processing.
  • Additional Settings: Check for any stream-specific settings affecting data capture, such as filters or transformations.

Source Database Configuration:

  • Data Changes: Ensure changes are logged in binary logs (e.g., INSERT, UPDATE, DELETE).
  • Binlog Configuration: Enable binlog in ROW format.
  • Database User Access: Grant REPLICATION SLAVE and SELECT privileges minimum. Consider additional permissions based on your use case (e.g., creating BigQuery objects).
  • Database Version: Confirm compatibility with Datastream.

Network and Firewall Restrictions:

  • Connectivity: Confirm connectivity between source and Datastream service.
  • SSL Certificates: Validate certificate validity and recognition.
  • IP Whitelist: Ensure required Datastream IP ranges are allowed in firewall rules.
  • Private Connections: Verify VPC peering or Cloud Interconnect configuration.

Data Type Mismatch:

  • Check for incompatible data types between MariaDB and BigQuery.
  • Verify supported character sets and collations and their mapping between the two systems.
  • Avoid unsupported MariaDB features.

Advanced Troubleshooting:

  • Logs and Monitoring:
    • Inspect Datastream logs in Logs Explorer, looking for patterns like specific times when errors occur.
    • Check relevant metrics in Google Cloud Monitoring.
    • Analyze the stream activity chart for anomalies.
  • Quotas: Check for exceeded Datastream quotas in the Google Cloud Console (IAM & Admin -> Quotas).
  • CDC Events: Confirm the source database actively receives transactions that modify data.

Troubleshooting Steps:

  1. Start with common, easily checked issues.
  2. Follow the troubleshooting guide in a logical and methodical order.
  3. Contact Google Cloud Support for further assistance if needed.

Remember:

  • Restart the Stream after any configuration changes.
  • Refer to the latest documentation for accuracy and updates.
  • Update the stream configuration if the source database structure changes (e.g., adding or dropping columns).

Documentation:

 

Thanks a lot for a such great answer!