Skip to content

feat(jdbc): centralize connection properties from BigQueryConnection and DataSource#4116

Draft
keshavdandeva wants to merge 2 commits intomainfrom
jdbc/refactor-connection-properties
Draft

feat(jdbc): centralize connection properties from BigQueryConnection and DataSource#4116
keshavdandeva wants to merge 2 commits intomainfrom
jdbc/refactor-connection-properties

Conversation

@keshavdandeva
Copy link
Contributor

b/484306563

@product-auto-label product-auto-label bot added size: xl Pull request size is extra large. api: bigquery Issues related to the googleapis/java-bigquery API. labels Feb 25, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @keshavdandeva, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the way connection properties are handled within the BigQuery JDBC driver. By introducing a dedicated DataSource class, the parsing and management of all connection-related parameters are centralized. This change aims to improve the robustness, consistency, and maintainability of the driver's configuration logic, making it easier to add new properties and ensure their correct interpretation across different components.

Highlights

  • Centralized Connection Property Parsing: Refactored connection property parsing logic from being scattered across BigQueryConnection and BigQueryDriver into a new dedicated DataSource class, promoting a single source of truth for connection configuration.
  • Introduction of DataSource Class: A new DataSource class was introduced to encapsulate all connection properties and provide a standardized way to parse them from a JDBC URL, simplifying property management and access.
  • Streamlined Property Access: Updated BigQueryConnection and BigQueryDriver to retrieve connection properties directly from the DataSource object, eliminating redundant URL parsing logic in multiple places.
  • Removal of Redundant Parsing Utilities: Many property parsing methods in BigQueryJdbcUrlUtility were removed or refactored, as their functionality is now handled by the DataSource class, reducing code duplication and improving maintainability.
Changelog
  • google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryConnection.java
    • Added connectionPoolSize and listenerPoolSize fields.
    • Refactored numerous connection property parsing calls to use the new DataSource object.
    • Added getter methods for connectionPoolSize and listenerPoolSize.
  • google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryDriver.java
    • Updated connect method to use DataSource.fromUrl(connectionUri) for log level and path parsing.
    • Modified getPropertyInfo to use DataSource.fromUrl(url) for OAuth property parsing.
  • google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryJdbcOAuthUtility.java
    • Overloaded parseOAuthProperties to accept a DataSource object.
    • Updated existing parseOAuthProperties(String url, ...) to call the new overload.
    • Changed property parsing within parseOAuthProperties to use DataSource getters.
    • Updated logic for EXTERNAL_ACCOUNT_AUTH to use DataSource getters for BYOID properties.
  • google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryJdbcUrlUtility.java
    • Removed several parse*Property methods (e.g., parseQueryProperties, parseLabels, parseIntProperty, parseLongProperty, parseBooleanProperty, parseOverrideProperties, parseJobCreationMode, parseBYOIDProperty, getConnectionPropertyDefaultValue, and various parse*InSecs and parse*PoolSize methods).
    • Modified parsePropertiesMap to parsePropertiesMapFromValue with an updated signature.
    • Removed unused imports related to URL decoding.
  • google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryPooledConnection.java
    • Modified constructor to retrieve listenerPoolSize directly from the BigQueryConnection object instead of parsing from the URL.
  • google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/DataSource.java
    • Added new fields for various connection properties, including OAuth, BYOID, endpoint overrides, private service connect, and connection/listener pool sizes.
    • Introduced a static PROPERTY_SETTERS map to dynamically set DataSource properties from URL parameters.
    • Added a static fromUrl(String url) factory method to parse a JDBC URL into a DataSource instance.
    • Updated createProperties to include new connection pool properties.
    • Implemented comprehensive getters and setters for all connection properties, often including default value logic.
  • google-cloud-bigquery-jdbc/src/test/java/com/google/cloud/bigquery/jdbc/BigQueryJdbcOAuthUtilityTest.java
    • Updated OAuth utility tests to reflect the new DataSource-based property parsing and override property handling.
  • google-cloud-bigquery-jdbc/src/test/java/com/google/cloud/bigquery/jdbc/BigQueryJdbcUrlUtilityTest.java
    • Removed numerous tests for URL parsing methods that were deprecated or moved to the DataSource class.
  • google-cloud-bigquery-jdbc/src/test/java/com/google/cloud/bigquery/jdbc/BigQueryPooledConnectionTest.java
    • Updated pooled connection tests to mock BigQueryConnection's getListenerPoolSize getter.
  • google-cloud-bigquery-jdbc/src/test/java/com/google/cloud/bigquery/jdbc/PooledConnectionDataSourceTest.java
    • Modified pooled connection data source tests to use DataSource properties and mock BigQueryConnection getters.
Activity
  • No human activity (comments, reviews) has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request centralizes connection property handling by introducing DataSource.fromUrl and refactoring BigQueryConnection to use it, aiming to improve maintainability and reduce code duplication. However, it introduces a critical functional regression in endpoint overrides, which could bypass private endpoints and violate security policies. There is also a potential NullPointerException in BigQueryJdbcOAuthUtility during authentication type retrieval, leading to application crashes. Additionally, consider improving consistency in default value handling and refactoring a large constructor for better readability.

BigQueryJdbcUrlUtility.OAUTH_TYPE_PROPERTY_NAME,
BigQueryJdbcUrlUtility.DEFAULT_OAUTH_TYPE_VALUE,
callerClassName));
authType = AuthType.fromValue(ds.getOAuthType());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The call to ds.getOAuthType() will trigger a NullPointerException if the OAuthType property is missing from the JDBC URL. This occurs because DataSource.getOAuthType() returns a primitive int, but the underlying oAuthType field is an Integer that can be null if not explicitly set. Auto-unboxing a null value causes a crash, leading to a Denial of Service (DoS) when establishing a connection without an explicit OAuthType. The previous implementation handled this safely. Consider modifying DataSource.java to safely handle the null case, for example, by returning a default value.

Comment on lines +172 to +180
if (ds.getEndpointOverrides() != null) {
this.overrideProperties.put(
BigQueryJdbcUrlUtility.ENDPOINT_OVERRIDES_PROPERTY_NAME, ds.getEndpointOverrides());
}
if (ds.getPrivateServiceConnect() != null) {
this.overrideProperties.put(
BigQueryJdbcUrlUtility.PRIVATE_SERVICE_CONNECT_PROPERTY_NAME,
ds.getPrivateServiceConnect());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The centralization of connection properties has introduced a regression in the handling of endpoint overrides (e.g., BIGQUERY, READ_API, STS). The new logic in the BigQueryConnection constructor stores the entire EndpointOverrides string in the overrideProperties map under a single key, instead of parsing it into individual components as the previous implementation did. Consequently, methods like getBigQueryConnection() and getBigQueryReadClientConnection() will fail to find the expected individual override keys (like BIGQUERY or READ_API), causing the driver to silently fall back to default public endpoints. This can lead to security policy violations in environments where traffic is required to stay within a private network (e.g., via Private Service Connect).

Comment on lines +209 to +219
if (jobCreationMode == null || jobCreationMode == 2) {
this.useStatelessQueryMode = true;
} else if (jobCreationMode == 1) {
this.useStatelessQueryMode = false;
} else {
throw new NumberFormatException(
String.format(
"Invalid value for %s. Use 1 for JOB_CREATION_REQUIRED and 2 for"
+ " JOB_CREATION_OPTIONAL.",
BigQueryJdbcUrlUtility.JOB_CREATION_MODE_PROPERTY_NAME));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The jobCreationMode == null check is redundant because ds.getJobCreationMode() provides a default value, so it will never return null. You can simplify this logic.

    if (jobCreationMode == 2) {
      this.useStatelessQueryMode = true;
    } else if (jobCreationMode == 1) {
      this.useStatelessQueryMode = false;
    } else {
      throw new NumberFormatException(
          String.format(
              "Invalid value for %s. Use 1 for JOB_CREATION_REQUIRED and 2 for"_                  + " JOB_CREATION_OPTIONAL.",
              BigQueryJdbcUrlUtility.JOB_CREATION_MODE_PROPERTY_NAME));
    }

Comment on lines +227 to +267
Map<String, String> proxyProperties = new java.util.HashMap<>();
if (ds.getProxyHost() != null)
proxyProperties.put(BigQueryJdbcUrlUtility.PROXY_HOST_PROPERTY_NAME, ds.getProxyHost());
if (ds.getProxyPort() != null)
proxyProperties.put(BigQueryJdbcUrlUtility.PROXY_PORT_PROPERTY_NAME, ds.getProxyPort());
if (ds.getProxyUid() != null)
proxyProperties.put(BigQueryJdbcUrlUtility.PROXY_USER_ID_PROPERTY_NAME, ds.getProxyUid());
if (ds.getProxyPwd() != null)
proxyProperties.put(BigQueryJdbcUrlUtility.PROXY_PASSWORD_PROPERTY_NAME, ds.getProxyPwd());
String proxyHost = ds.getProxyHost();
String proxyPort = ds.getProxyPort();
String proxyUid = ds.getProxyUid();
String proxyPwd = ds.getProxyPwd();

if (proxyPort != null
&& !java.util.regex.Pattern.compile(
"^([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$")
.matcher(proxyPort)
.find()) {
throw new IllegalArgumentException(
"Illegal port number provided %s. Please provide a valid port number.");
}

boolean isMissingProxyHostOrPortWhenProxySet =
(proxyHost == null && proxyPort != null) || (proxyHost != null && proxyPort == null);
if (isMissingProxyHostOrPortWhenProxySet) {
throw new IllegalArgumentException(
"Both ProxyHost and ProxyPort parameters need to be specified. No defaulting behavior"
+ " occurs.");
}
boolean isMissingProxyUidOrPwdWhenAuthSet =
(proxyUid == null && proxyPwd != null) || (proxyUid != null && proxyPwd == null);
if (isMissingProxyUidOrPwdWhenAuthSet) {
throw new IllegalArgumentException(
"Both ProxyUid and ProxyPwd parameters need to be specified for authentication.");
}
boolean isProxyAuthSetWithoutProxySettings = proxyUid != null && proxyHost == null;
if (isProxyAuthSetWithoutProxySettings) {
throw new IllegalArgumentException(
"Proxy authentication provided via connection string with no proxy host or port set.");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The constructor has become quite large with the addition of proxy validation logic. To improve readability and separation of concerns, consider encapsulating this validation logic in a separate private helper method within this class or moving it to a more appropriate location, such as a new method in BigQueryJdbcProxyUtility that accepts a DataSource object.

Comment on lines 294 to +324
this.enableWriteAPI =
BigQueryJdbcUrlUtility.parseBooleanProperty(
url,
BigQueryJdbcUrlUtility.ENABLE_WRITE_API_PROPERTY_NAME,
BigQueryJdbcUrlUtility.DEFAULT_ENABLE_WRITE_API_VALUE,
this.connectionClassName);
ds.getEnableWriteAPI() != null
? ds.getEnableWriteAPI()
: BigQueryJdbcUrlUtility.DEFAULT_ENABLE_WRITE_API_VALUE;
this.writeAPIActivationRowCount =
BigQueryJdbcUrlUtility.parseIntProperty(
url,
BigQueryJdbcUrlUtility.SWA_ACTIVATION_ROW_COUNT_PROPERTY_NAME,
BigQueryJdbcUrlUtility.DEFAULT_SWA_ACTIVATION_ROW_COUNT_VALUE,
this.connectionClassName);
ds.getSwaActivationRowCount() != null
? ds.getSwaActivationRowCount()
: BigQueryJdbcUrlUtility.DEFAULT_SWA_ACTIVATION_ROW_COUNT_VALUE;
this.writeAPIAppendRowCount =
BigQueryJdbcUrlUtility.parseIntProperty(
url,
BigQueryJdbcUrlUtility.SWA_APPEND_ROW_COUNT_PROPERTY_NAME,
BigQueryJdbcUrlUtility.DEFAULT_SWA_APPEND_ROW_COUNT_VALUE,
this.connectionClassName);
ds.getSwaAppendRowCount() != null
? ds.getSwaAppendRowCount()
: BigQueryJdbcUrlUtility.DEFAULT_SWA_APPEND_ROW_COUNT_VALUE;

String additionalProjectsStr = ds.getAdditionalProjects();
this.additionalProjects =
BigQueryJdbcUrlUtility.parseStringListProperty(
url,
BigQueryJdbcUrlUtility.ADDITIONAL_PROJECTS_PROPERTY_NAME,
this.connectionClassName);
(additionalProjectsStr == null || additionalProjectsStr.isEmpty())
? java.util.Collections.emptyList()
: java.util.Arrays.asList(additionalProjectsStr.split(","));

this.filterTablesOnDefaultDataset =
BigQueryJdbcUrlUtility.parseBooleanProperty(
url,
BigQueryJdbcUrlUtility.FILTER_TABLES_ON_DEFAULT_DATASET_PROPERTY_NAME,
BigQueryJdbcUrlUtility.DEFAULT_FILTER_TABLES_ON_DEFAULT_DATASET_VALUE,
this.connectionClassName);
ds.getFilterTablesOnDefaultDataset() != null
? ds.getFilterTablesOnDefaultDataset()
: BigQueryJdbcUrlUtility.DEFAULT_FILTER_TABLES_ON_DEFAULT_DATASET_VALUE;
this.requestGoogleDriveScope =
BigQueryJdbcUrlUtility.parseIntProperty(
url,
BigQueryJdbcUrlUtility.REQUEST_GOOGLE_DRIVE_SCOPE_PROPERTY_NAME,
BigQueryJdbcUrlUtility.DEFAULT_REQUEST_GOOGLE_DRIVE_SCOPE_VALUE,
this.connectionClassName);
ds.getRequestGoogleDriveScope() != null
? ds.getRequestGoogleDriveScope()
: BigQueryJdbcUrlUtility.DEFAULT_REQUEST_GOOGLE_DRIVE_SCOPE_VALUE;
this.metadataFetchThreadCount =
BigQueryJdbcUrlUtility.parseIntProperty(
url,
BigQueryJdbcUrlUtility.METADATA_FETCH_THREAD_COUNT_PROPERTY_NAME,
BigQueryJdbcUrlUtility.DEFAULT_METADATA_FETCH_THREAD_COUNT_VALUE,
this.connectionClassName);
this.requestReason =
BigQueryJdbcUrlUtility.parseStringProperty(
url,
BigQueryJdbcUrlUtility.REQUEST_REASON_PROPERTY_NAME,
null,
this.connectionClassName);
ds.getMetadataFetchThreadCount() != null
? ds.getMetadataFetchThreadCount()
: BigQueryJdbcUrlUtility.DEFAULT_METADATA_FETCH_THREAD_COUNT_VALUE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's an inconsistent pattern for handling default property values. For some properties, BigQueryConnection performs a null check and applies a default value (e.g., enableWriteAPI, writeAPIActivationRowCount). For others, the default is handled within the DataSource getter.

To improve consistency and centralize the logic for default values, consider updating the getters in the DataSource class to always provide a default value when a property is not set. This would simplify the BigQueryConnection constructor by allowing direct assignment without null checks.

For example, for enableWriteAPI:
In DataSource.java, the getter could be:

public Boolean getEnableWriteAPI() {
  return enableWriteAPI != null
      ? enableWriteAPI
      : BigQueryJdbcUrlUtility.DEFAULT_ENABLE_WRITE_API_VALUE;
}

Then, in BigQueryConnection.java, you could simply have:

this.enableWriteAPI = ds.getEnableWriteAPI();

This pattern should be applied to other properties like writeAPIActivationRowCount, writeAPIAppendRowCount, filterTablesOnDefaultDataset, requestGoogleDriveScope, and metadataFetchThreadCount.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/java-bigquery API. size: xl Pull request size is extra large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant