feat(jdbc): centralize connection properties from BigQueryConnection and DataSource#4116
feat(jdbc): centralize connection properties from BigQueryConnection and DataSource#4116keshavdandeva wants to merge 2 commits intomainfrom
BigQueryConnection and DataSource#4116Conversation
Summary of ChangesHello @keshavdandeva, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the way connection properties are handled within the BigQuery JDBC driver. By introducing a dedicated Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request centralizes connection property handling by introducing DataSource.fromUrl and refactoring BigQueryConnection to use it, aiming to improve maintainability and reduce code duplication. However, it introduces a critical functional regression in endpoint overrides, which could bypass private endpoints and violate security policies. There is also a potential NullPointerException in BigQueryJdbcOAuthUtility during authentication type retrieval, leading to application crashes. Additionally, consider improving consistency in default value handling and refactoring a large constructor for better readability.
| BigQueryJdbcUrlUtility.OAUTH_TYPE_PROPERTY_NAME, | ||
| BigQueryJdbcUrlUtility.DEFAULT_OAUTH_TYPE_VALUE, | ||
| callerClassName)); | ||
| authType = AuthType.fromValue(ds.getOAuthType()); |
There was a problem hiding this comment.
The call to ds.getOAuthType() will trigger a NullPointerException if the OAuthType property is missing from the JDBC URL. This occurs because DataSource.getOAuthType() returns a primitive int, but the underlying oAuthType field is an Integer that can be null if not explicitly set. Auto-unboxing a null value causes a crash, leading to a Denial of Service (DoS) when establishing a connection without an explicit OAuthType. The previous implementation handled this safely. Consider modifying DataSource.java to safely handle the null case, for example, by returning a default value.
| if (ds.getEndpointOverrides() != null) { | ||
| this.overrideProperties.put( | ||
| BigQueryJdbcUrlUtility.ENDPOINT_OVERRIDES_PROPERTY_NAME, ds.getEndpointOverrides()); | ||
| } | ||
| if (ds.getPrivateServiceConnect() != null) { | ||
| this.overrideProperties.put( | ||
| BigQueryJdbcUrlUtility.PRIVATE_SERVICE_CONNECT_PROPERTY_NAME, | ||
| ds.getPrivateServiceConnect()); | ||
| } |
There was a problem hiding this comment.
The centralization of connection properties has introduced a regression in the handling of endpoint overrides (e.g., BIGQUERY, READ_API, STS). The new logic in the BigQueryConnection constructor stores the entire EndpointOverrides string in the overrideProperties map under a single key, instead of parsing it into individual components as the previous implementation did. Consequently, methods like getBigQueryConnection() and getBigQueryReadClientConnection() will fail to find the expected individual override keys (like BIGQUERY or READ_API), causing the driver to silently fall back to default public endpoints. This can lead to security policy violations in environments where traffic is required to stay within a private network (e.g., via Private Service Connect).
| if (jobCreationMode == null || jobCreationMode == 2) { | ||
| this.useStatelessQueryMode = true; | ||
| } else if (jobCreationMode == 1) { | ||
| this.useStatelessQueryMode = false; | ||
| } else { | ||
| throw new NumberFormatException( | ||
| String.format( | ||
| "Invalid value for %s. Use 1 for JOB_CREATION_REQUIRED and 2 for" | ||
| + " JOB_CREATION_OPTIONAL.", | ||
| BigQueryJdbcUrlUtility.JOB_CREATION_MODE_PROPERTY_NAME)); | ||
| } |
There was a problem hiding this comment.
The jobCreationMode == null check is redundant because ds.getJobCreationMode() provides a default value, so it will never return null. You can simplify this logic.
if (jobCreationMode == 2) {
this.useStatelessQueryMode = true;
} else if (jobCreationMode == 1) {
this.useStatelessQueryMode = false;
} else {
throw new NumberFormatException(
String.format(
"Invalid value for %s. Use 1 for JOB_CREATION_REQUIRED and 2 for"_ + " JOB_CREATION_OPTIONAL.",
BigQueryJdbcUrlUtility.JOB_CREATION_MODE_PROPERTY_NAME));
}| Map<String, String> proxyProperties = new java.util.HashMap<>(); | ||
| if (ds.getProxyHost() != null) | ||
| proxyProperties.put(BigQueryJdbcUrlUtility.PROXY_HOST_PROPERTY_NAME, ds.getProxyHost()); | ||
| if (ds.getProxyPort() != null) | ||
| proxyProperties.put(BigQueryJdbcUrlUtility.PROXY_PORT_PROPERTY_NAME, ds.getProxyPort()); | ||
| if (ds.getProxyUid() != null) | ||
| proxyProperties.put(BigQueryJdbcUrlUtility.PROXY_USER_ID_PROPERTY_NAME, ds.getProxyUid()); | ||
| if (ds.getProxyPwd() != null) | ||
| proxyProperties.put(BigQueryJdbcUrlUtility.PROXY_PASSWORD_PROPERTY_NAME, ds.getProxyPwd()); | ||
| String proxyHost = ds.getProxyHost(); | ||
| String proxyPort = ds.getProxyPort(); | ||
| String proxyUid = ds.getProxyUid(); | ||
| String proxyPwd = ds.getProxyPwd(); | ||
|
|
||
| if (proxyPort != null | ||
| && !java.util.regex.Pattern.compile( | ||
| "^([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$") | ||
| .matcher(proxyPort) | ||
| .find()) { | ||
| throw new IllegalArgumentException( | ||
| "Illegal port number provided %s. Please provide a valid port number."); | ||
| } | ||
|
|
||
| boolean isMissingProxyHostOrPortWhenProxySet = | ||
| (proxyHost == null && proxyPort != null) || (proxyHost != null && proxyPort == null); | ||
| if (isMissingProxyHostOrPortWhenProxySet) { | ||
| throw new IllegalArgumentException( | ||
| "Both ProxyHost and ProxyPort parameters need to be specified. No defaulting behavior" | ||
| + " occurs."); | ||
| } | ||
| boolean isMissingProxyUidOrPwdWhenAuthSet = | ||
| (proxyUid == null && proxyPwd != null) || (proxyUid != null && proxyPwd == null); | ||
| if (isMissingProxyUidOrPwdWhenAuthSet) { | ||
| throw new IllegalArgumentException( | ||
| "Both ProxyUid and ProxyPwd parameters need to be specified for authentication."); | ||
| } | ||
| boolean isProxyAuthSetWithoutProxySettings = proxyUid != null && proxyHost == null; | ||
| if (isProxyAuthSetWithoutProxySettings) { | ||
| throw new IllegalArgumentException( | ||
| "Proxy authentication provided via connection string with no proxy host or port set."); | ||
| } |
There was a problem hiding this comment.
The constructor has become quite large with the addition of proxy validation logic. To improve readability and separation of concerns, consider encapsulating this validation logic in a separate private helper method within this class or moving it to a more appropriate location, such as a new method in BigQueryJdbcProxyUtility that accepts a DataSource object.
| this.enableWriteAPI = | ||
| BigQueryJdbcUrlUtility.parseBooleanProperty( | ||
| url, | ||
| BigQueryJdbcUrlUtility.ENABLE_WRITE_API_PROPERTY_NAME, | ||
| BigQueryJdbcUrlUtility.DEFAULT_ENABLE_WRITE_API_VALUE, | ||
| this.connectionClassName); | ||
| ds.getEnableWriteAPI() != null | ||
| ? ds.getEnableWriteAPI() | ||
| : BigQueryJdbcUrlUtility.DEFAULT_ENABLE_WRITE_API_VALUE; | ||
| this.writeAPIActivationRowCount = | ||
| BigQueryJdbcUrlUtility.parseIntProperty( | ||
| url, | ||
| BigQueryJdbcUrlUtility.SWA_ACTIVATION_ROW_COUNT_PROPERTY_NAME, | ||
| BigQueryJdbcUrlUtility.DEFAULT_SWA_ACTIVATION_ROW_COUNT_VALUE, | ||
| this.connectionClassName); | ||
| ds.getSwaActivationRowCount() != null | ||
| ? ds.getSwaActivationRowCount() | ||
| : BigQueryJdbcUrlUtility.DEFAULT_SWA_ACTIVATION_ROW_COUNT_VALUE; | ||
| this.writeAPIAppendRowCount = | ||
| BigQueryJdbcUrlUtility.parseIntProperty( | ||
| url, | ||
| BigQueryJdbcUrlUtility.SWA_APPEND_ROW_COUNT_PROPERTY_NAME, | ||
| BigQueryJdbcUrlUtility.DEFAULT_SWA_APPEND_ROW_COUNT_VALUE, | ||
| this.connectionClassName); | ||
| ds.getSwaAppendRowCount() != null | ||
| ? ds.getSwaAppendRowCount() | ||
| : BigQueryJdbcUrlUtility.DEFAULT_SWA_APPEND_ROW_COUNT_VALUE; | ||
|
|
||
| String additionalProjectsStr = ds.getAdditionalProjects(); | ||
| this.additionalProjects = | ||
| BigQueryJdbcUrlUtility.parseStringListProperty( | ||
| url, | ||
| BigQueryJdbcUrlUtility.ADDITIONAL_PROJECTS_PROPERTY_NAME, | ||
| this.connectionClassName); | ||
| (additionalProjectsStr == null || additionalProjectsStr.isEmpty()) | ||
| ? java.util.Collections.emptyList() | ||
| : java.util.Arrays.asList(additionalProjectsStr.split(",")); | ||
|
|
||
| this.filterTablesOnDefaultDataset = | ||
| BigQueryJdbcUrlUtility.parseBooleanProperty( | ||
| url, | ||
| BigQueryJdbcUrlUtility.FILTER_TABLES_ON_DEFAULT_DATASET_PROPERTY_NAME, | ||
| BigQueryJdbcUrlUtility.DEFAULT_FILTER_TABLES_ON_DEFAULT_DATASET_VALUE, | ||
| this.connectionClassName); | ||
| ds.getFilterTablesOnDefaultDataset() != null | ||
| ? ds.getFilterTablesOnDefaultDataset() | ||
| : BigQueryJdbcUrlUtility.DEFAULT_FILTER_TABLES_ON_DEFAULT_DATASET_VALUE; | ||
| this.requestGoogleDriveScope = | ||
| BigQueryJdbcUrlUtility.parseIntProperty( | ||
| url, | ||
| BigQueryJdbcUrlUtility.REQUEST_GOOGLE_DRIVE_SCOPE_PROPERTY_NAME, | ||
| BigQueryJdbcUrlUtility.DEFAULT_REQUEST_GOOGLE_DRIVE_SCOPE_VALUE, | ||
| this.connectionClassName); | ||
| ds.getRequestGoogleDriveScope() != null | ||
| ? ds.getRequestGoogleDriveScope() | ||
| : BigQueryJdbcUrlUtility.DEFAULT_REQUEST_GOOGLE_DRIVE_SCOPE_VALUE; | ||
| this.metadataFetchThreadCount = | ||
| BigQueryJdbcUrlUtility.parseIntProperty( | ||
| url, | ||
| BigQueryJdbcUrlUtility.METADATA_FETCH_THREAD_COUNT_PROPERTY_NAME, | ||
| BigQueryJdbcUrlUtility.DEFAULT_METADATA_FETCH_THREAD_COUNT_VALUE, | ||
| this.connectionClassName); | ||
| this.requestReason = | ||
| BigQueryJdbcUrlUtility.parseStringProperty( | ||
| url, | ||
| BigQueryJdbcUrlUtility.REQUEST_REASON_PROPERTY_NAME, | ||
| null, | ||
| this.connectionClassName); | ||
| ds.getMetadataFetchThreadCount() != null | ||
| ? ds.getMetadataFetchThreadCount() | ||
| : BigQueryJdbcUrlUtility.DEFAULT_METADATA_FETCH_THREAD_COUNT_VALUE; |
There was a problem hiding this comment.
There's an inconsistent pattern for handling default property values. For some properties, BigQueryConnection performs a null check and applies a default value (e.g., enableWriteAPI, writeAPIActivationRowCount). For others, the default is handled within the DataSource getter.
To improve consistency and centralize the logic for default values, consider updating the getters in the DataSource class to always provide a default value when a property is not set. This would simplify the BigQueryConnection constructor by allowing direct assignment without null checks.
For example, for enableWriteAPI:
In DataSource.java, the getter could be:
public Boolean getEnableWriteAPI() {
return enableWriteAPI != null
? enableWriteAPI
: BigQueryJdbcUrlUtility.DEFAULT_ENABLE_WRITE_API_VALUE;
}Then, in BigQueryConnection.java, you could simply have:
this.enableWriteAPI = ds.getEnableWriteAPI();This pattern should be applied to other properties like writeAPIActivationRowCount, writeAPIAppendRowCount, filterTablesOnDefaultDataset, requestGoogleDriveScope, and metadataFetchThreadCount.
b/484306563