storage: support resumable uploads#299
Conversation
e5a72fe to
ba2e8cf
Compare
lib/storage/file.js
Outdated
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
a665191 to
f76c406
Compare
lib/storage/file.js
Outdated
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
Overall this looks good. No big issues. RETRY_LIMIT might want to be increased if we put in exponential backoff as suggested. 5 seems like a more sane default (as suggested here). |
76ed7ac to
7bd17d9
Compare
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
Recent best practices have emerged, so I figured we should put these in place before merging. I've added a task list in the initial post with the intended revisions so far; more will likely be coming. One of the revisions is allowing a user to specify a preference of a simple or resumable upload. I'm seeking opinions on converting the Current: myBucket.upload("./photo.jpg", myFile, { my: "metadata" }, function (err, file) {})Suggested: myBucket.upload("./photo.jpg", {
destination: myFile,
metadata: {
my: "metadata"
},
resumable: (true || false)
}, function (err, file) {})
Current: myFile.createWriteStream({ my: "metadata" })Suggested: myFile.createWriteStream({
resumable: false, // default: true
metadata: {
my: "metadata"
}
})Any better ideas? |
|
Looks good, but I'd like the user to be able to change the resumable_threshold (that defaults to 5MB). Could we expose a configuration for storage, or add setters for similar values? In the future we might need it for the chunk size as well, and we could use it to allow the user to change the default for createWriteStream at a global level. |
|
Config on the var gcloud = require("gcloud")({ /* conn info */ })
gcloud.storage({ resumableThreshold: n })& var gcloud = require("gcloud")
var storage = gcloud.storage({ /* conn info */, resumableThreshold: n })
|
Bytes. The header is in bytes, so this seems like a simple choice.
Wait, why not? |
|
In a stream, we can't stat a file for its size. It comes to us in small kb chunks, meaning we don't know if it's over a threshold until after we've already formed the request. I suppose if we wanted to, we could buffer n threshold into memory before beginning the request (which is the time we have to say resumable vs simple), but that seems like a dangerous approach. & +1 on bytes. |
|
Fair enough. Plus you don't really know that the readable stream is a file at all. That being said, should resumable even work with streams unless they explicitly give us the filename to use? |
|
That's a great question, but I think it's impossible to answer. Still, I anticipate resumable will be a desirable default, and speaking technically, we have a solution for if we resume an upload, but are sent different data than we were originally (we bail and start a new upload). And in any case, the user knows best what they are doing, so we [will] allow them to be explicit about what type of upload to use at the time of the upload. |
|
Can we get access to the readable stream that is piping their data to our writable stream? In theory, if we can, we could try to yank the |
|
With a stream, we should only be aware of the data coming in, and not about how/where/etc. It would also be a bit magical if we tried to implement something like that. And usually, whenever there's magic, the solution is to add an option or variation of the method that gives the user explicit control of the outcome. We will have both of those things ({ resumable: false } and bucket.upload) |
|
Yeah, that would be too much magic, agreed. Getting back to the original question, I think that it's safe to say that if the developer is uploading using a stream, they know that |
7bd17d9 to
7e1dde8
Compare
* refactor(ts): Typescript conversion (initial) (#255) * refactor(ts): Rename *.js to .ts * refactor(ts): setup typescript tools * refactor(ts): install available @types packages * refactor(ts): remove .js extension from imports * refactor(ts): gts fix * refactor(ts): function to arrow => * refactor(ts): for...in => for...of Object.keys(... * refactor(ts): use native `extends Error` for custom error classes ...instead of relying on package 'create-error-class' * refactor(ts): module.exports => export { ... } * refactor(ts): temporarily disable noImplicitAny before augmenting types * refactor(ts): eof \n * refactor(ts): use typescipt ~2.9.2 * Revert "refactor(ts): module.exports => export { ... }" This reverts commit 219ef2f2b4c1f423afd15246bea6735e42b15d3b. * refactor(ts): temporarily remove posttest npm script * refactor(ts): internalize class and object members into class (#260) * refactor(ts): Acl: internalize static/object properties to class * refactor(ts): Bucket: internalize static/object properties to class * refactor(ts): File: internalize static/object properties to class * refactor(ts): Iam: internalize static/object properties to class * refactor(ts): File: internalized missed properties to class * chore: update package-lock.json * refactor(ts): Storage: internalize class/object properties into class * refactor(ts): Iam: import Bucket * refactor(ts): File: internal Acl * refactor(ts): add 'any' type to suppress compiler error (#261) * refactor(ts): Channel: type cast this.metadata as any * refactor(ts): Bucket: add types to make tsc work * refactor(ts): Acl: add any types * refactor(ts): File: add any types * refactor(ts): Storage: add any types * refactor(ts): Notification: add any[] type * refactor(ts): File: add any[] type * Typescript: use ts-style exports instead of module.exports (#267) * refactor(ts): use es6/ts export in project files * refactor(ts): import from file in project * refactor(ts): export = Storage; fix import from Storage from other classes * chore: update gcs-resumable-upload to 0.11.1 * refactor(ts): misc. fixes to make tsc happy (#268) * refactor(ts): File: add optional types for uninitialized members * refactor(ts): Bucket: options is optional in constructor * refactor(ts): File: various `any` cast; optional callback? and type fixes * refactor(ts): add Acl#default as a member * refactor(ts): Bucket: various any cast; optional callback? and type fixes * refactor(ts): Notification: add any cast; optional callback? * refactor(ts): Iam: type cast any * refactor(ts): Acl.default is optional * refactor(ts): require pumpify * refactor(ts): !assert this.acl.default exists in bucket * cleanup: no longer need to set Storage.X * refactor(ts): export = Storage proxy * refactor(ts): destructure apply Proxy and call Storage with single argument * refactor(ts): fix Date comparisons * cleanup: remove ignore no-class-assign * refactor(ts): convert tests to typescript (1) (#272) * refactor(ts): Tests: *.js => *.ts * refactor(ts): add missing @types/ in tests * refactor(ts): iam use ts-style imports * refactor(ts): test/index.ts use ts-style imports * refactor(ts): test/file.ts use ts-style imports * refactor(ts): test/bucket.ts use ts-style imports * refactor(ts): test/acl.ts use ts-style imports * refactor(ts): test/acl.ts move fakes inside describe(..) So tsc won't complain we're redeclaring sth. * refactor(ts): test/channel.ts use ts-style imports * refactor(ts): test/notification.ts use ts-style imports * refactor(ts): move let Acl back to outer scope * Typescript: setup package.json and .circleci/config.yaml (#271) * refactor(ts): update package.json with ts project specify paths * refactor(ts): add tsc compile step to npm; npm run check instead of lint * refactor(ts): npm run compile comes after npm install * refactor(ts): dont need to compile ts explicitly * refactor(ts): lint should run eslint on js tests, and run gts check on ts * refactor(es6): ts-ify 2 * refactor(ts): allow implicit this as any temporarily * refactor(ts): add any-cast and optional arg? to make test/bucket.ts compile * refactor(ts): add any-cast to test/notification.ts * refactor(ts): test/file.ts: add any-casts and optional arg? to make it compile * refactor(ts): test/file.ts: Date comparisons using .valueOf() * refactor(ts): test/file.ts Request fakes uses class syntax * refactor(ts): main src is in build/ now so require('..') will not work (#286) * refactor(ts): import individual exports from @google-cloud/common (#287) * refactor(ts): make test/notification.ts pass (#290) * refactor(ts): test/notification.ts: import indiv common exports * refactor(ts): fix Notification import from named export * refactor(ts): test/iam.ts: fix proxyquire import * refactor(ts): test/channel.ts: fix proxyquire import * refactor(ts): make test/acl.ts pass (#288) * TypeScript: make test/index.ts pass (#292) * refactor(ts): Storage test: fix relative path * refactor(ts): Storage test: fix FakeChannel proxyquire * refactor(ts): test-no-cover should compile ts before running test (#289) * build: test-no-cover should compile ts before running test * refactor(ts): change npm scripts to conform to typescript best practices * refactor(ts): checkin mocha.opts and add mocha deps * cleanup(codecov): ignore build/test for code coverage (#293) * TypeScript: make Bucket test pass (#291) * refactor(ts): test/bucket: fix proxyquire import * refactor(ts): fix testdata file path * refactor(ts): test/bucket.ts: fix common util * refactor(ts): test/bucket.ts: proxyquire async via default import * refactor(ts): make test/file.ts pass (#294) * refactor(ts): assign .name property to custom File error types * refactor(ts): test/file: FakeRequest#getRequestOptions should be static * refactor(ts): resumable-upload uses es6 modules, so proxyquire as default * refactor(ts): run gts fix (#295) * refactor(ts): function to arrow => (#297) * refactor(ts): function to arrow => * style * Revert "refactor(ts): function to arrow => (#297)" (#298) This reverts commit c35e9ccccf70161ffbae2e5ba18bc06072c80f81. * refactor(ts): tests function() to arrow => (#299) * refactor(ts): function to arrow => * style * refactor(ts): test/acl.ts: replace ref `arguments` in arrow function * refactor(ts): test/bucket.ts fix arrow function `this` and `arguments` ref * refactor(ts): test/file.ts fix arrow function `this` and `arguments` ref * refactor(ts): test/iam.ts fix arrow function `this` ref * refactor(ts): tslint:disable-next-line:variable-name (#300) * Typescript: workaround explicit 'any' cast in tests (#307) * ts: avoid using extend to override request methods * ts: no allowSyntheticDefaultImports (#316) * dep: update gts@0.8 and typescript@3.0.1 * ts: no allowSyntheticDefaultImports * ts: gts fix * TypeScript: export {Storage} as named export (#327) * ts: export Storage as named export - const {Storage} = req... BREAKING CHANGE: Storage will need to be imported as a named import: const { Storage } = require('@google-cloud/storage'); Storage cannot be instantiated without new: 𝘅 const storage = Storage(); // deprecated ✔ const storage = new Storage(); * ts: fix tests now that Storage is named * ts(docs): fix all in-line samples to use named-exported Storage (#329) * ts(docs): fix all in-line samples to use named-exported Storage * ts: named Storage import in samples and system tests * dep: Use v0.21.0 of @google-cloud/common (#330) * dep: bring in nodejs-common@0.21.0 * install @google-cloud/promisify and paginator and use them * extract paginator and promisify from proxyquire of @google-cloud/common * Delete package-lock.json * ts: more fixes (#333) * ts: add interfaces to get rid of any * ts: strip any * ts: add interface * dep: upgrade @google-cloud/common to ^0.21.1 * ts: get rid of any cast in bucket.ts with types * ts: fix anys * fix remaining check issues * ts: fix test * fix: use new Storage() constructor in system-test * doc: fix init in README.md quickstart * fix: arrow functions * more arrow funcitons * add ignore no-any for now; * ts doesnt like for..in, fix it * gts fix * fix test
- [ ] Regenerate this pull request now. PiperOrigin-RevId: 485941276 Source-Link: https://togithub.com/googleapis/googleapis/commit/a5f5928e736ea88c03e48c506a19fa632b43de9e Source-Link: https://togithub.com/googleapis/googleapis-gen/commit/61ebfaa325101bc9b29ee34900b45b2f0d23981e Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiNjFlYmZhYTMyNTEwMWJjOWIyOWVlMzQ5MDBiNDViMmYwZDIzOTgxZSJ9 BEGIN_NESTED_COMMIT chore: override API mixins when needed PiperOrigin-RevId: 477248447 Source-Link: https://togithub.com/googleapis/googleapis/commit/4689c7380444972caf11fd1b96e7ec1f864b7dfb Source-Link: https://togithub.com/googleapis/googleapis-gen/commit/c4059786a5cd805a0151d95b477fbc486bcbcedc Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiYzQwNTk3ODZhNWNkODA1YTAxNTFkOTViNDc3ZmJjNDg2YmNiY2VkYyJ9 END_NESTED_COMMIT
uploadandcreateWriteStreambetween simple/resumable upload techniqueupload: Stat the incoming file for size, default to simple for < 5mb, resumable for > 5mbcreateWriteStream: default to resumable uploadsFixes #298
createWriteStreamuses the Resumable Upload API: http://goo.gl/jb0e9D.The process involves these steps:
If the initial upload operation is interrupted, the next time the user uploads the file, these steps occur:
If the user tries to upload entirely different data to the remote file: