Rewrite a lot of the internals of Blobs to make reading more explicit.#154
Rewrite a lot of the internals of Blobs to make reading more explicit.#154mkruisselbrink wants to merge 2 commits intomainfrom
Conversation
|
This at least matches how chrome implements blobs backed by files, and how slicing and/or combining those blobs with other blobs works. I haven't verified yet if all browsers actually fully agree on this though. There also is still plenty of room for improvement here, and of course follow-up work to have HTML actually call out to the new "create a file backed File object" to create files in drag&drop and upload algorithms. @annevk do you have time to review this? not urgent, since I won't have much time in the near future to follow up anyway, but definitely would welcome your input about both the high-level idea of how this tries to define how things work, as well as details. |
This redefines how blobs work by adding internal slots and hooks to support Blob objects backed by various sources of data. By doing so this tries to make more explicit how various edge cases behave, as well as making it clear that blobs are immutable. This fixes #75, fixes #99, and fixes #47. It also lays the ground work to address w3c/webappsec-clear-site-date#49. This also fixes #71.
40f442d to
030d980
Compare
inexorabletash
left a comment
There was a problem hiding this comment.
In progress. Made it to line ~400 or so
| </div> | ||
|
|
||
| <div algorithm="blob deserialize"> | ||
| Their [=deserialization step=] (the <dfn>blob deserialization steps</dfn>), |
| </dl> | ||
| 1. Let |read state| be a new [=bytes blob read state=]. | ||
| 1. If |offset| is larger than |snapshot state|[<code><a for="bytes blob snapshot state">"data"</a></code>]'s [=byte sequence/length=], | ||
| set |read state|.[=bytes blob read state/bytes=] to an empty [=byte sequence=]. |
There was a problem hiding this comment.
FWIW, I don't see the struct.name syntax called out in Infra. Do you know if there's plan to add it?
|
It would help a lot to have an enumeration of the edge cases this attempts to cover as well as some kind of overview of the design to solve them. |
inexorabletash
left a comment
There was a problem hiding this comment.
I probably didn't catch everything, but I made it to the end!
| To <dfn lt="process blob parts|processing blob parts">process blob parts</dfn> given a sequence of {{BlobPart}}'s |parts| | ||
| and {{BlobPropertyBag}} |options|, | ||
| To <dfn lt="process blob parts|processing blob parts">process blob parts</dfn> | ||
| given a sequence of {{BlobPart}}'s |blobParts| and {{BlobPropertyBag}} |options|, |
There was a problem hiding this comment.
Nit: {{BlobPart}}s (plural s, not a possessive s)
There was a problem hiding this comment.
Also, maybe list rather than sequence? (I'm hazy on when to switch from IDL terms to Infra terms)
| 1. Let |parts| be an empty [=list=]. | ||
|
|
||
| 1. For each |element| in |parts|: | ||
| 1. Let |bytes| be an empty [=byte sequence=]. |
There was a problem hiding this comment.
This algorithm concatenates adjacent non-blob parts rather than just ending up with more parts. Although this probably matches implementations, it doesn't seem to be observable and makes the algorithm more complicated. Is there a good reason for it?
| @@ -361,16 +458,179 @@ run the following steps: | |||
| 1. If |element| is a {{BufferSource}}, <a lt="get a copy of the buffer source">get | |||
There was a problem hiding this comment.
Not new, but would be more readable with "Otherwise, if ..."
|
|
||
| 1. If |element| is a {{Blob}}, | ||
| append the bytes it represents to |bytes|. | ||
| 1. If |element| is a {{Blob}}: |
| :: A number, representing the byte offset in the remaining blob parts | ||
| from which to start returning data. | ||
| : <dfn>nested blob data</dfn> | ||
| :: `undefined` or a [=blob data description=]. This is `undefined` unless otherwise specified. |
There was a problem hiding this comment.
I don't think we typically style undefined (or null, true, false) as code ?
(Likely not consistent across specs)
| let |d| be set to the {{FilePropertyBag/lastModified}} dictionary member. | ||
| If it is not provided, | ||
| set |d| to the current date and time | ||
| with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte sequences</a>. |
There was a problem hiding this comment.
spelling: should be "ambiguity"
| 2. Let |n| be a new string of the same size as the {{fileName}} argument to the constructor. | ||
| Copy every character from {{fileName}} to |n|, | ||
| 1. Let |n| be a new string of the same size as |fileName|. | ||
| 1. Copy every character from |fileName| to |n|, |
There was a problem hiding this comment.
Not new: use "code point" not "character" here and elsewhere.
| with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte sequences</a>. | ||
|
|
||
| 1. If |options|.{{FilePropertyBag/lastModified}} member is provided: | ||
| 1. Let |d| be |options|.{{FilePropertyBag/lastModified}} dictionary member. |
There was a problem hiding this comment.
consistency: "member" or "dictionary member"
| 1. If |options|.{{FilePropertyBag/lastModified}} member is provided: | ||
| 1. Let |d| be |options|.{{FilePropertyBag/lastModified}} dictionary member. | ||
| 1. Otherwise: | ||
| 1. Let |d| be the current date and time |
There was a problem hiding this comment.
Current date / time (as millis since epoch) comes up several times in the spec. Factor it out into a definition.
| represented as the number of milliseconds since the <a>Unix Epoch</a> | ||
| (which is the equivalent of <code>Date.now()</code> [[ECMA-262]]). | ||
|
|
||
| Note: Since ECMA-262 {{Date}} objects convert to <code>long long</code> values |
There was a problem hiding this comment.
Maybe move this into the domintro ?
There was a problem hiding this comment.
I guess File constructor doesn't have a domintro yet. Add one? Blob too?
That'd be a good place to call out (in giant blinking text...) that the first argument is a array. :)
pwnall
left a comment
There was a problem hiding this comment.
Posting my current comments, since @inexorabletash made it to the end before me. Happy to take a second look after revisions, if you'd like me to.
|
|
||
| </div> | ||
|
|
||
| Issue(w3c/webappsec-clear-site-data#49): The actual storage API |serialized| was persisted in will need a way of |
There was a problem hiding this comment.
The storage API used to persist |serialized| needs a way to
| </div> | ||
|
|
||
| Issue(w3c/webappsec-clear-site-data#49): The actual storage API |serialized| was persisted in will need a way of | ||
| modifying the read algorithms for deserialized blobs. I.e. a Blob that was |
There was a problem hiding this comment.
I.e. -> E.g., or For example,
(I think the IndexedDB sentence here is an example of the above, not an elaboration.)
|
|
||
| 1. Set |serialized|.\[[SnapshotState]] to |value|'s [=snapshot state=]. | ||
| A {{Blob}} has an associated <dfn export for=Blob>\[[data]]</dfn> internal slot, | ||
| a [=blob data description=]. |
There was a problem hiding this comment.
internal slot, which holds/is a [=blob data description=] ?
(I'm trying to avoid a reparse. Without the extra words, my first parse of the sentence is that there is an internal slot and a blob data description.)
| Issue(w3c/webappsec-clear-site-data#49): The actual storage API |serialized| was persisted in will need a way of | ||
| modifying the read algorithms for deserialized blobs. I.e. a Blob that was | ||
| deserialized from IndexedDB should start throwing in its read steps after | ||
| clear-site-data clears all IndexedDB data. Somehow let StructuredDeserialize |
| modifying the read algorithms for deserialized blobs. I.e. a Blob that was | ||
| deserialized from IndexedDB should start throwing in its read steps after | ||
| clear-site-data clears all IndexedDB data. Somehow let StructuredDeserialize | ||
| pass along a hook from the storage API to here? |
| ### Byte Sequence backed blobs ### {#byte-sequence-backed-blobs} | ||
|
|
||
| The [=snapshot state=] for a byte sequence backed blob contains a | ||
| <dfn for="bytes blob snapshot state"><code>"data"</code></dfn> member, a [=byte sequence=]. |
There was a problem hiding this comment.
(Disclaimer: I'm not versed in spec-ese) I would have expected (map) entry instead of member here.
The [=snapshot state=] ... has a "data" entry that stores a [=byte sequence=]. ?
|
|
||
| </div> | ||
|
|
||
| A <dfn>bytes blob read state</dfn> is a [=struct=] conssting of |
There was a problem hiding this comment.
conssting -> consisting (typo)
or a [=struct=] that has a single <dfn for="bytes blob read state">bytes</dfn> member, which is a [=byte sequence=]
| where each part could be a blob itself. | ||
|
|
||
| The [=snapshot state=] for a multipart blob contains a | ||
| <dfn for="multipart blob snapshot state"><code>"parts"</code></dfn> member, |
There was a problem hiding this comment.
same comment as above regarding member vs (map) entry
| 1. For each |element| in |blobParts|: | ||
| 1. If |element| is a {{USVString}}, run the following substeps: | ||
|
|
||
| 1. Let |s| be |element|. |
There was a problem hiding this comment.
(not introduced by this PR)
I think it would be nice to replace |s| with |str|. This would avoid [=UTF-8 encoding=] |s| which reads like it could be a possessive pronoun.
| ### Byte Sequence backed blobs ### {#byte-sequence-backed-blobs} | ||
|
|
||
| The [=snapshot state=] for a byte sequence backed blob contains a | ||
| <dfn for="bytes blob snapshot state"><code>"data"</code></dfn> member, a [=byte sequence=]. |
There was a problem hiding this comment.
WDYT about renaming the "data" map entry to "bytes" to avoid the naming conflict with the [[data]] internal slot?
This redefines how blobs work by adding internal slots and hooks to
support Blob objects backed by various sources of data. By doing so this
tries to make more explicit how various edge cases behave, as well as
making it clear that blobs are immutable.
This fixes #75, fixes #99, and fixes #47.
It also lays the ground work to address w3c/webappsec-clear-site-data#49.
This also fixes #71.
For normative changes, the following tasks have been completed:
Implementation commitment:
Preview | Diff