Make write(IO, Char) actually return the amount of printed bytes instead of the attempted written bytes.#56980
Conversation
…ead of the attempted written bytes.
|
Isn't this aligning the implementation with the documented behavior? So this should actually be a bugfix, no?
|
|
Yes, @topolarity and I found this while griping about how it's hard to know if write truncated bytes when taking in non string like things, and I got confused as to why me writing to a full buffer was always succeeding |
|
This should definitely get a regression test before its merged though - perhaps something like io = IOBuffer(;maxsize=1)
write(io, 'a')
@test write(io, 'a') == 0? |
| while true | ||
| write(io, u % UInt8) | ||
| n += write(io, u % UInt8) | ||
| (u >>= 8) == 0 && return n |
There was a problem hiding this comment.
This currently unconditionally advances the given character, but what happens in case the first write fails, and the second succeeds? Now there's suddenly a torn write involved here, and even though you can theoretically know that not all of the given Char has been written (e.g. getting a return value of 3 when a 4-byte Char is passed), you still wouldn't know which byte was dropped.
I think it would be good to return after the first failing write, so that it's at least knowable that a valid prefix has been written (if the return value is nonzero).
There was a problem hiding this comment.
Is there any Julia IO type where writing a byte can fail, return zero, and then succeed, without some error being thrown?
There was a problem hiding this comment.
For example, writing a byte with TranscodingStreams.jl will either return 1 or throw an error.
There was a problem hiding this comment.
Sure, a non-blocking buffered IO whose buffer is temporarily full, for example. I don't know whether there currently is such a type in the ecosystem, but the point is that it could exist and would be a valid IO, as far as I can tell.
There was a problem hiding this comment.
Here's a (slightly contrived) example:
julia> io = IOBuffer(; maxsize=1)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=1, ptr=1, mark=-1)
julia> write(io, 'a')
1
julia> write(io, 'a') # should be 0 with this PR, since the write doesn't succeed
1
julia> seekstart(io); # simulate a read-end on some other process, for example
julia> read(io, Char) # happens on the read-end
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
julia> write(io, 'a') # continue writing
1It's a bit awkward to do this with an IOBuffer, but the principle is the same for some IO type that has an actual read-end that's distinct from the write end. For arbitrary I/O, it's usually preferrable to drop data on the write end and retry later once the buffer is ready to send again. With the current behavior, the writer wouldn't know what to try to retransmit over the I/O, since it's impossible to know which byte(s) of the Char was/were not transmitted correctly. Effectively, the number returned by write becomes irrelevant, and only matters when it matches sizeof(Char) - at which point we might as well only return true/false. If we instead abort as soon as any internal write fails, we know that at least a correct prefix of the Char (or any data, in the general case) was returned, and we can retry with only the data that we haven't attempted to transmit at all yet.
There was a problem hiding this comment.
I don't think write ever errors for us, given asyncio and other stuff?
There was a problem hiding this comment.
It does, e.g. trying to write to a read-only stream. write itself has a synchronous API, i.e. it is (task-)blocking.
There was a problem hiding this comment.
zero is not a "valid" return value in that case.
There is interesting historical data suggesting that some implementations of libc write were indeed able to return 0: https://stackoverflow.com/a/41970485
For quite a bunch of kinds of files, the behavior is unspecified, so more or less anything goes either way 🤷
I don't think write ever errors for us, given asyncio and other stuff?
Right, and for a non-blocking buffered IO it would be incredibly awkward to throw actual errors just because it's full. That possibility would be incredibly detrimental in the common case of success. I admit having 0 signal that is quite a bad API though. I guess this is yet-another case something like a Result{Int, Err} sum type would be nice, to distinguish success from errors 🤔
Maybe let me put it another way - would this be a valid IO subtype (barring some other missing methods)?
struct FlakyIO <: IO
io::IO
end
Base.write(fio::FlakyIO, b::UInt8) = rand(Bool) ? write(fio.io, b) : 0You could get very fancy and record which writes succeeded & which ones failed for introspection later on, or do some more complicated scheme for deciding when exactly it "fails" to write anything. This kind of type would be incredibly useful for fuzzing stuff that accidentally depends on writes to IO always succeeding (like the fallback method of write in Base does, for example).
One issue I see with just throwing an error for partial writes/write failures of parts of larger types is that then the return value of write becomes meaningless - either we always get a full write, or we get an error. There would be no more room for partial writes, which can happen in a bunch of cases.
There was a problem hiding this comment.
This conversation is worth continuing, but for the purposes of fixing this bug I think it's orthogonal.
Our AbstractArray write method can also suffer "torn" writes in the same way:
function unsafe_write(s::IO, p::Ptr{UInt8}, n::UInt)
written::Int = 0
for i = 1:n
written += write(s, unsafe_load(p, i))
end
return written
endThis is probably worth splitting into a separate issue and fixing across-the-board. The only thing I think this needs to merge @gbaraldi is a test.
This might break some tests but I want to see which