-
Notifications
You must be signed in to change notification settings - Fork 125
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Description
bytes fields in both HashModel and JsonModel fail with UnicodeDecodeError when storing actual binary data (non-UTF8 bytes). Only ASCII-compatible byte sequences work.
Steps to Reproduce
from aredis_om import JsonModel, Field, Migrator
import asyncio
class File(JsonModel, index=True):
filename: str
content: bytes
async def main():
await Migrator().run()
# This works (ASCII-only bytes):
f1 = File(filename="text.txt", content=b"Hello World")
await f1.save() # ✅ OK
# This fails (binary data):
f2 = File(filename="image.png", content=b"\x89PNG\r\n\x1a\n")
await f2.save() # ❌ UnicodeDecodeError
asyncio.run(main())Error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
Stack Trace
File "aredis_om/model/model.py", line 3130, in save
data = jsonable_encoder(data)
File "aredis_om/model/encoders.py", line 134, in jsonable_encoder
encoded_value = jsonable_encoder(
File "aredis_om/model/encoders.py", line 171, in jsonable_encoder
return ENCODERS_BY_TYPE[type(obj)](obj)
File "pydantic/deprecated/json.py", line 55, in <lambda>
bytes: lambda o: o.decode(),
Root Cause
The jsonable_encoder in aredis_om/model/encoders.py uses Pydantic's encoder mapping which calls bytes.decode() without any encoding parameter, defaulting to UTF-8. This fails for any bytes that aren't valid UTF-8.
Expected Behavior
bytes fields should be able to store arbitrary binary data, not just UTF-8 compatible bytes. Options:
- Base64 encode bytes automatically - Store as base64 string, decode on retrieval
- Use latin-1 encoding - Can represent any byte value (0-255)
- Document the limitation - If intentional, document that bytes must be UTF-8 compatible
Current Workaround
Store binary data as base64-encoded strings:
import base64
class File(JsonModel, index=True):
filename: str
content_b64: str # Store as base64 string
@classmethod
def create(cls, filename: str, content: bytes):
return cls(filename=filename, content_b64=base64.b64encode(content).decode())
@property
def content(self) -> bytes:
return base64.b64decode(self.content_b64)Environment
- redis-om version: 1.0.0 (current main branch)
- Python version: 3.12
- Pydantic version: 2.x
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working