-
-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Description
I'm trying to set up a clean separation between RL policy and RL environment execution using this project. My basic flow is:
- Open a new env on remote
- Run
env.stepuntil the episode ends - Call
env.close
I ran into some issues:
-
No numpy support in zero.
I solved this by writing a new encoder that handles numpy arrays:import asyncio import io from typing import Any, Type import numpy as np from zero import AsyncZeroClient from zero.encoder.generic import GenericEncoder from zero.encoder.msgspc import T class GenericEncoderWithNumpySupport(GenericEncoder): def encode(self, data) -> bytes: if isinstance(data, np.ndarray): buffer = io.BytesIO() np.save(buffer, data) return super().encode(buffer.getvalue()) return super().encode(data) def decode(self, data: bytes) -> Any: decoded_data = super().decode(data) if decoded_data[1:6] == b"NUMPY": # MAGIC string for numpy array buffer = io.BytesIO(decoded_data) return np.load(buffer) return decoded_data def decode_type(self, data: bytes, typ: Type[T]) -> T: if issubclass(typ, np.ndarray): decoded_data = self.decode(data) buffer = io.BytesIO(decoded_data) return np.load(buffer) return super().decode_type(data, typ) def is_allowed_type(self, typ: Type) -> bool: return super().is_allowed_type(typ) or typ is np.ndarray
-
NoneType return disallowed.
It's an annoying restriction—I just worked around this by returning a dummy value. -
No support for multiple arguments.
This was a deal breaker for me. Not supporting multiple arguments made it pretty hard to proceed, so I stopped here.
Would be nice if these issues could get some attention. The numpy part is easily fixable, but the lack of multi-argument support is really limiting.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels