This guide covers MADSci's structured logging system and the EventClient context system for hierarchical logging.
MADSci uses the EventClient class for structured logging throughout the system. The EventClient:
- Logs events locally to files with rotation
- Sends events to the EventManager for centralized storage and querying
- Integrates with OpenTelemetry for distributed tracing
- Supports hierarchical context propagation
- Use structured logging: Put data in kwargs, not f-strings
- Set
event_type: Use the most specificEventTypeavailable - Include context: Include relevant IDs (workflow_id, node_id, resource_id, etc.)
- Use
exc_info=Trueon exception logs - Use the context system for hierarchical logging across components
self.event_client.info(
"Workflow step completed",
event_type=EventType.WORKFLOW_STEP_COMPLETE,
workflow_id=workflow.workflow_id,
step_index=step_index,
step_action=step.action,
duration_ms=elapsed_ms,
)# Don't do this - data is not queryable
self.event_client.info(
f"Workflow {workflow.workflow_id} step {step_index} completed in {elapsed_ms}ms"
)| Category | EventTypes | Emitted By |
|---|---|---|
| Manager lifecycle | MANAGER_* |
Manager services |
| Workflow lifecycle | WORKFLOW_*, WORKFLOW_STEP_* |
Workcell/workflow execution layer |
| Node operations | NODE_*, ACTION_* |
Node module layer |
| Domain events | RESOURCE_*, LOCATION_*, DATA_* |
Respective managers |
MADSci provides a hierarchical logging context system that automatically propagates context through your code. This enables logs from related operations to share common identifiers for easier debugging and analysis.
from madsci.common.context import get_event_client, event_client_context
# Get a logger (uses context if available, creates new if not)
logger = get_event_client()
logger.info("Processing request")
# Establish context at entry points
with event_client_context(name="my_operation", operation_id="op-123") as logger:
logger.info("Starting operation")
# All logs within this block include operation_id
# Nested context adds more metadata
with event_client_context(name="substep", substep_id="sub-456") as step_logger:
step_logger.info("Executing substep")
# Logs include both operation_id and substep_idEstablish context at:
- Application entry points (main functions, CLI commands)
- Experiment runs
- Workflow executions
- Request handlers (via middleware)
- Long-running operations
Use get_event_client() in:
- Library code
- Utility functions
- Classes that may be used in different contexts
- Any code that should inherit parent context
-
Prefer
get_event_client()overEventClient()# Good - participates in context logger = get_event_client() # Legacy - creates isolated client logger = EventClient()
-
Add meaningful context metadata
with event_client_context( name="workflow", workflow_id=workflow.id, workflow_name=workflow.name, user_id=user.id, ): # All nested logs include this context
-
Use descriptive hierarchy names
# Good - clear hierarchy "experiment.workflow.step.action" # Avoid - too generic "process.subprocess.task"
| Level | Pattern | Example |
|---|---|---|
| Experiment | experiment |
experiment |
| Workflow | experiment.workflow |
experiment.workflow |
| Step | experiment.workflow.step.{name} |
experiment.workflow.step.transfer |
| Node (server-side) | node.{node_name} |
node.robot |
| Action (server-side) | node.{node_name}.action.{action} |
node.robot.action.grab |
| Manager | manager.{manager_name} |
manager.event_manager |
| CLI Command | cli.{command_name} |
cli.run_workflow |
Before (Without Context):
[madsci.client.node.rest_node_client] Starting action
[madsci.client.node.rest_node_client] Action complete
[madsci.client.resource_client] Updating resource
After (With Context):
[experiment.workflow.step.transfer] Starting action | experiment_id=exp-123 workflow_id=wf-456 step_id=step-1
[experiment.workflow.step.transfer] Action complete | experiment_id=exp-123 workflow_id=wf-456 step_id=step-1
[experiment.workflow] Updating resource | experiment_id=exp-123 workflow_id=wf-456
The context system is fully backward compatible. Your existing code will continue to work without changes.
Your existing code continues to work:
# Still works
logger = EventClient(name="my_module")
logger.info("Hello")Replace direct EventClient() instantiation:
# Before
class MyComponent:
def __init__(self):
self.logger = EventClient()
# After
from madsci.common.context import get_event_client
class MyComponent:
def __init__(self):
self.logger = get_event_client()Wrap your main operations in context:
# Before
def process_batch(batch_id: str):
logger = EventClient(name="batch_processor")
logger.info(f"Processing batch {batch_id}")
for item in items:
process_item(item)
# After
from madsci.common.context import event_client_context
def process_batch(batch_id: str):
with event_client_context(name="batch", batch_id=batch_id) as logger:
logger.info("Processing batch")
for item in items:
process_item(item) # Inherits batch contextimport click
from madsci.common.context import event_client_context
@click.command()
def my_command():
with event_client_context(name="cli.my_command") as logger:
logger.info("Running command")
do_work() # Inherits contextfrom madsci.common.context import event_client_context
from madsci.client.workcell_client import WorkcellClient
with event_client_context(name="my_script", script_name="transfer_samples") as logger:
logger.info("Starting script")
# All clients now share the script context
workcell = WorkcellClient()
workcell.start_workflow("my_workflow.yaml")def get_event_client(
name: Optional[str] = None,
create_if_missing: bool = True,
**context_kwargs: Any,
) -> EventClient:
"""
Get the current EventClient from context, or create one if none exists.
Args:
name: Optional name override (used when creating new client)
create_if_missing: If False, raises RuntimeError when no context
**context_kwargs: Additional context to bind to the returned client
Returns:
EventClient with inherited context
"""@contextlib.contextmanager
def event_client_context(
name: Optional[str] = None,
client: Optional[EventClient] = None,
inherit: bool = True,
**context_metadata: Any,
) -> Generator[EventClient, None, None]:
"""
Establish or extend an EventClient context.
Args:
name: Name for this context level (added to hierarchy)
client: Explicit EventClient to use
inherit: If True, inherit parent context; if False, create fresh
**context_metadata: Additional context to bind to all logs
Yields:
EventClient for this context
"""def has_event_client_context() -> bool:
"""Check if an EventClient context is currently active."""Symptom: Logs don't include expected context metadata.
Cause: Code is using EventClient() directly instead of get_event_client().
Fix: Replace EventClient() with get_event_client().
Symptom: Many separate log files being created.
Cause: Multiple EventClient() instances with different names.
Fix: Use context system to share single root client.
# Establish context once at entry point
with event_client_context(name="my_app") as logger:
# All components within this context share the same underlying logger
run_operations()Note: Context propagates automatically with async/await but does NOT propagate across thread or process boundaries. For threading or multiprocessing, establish context in each thread/process:
import multiprocessing
from madsci.common.context import event_client_context
def worker_process(task_id: str):
# Must establish context within each process
with event_client_context(name="worker", task_id=task_id) as logger:
logger.info("Worker starting")
do_work()For cleaner code, use decorators instead of context managers:
Wrap functions with EventClient context:
from madsci.common.context import with_event_client
@with_event_client(name="my_workflow", workflow_id="wf-123")
def my_workflow(event_client=None):
event_client.info("Running workflow")
do_work()
# Called normally - context is established automatically
my_workflow()Wrap all methods of a class with context:
from madsci.common.context import event_client_class
@event_client_class(component_type="processor")
class DataProcessor:
def process(self, data):
# self.event_client is available in all methods
self.event_client.info("Processing data", data_size=len(data))
return transform(data)
def get_context_overrides(self) -> dict:
# Optional: provide instance-specific context
return {"processor_id": self.id}When OpenTelemetry is enabled, logs are automatically correlated with distributed traces.
Enable OTEL per-manager or per-EventClient:
from madsci.client.event_client import EventClient, EventClientConfig
config = EventClientConfig(
otel_enabled=True,
otel_service_name="my_service",
otel_exporter="otlp",
otel_endpoint="http://localhost:4317",
otel_protocol="grpc",
)
client = EventClient(name="traced_component", config=config)Or via environment variables:
EVENT_OTEL_ENABLED=true
EVENT_OTEL_SERVICE_NAME="madsci.event"
EVENT_OTEL_EXPORTER="otlp"
EVENT_OTEL_ENDPOINT="http://localhost:4317"With OTEL enabled, MADSci uses LoggingInstrumentor from the opentelemetry-instrumentation-logging package to automatically bridge Python's stdlib logging to the OTEL log pipeline. This injects trace context into every log record, so events automatically include:
trace_id: W3C trace identifierspan_id: Current span identifierparent_span_id: Parent span identifier (if available)
This enables:
- Clicking on a trace ID in logs to jump to the full trace in Jaeger
- Seeing which logs were generated during a specific request
- Understanding the full flow of a workflow across all managers
For explicit span creation:
from madsci.common.otel import span_context, with_span
# Context manager
with span_context("process_data", attributes={"data.size": 100}) as span:
result = process(data)
span.set_attribute("result.count", len(result))
# Decorator
@with_span(name="fetch_user")
def get_user(user_id: str):
return api.fetch(user_id)See OBSERVABILITY.md for the full observability stack setup.