AI Software Engineer | LLM Systems · Production AI · Backend
Montreal, Canada
Portfolio · LinkedIn · Google Scholar · Email
AI Software Engineer with 5+ years shipping production AI systems. I architect LLM orchestration pipelines, gRPC microservices, and automated testing strategies at scale. Published researcher in edge computing and neural architecture search.
- 9× inference throughput — Edge BERT inference via pipeline parallelism + neural architecture search
- 99.9% service availability — gRPC multi-agent orchestration for automotive assistant workflows
- 95%+ code coverage — Full automated testing strategy (unit, contract, integration, E2E)
- 60% deployment lead time reduction — CI/CD automation and testing infrastructure
- 12–14 TOPs/s/W energy efficiency — Analog memory training acceleration (IBM Research)
| Project | Description | Stack |
|---|---|---|
| A2A Samples | Agent-to-Agent protocol samples for multi-agent AI systems | Python, Jupyter |
| Analog HW Acceleration | IBM analog hardware acceleration for in-memory computing ⭐ 7 | Python, Jupyter |
| Super-Convergence in Analog HW | Exploring super-convergence for analog in-memory computing ⭐ 5 | Python, Jupyter |
| DataBrick Learning | Databricks platform learning and data engineering workflows | Python |
Apps & Products (Private repos — see portfolio for demos)
| Project | Description | Stack |
|---|---|---|
| Puppy Step | Pet companion app with real-time growth tracking and milestone gallery | React Native, Firebase |
| Subscribe Manager | Subscription tracking platform with intelligent expense insights | React, Node.js |
| Canada Count Down | Timeline tracker for Canadian PR-to-Citizenship journey | React Native |
| Coach Book IQ | Team management and scheduling for professional coaching staff | React, Firebase |
-
PipeBERT: High-throughput BERT Inference for ARM Big.LITTLE Multi-core Processors Journal of Signal Processing Systems (IEEE SiPS 2022) H.-Y. Chang, S. Mozafari, C. Chen, J. Clark, B. Meyer, W. Gross
-
High-Throughput Edge Inference for BERT Models via Neural Architecture Search and Pipeline GLSVLSI 2023 (Poster) H.-Y. Chang, S. Mozafari, J. Clark, B. Meyer, W. Gross
-
AI Hardware Acceleration with Analog Memory IBM Journal of Research and Development H.-Y. Chang, G.W. Burr, P. Narayanan, S. Ambrogio et al.
-
A Novel Architecture to Build Ideal-linearity Neuromorphic Synapses on a Pure Logic FinFET Platform 2019 Symposium on VLSI Technology (Oral) E.R. Hsieh, H.-Y. Chang, S.S. Chung, S.S. Wong et al.
AI/LLM Systems: LangChain · RAG Systems · FAISS · Vector Databases · Function Calling · Multi-Agent Orchestration
ML & Inference: PyTorch · TensorFlow · TVM · Neural Architecture Search · Edge Inference
Backend & Infra: Python · FastAPI · gRPC · Docker · Kubernetes · GitLab CI/CD · Async Python
App Development: React · React Native · Firebase · Node.js
Research & Data: Pytest · Iceberg Catalog · LaTeX · MATLAB · SystemC
- 🔧 Scaling multi-agent LLM orchestration at Cerence for automotive AI
- 🧪 Exploring agentic AI patterns and tool-use architectures
- 📱 Building side projects and mobile apps
- ✍️ Writing about LLM systems and production AI
- AI/LLM engineering roles (senior level)
- Technical collaborations on LLM tooling and orchestration
- Speaking and writing about production AI systems
hychangee@gmail.com · Montreal, Canada

