Staff Software Engineer — AI Advice Centre
Royal Bank of Canada Toronto, ON
- Architect end-to-end AI call center platform integrating SIP telephony, real-time speech-to-text transcription, and LLM summarization pipelines processing thousands of daily banking interactions.
- Engineered LLM ensemble labelling pipeline using GPT-4o mini, GPT-4.1 mini, and GPT-5 mini with consensus voting and tie-breaking escalation to larger thinking models; generated high-quality training labels at scale for fine-tuning.
- Implemented knowledge distillation workflow: used LLM ensemble as teacher to fine-tune Arctic Embed 2.0 Large as domain-specific student model for banking intent semantics.
- Built continuous training MLOps pipeline on S3 + Apache Airflow; automated dataset ingestion, model retraining, evaluation, and deployment to OpenShift.
- Benchmarked SVM (RBF kernel), logistic regression, SetFit (head and full fine-tune), centroid, and fine-tuned Arctic Embed classifiers for production intent routing.
- Led BART intent classification system using Jina v3 / multilingual-e5 with ONNX Runtime; reduced inference latency from 8 s → 50 ms on CPU-only OpenShift 4.
- Built Spring Boot microservices with k-NN routing of 14,000+ banking intents across 110 workflow categories; achieved 95% semantic naming accuracy via LLM-assisted clustering.