Hi, I'm Saif Mahmoud
AI Engineer | Applied AI/ML & Inference
I'm an AI Engineer specializing in cost-effective inference pipelines. I'm currently researching structured pruning in Vision Transformers at Al Ain University.
About Me
Learn more about my background, education, and courses
Education
Bachelor of Science in Software Engineering
Al Ain University
Sep 2023 – May 2027 (Expected)
Courses
HarvardX
July 2025DeepLearning.AI & Stanford
July 2025DeepLearning.AI & Stanford
January 2026DeepLearning.AI & Stanford
January 2026Work Experience
My professional journey and contributions
Undergraduate Research Assistant
Al Ain University
Conducting a comparative analysis of cost-accuracy tradeoffs in SOTA structured pruning for Vision Transformers, synthesizing data across 69+ papers. To accelerate the literature review, I engineered a custom Chrome extension pipeline to log, track, and automatically grade over 550 papers based on strict inclusion criteria.
AI Engineering Intern
LUXAI
Architected Reelwise, an inference pipeline utilizing GraphQL interception. Engineered a triage system with sub-35ms CPU-based filtration to reduce inference load by 60%, followed by an Int8-quantized Faster-Whisper workflow. Customized sentenece-transformers by Implementing FlashAttention using Triton, cutting cold-start latency by 71% while maintaining 0.96x sustained throughput. Sustained a memory footprint of under 48MB VRAM per video, allowing scaling to two concurrent workers, boosting throughput by 60% with minimal memory overhead.
Software Engineering Intern
Smart Navigation Systems
Built the core backend for Himaya71 (UAEU I2P Top Startup), a smart campus safety system. I designed an edge-to-cloud architecture where local devices run YOLOv8s for head-count detection and transmit lightweight JSON payloads to a centralized PostgreSQL backend via REST endpoints, ensuring real-time occupancy and fire alert tracking.
Skills & Technologies
Tools and technologies I work with
Languages & Frameworks
Inference & Optimization
AI & Machine Learning
DevOps & Tools
Featured Projects
A showcase of my work across different domains
Himaya71 Platform
A distributed edge-to-cloud safety system. I architected the Django backend to ingest high-frequency telemetry from edge devices running YOLOv8s. Features include a Next.js command center with real-time occupancy heatmaps and a React Native mobile unit for indoor navigation.
Reelwise
High-fidelity content intelligence pipeline. I Intercepted internal network traffic to fetch raw metadata, then routed the videos through a custom VADER triage gate (sub-35ms latency). This filtering reduced GPU load by 60%, allowing the heavier Int8 Faster-Whisper model to run within a strict 190MB VRAM constraint. Authored a custom triton kernel to increase cold-start throughput by 71%, bypassing caching overhead and maintaining 0.96x sustained throughput
Oryx Intelligence
Bilingual SEO engine that drove a 288% traffic surge. I engineered a pipeline combining unsupervised K-Means clustering for niche identification and a perplexity-based validation gate (Sentence-Transformers) to act as an 'LLM-as-a-judge,' preventing low-quality content generation.
Resume Optimizer
Hybrid inference career platform (Client-side WebLLM and external gemini API). I implemented an input sanitization layer with 40+ regex patterns to block prompt injections before they reach the LLM. Optimized for speed, the system uses two-phase prompt chaining and Server-Sent Events (SSE) to achieve a 72% reduction in Time-To-First-Token.
Get In Touch
I'm always open to new opportunities and collaborations
Let's connect!
Whether you have a question, want to discuss a project, or just want to say hi, I'll try my best to get back to you!