Hi, I'm Saif Mahmoud

AI Engineer | Applied AI/ML & Inference

I'm an AI Engineer specializing in cost-effective inference pipelines. I'm currently researching structured pruning in Vision Transformers at Al Ain University.

Connect:

About Me

Learn more about my background, education, and courses

Education

Bachelor of Science in Software Engineering

Al Ain University

Abu Dhabi, UAE

Sep 2023 – May 2027 (Expected)

GPA: 3.81(Honors list)

Courses

Work Experience

My professional journey and contributions

Undergraduate Research Assistant

Al Ain University

UAE
November 2025 – Present

Conducting a comparative analysis of cost-accuracy tradeoffs in SOTA structured pruning for Vision Transformers, synthesizing data across 69+ papers. To accelerate the literature review, I engineered a custom Chrome extension pipeline to log, track, and automatically grade over 550 papers based on strict inclusion criteria.

Python
Vision Transformers
Structured Pruning
Chrome Extension
Research

AI Engineering Intern

LUXAI

UAE
July 2025 – Present

Architected Reelwise, an inference pipeline utilizing GraphQL interception. Engineered a triage system with sub-35ms CPU-based filtration to reduce inference load by 60%, followed by an Int8-quantized Faster-Whisper workflow. Customized sentenece-transformers by Implementing FlashAttention using Triton, cutting cold-start latency by 71% while maintaining 0.96x sustained throughput. Sustained a memory footprint of under 48MB VRAM per video, allowing scaling to two concurrent workers, boosting throughput by 60% with minimal memory overhead.

Triton
ONNX Runtime
Faster-Whisper
VADER
GraphQL
FastAPI

Software Engineering Intern

Smart Navigation Systems

UAE
May 2025 – November 2025

Built the core backend for Himaya71 (UAEU I2P Top Startup), a smart campus safety system. I designed an edge-to-cloud architecture where local devices run YOLOv8s for head-count detection and transmit lightweight JSON payloads to a centralized PostgreSQL backend via REST endpoints, ensuring real-time occupancy and fire alert tracking.

Python
Django
YOLOv8s
PostgreSQL
IoT
Edge Computing

Skills & Technologies

Tools and technologies I work with

Languages & Frameworks

Python
TypeScript
Bash
C++
Java
FastAPI
Django
Next.js
Node.js

Inference & Optimization

Triton
CUDA
Quantization
ONNX Runtime
CTranslate2
tiktoken

AI & Machine Learning

PyTorch
Spacy
Hugging Face
OpenCV
scikit-learn
NLTK
NumPy
Pandas

DevOps & Tools

Docker
Linux
GitHub Actions
Alembic
Ruff
PostgreSQL
pgvector
Playwright
Redis

Featured Projects

A showcase of my work across different domains

Get In Touch

I'm always open to new opportunities and collaborations

Let's connect!

Whether you have a question, want to discuss a project, or just want to say hi, I'll try my best to get back to you!