Victor Salcedo

AI Infrastructure Engineer · LLM Inference · GPU Optimization · Distributed ML Systems · Founder, SalixLogic · San Diego, CA

About

I'm an AI and machine learning engineer focused on high-performance model inference, GPU efficiency, and the engineering systems behind modern large-scale AI. My work centers on designing and building infrastructure that improves model performance, reliability, and reproducibility.

Through SalixLogic, my AI engineering business, I build practical systems for LLM inference, model optimization, and scalable ML workflows. This includes work with PyTorch, vLLM, Triton, and Ray to develop optimized inference pipelines, benchmark GPU performance, and create structured tools for evaluation and experiment management.

My long-term direction is centered on AI systems engineering, model inference optimization, and distributed ML infrastructure — with an emphasis on building tools and workflows that accelerate research and enable efficient large-model deployment.

Research & Engineering Focus

01 LLM Inference & Optimization High-throughput inference pipelines with vLLM, Triton, and ONNX. Reducing latency and memory footprint for large-model deployment.

02 GPU & Edge Performance Benchmarking GPU efficiency, quantization, model compression, and optimized deployment on embedded hardware including NVIDIA Jetson.

03 Distributed ML Systems Scalable training and inference infrastructure with Ray and PyTorch. Reproducible workflows via CI/CD pipelines and automated testing.

04 Computer Vision & Deep Learning Object detection (YOLOv8), CNN-LSTM sequence modeling, time-series forecasting, and reinforcement learning for autonomous systems.

Featured Project

Blind Spot Detection System — YOLOv8 + LISA Dataset

Developed a real-time blind spot detection model using YOLOv8 trained on the LISA traffic detection dataset. Engineered the full ML pipeline from data preprocessing and augmentation through training, validation, and performance benchmarking. Optimized for embedded edge deployment on NVIDIA Jetson hardware via ONNX conversion and quantization, achieving production-ready inference on resource-constrained devices.

YOLOv8 LISA Dataset ONNX Quantization NVIDIA Jetson Edge Deployment PyTorch Real-Time Inference

Technical Skills

Inference Frameworks	PyTorch, vLLM, Triton, ONNX, TensorFlow, scikit-learn, YOLOv8
Distributed Systems	Ray, CI/CD Pipelines, Model Monitoring, Version Control, Automated Testing
Edge & MLOps	NVIDIA Jetson, Model Quantization, ONNX Conversion, Reproducible Workflows
Languages	Python, SQL, C++, C, Bash
Data & Visualization	Pandas, NumPy, Power BI, Plotly, Matplotlib, Seaborn, SQL Server

Connect

Education

M.S., Applied Artificial Intelligence University of San Diego 2026
B.S., Cognitive Science Machine Learning & Neural Computation UC San Diego 2024 · Minor: Mathematics

Contact

Business SalixLogic — AI Engineering & Consulting
Location San Diego, CA
For Consulting Inquiries
Phone (619) 504-3651

Graduate Coursework