Victor Salcedo
AI Infrastructure Engineer · LLM Inference · GPU Optimization · Distributed ML Systems · Founder, SalixLogic · San Diego, CAI'm an AI and machine learning engineer focused on high-performance model inference, GPU efficiency, and the engineering systems behind modern large-scale AI. My work centers on designing and building infrastructure that improves model performance, reliability, and reproducibility.
Through SalixLogic, my AI engineering business, I build practical systems for LLM inference, model optimization, and scalable ML workflows. This includes work with PyTorch, vLLM, Triton, and Ray to develop optimized inference pipelines, benchmark GPU performance, and create structured tools for evaluation and experiment management.
My long-term direction is centered on AI systems engineering, model inference optimization, and distributed ML infrastructure — with an emphasis on building tools and workflows that accelerate research and enable efficient large-model deployment.
Developed a real-time blind spot detection model using YOLOv8 trained on the LISA traffic detection dataset. Engineered the full ML pipeline from data preprocessing and augmentation through training, validation, and performance benchmarking. Optimized for embedded edge deployment on NVIDIA Jetson hardware via ONNX conversion and quantization, achieving production-ready inference on resource-constrained devices.
| Inference Frameworks | PyTorch, vLLM, Triton, ONNX, TensorFlow, scikit-learn, YOLOv8 |
| Distributed Systems | Ray, CI/CD Pipelines, Model Monitoring, Version Control, Automated Testing |
| Edge & MLOps | NVIDIA Jetson, Model Quantization, ONNX Conversion, Reproducible Workflows |
| Languages | Python, SQL, C++, C, Bash |
| Data & Visualization | Pandas, NumPy, Power BI, Plotly, Matplotlib, Seaborn, SQL Server |