Senior Application Engineer – Lightbits Inferra

Lightbits is the pioneer of software-defined storage (SDS) over NVMe/TCP. As the AI landscape
shifts from training to massive-scale, multi-agent inference, the traditional memory hierarchy is
breaking down. We are launching our next-generation AI inference KV Cache.

We are seeking a hands-on, customer-facing Senior Application Engineer to help customers
deploy, validate, and operate Lightbits Inferra for AI inference workloads. In this role, you will work
directly with customers on POV’s,, production deployments, performance optimization,
troubleshooting, and operational readiness. You will also collaborate closely with Product and
Engineering to improve deployment automation, documentation, and overall customer experience.

This position is located in Israel.

Responsibilities

  • Lead technical customer engagements from discovery through production deployment.
  • Design, deploy, and support Kubernetes-based AI inference environments, including GPU and KV Cache infrastructure.
  • Validate AI inference workloads, troubleshoot performance bottlenecks, and optimize system configurations.
  • Own customer POV’s, including success criteria, test plans, issue resolution, and production handoff.
  • Diagnose complex issues across Linux, Kubernetes, networking, GPU infrastructure, and application layers.
  • Partner with Product and Engineering teams to drive product improvements based on customer feedback.
  • Develop automation, tools, scripts, and technical documentation to improve deployment efficiency and repeatability.
  • Create customer-facing technical content, including deployment guides, runbooks, best practices, and troubleshooting documentation.

Qualifications

  • Experience leading AI, HPC, Kubernetes, or infrastructure POCs into production environments.
  • Strong Linux and Kubernetes administration, troubleshooting, and operational experience.
  • Understanding of AI inference infrastructure, GPU-based deployments, and performance optimization.
  • Ability to troubleshoot distributed systems using logs, metrics, dashboards, and observability tools.
  • Fluency with AI tools including Claud, Co-pilot and others.
  • Excellent written and verbal communication skills with customer-facing experience.
  • Ability to manage technical discussions, define success criteria, and communicate risks and tradeoffs effectively.
  • Experience with NVIDIA Triton, vLLM, TensorRT-LLM, SGLang, KServe, Ray Serve, or similar inference platforms.
  • Familiarity with GPU infrastructure, CUDA, RDMA, NCCL, and high-performance networking.
  • Experience with Kubernetes Operators, Helm, and observability platforms.
  • Background in Solutions Engineering, Field Engineering, SRE, Platform Engineering, or Customer Success Engineering.
  • Experience creating technical blogs, tutorials, reference architectures, or customer enablement content.

Fill the form to apply