inference2

Efficiency deserves its own reward

The Power of Two

Fast track your AI journey

The NVIDIA Deep Learning Inference Platform running on Google Cloud delivers all the performance, efficiency, and responsiveness you need to power next generation artificial intelligence (AI) products and services.
Deep Learning Inference Platform

End-to-end scalability

Build your future on solid foundations

Operating within a unified architecture, this platform enables neural networks on every deep learning framework to be trained, optimized, and deployed for real-time inferencing.

The result is an end-to-end, fully scalable, deep learning platform that unlocks the power of NVIDIA Tensor Core GPUs and delivers incredible throughput, while minimizing latency.

Exponential performance

Explore further, deeper, longer

By utilizing NVIDIA TensorRT™, a high-performance inference platform, you can create neural network models, calibrate for lower precision with higher accuracy, and deploy models to the Google Cloud, resulting in:

  • 40X higher throughput
  • Up to 50X faster inference than CPUs

Dramatic acceleration

Get the right balance of cost and capability

With NVIDIA GPUs tightly integrated with the Google Cloud and AI platform, you can drive greater efficiency while running trained models for inference, as well as higher throughput and lower latency for more responsive user experiences:
  • Accelerate inference for all frameworks and networks using NVIDIA TensorRT
  • Integrate TensorRT Inference Server (TRTIS) to maximize GPU utilization
  • Work with Google Cloud AI Platform, Google Kubernetes Engine (GKE), and other Google cloud tools

Your inference platform

A fusion of distinct elements and business benefits

Featuring NVIDIA T4, NVIDIA V100, NVIDIA P4, and NVIDIA P100 GPUs, this is an inference platform that offers customers a host of key differentiators, including:

  • Direct acceleration of NVIDIA TensorFlow with TensorRT integration
  • NVIDIA CUDA-X Libraries that accelerate all parts of the end-to-end pipeline to speed up frameworks
  • NGC Software Hub to containerize all NVIDIA software offerings for deep learning

The
Power of Two

and you

Book your discovery session now.

Snap Inc.

Monetizing rapid platform growth

Advertising revenue is critical to the success of Snap – the company behind Snapchat – and NVIDIA GPUs on Google Cloud provide the fast, accurate, and cost-effective inference it needs to give advertisers access to the right targets.

Nima Khajehnouri

Snap’s monetization algorithms have the single biggest impact to our advertisers and shareholders. NVIDIA T4 powered GPUs for inference on Google Cloud Platform will enable us to increase advertising efficacy while at the same time lower costs when compared to a CPU only implementation.

Nima Khajehnouri, Senior Director, Monetization at Snap Inc.

Unleash the
Power of Two

Executive
guide

Artificial intelligence from NVIDIA and Google Cloud:
The business benefits

Technology
showcase

Build and deploy AI with NVIDIA and Google Cloud:
Technology showcase

Executive
guide

High performance computing from NVIDIA and Google Cloud:
The business benefits

How to guide

Accelerate your Deep Learning projects with VM Images:
NVIDIA and Google Cloud

Scroll to Top

Want to stay up to date?