The release of NVIDIA NIM on Hugging Face Inference Endpoints

Posted Jun 4, 2024 Updated Jun 6, 2024

By Fodev JEO 1 min read

Yesterday at COMPUTEX, Jensen Huang announced the release of NVIDIA NIM on Hugging Face Inference Endpoints!

🚀 NVIDIA NIM are inference services designed to streamline and accelerate the deployment of generative AI models. 👀

1️⃣ 1-click deployment from Hugging Face Hub to Inference Endpoints
🆕 Starting with Llama 3 8B and Llama 3 70B on AWS, GCP
🚀 Up to 9000 tokens/sec for Llama 3 8B
🔜 Adding Mixtral 8x22B, Phi-3, and Gemma and more soon

Learn More: https://developer.nvidia.com/blog/nvidia-collaborates-with-hugging-face-to-simplify-generative-ai-model-deployments

Translate to Korean

어제 COMPUTEX에서 Jensen Huang 는 Hugging Face 추론 엔드포인트에 대한 NVIDIA NIM의 출시를 발표했습니다!

🚀 NVIDIA NIM은 생성형 AI 모델의 배포를 간소화하고 가속화하도록 설계된 추론 서비스입니다. 👀

1️⃣ Hugging Face Hub에서 Inference Endpoints로 1-클릭 배포
🆕 AWS의 Llama 3 8B 및 Llama 3 70B부터 GCP
🚀 최대 9000 토큰/초 (Llama 3 8B)
🔜 Mixtral 8x22B, Phi-3 및 Gemma 등이 곧 추가될 예정입니다.

자세히보기: https://developer.nvidia.com/blog/nvidia-collaborates-with-hugging-face-to-simplify-generative-ai-model-deployments

AI NVIDIA Endpoint

This post is licensed under CC BY 4.0 by the author.

Trending Tags

LLM AI Study RAG Application Cookbook Pytorch Tools Transformer Datascience