AI & ML interests

None defined yet.

Recent Activity

aws-neuron's activity

jeffboudierย 
posted an update 9 days ago
view post
Post
2075
Llama4 is out and Scout is already on the Dell Enterprise Hub to deploy on Dell systems ๐Ÿ‘‰ dell.huggingface.co
jeffboudierย 
posted an update 12 days ago
view post
Post
1488
Enterprise orgs now enable serverless Inference Providers for all members
- includes $2 free usage per org member (e.g. an Enterprise org with 1,000 members share $2,000 free credit each month)
- admins can set a monthly spend limit for the entire org
- works today with Together, fal, Novita, Cerebras and HF Inference.

Here's the doc to bill Inference Providers usage to your org: https://huggingface.co/docs/inference-providers/pricing#organization-billing
  • 2 replies
ยท
pagezyhfย 
posted an update 2 months ago
view post
Post
1739
We published https://huggingface.co/blog/deepseek-r1-aws!

If you are using AWS, give a read. It is a running document to showcase how to deploy and fine-tune DeepSeek R1 models with Hugging Face on AWS.

We're working hard to enable all the scenarios, whether you want to deploy to Inference Endpoints, Sagemaker or EC2; with GPUs or with Trainium & Inferentia.

We have full support for the distilled models, DeepSeek-R1 support is coming soon!! I'll keep you posted.

Cheers
  • 1 reply
ยท
pagezyhfย 
posted an update 3 months ago
jeffboudierย 
posted an update 3 months ago
view post
Post
742
NVIDIA just announced the Cosmos World Foundation Models, available on the Hub: nvidia/cosmos-6751e884dc10e013a0a0d8e6

Cosmos is a family of pre-trained models purpose-built for generating physics-aware videos and world states to advance physical AI development.
The release includes Tokenizers nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6

Learn more in this great community article by @mingyuliutw and @PranjaliJoshi https://huggingface.co/blog/mingyuliutw/nvidia-cosmos
  • 1 reply
ยท
pagezyhfย 
posted an update 4 months ago
pagezyhfย 
posted an update 4 months ago
view post
Post
982
Itโ€™s 2nd of December , hereโ€™s your Cyber Monday present ๐ŸŽ !

Weโ€™re cutting our price down on Hugging Face Inference Endpoints and Spaces!

Our folks at Google Cloud are treating us with a 40% price cut on GCP Nvidia A100 GPUs for the next 3๏ธโƒฃ months. We have other reductions on all instances ranging from 20 to 50%.

Sounds like the time to give Inference Endpoints a try? Get started today and find in our documentation the full pricing details.
https://ui.endpoints.huggingface.co/
https://huggingface.co/pricing
pagezyhfย 
posted an update 5 months ago
view post
Post
310
Hello Hugging Face Community,

if you use Google Kubernetes Engine to host you ML workloads, I think this series of videos is a great way to kickstart your journey of deploying LLMs, in less than 10 minutes! Thank you @wietse-venema-demo !

To watch in this order:
1. Learn what are Hugging Face Deep Learning Containers
https://youtu.be/aWMp_hUUa0c?si=t-LPRkRNfD3DDNfr

2. Learn how to deploy a LLM with our Deep Learning Container using Text Generation Inference
https://youtu.be/Q3oyTOU1TMc?si=V6Dv-U1jt1SR97fj

3. Learn how to scale your inference endpoint based on traffic
https://youtu.be/QjLZ5eteDds?si=nDIAirh1r6h2dQMD

If you want more of these small tutorials and have any theme in mind, let me know!
jeffboudierย 
posted an update 5 months ago
pagezyhfย 
posted an update 5 months ago
view post
Post
1373
Hello Hugging Face Community,

I'd like to share here a bit more about our Deep Learning Containers (DLCs) we built with Google Cloud, to transform the way you build AI with open models on this platform!

With pre-configured, optimized environments for PyTorch Training (GPU) and Inference (CPU/GPU), Text Generation Inference (GPU), and Text Embeddings Inference (CPU/GPU), the Hugging Face DLCs offer:

โšก Optimized performance on Google Cloud's infrastructure, with TGI, TEI, and PyTorch acceleration.
๐Ÿ› ๏ธ Hassle-free environment setup, no more dependency issues.
๐Ÿ”„ Seamless updates to the latest stable versions.
๐Ÿ’ผ Streamlined workflow, reducing dev and maintenance overheads.
๐Ÿ”’ Robust security features of Google Cloud.
โ˜๏ธ Fine-tuned for optimal performance, integrated with GKE and Vertex AI.
๐Ÿ“ฆ Community examples for easy experimentation and implementation.
๐Ÿ”œ TPU support for PyTorch Training/Inference and Text Generation Inference is coming soon!

Find the documentation at https://huggingface.co/docs/google-cloud/en/index
If you need support, open a conversation on the forum: https://discuss.huggingface.co/c/google-cloud/69
jeffboudierย 
posted an update 6 months ago
jeffboudierย 
posted an update 7 months ago
view post
Post
470
Inference Endpoints got a bunch of cool updates yesterday, this is my top 3
jeffboudierย 
posted an update 7 months ago
view post
Post
4086
Pro Tip - if you're a Firefox user, you can set up Hugging Chat as integrated AI Assistant, with contextual links to summarize or simplify any text - handy!

In this short video I show how to set it up
ยท
jeffboudierย 
posted an update 12 months ago
jeffboudierย 
posted an update about 1 year ago