
Amazon SageMaker JumpStart
ML hub with curated foundation models, pretrained algorithms, and solution templates you can deploy and fine-tune in SageMaker
Discover top open-source software, updated regularly with real-world adoption signals.

Efficiently fine-tune large models with minimal parameters
PEFT enables state‑of‑the‑art parameter‑efficient fine‑tuning, cutting compute and storage while matching full‑model performance, integrated with Transformers, Diffusers, and Accelerate.

PEFT (Parameter‑Efficient Fine‑Tuning) provides a suite of methods—such as LoRA, IA³, and soft prompts—that adapt large pretrained models by training only a tiny fraction of parameters. This reduces GPU memory and storage requirements dramatically, often to under 0.2% of the original model, while delivering accuracy comparable to full fine‑tuning.
The library plugs directly into the Hugging Face ecosystem: wrap a base model with get_peft_model, train using the standard Trainer or Accelerate for distributed workloads, and save adapters that are only a few megabytes in size. PEFT adapters can be combined with quantization and CPU offloading to run on consumer‑grade hardware, making it practical to fine‑tune 12B‑parameter models on a single A100 or even a 16 GB GPU.
When teams consider PEFT, these hosted platforms usually appear on the same shortlist.

ML hub with curated foundation models, pretrained algorithms, and solution templates you can deploy and fine-tune in SageMaker

Enterprise AI platform providing LLMs (Command, Aya) plus Embed/Rerank for retrieval

API-first platform to run, fine-tune, and deploy AI models without managing infrastructure
Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.
Sentiment analysis with a 12B LLM on a single A100
Achieves near‑full‑model accuracy while using under 10 GB GPU memory
Multilingual ASR fine‑tuning of Whisper large with LoRA + 8‑bit quantization
Enables real‑time transcription on a 16 GB GPU
SaaS product serving multiple tasks from one base model
Reduces per‑task storage to a few megabytes via adapters
Instruction‑template experimentation on T0‑3B using LoRA
Improves accuracy with minimal compute overhead
PEFT stands for Parameter‑Efficient Fine‑Tuning, a set of techniques that adapt large models by training only a small subset of added parameters.
Often less than 0.2% of the total parameters, e.g., a few million trainable weights in a multi‑billion‑parameter model.
Yes, PEFT adapters can be used together with 8‑bit or lower‑precision quantization to further reduce memory and compute.
PEFT works natively with Transformers, Diffusers, and Accelerate for training and inference.
PEFT adapters are typically a few megabytes, whereas full model checkpoints can be several gigabytes.
Project at a glance
ActiveLast synced 4 days ago