News
AWS Previews Ability To Customize Anthropic's Claude 3 Haiku AI Model
Amazon Web Services (AWS) and Anthropic are developing a new capability that would let organizations use their proprietary data to customize Anthropic's Claude 3 Haiku model.
The capability, currently in preview, is available to users of Amazon Bedrock using the U.S. West (Oregon) AWS region.
Haiku is the smallest and least pricey of generative AI firm Anthropic's Claude 3 line of AI models. Haiku and its more sophisticated counterparts, Opus and Sonnet, are available on Bedrock, Amazon's managed AI development platform that gives users access to pre-trained AI models via API.
The ability to fine-tune Haiku means that developers can train the model based on their specific business needs and using their own organization's data that's stored in an AWS S3 bucket. This enables the model to return outputs that are more contextually relevant than a general model can.
In a detailed blog post last week, AWS explained the fine-tuning process this way:
During fine-tuning, the weights of the pre-trained Anthropic Claude 3 Haiku model will get updated to enhance its performance on a specific target task. Fine-tuning allows the model to adapt its knowledge to the task-specific data distribution and vocabulary. Hyperparameters like learning rate and batch size need to be tuned for optimal fine-tuning.
AWS proposed the following sample use cases for the capability:
- Classification: For example, when you have 10,000 labeled examples and want Anthropic Claude to do really well at this task
- Structured outputs: For example, when you need Anthropic Claude's response to always conform to a given structure
- Industry knowledge: For example, when you need to teach Anthropic Claude how to answer questions about your company or industry
- Tools and APIs: For example, when you need to teach Anthropic Claude how to use your APIs really well
A fine-tuned Haiku model can even be more useful for an organization than Opus or Sonnet, while still being faster and less expensive.
"This process enhances task-specific model performance, allowing the model to handle custom use cases with task-specific performance metrics that meet or surpass more powerful models like Anthropic Claude 3 Sonnet or Anthropic Claude 3 Opus," said AWS. "As a result, businesses can achieve improved performance with reduced costs and latency."
Organizations may be wary of using their private data to ground an AI model, but Anthropic assured users in a separate blog post that "[p]roprietary training data remains within customers' AWS environment."
Neither AWS nor Anthropic indicated when or if a fine-tuning capability is in the works for Anthropic's other Claude 3 models.
Currently, the fine-tuning capability is limited to text up to 32,000 tokens in length, though Anthropic said it plans to add "vision capabilities" at some point.