fix #20326 - [Feature]: Support TTL(1h) field in prompt caching for Bedrock Claude 4.5 models #20338
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile OverviewGreptile SummaryThis PR adds support for TTL (Time-To-Live) field in prompt caching for AWS Bedrock Claude 4.5 models, allowing Changes made:
Critical issues found:
Additional concerns:
Confidence Score: 1/5
|
| Filename | Overview |
|---|---|
| litellm/types/llms/bedrock.py | Added ttl field to CachePointBlock TypedDict to support TTL values in prompt caching |
| litellm/llms/bedrock/chat/converse_transformation.py | Modified _get_cache_point_block to extract and validate TTL values ("5m" or "1h") from cache_control before creating cache point blocks |
| litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py | Added Claude 4.5 detection method and modified TTL removal logic, but contains critical logic error that preserves TTL for all models instead of only Claude 4.5 |
Sequence Diagram
sequenceDiagram
participant Client
participant LiteLLM
participant ConverseAPI as Bedrock Converse API
participant InvokeAPI as Bedrock Invoke API
Note over Client,InvokeAPI: TTL Prompt Caching Flow
Client->>LiteLLM: messages with cache_control: {type: "ephemeral", ttl: "1h"}
alt Converse API Path
LiteLLM->>LiteLLM: _get_cache_point_block()
LiteLLM->>LiteLLM: Extract ttl from cache_control
alt ttl is "5m" or "1h"
LiteLLM->>LiteLLM: Add ttl to CachePointBlock
else other ttl value
LiteLLM->>LiteLLM: Create CachePointBlock without ttl
end
LiteLLM->>ConverseAPI: Request with cachePoint: {type: "default", ttl: "1h"}
ConverseAPI->>LiteLLM: Response with cache metrics
else Invoke API Path
LiteLLM->>LiteLLM: _remove_ttl_from_cache_control(model)
LiteLLM->>LiteLLM: _is_claude_4_5_on_bedrock(model)
alt Claude 4.5 model AND ttl is "5m" or "1h"
LiteLLM->>LiteLLM: Keep ttl in cache_control
Note over LiteLLM: BUG: Lines 156-157<br/>always keep "5m"/"1h"<br/>for ALL models
else Other model OR other ttl
LiteLLM->>LiteLLM: Remove ttl from cache_control
end
LiteLLM->>InvokeAPI: Request with modified cache_control
InvokeAPI->>LiteLLM: Response with cache metrics
end
LiteLLM->>Client: Return response with cache tokens
| if ttl in ["5m", "1h"]: | ||
| continue |
There was a problem hiding this comment.
Critical logic error: this code preserves TTL values of "5m" and "1h" for ALL models, not just Claude 4.5 models. This will break the existing test at tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py:108-113 which expects TTL to be removed when no model is passed.
The lines 148-149 already handle the Claude 4.5 case, so lines 156-157 should be deleted. Without them, the logic will be:
- If Claude 4.5 and TTL is "5m" or "1h" → keep it (line 148-149)
- Otherwise, if TTL exists → remove it (line 159)
| if ttl in ["5m", "1h"]: | |
| continue | |
| cache_control.pop("ttl", None) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py
Line: 156:157
Comment:
Critical logic error: this code preserves TTL values of `"5m"` and `"1h"` for ALL models, not just Claude 4.5 models. This will break the existing test at `tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py:108-113` which expects TTL to be removed when no model is passed.
The lines 148-149 already handle the Claude 4.5 case, so lines 156-157 should be deleted. Without them, the logic will be:
- If Claude 4.5 and TTL is "5m" or "1h" → keep it (line 148-149)
- Otherwise, if TTL exists → remove it (line 159)
```suggestion
cache_control.pop("ttl", None)
```
How can I resolve this? If you propose a fix, please make it concise.
litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py
Outdated
Show resolved
Hide resolved
| Remove `ttl` field from cache_control in messages. | ||
| Bedrock doesn't support the ttl field in cache_control. | ||
|
|
||
| Update: bedock supports `5m` and `1h` for Claude 4.5 models. |
There was a problem hiding this comment.
Typo: "bedock" should be "Bedrock"
| Update: bedock supports `5m` and `1h` for Claude 4.5 models. | |
| Update: Bedrock supports `5m` and `1h` for Claude 4.5 models. |
Prompt To Fix With AI
This is a comment left during a code review.
Path: litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py
Line: 125:125
Comment:
Typo: "bedock" should be "Bedrock"
```suggestion
Update: Bedrock supports `5m` and `1h` for Claude 4.5 models.
```
How can I resolve this? If you propose a fix, please make it concise.| cache_point = CachePointBlock(type="default") | ||
| if isinstance(cache_control, dict) and "ttl" in cache_control: | ||
| ttl = cache_control["ttl"] | ||
| if ttl in ["5m", "1h"]: | ||
| cache_point["ttl"] = ttl |
There was a problem hiding this comment.
Verify whether the Converse API supports TTL for all Claude models or only Claude 4.5 models. The Invoke API implementation (in anthropic_claude3_transformation.py) restricts TTL support to Claude 4.5 models. If the Converse API has the same restriction, this code should also check the model version using a method similar to _is_claude_4_5_on_bedrock().
Prompt To Fix With AI
This is a comment left during a code review.
Path: litellm/llms/bedrock/chat/converse_transformation.py
Line: 957:961
Comment:
Verify whether the Converse API supports TTL for all Claude models or only Claude 4.5 models. The Invoke API implementation (in `anthropic_claude3_transformation.py`) restricts TTL support to Claude 4.5 models. If the Converse API has the same restriction, this code should also check the model version using a method similar to `_is_claude_4_5_on_bedrock()`.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
@Lucky-Lodhi2004 Please make fixes based on the reviews
There was a problem hiding this comment.
ok, i made the proposed changes, but i have a question, why do mcp tests always fail when i make a pull request? do failing mcp tests may cause my PR to be rejected (not merged)?
34e5bb2
into
BerriAI:litellm_oss_staging_02_04_2026
Relevant issues
fixes issue #20326
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unitCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes
added ttl support in claude 4.5 models.