🚨 Severity: HIGH Vulnerability: SSRF in load_web_page tool#4368
🚨 Severity: HIGH Vulnerability: SSRF in load_web_page tool#4368Nandan-D14 wants to merge 2 commits intogoogle:mainfrom
Conversation
🚨 Severity: HIGH 💡 Vulnerability: SSRF in load_web_page tool. The tool allowed fetching arbitrary URLs including internal IPs (localhost, 192.168.x.x) and lacked timeout protection. 🎯 Impact: Attackers could access internal services, metadata services, or cause DoS. 🔧 Fix: Added `_is_safe_url` validation to check scheme, block private IPs (direct and resolved), and added a 10s timeout. ✅ Verification: Added `tests/unittests/tools/test_load_web_page.py` verifying that private IPs are blocked and public URLs are allowed. Co-authored-by: Nandan-D14 <154668653+Nandan-D14@users.noreply.github.com>
…-675239147882500237 🛡️ Sentinel: [HIGH] Fix SSRF in load_web_page
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @Nandan-D14, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request delivers a critical security enhancement by patching a high-severity Server-Side Request Forgery (SSRF) vulnerability within the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Response from ADK Triaging Agent Hello @Nandan-D14, thank you for creating this PR! Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). You can visit https://cla.developers.google.com/ to see your current agreements or to sign a new one. This information will help reviewers to review your PR more efficiently. Thanks! |
There was a problem hiding this comment.
Code Review
This pull request effectively addresses a high-severity Server-Side Request Forgery (SSRF) vulnerability in the load_web_page tool. The fix is comprehensive, adding checks for URL schemes, direct private IP addresses, and DNS-resolved private IPs, along with a request timeout to mitigate DoS risks. The new unit tests are a great addition and validate the core security fix. My feedback includes a minor refactoring suggestion to improve code clarity and a recommendation for an additional test case to enhance coverage. The other changes are minor stylistic improvements.
| def _is_safe_url(url: str) -> bool: | ||
| """Checks if the url is safe to browse.""" | ||
| try: | ||
| parsed = urlparse(url) | ||
| except ValueError: | ||
| return False | ||
|
|
||
| if parsed.scheme not in ('http', 'https'): | ||
| return False | ||
|
|
||
| hostname = parsed.hostname | ||
| if not hostname: | ||
| return False | ||
|
|
||
| try: | ||
| # Check if the hostname is an IP address | ||
| ip = ipaddress.ip_address(hostname) | ||
| if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_unspecified: | ||
| return False | ||
| except ValueError: | ||
| # Check if the hostname resolves to a private IP address | ||
| try: | ||
| addr_info = socket.getaddrinfo(hostname, None) | ||
| for res in addr_info: | ||
| ip_str = res[4][0] | ||
| ip = ipaddress.ip_address(ip_str) | ||
| if ( | ||
| ip.is_private | ||
| or ip.is_loopback | ||
| or ip.is_link_local | ||
| or ip.is_unspecified | ||
| ): | ||
| return False | ||
| except socket.gaierror: | ||
| return False | ||
|
|
||
| return True |
There was a problem hiding this comment.
To improve readability and reduce code duplication, the IP address validation logic is repeated. You can extract this check into a nested helper function.
def _is_safe_url(url: str) -> bool:
"""Checks if the url is safe to browse."""
def _is_restricted_ip(ip: ipaddress.ip_address) -> bool:
"""Checks if an IP address is in a restricted range."""
return ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_unspecified
try:
parsed = urlparse(url)
except ValueError:
return False
if parsed.scheme not in ('http', 'https'):
return False
hostname = parsed.hostname
if not hostname:
return False
try:
# Check if the hostname is an IP address
ip = ipaddress.ip_address(hostname)
if _is_restricted_ip(ip):
return False
except ValueError:
# Check if the hostname resolves to a private IP address
try:
addr_info = socket.getaddrinfo(hostname, None)
for res in addr_info:
ip_str = res[4][0]
ip = ipaddress.ip_address(ip_str)
if _is_restricted_ip(ip):
return False
except socket.gaierror:
return False
return True| # Verify requests.get was called | ||
| mock_get.assert_called_with(url, allow_redirects=False, timeout=10) | ||
| self.assertIn('Some Public Content', result) | ||
|
|
There was a problem hiding this comment.
The new tests are great for verifying the SSRF fix. To make the test suite even more robust, consider adding a test case that explicitly checks for a domain name that resolves to a private IP address. This will ensure the DNS resolution check is working as expected.
| @patch('requests.get') | |
| @patch('socket.getaddrinfo') | |
| def test_load_web_page_blocks_domain_resolving_to_private_ip(self, mock_getaddrinfo, mock_get): | |
| """Verifies that load_web_page blocks fetches to domains resolving to private IPs.""" | |
| # Mock socket resolution to a private IP | |
| mock_getaddrinfo.return_value = [(2, 1, 6, '', ('192.168.1.1', 80))] | |
| url = 'http://internal.service.local' | |
| # Call the function | |
| result = load_web_page(url) | |
| # Verify requests.get was NOT called | |
| mock_get.assert_not_called() | |
| self.assertIn('The url is restricted', result) | |
🚨 Severity: HIGH
💡 Vulnerability: SSRF in load_web_page tool. The tool allowed fetching arbitrary URLs including internal IPs (localhost, 192.168.x.x) and lacked timeout protection.
🎯 Impact: Attackers could access internal services, metadata services, or cause DoS.
🔧 Fix: Added
_is_safe_urlvalidation to check scheme, block private IPs (direct and resolved), and added a 10s timeout.✅ Verification: Added
tests/unittests/tools/test_load_web_page.pyverifying that private IPs are blocked and public URLs are allowed.