WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Conversation

@yungyu16
Copy link

中文版本

请描述这个PR的作用以及为什么需要它

本PR增强了token服务器的请求处理能力,并增加了最大帧长度验证,以防止潜在的安全风险和性能问题。

当公司安全团队使用nmap进行端口扫描时,可能会向token服务器发送畸形数据包。

这些数据包可能包含异常大的长度字段,导致服务器创建极大的字节数组,从而引发过多的内存消耗和Full GC问题。

现象

img

在maxFrameLength最大为1024时,解码ping报文时,会创建16M的临时数组,带来内存压力。

复现

  1. 本地启动 com.alibaba.csp.sentinel.demo.cluster.ClusterServerDemo
  2. 本地安装nmap命令行工具 brew install nmap
  3. 本地执行napp扫描脚本 nmap -oX - 127.0.0.1 -p 11111 -T4 -sT -sV -Pn -n --host-timeout 300000ms --max-retries 1 --min-parallelism 16 --max-scan-delay 5s
img_1
  1. com.alibaba.csp.sentinel.cluster.server.codec.data.PingRequestDataDecoder.decode 断点,可以看到解码出超大的length

原理

根据namp端口特征 规则库可以看到,DNSVersionBindReqTCP 类型的探测报文会被token server误解码为ping包。
https://raw.githubusercontent.com/nmap/nmap/refs/heads/master/nmap-service-probes

img_2

这个PR是否修复了某个问题?

修复了畸形数据包中异常大的长度字段可能导致token服务器过度内存分配和Full GC的问题。

请描述您是如何解决的

  1. ServerConstants.java中添加了一个值为1024的常量NETTY_MAX_FRAME_LENGTH,用于定义允许的最大帧长度。
  2. 修改了NettyTransportServer.java,在LengthFieldBasedFrameDecoder中使用NETTY_MAX_FRAME_LENGTH常量替代硬编码值。
  3. 增强了ParamFlowRequestDataDecoder.java,对字符串参数长度进行验证,如果超过最大帧长度则抛出异常。
  4. 改进了PingRequestDataDecoder.java,检测并记录可能是端口扫描尝试的异常数据包。

请描述如何验证这个PR

  1. 启动Sentinel集群token服务器
  2. 发送正常请求以验证服务器功能正常
  3. 发送包含异常大长度字段的畸形数据包以模拟端口扫描尝试
  4. 验证服务器记录警告消息并优雅地处理数据包,而不会创建大的字节数组
  5. 检查超过最大帧长度的字符串参数请求是否被适当的异常拒绝

特别说明(给评审人员)

此修复解决了畸形数据包可能导致过度内存分配的潜在安全和性能问题。该解决方案引入了适当的数据包大小验证和限制,以防止拒绝服务场景的发生。

English Version

Describe what this PR does / why we need it

This PR enhances the token server's request handling and adds max frame length validation to prevent potential security risks and performance issues. When the company's security team performs port scanning using nmap, malformed packets may be sent to the token server. These packets may contain abnormally large length fields, which could cause the server to create extremely large byte arrays, leading to excessive memory consumption and Full GC issues.

Phenomenon

img

When maxFrameLength is set to a maximum of 1024, decoding ping packets creates a 16M temporary array, causing memory pressure.

Reproduction Steps

  1. Start locally: com.alibaba.csp.sentinel.demo.cluster.ClusterServerDemo
  2. Install nmap command line tool locally: brew install nmap
  3. Execute nmap scanning script locally: nmap -oX - 127.0.0.1 -p 11111 -T4 -sT -sV -Pn -n --host-timeout 300000ms --max-retries 1 --min-parallelism 16 --max-scan-delay 5s
img_1
  1. Set a breakpoint in com.alibaba.csp.sentinel.cluster.server.codec.data.PingRequestDataDecoder.decode to see the decoded oversized length

Principle

According to the nmap port characteristic rule base, DNSVersionBindReqTCP type probe packets are misdecoded by the token server as ping packets.
https://raw.githubusercontent.com/nmap/nmap/refs/heads/master/nmap-service-probes

img_2

Does this pull request fix one issue?

Fixes the issue where malformed packets with abnormally large length fields could cause excessive memory allocation and Full GC in the token server.

Describe how you did it

  1. Added a constant NETTY_MAX_FRAME_LENGTH with a value of 1024 in ServerConstants.java to define the maximum frame length allowed.
  2. Modified NettyTransportServer.java to use the NETTY_MAX_FRAME_LENGTH constant in the LengthFieldBasedFrameDecoder instead of a hardcoded value.
  3. Enhanced ParamFlowRequestDataDecoder.java to validate the string parameter length against the maximum frame length and throw an exception if it exceeds the limit.
  4. Improved PingRequestDataDecoder.java to detect and log abnormal packets that may be port scanning attempts.

Describe how to verify it

  1. Start the Sentinel cluster token server
  2. Send a normal request to verify that the server functions properly
  3. Send a malformed packet with an abnormally large length field to simulate a port scanning attempt
  4. Verify that the server logs a warning message and handles the packet gracefully without creating large byte arrays
  5. Check that requests with string parameters exceeding the maximum frame length are rejected with an appropriate exception

Special notes for reviews

This fix addresses a potential security and performance issue where malformed packets could cause excessive memory allocation. The solution introduces proper validation and limits on packet sizes to prevent denial-of-service scenarios.

@CLAassistant
Copy link

CLAassistant commented Nov 11, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


yungyu16 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@yungyu16
Copy link
Author

这个问题,有可能被恶意利用,影响微服务进程可用性。

@yungyu16 yungyu16 force-pushed the 1.8 branch 2 times, most recently from c7758bf to 8a14d79 Compare November 11, 2025 05:30
@yungyu16
Copy link
Author

image 为什么我已经签署了CLA,但是机器人没有更新状态

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants