Copy Right Notice: The materials presented below are for academic use only. Copyright and all rights therein are retained by the authors or by the respective copyright holders.
Programmable switches have revolutionized network operations by enabling the flexible customization of packet processing logic using language like P4. However, changing the programs running on the switch requires disturbing traffic and suspending other unrelated programs. In this paper, we present P4runpro, enabling runtime data plane updates with dynamic resource allocation. The P4runpro data plane abstracts hardware resources and defines dynamically reconfigurable atomic operations that form packet processing logic. P4runpro provides runtime programming interfaces called P4runpro primitives for the operator to write high-level programs. We have designed the P4runpro compiler to automatically and consistently link the P4runpro programs to the running data plane. We implement our prototype on a Tofino switch. We implement 15 example runtime programs using P4runpro to demonstrate its generality and expressiveness. Our evaluation results show that compared to the state-of-the-art, P4runpro can respond within hundreds of milliseconds, achieve an average of 60% to 80% dynamic resource utilization, concurrently run ≈0.6K to ≈2.8K programs, and introduce lower overhead. Our case studies illustrate the benefit of runtime programming and prove the same functionality between P4runpro and conventional P4 programs.
INFOCOM
AggDeliv: Aggregating Multiple Wireless Links for Efficient Mobile Live Video Delivery
Jinlong E, Lin He, Zongyi Zhao, Yachen Wang, Gonglong Chen, and Wei Chen
In Proceedings of the 43rd IEEE Conference on Computer Communications (INFOCOM) Vancouver, Canada, May 20–23, 2024
Mobile live-streaming applications with stringent latency and bandwidth requirements have gained tremendous attention in recent years. Encountered with bandwidth insufficiency and congestion instability of the wireless uplinks, multi-access networking provides opportunities to achieve fast and robust connectivity. However, the state-of-the-art multi-path transmission solutions are lack of adaptivity to the heterogeneous and dynamic nature of wireless networks. Meanwhile, the indispensable video coding and transformation bring about extra latency and make the video delivery vulnerable to network throughput fluctuation. This paper presents AggDeliv, a framework that provides efficient and robust multi-path transmission for mobile live video delivery. The key idea is to relate multi-path packet scheduling to congestion control optimization over diverse wireless links and adapt it to the mobile video characteristics. This is achieved by probabilistic packet allocation based on links’ congestion windows, wireless-oriented delay and loss aware congestion control, as well as lightweight video frame coding and network adaptive frame-packet transformation. Real-world evaluations demonstrate that our framework significantly outperforms the state-of-the-art solutions on aggregate goodput and streaming video bitrate.
ICNP
Luori: Active Probing and Evaluation of Internet-wide IPv6 Fully Responsive Prefixes
Daguo Cheng, Lin He#, Chentian Wei, Qilei Yin, Boran Jin, Zhaoan Wang, Xiaoteng Pan, Sixu Zhou, Ying Liu, Shenglin Zhang, Fuchao Tan, and Wenmao Liu
In Proceedings of the 32nd IEEE International Conference on Network Protocols (ICNP) Charleroi, Belgium, October 28-31, 2024
With the large-scale deployment and application of IPv6, IPv6 network measurements will become increasingly important. However, a special type of IPv6 prefix called Fully Responsive Prefix (FRP) is having a significant impact on IPv6 measurement campaigns, which is defined as all addresses under a prefix responding to scans. Obviously, there cannot be a real responder behind each of these addresses. To reveal the current status and impact of Internet-wide IPv6 FRPs, we propose for the first time an active probing method for Internet-wide IPv6 FRPs, Luori, which transforms the active probing process under IPv6 huge prefix space (potential range of prefix presence) into a dynamic search process in a tree based on reinforcement learning, achieving efficient probing of arbitrary routing prefixes. The evaluation results show that Luori found 31.7K largest FRPs in a single Internet-wide probing with 11M budget, covering 1.5×1030 address space, which is 106× that of existing methods. More importantly, after six months of Internet-wide probing, we have found 516K largest FRPs, which covers 1.3 × 1033 address space and 795 ASes, making it the largest publicly known FRP list. Based on this list, we screen out 20% of the addresses covered by FRPs from a well-known IPv6 active address dataset. Furthermore, we further analyze and find that the distribution of these FRPs is extensive and their implementation methods are diverse, which can provide beneficial references for the practical application of FRPs. We also make this list publicly available and maintain it long-term for use and study by relevant researchers.
ICNP
6Vision: Image-encoding-based IPv6 Target Generation in Few-seed Scenarios
Efficient global Internet scanning is crucial for network measurement and security analysis. While existing target generation algorithms verify remarkable performance in largescale detection, their efficiency notably diminishes in few-seed scenarios. This decline is primarily attributed to the intricate configuration rules and sampling bias of seed addresses. Moreover, instances where BGP prefixes have few seed addresses are widespread, constituting 63.65% of occurrences. We introduce 6Vision to tackle this challenge by introducing a novel approach of encoding IPv6 addresses into images, facilitating comprehensive analysis of intricate configuration rules. Through feature stitching, 6Vision not only improves the learnable features but also amalgamates addresses associated with configuration patterns for enhanced learning. Moreover, it integrates an environmental feedback mechanism to refine model parameters based on identified active addresses, thereby alleviating the sampling bias inherent in seed addresses. As a result, 6Vision achieves high-accuracy detection even in few-seed scenarios. The HitRate of 6Vision is improved by 181%∼2,490% compared to existing algorithms, while the CoverNum is 1.18∼11.20 times that of them. Additionally, 6Vision can function as a preliminary detection module for existing algorithms, yielding a conversion gain (CG) ranging from 242%∼2,081%. Ultimately, we achieve a conversion rate (CR) of 28.97% for few-seed scenarios. We enrich the IPv6 hitlist, not only enhancing current target generation algorithms for large-scale address detection in few-seed scenarios but also effectively supporting IPv6 network measurement and security analysis.
IPCCC
Overlooked Backdoors: Investigating 6to4 Tunnel Nodes and Their Exploitation in the Wild
As native IPv6 adoption increases, the use of 6to4 tunnels has declined, yet they remain a significant security concern in today’s Internet. This study investigates the real-world deployment of 6to4 tunnels, revealing their current scale, characteristics, and security implications. We identify open 6to4 relays in 216 countries and 13,114 autonomous systems, noting stable short-term counts but a long-term decline. We analyze the security of these nodes and find over 578k nodes vulnerable to address spoofing and packet injection. Additionally, we present several under-emphasized scenarios where open 6to4 nodes are abused, including leveraging services on 6to4 nodes as traffic amplifiers, circumventing restrictions using multiple 6to4 addresses, and connecting 6to4 nodes to render attacks untraceable.
2023
NDSS
Your Router is My Prober: Measuring IPv6 Networks via ICMP Rate Limiting Side Channels
Active Internet measurements face challenges when some measurements require many remote vantage points. In this paper, we propose a novel technique for measuring remote IPv6 networks via side channels in ICMP rate limiting, a required function for IPv6 nodes to limit the rate at which ICMP error messages are generated. This technique, iVantage, can to some extent use 1.1M remote routers distributed in 9.5k autonomous systems and 182 countries as our “vantage points”.We apply iVantage to two different, but both challenging measurement tasks: 1) measuring the deployment of inbound source address validation (ISAV) and 2) measuring reachability between arbitrary Internet nodes. We accomplish these two tasks from only one local vantage point without controlling the targets or relying on other services within the target networks. Our large-scale ISAV measurements cover 50% of all IPv6 autonomous systems and find 79% of them are vulnerable to spoofing, which is the most large-scale measurement study of IPv6 ISAV to date. Our method for reachability measurements achieves over 80% precision and recall in our evaluation. Finally, we perform an Internet-wide measurement of the ICMP rate limiting implementations, present a detailed discussion on ICMP rate limiting, particularly the potential security and privacy risks in the mechanism of ICMP rate limiting, and provide possible mitigation measures. We make our code available to the community.
With desired functionality of moving object tracking, wireless pan-tilt cameras are able to play critical roles in a growing diversity of surveillance environments. However, today’s pan-tilt cameras oftentimes underperform when tracking frequently moving objects like humans – they are prone to lose sight of objects and bring about excessive mechanical rotations that are especially detrimental to those energy-constrained outdoor scenarios. The ineffectiveness and high cost of state-of-the-art tracking approaches are rooted in their adherence to the industry’s simplicity principle, which leads to their stateless nature, performing gimbal rotations based only on the latest object detection. To address the issues, we design and implement WiseCam that wisely tunes the pan-tilt cameras to minimize mechanical rotation costs while maintaining long-term object tracking. We examine the performance of WiseCam by experiments on two types of pan-tilt cameras with different motors. Results show that WiseCam significantly outperforms the state-of-the-art tracking approaches on both tracking duration and power consumption.
IWQoS
Which Doors Are Open: Reinforcement Learning-based Internet-wide Port Scanning
Internet-wide scanning is a commonly used research technique in various network surveys, such as measuring service deployment and security vulnerabilities. However, these network surveys are limited to the given port set, not comprehensively obtaining the real network landscape, and even misleading survey conclusions. In this work, we introduce PMap, a port scanning tool that efficiently discovers the majority of open ports from all 65K ports in the whole network. PMap uses the correlation of ports to build an open port correlation graph of each network, using a reinforcement learning framework to update the correlation graph based on feedback results and dynamically adjust the order of port scanning. Compared to current port scanning methods, PMap achieves better performance on hit rate, coverage, and intrusiveness. Our experiments over real-world networks show that PMap can find 90% open ports by only scanning 125 ports (90%@125) to each active address with 136× less than the state-of-the-art port probing methods. PMap reduces the number of scanned ports to decrease the intrusive nature of port scanning. PMap is the first effective practice for scanning open ports using reinforcement learning. It bridges the gap of existing scanning tools and effectively supports subsequent service discovery and security research.
IWQoS
GraphIoT: Accurate IoT Identification based on Heterogeneous Graph
IoT devices deployed on campus and enterprise networks facilitate people’s lives and work. However, these devices also bring serious network asset management and security management problems. IoT device identification is the premise to solve these problems. Although current IoT identification methods can identify devices with relatively high accuracy in ideal environments, it is difficult to accurately identify devices in real-world complex environments (e.g., campus networks, enterprise networks). Therefore, we propose to use exact features. To solve the problem of different dimensions of exact features, we creatively model the IoT identification problem as a heterogeneous graph representation learning problem and design a new representation learning algorithm. We are the first to propose an approach to accurately identify IoT devices in real-world complex environments and solve this problem through heterogeneous graphs. The evaluation shows that GraphIoT’s macro F1 is on average 13.58% and 12.77% higher than the other methods on two public datasets.
2022
USENIX ATC
AddrMiner: A Comprehensive Global Active IPv6 Address Discovery System
Guanglei Song, Jiahai Yang#, Lin He#, Zhiliang Wang, Guo Li, Chenxin Duan, Yaozhong Liu, and Zhongxiang Sun
In Proceedings of the 2022 USENIX Annual Technical Conference (USENIX ATC) Carlsbad, CA, USA, July 11–13, 2022
Fast Internet-wide scanning is essential for network situational awareness and asset evaluation. However, the vast IPv6 address space makes brute-force scanning infeasible. Although state-of-the-art techniques have made effective attempts, these methods do not work in seedless regions, while the detection efficiency is low in regions with seeds. Moreover, the constructed hitlists with low coverage cannot truly represent the active IPv6 address landscape of the Internet.
This paper introduces AddrMiner, a global active IPv6 address probing system, making IPv6 active address probing systematic, comprehensive, and economical. We divide the IPv6 address space regions into three kinds according to the number of seed addresses and propose a probing algorithm for each of them. For the regions with no seeds, we propose AddrMiner-N, leveraging an organization association strategy to mine active addresses. It finds active addresses covering 86.4K BGP prefixes, accounting for 81.6% of the probed BGP prefixes. For the regions with few seeds, we propose AddrMiner-F, utilizing a similarity matching strategy to probe active addresses further. The hit rate of active address probing is improved by 70%-150% compared to existing algorithms. For the regions with sufficient seeds, we propose AddrMiner-S to generate target addresses based on reinforcement learning dynamically. It nearly doubles the hit rate compared to the state-of-the-art algorithms. Finally, we deploy AddrMiner and discover 2.1 billion active IPv6 addresses, including 1.7 billion de-aliased active addresses and 0.4 billion aliased addresses, through continuous probing for 13 months. We would like to further open the door of IPv6 measurement studies by publicly releasing AddrMiner and sharing our data.
USENIX ATC
Firebolt: Finding Bugs in Programmable Data Plane Generators
Programmable data planes (DP) enable flexible customization of packet processing logic with domain-specific languages such as P4. To relieve developers from lengthy codes and tedious hardware details, many researches propose DP program generators that take high-level intents as input and automatically convert intents into DP programs. Generators must be correct, otherwise they may produce buggy programs or DP logic that is inconsistent with intents. Nevertheless, existing verification tools are designed to verify individual DP programs, not generators. They either cannot achieve high bug coverage or cannot debug generators with high scalability.
This paper presents Firebolt, a blackbox testing tool designed to dig out faults in DP program generators, including security vulnerabilities, intent violations, and generator crash. Firebolt achieves high bug coverage by using syntax-guided intent generation to construct a comprehensive, syntactically correct, and semantically valid intent set. To avoid intent explosion, Firebolt designs an intent space pruning approach that eliminates redundant intents while preserving representative ones. For high scalability, Firebolt automatically formalizes DP programs and intents for verification. We apply Firebolt to three popular open-source DP generators. Evaluation results demonstrate that Firebolt can detect 2x bugs with 0.1% to 0.01% human efforts compared to existing tools.
GLOBECOM
What Causes Delay Asymmetry: A Large-scale One-way Delay Measurement and Empirical Study
In global communications, severe one-way delay (OWD) asymmetry often occurs. Due to the difficulties of OWD measurement (need to control both ends and synchronize their clocks), now RTT/2 is commonly used to estimate OWD. However, OWD asymmetry can lead to large errors in the halving RTT method, which in turn affects the end-to-end quality of service (QoS) guarantees. In this paper, we investigate OWD asymmetry through large-scale OWD measurements on a global scale. The measurements show that more than 11% of network paths have OWDs with a relative difference of more than 10% compared to RTT/2. By analyzing the measurement results in depth, we try to explain why the delay asymmetry occurs. We find that 67% is caused by hop inflation or a significant increase in propagation distance, and 33% is caused by variable queuing delays. We also find AS-level paths between node pairs with significant delay asymmetry are much more likely (∼ 10×) to violate the well-known valley-free rule.
GLOBECOM
Both Efficient and Accurate: A Large-scale One-way Delay Measurement Scheme
One-way delay (OWD) is one of the essential network performance metrics. In large-scale resilient overlay networks (RONs), OWD measurements can be used for shortest path selection and troubleshooting. However, OWD measurements remain difficult because of the need for precise time synchronization. Especially in large-scale networks, clock synchronization of all nodes has always been a considerable challenge. Therefore, in many cases, people use half of the round-trip time (RTT/2) as a rough substitute for the OWD. This paper presents an efficient and easy-to-deploy scheme for large-scale OWD measurements with the algorithm ClockConverger at its core. The scheme consists of three steps: Firstly, we perform low-precision time synchronization for all the measured nodes relying on network time protocol daemons (ntpd); Then, we use the open-source tool OWPing to perform OWD measurements; Finally, we correct the errors of the measured OWDs with our proposed ClockConverger. The theory and experiments show that our scheme’s accuracy is significantly better than RTT/2. Meanwhile, the complexity of ClockConverger is O(n^2), which is much lower than the exponential complexity of the existing Maximum-Entropy algorithm.
CNSM
Towards a Behavioral and Privacy Analysis of ECS for IPv6 DNS Resolvers
The Domain Name System (DNS) is critical to Internet communications. EDNS Client Subnet (ECS), a DNS extension, allows recursive resolvers to include client subnet information in DNS queries to improve end-user mapping, extending the visibility of client information to a broader range. Major content delivery network (CDN) vendors, content providers (CP), and public DNS service providers (PDNS) are accelerating their IPv6 infrastructure development. With the increasing deployment of IPv6-enabled services and DNS being the most foundational system of the Internet, it becomes important to analyze the behavioral and privacy status of IPv6 resolvers. However, there is a lack of research on ECS for IPv6 DNS resolvers. In this paper, we study the ECS deployment and compliance status of IPv6 resolvers. Our measurement shows that 11.12% IPv6 open resolvers implement ECS. We discuss abnormal non-compliant scenarios that exist in both IPv6 and IPv4 that raise privacy and performance issues. Additionally, we measured if the sacrifice of clients’ privacy can enhance IPv6 CDN performance. We find that in some cases ECS helps end-user mapping but with an unnecessary privacy loss. And even worse, the exposure of client address information can sometimes backfire, which deserves attention from both Internet users and PDNSes.
CNSM
PerfTrace: A New Multi-metric Network Performance Monitoring Tool
We present PerfTrace, an end-to-end tool for efficient, real-time, multi-metric network performance monitoring. PerfTrace provides a high integration of different existing measurement functions, supporting the measurement of essential metrics such as latency, jitter, packet loss, and available bandwidth. More importantly, innovative schemes and algorithms are proposed to address the weaknesses of existing tools.
We evaluate PerfTrace on the Internet and our testbed, respectively. We find that (i) PerfTrace measures one-way and two-way latency, jitter, and packet loss ∼9.4× faster and ∼3.6× more data-efficiently; (ii) PerfTrace measures available bandwidth in our testbed with minimal mean relative error (5.22%), outperforming all the tools compared (ranging from 8.17% to 37.24%); (iii) PerfTrace measures available bandwidth with better accuracy and adaptability in different real-world network environments. Meanwhile, PerfTrace consumes a more constant percentage of bandwidth resources than other tools when monitoring available bandwidth. PerfTrace’s data overhead is always only about 1/600 of the total bandwidth for a measurement frequency once per minute.
2021
IWQoS
Towards Chain-Aware Scaling Detection in NFV with Reinforcement Learning
Elastic scaling enables dynamic and efficient re-source provisioning in Network Function Virtualization (NFV) to serve fluctuating network traffic. Scaling detection determines the appropriate time when a virtual network function (VNF) needs to be scaled, and its precision and agility profoundly affect system performance. Previous heuristics define fixed control rules based on a simplified or inaccurate understanding of deployment environments and workloads. Therefore, they fail to achieve optimal performance across a broad set of network conditions.In this paper, we propose a chain-aware scaling detection mechanism, namely CASD, which learns policies directly from experience using reinforcement learning (RL) techniques. Furthermore, CASD incorporates chain information into control policies to efficiently plan the scaling sequence of VNFs within a service function chain. This paper makes the following two key technical contributions. Firstly, we develop chain-aware representations, which embed global chains of arbitrary sizes and shapes into a set of embedding vectors based on graph embedding techniques. Secondly, we design an RL-based neural network model to make scaling decisions based on chain-aware representations. We implement a prototype of CASD, and its evaluation results demonstrate that CASD reduces the overall system cost and improves system performance over other baseline algorithms across different workloads and chains.
IWQoS
pSAV: A Practical and Decentralized Inter-AS Source Address Validation Service Framework
Jiamin Cao, Ying Liu#, Mingxing Liu, Lin He#, Yihao Jia, and Fei Yang
In Proceedings of the 29th IEEE/ACM International Symposium on Quality of Service (IWQoS) Virtual, June 25-28, 2021
Source IP address spoofing has been a major vulnerability of the Internet for many years. Although much work has been done to study the problem extensively, spoofing continues to occur frequently and has led to many serious network attacks. Inter-AS source address validation (SAV) is considered an important defense method for AS to filter spoofed packets. However, existing work has been unable to drive inter-AS SAV deployment into practice due to the lack of deployment incentives and trust foundation.In this paper, we propose a practical and decentralized inter-AS SAV service framework, pSAV, to promote inter-AS SAV deployment. pSAV increases deployment incentives by treating SAV as a payable service and dividing the participant ASes into service subscribers, providers, and auditors. On the control plane, pSAV leverages blockchain as a trust foundation to provide service subscriptions and audits with automatic incentive allocation. On the data plane, pSAV leverages P4-programmable switches to provide flexible and high-performance SAV services. We prototype the pSAV control plane based on Hyperledger Fabric and implement various SAV techniques on Barefoot Tofino switches. The evaluation results show that (1) on the control plane, pSAV blockchain can provide high-performance service transactions (hundreds of transactions per second with second latency), and (2) on the data plane, pSAV can provide various high-throughput (hundreds of Gbps) SAV services using only one programmable switch.
ISSRE
CloudPin: A Root Cause Localization Framework of Shared Bandwidth Package Traffic Anomalies in Public Cloud Networks
Shize Zhang, Yunfeng Zhao, Jianyuan Lu, Biao Lyu, Shunmin Zhu, Zhiliang Wang, Jiahai Yang, Lin He, and Jianping Wu
In Proceedings of the 32nd IEEE International Symposium on Software Reliability Engineering (ISSRE) Wuhan, China, October 25-28, 2021
Due to the sharing nature of public cloud, most of the cloud services use a sharing bandwidth package (sBwp) model to conduct inbound/outbound communication. The sBwp model allows users to purchase a sharing bandwidth for plenty of virtual machines instead of purchasing bandwidth for each virtual machine separately. The advantage of sBwp is that it can provide users with convenient configuration and lower economic cost. However, the sBwp model brings new challenges for operators to localize the root cause of traffic anomalies of a sharing bandwidth, especially for a globally distributed large-scale public cloud with millions of users. In this paper, we first formalize the sBwp problem on the cloud and propose CloudPin, a root cause localization framework for this problem. Our framework solves all the challenges by employing a multi-dimensional algorithm with three sub-models of prediction deviation, anomaly amplitude, and shape similarity, and an overall ranking algorithm. Evaluations on real-world data, from one of the world-renowned public cloud vendors, show that our algorithm precision reaches 97.8% for the top 1 of the ranking list, outperforming multiple baseline algorithms.
ICC
Deception Maze: A Stackelberg Game-Theoretic Defense Mechanism for Intranet Threats
The intranets in modern organizations are facing severe data breaches and critical resource misuses. By reusing user credentials from compromised systems, Advanced Persistent Threat (APT) attackers can move laterally within the internal network. A promising new approach called deception technology makes the network administrator (i.e., defender) able to deploy decoys to deceive the attacker in the intranet and trap him into a honeypot. Then the defender ought to reasonably allocate decoys to potentially insecure hosts. Unfortunately, existing APT-related defense resource allocation models are infeasible because of the neglect of many realistic factors.In this paper, we make the decoy deployment strategy feasible by proposing a game-theoretic model called the APT Deception Game to describe interactions between the defender and the attacker. More specifically, we decompose the decoy deployment problem into two subproblems and make the problem solvable. Considering the best response of the attacker who is aware of the defender’s deployment strategy, we provide an elitist reservation genetic algorithm to solve this game. Simulation results demonstrate the effectiveness of our deployment strategy compared with other heuristic strategies.
MSN
TAP: A Traffic-Aware Probabilistic Packet Marking for Collaborative DDoS Mitigation
Mingxing Liu, Ying Liu, Ke Xu, Lin He, Xiaoliang Wang, Yangfei Guo, and Weiyu Jiang
In Proceedings of the 17th International Conference on Mobility, Sensing and Networking (MSN) Exeter, UK, December 13-15, 2021
In recent years, Distributed Denial-of-Service (DDoS) attacks have become more rampant and continue to be one of the most serious security threats facing network infrastructure. In a classic DDoS attack, the attacker controls numerous bots from many sources to send a significant volume of traffic to flood the victim end or the bottleneck link. In practical networks, it is inefficient and costly to request all partner routers to collaboratively mitigate DDoS attacks. The common feature of DDoS attacks is the abnormal distribution of traffic to the victim. In this paper, we propose TAP, a collaborative DDoS mitigation framework, based on traffic-aware probabilistic packet marking (PPM). TAP enables the victim to select a few hit routers as collaborators to mitigate attack traffic efficiently depending on the traffic distribution. Our evaluation results show that TAP greatly reduces attack traffic within seconds and mitigate the damage caused by DDoS with less overhead, which demonstrates that TAP is an effective, efficient, and rapid-response scheme for collaborative DDoS mitigation.
2020
IWQoS
Towards the Construction of Global IPv6 Hitlist and Efficient Probing of IPv6 Address Space
Fast IPv4 scanning has made sufficient progress in network measurement and security research. However, it is infeasible to perform brute-force scanning of the IPv6 address space. We can find active IPv6 addresses through scanning candidate addresses generated by the state-of-the-art algorithms, whose probing efficiency of active IPv6 addresses, however, is still very low. In this paper, we aim to improve the probing efficiency of IPv6 addresses in two ways. Firstly, we perform a longitudinal active measurement study over four months, building a high-quality dataset called hitlist with more than 1.3 billion IPv6 addresses distributed in 45.2k BGP prefixes. Different from previous work, we probe the announced BGP prefixes using a pattern-based algorithm, which makes our dataset overcome the problems of uneven address distribution and low active rate. Secondly, we propose an efficient address generation algorithm DET, which builds a density space tree to learn high-density address regions of the seed addresses in linear time and improves the probing efficiency of active addresses. On the public hitlist and our hitlist, we compare our algorithm DET against state-of-the-art algorithms and find that DET increases the de-aliased active address ratio by 10%, and active address (including aliased addresses) ratio by 14%, by scanning 50 million addresses.
ICC
P4DAD: Securing Duplicate Address Detection Using P4
Duplicate Address Detection (DAD) is an essential part of the Neighbor Discovery Protocol (NDP), which determines whether the IPv6 address of a node conflicts with those of other nodes. Due to the lack of verification of NDP messages, DAD is vulnerable to DoS attacks. Existing solutions suffer from high complexity, need to modify the NDP, or have a single point of failure.
To solve the above problems, we propose P4DAD, a secure DAD mechanism described by P4. By creating and maintaining binding entries between IPv6 address and link-layer property of host, P4DAD can filter spoofed NDP messages in an in-network manner to prevent DoS attacks on DAD without modifications to the NDP or host stack. We implement a prototype of P4DAD and evaluate it in terms of functionality, performance, and scalability. Evaluation results show that P4DAD can prevent DoS attacks on DAD successfully with negligible overhead, and has satisfactory scalability.
2019
INFOCOM
Bootstrapping Accountability and Privacy to IPv6 Internet without Starting from Scratch
Accountability and privacy are considered valuable but conflicting properties in the Internet, which at present does not provide native support for either. Past efforts to balance accountability and privacy in the Internet have unsatisfactory deployability due to the introduction of new communication identifiers, and because of large-scale modifications to fully deployed infrastructures and protocols. The IPv6 is being deployed around the world and this trend will accelerate. In this paper, we propose a private and accountable proposal based on IPv6 called PAVI that seeks to bootstrap accountability and privacy to the IPv6 Internet without introducing new communication identifiers and large-scale modifications to the deployed base. A dedicated quantitative analysis shows that the proposed PAVI achieves satisfactory levels of accountability and privacy. The results of evaluation of a PAVI prototype show that it incurs little performance overhead, and is widely deployable.
NAACL
Relation Extraction with Temporal Reasoning Based on Memory Augmented Distant Supervision
Jianhao Yan, Lin He, Ruqin Huang, Jian Li, and Ying Liu
In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Minneapolis, USA, June 2-7, 2019
Distant supervision (DS) is an important paradigm for automatically extracting relations. It utilizes existing knowledge base to collect examples for the relation we intend to extract, and then uses these examples to automatically generate the training data. However, the examples collected can be very noisy, and pose significant challenge for obtaining high quality labels. Previous work has made remarkable progress in predicting the relation from distant supervision, but typically ignores the temporal relations among those supervising instances. This paper formulates the problem of relation extraction with temporal reasoning and proposes a solution to predict whether two given entities participate in a relation at a given time spot. For this purpose, we construct a dataset called WIKI-TIME which additionally includes the valid period of a certain relation of two entities in the knowledge base. We propose a novel neural model to incorporate both the temporal information encoding and sequential reasoning. The experimental results show that, compared with the best of existing models, our model achieves better performance in both WIKI-TIME dataset and the well-studied NYT-10 dataset.
ACML
Multi-modal Representation Learning for Successive POI Recommendation
Lishan Li, Ying Liu, Jianping Wu, Lin He, and Gang Ren
In Proceedings of the 11th Asian Conference on Machine Learning (ACML) Nagoya, Japan, November 17-19, 2019
Successive POI recommendation is a fundamental problem for location-based social networks (LBSNs). POI recommendation takes a variety of POI context information (e.g. spatial location and textual comment) and user preference into consideration. Existing POI recommendation systems mainly focus on part of the POI context and user preference with a specific modeling, which loses valuable information from other aspects. In this paper, we propose to construct a multi-modal check-in graph, a heterogeneous graph that combines five check-in aspects in a unified way. We further propose a multi-modal representation learning model based on the graph to jointly learn POI and user representations. Finally, we employ an attentional recurrent neural network based on the representations for successive POI recommendation. Experiments on a public dataset studies the effects of modeling different aspects of check-in records and demonstrates the effectiveness of the method in improving POI recommendation performance.
2017
IPCCC
Revisiting inter-AS IP Spoofing Let the Protection Drive Source Address Validation
In Proceedings of the 36th IEEE International Performance Computing and Communications Conference (IPCCC) San Diego, California, USA, December 10-12, 2017
IP spoofing, which is prevalently used for anonymity and reflection attacks, has shown increasing destructive power in recent years. Although certain source address validation solutions have been standardized by the Internet Engineering Task Force, few networks are willing to adopt them in view of the deficiency of deployment benefits. Actually, all the source address validation solutions face the problem of a lack of deployability. In this paper, we summarize the key points describing deployability and propose a new security service-inter-autonomous-system (AS) Source Address Protection (iSAP). Technically, by increasing the possibility of keeping the source address belonging to one AS from being the victim of reflection flooding, iSAP improves the deployers ability to prevent IP spoofing and increases incremental deployability. In reality, such a service can also be regarded as a new profit opportunity for ASes and it could progress gradually once it is well commercialized. Based on simulations with real Internet topology data, the results illustrate that iSAP can protect ASes from being reflected with only a few deployers, exhibiting a high potential to mitigate reflection flooding with modest resource consumption.
Journal & Magazine papers
# indicates the corresponding author.
2024
TMC
WiseCam: A Systematic Approach to Intelligent Pan-Tilt Cameras for Moving Object Tracking
Jinlong E, Fangshuo Han, Lin He, Wei Xu, Zhenhua Li, Yunpeng Chai, and Yunhao Liu
With the desired functionality of moving object tracking, wireless pan-tilt cameras are able to play critical roles in a growing diversity of surveillance environments. However, today’s pan-tilt cameras oftentimes underperform when tracking frequently moving objects like humans – they are prone to lose sight of objects and bring about excessive mechanical rotations that are especially detrimental to those energy-constrained outdoor scenarios. The ineffectiveness and high cost of all state-of-the-art tracking approaches are rooted in their adherence to the industry’s simplicity principle, which leads to their stateless nature, performing gimbal rotations based only on the latest object detection. To address the issues, we design and implement WiseCam that wisely tunes the pan-tilt cameras to minimize mechanical rotation costs while maintaining long-term object tracking. This systematic tracking approach also tackles issues of motion-rotation speed gap and scattered moving objects, which is universally applicable to complex tracking scenarios. We examine the performance of WiseCam by experiments on two types of pan-tilt cameras with different motors. Results show that it significantly outperforms the state-of-the-art tracking approaches on both tracking duration and power consumption.
TON
AddrMiner: A Fast, Efficient, and Comprehensive Global Active IPv6 Address Detection System
Fast Internet-wide scanning is essential for network situational awareness and asset evaluation. However, the vast IPv6 address space makes brute-force scanning infeasible. Despite advancements in state-of-the-art methods, they do not work in seedless regions and suffer low detection efficiency and speed in regions with seed addresses. Moreover, the collected active address list (i.e., IPv6 hitlist) with low coverage cannot truly represent the active IPv6 address landscape of the Internet. This paper introduces AddrMiner, a fast, efficient, and comprehensive global active IPv6 address detection system. We design a systematic active IPv6 address detection strategy that divides the IPv6 space into two detection scenarios based on the presence or absence of seed addresses to discover active IPv6 addresses from scratch and from few to many. In the seedless regions, we present AddrMiner-N, leveraging a multi-level association policy to probe active addresses. It fills the gap of address detection in seedless regions and successfully discovers active addresses in 39,899 BGP prefixes without seed addresses, with a 1.03X higher hit rate, 30∼911X higher speed, and 2.7X broader coverage, compared to existing solutions. In the regions with seed addresses, our method AddrMiner-S dynamically generates target addresses using reinforcement learning. Compared to state-of-the-art methods, AddrMiner-S achieves an impressive 56.3% hit rate and a discovery speed of 839.0/s, which is 1.9∼2153X and 1.5∼755X of existing works, respectively. Finally, we deploy AddrMiner and discover 2.1B active IPv6 addresses, including 1.7B de-aliased active addresses and 0.4B aliased addresses, through continuous probing for three years.
2023
TMC
AutoIoT: Automatically Updated IoT Device Identification with Semi-supervised Learning
Linna Fan, Lin He#, Yichao Wu, Shize Zhang, Zhiliang Wang, Jia Li, Jiahai Yang#, Chaocan Xiang, and Xiaoqian Ma
IoT devices bring great convenience to a person’s life and industrial production. However, their rapid proliferation also troubles device management and network security. Network administrators usually need to know how many IoT devices are in the network and whether they behave normally. IoT device identification is the first step to achieving these goals. Previous IoT device identification methods reach high accuracy in a closed environment. But they are not applicable in the continuously changing environment. When new types of devices are plugged in, they cannot update themselves automatically. Besides, they usually rely on supervised learning and need lots of labeled data, which is costly. To solve these problems, we propose a novel IoT device identification model named AutoIoT, updating itself automatically when new types of devices are plugged in. Besides, it only needs a few labeled data and identifies IoT devices with high accuracy. The evaluation on two public datasets shows that AutoIoT can identify new device types only using 1.5∼2.5 hours’ traffic and still have high accuracy after updating. Moreover, it has a better performance than other works when there are only a few labeled data, especially in an environment with scanning traffic.
IEEE Netw.
SAV6: A Novel Inter-AS Source Address Validation Protocol for IPv6 Internet
IP spoofing is prevalently used for anonymity and reflection attacks, e.g., distributed denial of service (DDoS) attacks, which have shown increasingly destructive power in recent years because today’s Internet lacks validation on source addresses. Moreover, the fast deployment of IPv6 on the Internet may further aggravate the damages of DDoS attacks.
This paper proposes a novel source address validation mechanism called SAV6, which leverages the huge IPv6 address space to validate source addresses at an inter-autonomous system (AS) granularity. In SAV6, each IPv6 address contains an AS number (ASN), whose corresponding AS announces the prefix of the address to other ASes. An AS can determine the authenticity of the source address by whether the ASN in the address matches the corresponding prefix after receiving an incoming packet. The performance evaluation of a SAV6 prototype shows that it adds little performance overhead to the deployed infrastructures and is a lightweight and deployable protocol.
Efficient Video Distribution Based on Collaboration of Heterogenous Computing Nodes (基于异构算力节点协同的高效视频分发)
Jinlong E, and Lin He#
Journal of Computer Research and Development, 2023
Computing power network connects computing nodes through the network to break the limitation of single point of computing power, and it is rapidly developed and applied in more and more business fields in recent years. The popular live video broadcasting relies on the transmission and transcoding of a large number of video frames, and it is of great practical importance to explore computing power networks for efficient video distribution. Compared with the traditional large-scale data processing, video applications have higher requirements for transmission delay and bandwidth guarantee. However, the computing power of nodes provided by each cloud service varies, and the state of network links between nodes often varies. Therefore, it is a great challenge to realize low latency and high bandwidth video distribution by selecting nodes with the best combined transmission and transcoding performance. Therefore, we design an efficient video distribution scheme based on heterogeneous computing nodes, including planning video transmission paths and reasonably selecting transcoding nodes through reinforcement learning, using priority queuing scheduling for different video distribution tasks and adaptively adjusting node resources to reduce resource bursty competition, adopting a layered-log-synchronization fault tolerance mechanism to quickly restore data consistency after node failures, and finally deploying multi-cloud service distributed nodes to realize a complete video distribution system. A large number of live ultra-high definition video experiments show that the performance of this scheme is significantly improved compared with existing video distribution methods.
2022
TON
DET: Enabling Efficient Probing of IPv6 Active Addresses
Guanglei Song, Jiahai Yang, Zhiliang Wang, Lin He#, Jinlei Lin, Long Pan, Chenxin Duan, and Xiaowen Quan
Fast IPv4 scanning significantly improves network measurement and security research. Nevertheless, it is infeasible to perform brute-force scanning of the IPv6 address space. Alternatively, one can find active IPv6 addresses through scanning the candidate addresses generated by state-of-the-art algorithms. However, the probing efficiency of such algorithms is often very low. In this paper, our objective is to improve the probing efficiency of IPv6 addresses. We first perform a longitudinal active measurement study and build a high-quality dataset, hitlist, including more than 1.95,B IPv6 addresses distributed in 58.2,K BGP prefixes and collected over 17 months period. Different from the previous works, we probe the announced BGP prefixes using a pattern-based algorithm. This results in a dataset without uneven address distribution and low active rates. Further, we propose an efficient address generation algorithm, DET, which builds a density space tree to learn high-density address regions of the seed addresses with linear time complexity and improves the active addresses’ probing efficiency. We then compare our algorithm DET against state-of-the-art algorithms on the public hitlist and our hitlist by scanning 50,M addresses. Our analysis shows that DET increases the de-aliased active address ratio and active address (including aliased addresses) ratio by 10%, and 14%, respectively. Furthermore, we develop a fingerprint-based method to detect aliased prefixes. The proposed method for the first time directly verifies whether the prefix is aliased or not. Our method finds that 10.64% of the public aliased prefixes are false positive.
TON
TurboNet: Faithfully Emulating Networks with Programmable Switches
Faithfully emulating networks is critical for verifying the correctness and effectiveness of new networking-related designs. Existing network experiment platforms either cannot faithfully emulate the functionality and performance of production networks or cannot scale well due to cost constraints. In this paper, we propose TurboNet, a new network emulator that utilizes one or more programmable switches to achieve faithful emulation of the network data plane and control plane. For data plane emulation, we propose a series of key designs, such as port mapper, queue mapper, and delayed queue, to emulate network topologies and performance metrics with high flexibility and accuracy. For control plane emulation, we support static routing configurations, distributed routing agents, and the centralized routing controllers. Meanwhile, we provide APIs for operators to simplify network emulation tasks. We implement TurboNet on Tofino switches. Evaluation results show that: (1) On the data plane, TurboNet can flexibly emulate various topologies, such as an 8-ary fat-tree with only one programmable switch and a 10-ary fat-tree with four programmable switches; (2) On the control plane, TurboNet supports about 200 BGP agents on a single programmable switch with a CPU usage of 25%; (3) TurboNet can accurately emulate different network performance metrics such as 10⁻⁸ link loss, and microsecond to millisecond link delay.
TPDS
CoFilter: High-Performance Switch-Accelerated Stateful Packet Filter for Bare-Metal Servers
As one of the most critical cloud services, Bare-Metal Servers (BMS) introduce stringent performance requirements on data center networks (DCN). Stateful packet filter is an integral DCN component of ensuring connection security for BMS. However, the off-the-shelf stateful packet filters either are costly for cloud DCNs or introduce significant performance bottlenecks. In this article, we present CoFilter , which leverages low-cost programmable switches to accelerate the stateful packet filter for BMS. CoFilter uses (1) stateful process partition to enable complex stateful packet filtering logic on programmability-limited switching ASICs, (2) state compression to track tens of millions of connections with constrained hardware memory, and (3) per-tenant packet rate limit and tenant-aware flow migration to achieve efficient performance isolation among different tenants. Overall, CoFilter implements a high-performance stateful packet filter via the co-design of programmable switching ASIC and CPU. We evaluate CoFilter under various data center traffic traces with real-world flow distributions. The evaluation results show that CoFilter remarkably outperforms NetFilter, i.e., forwarding packets at line rate (13x throughput of NetFilter), keeping packet delay within 1us, and freeing a significant quantity of CPU cores, with rather small memory usage, i.e., accommodating over 107 connections with only 16MB SRAM.
CN
EvoIoT: An Evolutionary IoT and Non-IoT Classification Model in Open Environments
IoT device identification is essential for both device asset management and security management. IoT device identification in open environments is the key to its application in real environments, where IoT and non-IoT device identification is the first and critical step. However, existing methods are either not applicable to open environments or have poor scalability and sustainability for IoT and non-IoT device identification. This paper presents EvoIoT, an IoT and non-IoT identification model designed to apply in open environments with high scalability and the ability to run sustainably and effectively. EvoIoT achieves high scalability by applying a unified model for all devices. To mine discriminative features from encrypted traffic, EvoIoT extracts original features from packet headers and judges feature importance in the entire model update process. For sustainability, EvoIoT proposes a representative device choosing method and model update method to address the concept drift caused by new types of devices. EvoIoT is the first high-performance model that systematically addresses the IoT and non-IoT device classification problem. We evaluate EvoIoT on two public datasets and a private dataset collected from a laboratory setting. The evaluation results show that EvoIoT is on average 6.27% 48.89% more accurate than state-of-the-art methods.
IEEE Netw.
Enabling Application-aware Traffic Engineering in IPv6 Networks
Lin He, Shicheng Wang, Yichi Xu, Peng Kuang, Jiamin Cao, Ying Liu, Xing Li, and Shuping Peng
The Internet hosts numerous applications with different requirements for network delay, bandwidth, jitter, packet loss, and so on. However, in the TCP/IP network architecture, the network and application layers are decoupled, which means that the network does not have a fine-grained understanding of the application requirements. Therefore, it is not easy to provide truly fine-grained traffic operations for applications and guarantee their corresponding service level agreement requirements. In this article, we propose ATE6, which is an application-aware traffic engineering (TE) framework. For the control plane, we define a request language that can be used by applications to express their requirements, which allows networks to associate IPv6 addresses with the communication requirements of applications. We also devise an efficient path selection algorithm that allows the network operator to deploy an optimal TE path for an application. For the data plane, ATE6 uses segment routing over the IPv6 data plane to enforce and control network paths based on network policies. We implemented a prototype, and evaluation results show that ATE6 is a flexible, privacy-preserving, lightweight, and cost-efficient application-aware TE framework.
2021
TON
PAVI: Bootstrapping Accountability and Privacy to IPv6 Internet
Accountability and privacy are considered valuable but conflicting properties in the Internet, which at present does not provide native support for either. Past efforts to balance accountability and privacy in the Internet have unsatisfactory deployability due to the introduction of new communication identifiers, and because of large-scale modifications to fully deployed infrastructures and protocols. The IPv6 is being deployed around the world and this trend will accelerate. In this paper, we propose a private and accountable proposal based on IPv6 called PAVI that seeks to bootstrap accountability and privacy to the IPv6 Internet without introducing new communication identifiers and large-scale modifications to the deployed base. A dedicated quantitative analysis shows that the proposed PAVI achieves satisfactory levels of accountability and privacy. The results of the evaluation of a PAVI prototype show that it incurs little performance overhead, and is widely deployable.
CN
Towards Securing Duplicate Address Detection using P4
Duplicate Address Detection (DAD) is one of the functions of the Neighbor Discovery Protocol (NDP), which determines whether the IPv6 address of a node conflicts with those of other nodes. However, due to the lack of verification of NDP messages, DAD is vulnerable to Denial of Service (DoS) attacks. Existing solutions suffer from high complexity and low security, need to modify the NDP, or have a single point of failure, which renders them infeasible to be deployed.
To solve the above problems, we propose P4DAD, which is a secure DAD mechanism based on P4. By creating and maintaining a binding entry between an IPv6 address and a link-layer property of a host’s network attachment, P4DAD can filter spoofed NDP messages in an in-network manner to prevent DoS attacks on DAD without modification to the NDP or host stack. We implement a prototype of P4DAD and evaluate it in terms of functionality, performance, and scalability. Evaluation results show that P4DAD can prevent DoS attacks on DAD successfully with negligible overhead and has satisfactory scalability.
2018
SCIS
GAGMS: A Requirement-driven General Address Generation and Management System
IPv6 address generation is closely related to the manageability, security, privacy protection, and traceability of the Internet. There are many kinds of IPv6 address generation and configuration methods in the area of Internet standards and research that may cause certain problems, including the mixed operation problem of multiple IPv6 address generation schemes, the synchronization problem of the change in IPv6 address, the efficiency problem of processing large-scale concurrent IPv6 address requests, and the general model problem for mapping IPv6 addresses to other requirement spaces as identifiers. In this paper, we consider generating and managing IPv6 addresses according to network requirements. After conducting a requirement analysis of most proposed address generation schemes, we propose a general address generation model and a general address management system, which are the cores of the general address generation and management system (GAGMS). This system solves the above problems under the premise of maintaining the diversity and flexibility of the existing IPv6 address generation and configuration methods and allows networks to utilize different address generation schemes according to different requirements in different scenarios. Finally, we design a prototype system and evaluate our GAGMS to demonstrate its effectiveness, manageability, and scalability, and we have conducted trial deployment in campus networks and are trying to standardize this work in IETF.
Today’s Internet is vulnerable to numerous attacks, including source spoofing, distributed denial of service, prefix hijacking, and route forgery. Network-layer accountability is considered as an effective deterrence tool which can be used to address these attacks. Much research has been devoted to improving network-layer accountability of today’s Internet. In this paper, we first investigate the state-of-the-art network-layer accountability research and summarize a general definition of network-layer accountability. Next, we propose a network-layer accountability framework and present a taxonomy of network-layer accountability protocols according to accountability granularity. Furthermore, we compare these protocols and discuss their pros and cons mainly from accountability function, deployability, and security. Finally, some open research questions are emphasized for directing future designs.
RISP: An RPKI-based inter-AS Source Protection Mechanism
IP source address spoofing is regarded as one of the most prevalent components when launching an anonymous invasion, especially a Distributed Denial-of-Service (DDoS) attack. Although Source Address Validations (SAVs) at the access network level are standardized by the Internet Engineering Task Force (IETF), SAV at the inter-Autonomous System (AS) level still remains an important issue. To prevent routing hijacking, the IETF is constructing a Resource Public Key Infrastructure (RPKI) as a united trust anchor to secure interdomain routing. In this study, we creatively use the RPKI to support inter-AS SAV and propose an RPKI-based Inter-AS Source Protection (RISP) mechanism. According to the trust basis provided by the RPKI, RISP offers ASes a more credible source-oriented protection for the IP addresses they own and remains independent of the RPKI. Based on the experiments with real Internet topology, RISP not only provides better incentives, but also improves efficacy and economizes bandwidth with a modest resource consumption.
2015
SCIS
Building an IPv6 Address Generation and Traceback System with NIDTGA in Address Driven Network
In the design and construction process of Next Generation Internet, it is important to identify the source of each IP packet forwarding accurately, especially for the support of precise fine-grained management, control, traceability and improving the trustworthiness of the Internet. This paper designed a scalable Network Identity (NID) scheme for the Internet users, proposed NIDTGA (Network Identity and Time Generated Address), an IPv6 address generation algorithm embedded NID and time information, then designed and implemented an IPv6 address generation and traceback system based on NIDTGA. The design of NIDTGA, which reflects the length, time and owner attributes of the IP address, can be a good support to ADN (Address Driven Network). At the same time, by embedding the key elements of user identity and time in the IPv6 address, and by taking into account both the traceability and privacy, NIDTGA can provide a technical basis for the establishment of the network trust mechanism, and achieve the traceability of security event.
Books
2023
PTP
Cyberspace Mapping—Principles, Techniques and Applications