This is the virtual conference schedule for PAM 2021. All times are in Central European Summer Time (CEST). (To ease dealing with time zones, view the sessions in a Google Calendar or subscribe to the ICS calendar file)

For discussion and Q&A, please join the #pam2021- channels shown below in the SIGCOMM Slack.

Note: the WebEx link has changed on Tuesday. Check your email for the new link.


  • 14:00 - 14:30 - Introduction, Overview, and Awards (Slack Channel: #pam2021-general )
  • #pam2021-keynote1)
    • Abstract: The Internet is an interconnection of independent networks known as Autonomous Systems (ASes). Given that ASes are built on top of hardware and software operated by humans, the Internet is subject to some limitations. For example, humans are error-prone and eventually take arbitrary decisions, enterprises are generally greedy from a revenue point of view. Finally, hardware and circuits may fail, requiring maintenance or replacement. All these factors may lead the Internet to have broken pieces, i.e., malfunctioning components, networks facing limitations and even selfish networks prioritizing their own revenue rather than the better performance of the Internet. Much of my current work is on measuring the Internet to understand its vulnerabilities. In this talk, I’ll focus on two hidden broken pieces of the Internet. First, I’ll concentrate on the border gateway protocol (BGP), the routing protocol used on the Internet, and study whether ASes carry on BGP lies where the control plane and the data plane differs. After applying a sequence of filters to remove different artifacts, we find cases where the paths indeed mismatch. One cause for such discrepancy is the presence of detours. We then study how traffic flows inside ASes and focus on the detection of forwarding detours. In case of detours, the forwarding routes do not match the best available routes, according to the internal gateway protocol (IGP) in use. We reveal such forwarding detours in multiple ASes.
  • 15:30 - 16:00 - Break
  • #pam2021-session1-covid19)
    • Constantin Sander, Ike Kunze, Klaus Wehrle, and Jan Rüth (RWTH Aachen University)
      Abstract: Congestion control is essential for the stability of the Internet and the corresponding algorithms are commonly evaluated for interoperability based on flow-rate fairness. In contrast, video conferencing software such as Zoom uses custom congestion control algorithms whose fairness behavior is mostly unknown. Aggravatingly, video conferencing has recently seen a drastic increase in use -- partly caused by the COVID-19 pandemic -- and could hence negatively affect how available Internet resources are shared. In this paper, we thus investigate the flow-rate fairness of video conferencing congestion control at the example of Zoom and influences of deploying AQM. We find that Zoom is slow to react to bandwidth changes and uses two to three times the bandwidth of TCP in low-bandwidth scenarios. Moreover, also when competing with delay aware congestion control such as BBR, we see high queuing delays. AQM reduces these queuing delays and can equalize the bandwidth use when used with flow-queuing. However, it then introduces high packet loss for Zoom, leaving the question how delay and loss affect Zoom's QoE. We hence show a preliminary user study in the appendix which indicates that the QoE is at least not improved and should be studied further.
    • Shinan Liu (University of Chicago), Paul Schmitt (Princeton University), Francesco Bronzino (Université Savoie Mont Blanc), Nick Feamster (University of Chicago)
      Abstract: The COVID-19 pandemic has resulted in dramatic changes to the daily habits of billions of people. Users increasingly have to rely on home broadband Internet access for work, education, and other activities. These changes have resulted in corresponding changes to Internet traffic patterns. This paper aims to characterize the effects of these changes with respect to Internet service providers in the United States. We study three questions: (1) How did traffic demands change in the United States as a result of the COVID-19 pandemic?; (2) What effects have these changes had on Internet performance?; (3) How did service providers respond to these changes? We study these questions using data from a diverse collection of sources. Our analysis of interconnection data for two large ISPs in the United States shows a 30–60% increase in peak traffic rates in the first quarter of 2020. In particular, we observe traffic downstream peak volumes for a major ISP increase of 13–20% while upstream peaks increased by more than 30%. Further, we observe significant variation in performance across ISPs in conjunction with the traffic volume shifts, with evident latency increases after stay-at-home orders were issued, followed by a stabilization of traffic after April. Finally, we observe that in response to changes in usage, ISPs have aggressively augmented capacity at interconnects, at more than twice the rate of normal capacity augmentation. Similarly, video conferencing applications have increased their network footprint, more than doubling their advertised IP address space.
    • Ryo Kawaoka (Waseda University), Daiki Chiba, Takuya Watanabe, and Mitsuaki Akiyama (NTT Secure Platform Laboratories), Tatsuya Mori (Waseda University)
      Abstract: This work takes a first look at domain names related to COVID-19 (Cov19doms in short), using a large-scale registered Internet domain name database, which accounts for 260M of distinct domain names registered for 1.6K of distinct top-level domains. We extracted 167K of Cov19doms that have been registered between the end of December 2019 and the end of September 2020. We attempt to answer the following research questions through our measurement study: RQ1: Is the number of Cov19doms registrations correlated with the COVID-19 outbreaks?, RQ2: For what purpose do people register Cov19doms? Our chief findings are as follows: (1) Similar to the global COVID-19 pandemic observed around April 2020, the number of Cov19doms registrations also experienced the drastic growth, which, interestingly, pre-ceded the COVID-19 pandemic by about a month, (2) 70 \% of active Cov19doms websites with visible content provided useful information such as health, tools, or product sales related to COVID-19, and (3) non-negligible number of registered Cov19doms was used for malicious purposes. These findings imply that it has become more challenging to distinguish domain names registered for legitimate purposes from others and that it is crucial to pay close attention to how Cov19doms will be used/misused in the future.
  • #pam2021-session2-tls)
    • Olamide Omolola (University of Vienna), Richard Roberts (The University of Maryland), Ishtiaq Ashiq and Taejoong Chung (Virginia Tech), Dave Levin (The University of Maryland), Alan Mislove (Northeastern University)
      Abstract: The Transport Layer Security (TLS) Public Key Infrastructure (PKI) is essential to the security and privacy of users on the internet. Despite its importance, prior work from the mid-2010s has shown that mismanagement of the TLS PKI often led to weakened security guarantees, such as compromised certificates going unrevoked and many internet devices generating self-signed certificates. Many of these problems can be traced to manual processes that were the only option at the time. However, in the intervening years, the TLS PKI has undergone several changes: once-expensive TLS certificates are now freely available, and they can be obtained and reissued via automated programs. In this paper, we examine whether these changes to the TLS PKI have led to improvements in the PKI’s management. We collect data on all certificates issued by Let’s Encrypt (now the largest certificate authority by far) over the past four years. Our analysis focuses on two key questions: First, are administrators making proper use of the automation that modern CAs provide for certificate reissuance? We find that a surprising fraction (40%) of domains do not reissue their certificate on a predictable schedule, indicating a lack of use of automated tools. Second, do administrators that use automated CAs react to large-scale compromises more responsibly? To answer this, we use a recent Let’s Encrypt mis-issuance bug as a natural experiment, and find that a significantly larger fraction of administrators reissued their certificates in a timely fashion compared to previous bugs. Overall, our results demonstrate that when given the right tools, administrators can perform their duties and improve security for all internet users
    • Nikita Korzhitskii and Niklas Carlsson (Linkoping University)
      Abstract: The modern Internet is highly dependent on the trust communicated via X.509 certificates. However, in some cases certificates become untrusted and it is necessary to revoke them. In practice, the problem of secure certificate revocation has not yet been solved, and today no revocation procedure (similar to Certificate Transparency w.r.t. certificate issuance) has been adopted to provide transparent and immutable history of all revocations. Instead, the status of most certificates can only be checked with Online Certificate Status Protocol (OCSP) and/or Certificate Revocation Lists (CRLs). In this paper, we present the first longitudinal characterization of the revocation statuses delivered by CRLs and OCSP servers from the time of certificate expiration to status disappearance. The analysis captures the status history of over 1 million revoked certificates, including 773K certificates mass-revoked by Let's Encrypt. Our characterization provides a new perspective on the Internet's revocation rates, quantifies how short-lived the revocation statuses are, highlights differences in revocation practices within and between different CAs, and captures biases and oddities in the handling of revoked certificates. Combined, the findings motivate the development and adoption of a revocation transparency standard.
    • Trinh Viet Doan, Irina Tsareva, and Vaibhav Bajpai (Technical University of Munich)
      Abstract: The Domain Name System (DNS) is a cornerstone of communication on the Internet. DNS over TLS (DoT) has been standardized in 2016 as an extension to the DNS protocol, however, its performance has not been extensively studied yet. In the first study that measures DoT from the edge, we leverage 3.2k RIPE Atlas probes deployed in home networks to assess the adoption, reliability, and response times of DoT in comparison with DNS over UDP/53 (Do53). Each probe issues 200 domain name lookups to 15 public resolvers, five of which support DoT, and to the probes’ local resolvers over a period of one week, resulting in 90M DNS measurements in total. We find that the support for DoT among open resolvers has increased by 23.1% after nine months in comparison with previous studies. However, we observe that DoT is still only supported by local resolvers for 0.4% of the RIPE Atlas probes. In terms of reliability, we find failure rates for DoT to be inflated by 0.4–32.2 percentage points when compared to Do53. While Do53 failure rates for most resolvers individually are consistent across continents, DoT failure rates have much higher variation. As for response times, we see high regional differences for DoT and find that nearly all DoT requests take at least 100 ms to return a response (in a large part due to connection and session establishment), showing an inflation in response times of more than 100 ms compared to Do53. Despite the low adoption of DoT among local resolvers, they achieve DoT response times of around 140–150 ms similar to public resolvers (130–230 ms), although local resolvers also exhibit higher failure rates in comparison.
  • 17:45 - 19:00 - Break / Virtual Coffee Break - join the breakout rooms via WebEx
  • #pam2021-session3-videostreaming)
    • Sina Keshvadi and Carey Williamson (University of Calgary)
      Abstract: Live streaming is one of the most popular Internet activities. Nowadays, there has been an increase in free live streaming (FLS) services that provide unauthorized broadcasting of live events, attracting millions of viewers. These opportunistic providers often have modest network infrastructures, and monetize their services through advertising and data analytics, which raises concerns about the performance, quality of experience, and user privacy when using these services. In this paper, we measure and analyze the behaviour of 20 FLS sports sites on Android smartphones, focusing on packet-level, video player, and privacy aspects. In addition, we compare FLS services with two legitimate online sports networks. Our measurement results show that FLS sites suffer from scalability issues during highly-popular events, deliver lower QoE than legitimate providers, and often use obscure and/or suspicious tracking services. Caution is thus advised when using FLS services.
    • Ishani Sarkar (EasyBroadcast Nantes, France), Guillaume Urvoy-Keller (Université Côte d'Azur, I3S, France), Soufiane Rouibia (EasyBroadcast Nantes, France), Dino Lopez-Pacheco (Université Côte d'Azur, I3S, France)
      Abstract: Video live streaming now represents over 15\% of the Internet traffic. Typical distribution architectures for this type of service heavily rely on content distribution networks (CDNs) that enable to meet the stringent QoS requirements of live video applications. However, such a CDN-based solution is costly to operate and a number of solutions that complement CDN servers with WebRTC have emerged. WebRTC enables direct communications between browsers (viewers). The key idea is to enable viewer to viewers (V2V) video chunks exchanges as far as possible and revert to the CDN servers only if the video chunk has not been received before the timeout. In this work, we present the study we performed on an operational hybrid live video system that operates over XX channels worldwide. Relying on the per exchange statistics that the platform collects for each viewer interaction, we first present an high level overview of the performance of such system in the wild. As the key performance indicator is the fraction of V2V traffic of the system, we next focus our attention on this metrics. We demonstrate that the overall performance is driven by a small fraction of users, presumably featuring the best network access. By further profiling individual clients upload and download performance, we demonstrate that the clients responsible for the chunk losses, i.e. chunks that are sent to a requesting viewer but were not fully uploaded before the deadline, have a poor uplink access. We devised a work-round strategy, where each client evaluates its uplink capacity and refrains from sending to other clients if its past performance is too low. We assess the effectiveness of the approach on the Grid5000 testbed and present live results that confirm the good results achieved in a controlled environment. We are indeed able to achieve a good trade-off between the reduction of chunk loss rate and the reduction of the overall V2V traffic.
    • Vivek Adarsh, Michael Nekrasov, Udit Paul, Alex Ermakov, and Arpit Gupta (University of California, Santa Barbara), Morgan Vigil-Hayes (Northern Arizona University), Ellen Zegura (Georgia Institute of Technology), Elizabeth Belding (University of California, Santa Barbara)
      Abstract: The explosion of mobile broadband as an essential means of Internet connectivity has made the scalable evaluation and inference of quality of experience (QoE) for applications delivered over LTE networks critical. However, direct QoE measurement can be time and resource intensive. Further, the wireless nature of LTE networks necessitates that QoE be evaluated in multiple locations per base station as factors such as signal availability may have significant spatial variation. Based on our observations that quality of service (QoS) metrics are less time and resource-intensive to collect, we investigate how QoS can be used to infer QoE in LTE networks. Using an extensive, novel dataset representing a variety of network conditions, we design several state-of-the-art predictive models for scalable video QoE inference. We demonstrate that our models can accurately predict rebuffering events and resolution switching more than 80% of the time, despite the dataset exhibiting vastly different QoS and QoE profiles for the location types. We also illustrate that our classifiers have a high degree of generalizability across multiple videos from a vast array of genres. Finally, we highlight the importance of low-cost QoS measurements such as reference signal received power (RSRP) and throughput in QoE inference through an ablation study.
  • #pam2021-session4-stayingconnected)
    • Lorenzo Ariemma and Simone Liotta (Roma Tre University), Massimo Candela (University of Pisa), Giuseppe Di Battista (Roma Tre University)
      Abstract: The Border Gateway Protocol (BGP) is the protocol that makes the various networks composing the Internet communicate to each other. Routers speaking BGP exchange updates to keep the routing up-to-date and allow such communication. This usually is done to reflect changes in the routing configurations or as a consequence of link failures. In the Internet as a whole it is normal that BGP updates are continuously exchanged, but for any specific IP prefix, these updates are supposed to be concentrated in a short time interval that is needed to react to a network change. On the contrary, in this paper we show that there are many IP prefixes involved in quite long sequences consisting of a large number of BGP updates. Namely, examining ~30 billion updates collected by 172 observation points distributed worldwide, we estimate that almost 30% of them belong to sequences lasting more than one week. Such sequences involve 222285 distinct IP prefixes, approximately one fourth of the number of announced prefixes. We detect such sequences using a method based on the Discrete Wavelet Transform. We publish an online tool for the exploration and visualization of such sequences, which is open to the scientific community for further research. We empirically validate the sequences and report the results in the same online resource. The analysis of the sequences shows that almost all the observation points are able to see a large amount of sequences, and that 53% of the sequences last at least two weeks.
    • Juno Mayer, Valerie Sahakian, Emilie Hooft, Douglas Toomey, and Ramakrishnan Durairajan (University of Oregon)
      Abstract: The U.S. Pacific Northwest (PNW) is one of the largest Internet infrastructure hubs for several cloud and content providers, research networks, colocation facilities, and submarine cable deployments. Yet, this region is within the Cascadia Subduction Zone and currently lacks a quantitative understanding of the resilience of the Internet infrastructure due to seismic forces. The main goal of this work is to assess the resilience of critical Internet infrastructure in the PNW to shaking from earthquakes. To this end, we have developed a framework called \tool to understand the levels of risk that earthquake-induced shaking poses to wired and wireless infrastructures in the PNW. We take a probabilistic approach to categorize the infrastructures into risk groups based on historical and predictive peak ground acceleration (PGA) data and estimate the extent of shaking-induced damages to Internet infrastructures. Our assessments show the following in the next 50 years: ~65% of the fiber links and cell towers are susceptible to a very strong to violent earthquake; the infrastructures in Seattle-Tacoma-Bellevue and Portland-Vancouver-Hillsboro metropolitan areas have a 10% chance to incur a very strong to a severe earthquake. To mitigate the damages, we have designed a route planner capability in \tool. Using this capability, we show that a dramatic reduction of PGA is possible with a moderate increase in latencies.
  • #pam2021-keynote2)
    • Krishna Gummadi (Max Planck Institute for Software Systems (MPI-SWS))
      Abstract: Over the past two decades, the Internet has enabled (and continues to enable) numerous disruptive socio-technical systems like BitTorrent, Facebook, Amazon, and Bitcoin that have transformed media landscape, personal and corporate communications, trade, and monetary systems. The scale and societal impact of these Internet systems raise fundamental questions about their transparency and potential for unfairness and bias against some of their users. Understanding these threats requires us to define measures and develop methods to quantify unfairness and bias, often via black-box auditing of opaque systems. In this talk, I will discuss some of our attempts to measure bias and unfairness in traffic shaping in the Internet, targeted ads on social media, product recommendations on e-commerce platforms, and transaction prioritization in blockchains. I will also touch upon the challenges with designing fair and unbiased socio-technical systems, while maintaining their innovative potential.
  • #pam2021-session5-websecurity)
    • Yuwei Zeng, Xunxun Chen, and Tianning Zang (University of Chinese Academy of Sciences), Haiwei Tsang (Jilin University)
      Abstract: An increasing number of adversaries tend to cover up their malicious sites by leveraging the elaborate redirection chains. Prior works mostly focused on the specific attacks that users suffered, and seldom considered how users were exposed to such attacks. In this paper, we conduct a comprehensive measurement study on the malicious redirections that leverage squatting domain names as the start point. To this end, we collected 101,186 resolved squatting domain names that targeted 2,302 top brands from the ISP-level DNS traffic. After dynamically crawling these squatting domain names, we pioneered the application of performance log to mine the redirection chains they involved. Afterward, we analyzed the nodes that acted as intermediaries in malicious redirections and found that adversaries preferred to conduct URL redirection via imported JavaScript codes and iframes. Our further investigation indicates that such intermediaries have obvious aggregation, both in the domain name and the Internet infrastructure supporting them.
    • Nurullah Demir and Tobias Urban (Institute for Internet Security - Westphalian University of Applied Sciences), Kevin Wittek (Institute for Internet Security - Westphalian University of Applied Sciences / RWTH Aachen University), Norbert Pohlmann (Institute for Internet Security - Westphalian University of Applied Sciences)
      Abstract: Software updates take an essential role in keeping IT environments secure. If service providers delay or do not install updates, it can cause unwanted security implications for their environments. This paper conducts a large-scale measurement study of the update behavior of websites and their utilized software stacks. Across 18 months, we analyze over 5.6M websites and 246 distinct client- and server-side software distributions. We found that almost all analyzed sites use outdated software. To understand the possible security implications of outdated software, we analyze the potential vulnerabilities that affect the utilized software. We show that software components are getting older and more vulnerable because they are not updated. We find that 95% of the analyzed websites use at least one product for which a vulnerability existed.
    • Vector Guo Li and Gautam Akiwate (University of California San Diego), Kirill Levchenko (University of Illinois at Urbana-Champaign), Geoffrey M. Voelker and Stefan Savage (University of California San Diego)
      Abstract: One of the staples of network defense is blocking traffic to and from a list of "known bad" sites on the Internet. However, few organizations are in a position to produce such a list themselves, so pragmatically this approach depends on the existence of third-party "threat intelligence" providers who specialize in distributing feeds of unwelcome IP addresses. However, the choice to use such a strategy, let alone which data feeds are trusted for this purpose, is rarely made public and thus little is understood about the deployment of these techniques in the wild. To explore this issue, we have designed and implemented a technique to infer proactive traffic blocking on a remote host and, through a series of measurements, to associate that blocking with the use of particular IP blocklists. In a pilot study of 220K US hosts, we find as many as one fourth of the hosts appear to blocklist based on some source of threat intelligence data, and about 2% use one of the 9 particular third-party blocklists that we evaluated.
  • #pam2021-session6-dos)
    • Tiago Heinrich (UFPR), Rafael Rodrigues Obelheiro (UDESC), Carlos Alberto Maziero (UFPR)
      Abstract: Distributed reflection denial of service (DRDoS) attacks are widespread on the Internet. DRDoS attacks exploit mostly UDP-based protocols to achieve traffic amplification and provide an extra layer of indirection between attackers and their victims, and a single attack can reach hundreds of Gbps. Recent trends in DRDoS include multiprotocol amplification attacks, which exploit several protocols at the same time, and carpet bombing attacks, which target multiple IP addresses in the same subnet instead of a single address, in order to evade detection. These kinds of attacks have been reported in the wild, but have not been discussed in the scientific literature so far. This paper describes the first research on the characterization of both multiprotocol and carpet bombing DRDoS attacks. We developed MP-H, a honeypot that implements nine different protocols commonly used in DRDoS attacks, and used it for data collection. Over a period of 731 days, our honeypot received 1.8 TB of traffic, containing nearly 20.7 billion requests, and was involved in more than 1.4 million DRDoS attacks, including over 13.7 thousand multiprotocol attacks. We describe several features of multiprotocol attacks and compare them to monoprotocol attacks that occurred in the same period, and characterize the carpet bombing attacks seen by our honeypot.
    • Daniel Kopp (DE-CIX), Christoph Dietzel (MPI / DE-CIX), Oliver Hohlfeld (Brandenburg University of Technology)
      Abstract: DDoS attacks remain a major security threat to the continuous operation of Internet edge infrastructures, web services, and cloud platforms. While a large body of research focuses on DDoS detection and protection, to date we ultimately failed to eradicate DDoS altogether. Yet, the landscape of DDoS attack mechanisms is even evolving, demanding an updated perspective on DDoS attacks in the wild. In this paper, we identify up to 2608 DDoS amplification attacks at a single day by analyzing multiple Tbps of traffic flows at a major IXP with a rich ecosystem of different networks. We observe the prevalence of well known amplification attack protocols (e.g., NTP, CLDAP), which should no longer exist given the established mitigation strategies. Nevertheless, they pose the largest fraction on DDoS amplification attacks within our observation and we witness the emergence of DDoS attacks using recently discovered amplification protocols (e.g., OpenVPN, ARMS, Ubiquity Discovery Protocol). By analyzing the impact of DDoS on core Internet infrastructure, we show that DDoS can overload backbone capacity and that filtering approaches in prior work omit 97% of the attack traffic.
    • Jacob Davis (Sandia National Laboratories), Casey Deccio (Brigham Young University)
      Abstract: The Domain Name System (DNS) has been frequently abused for Distributed Denial of Service (DDoS) attacks and cache poisoning because it relies on the User Datagram Protocol (UDP). Since UDP is connection-less, it is trivial for an attacker to spoof the source of a DNS query or response. DNS Cookies, a protocol standardized in 2016, add pseudo-random values to DNS packets to provide identity management and prevent spoofing attacks. In this paper, we present the first study measuring the deployment of DNS Cookies in nearly all aspects of the DNS architecture. We also provide an analysis of the current benefits of DNS Cookies and the next steps for stricter deployment. Our findings show that cookie use is limited to less than 30% of servers and 10% of recursive clients. We also find several configuration issues that could lead to substantial problems if cookies were strictly required. Overall, DNS Cookies provide limited benefit in a majority of situations, and, given current deployment, do not prevent DDoS or cache poisoning attacks.
  • 17:45 - 19:00 - Break / Virtual Coffee Break - join the breakout rooms via WebEx
  • #pam2021-session7-performance)
    • Georgios P. Katsikas, Tom Barbette, Marco Chiesa, Dejan Kostic, and Gerald Q. Maguire Jr. (KTH Royal Institute of Technology)
      Abstract: Network interface cards (NICs) are fundamental components of modern high-speed networked systems, supporting multi-100 Gbps speeds and increasing programmability. Offloading computation from a server’s CPU to a NIC frees a substantial amount of the server’s CPU resources, making NICs key to offer competitive cloud services. Therefore,understanding the performance benefits and limitations of offloading a networking application to a NIC is of paramount importance. In this paper, we measure the performance of four different NICs from one of the largest NIC vendors worldwide, supporting 100 Gbps and 200 Gbps. We show that while today’s NICs can easily support multi-hundred-gigabit throughputs, performing frequent update operations of a NIC’s packet classifier — as network address translators (NATs) and load balancers would do for each incoming connection — results in a dramatic throughput reduction of up to 70 Gbps or complete denial of service. Our conclusion is that all tested NICs cannot support high-speed networking applications that require keeping track of a large number of frequently arriving incoming connections. Furthermore, we show a variety of counter-intuitive performance artefacts including the performance impact of using multiple tables to classify flows of packets.
    • Yimeng Zhao (Facebook), Ahmed Saeed (MIT CSAIL), Ellen Zegura and Mostafa Ammar (Georgia Institute of Technology)
      Abstract: To keep up with demand, servers will scale up to handle hundreds of thousands of clients simultaneously. Much of the focus of the community has been on scaling servers in terms of aggregate traffic intensity (packets transmitted per second). However, bottlenecks caused by the increasing number of concurrent clients, resulting in a large number of concurrent flows, have received little attention. In this work, we focus on identifying such bottlenecks. In particular, we define two broad categories of problems; namely, admitting more packets into the network stack than can be handled efficiently, and increasing per-packet overhead within the stack. We show that these problems contribute to high CPU usage and network performance degradation in terms of aggregate throughput and RTT. Our measurement and analysis are performed in the context of the Linux networking stack, the the most widely used publicly available networking stack. Further, we discuss the relevance of our findings to other network stacks. The goal of our work is to highlight considerations required in the design of future networking stacks to enable efficient handling of large numbers of clients and flows.
    • Prathy Raman and Marcel Flores (Verizon Media Platform)
      Abstract: Maintaining a performant Internet service is simplified when operators are able to develop an understanding of the path between the service and its end users. A key piece of operational knowledge comes from understanding when i) segments of a path contribute to a significant portion of the path's delay ii) when these segments occur across end users. We propose \emph{hoplets}, an abstraction for describing delay increases between an end-user and a content provider built on traceroutes. We present a mechanism for measuring and comparing hoplets to determine when they describe the same underlying network features. Using this mechanism, we construct a methodology to enable wide scale measurement that requires only limited contextual data. We demonstrate the efficacy of hoplets, showing their ability to effectively describe round-trip-time increases observed from a global content delivery network. Additionally, we perform an Internet-scale measurement and analysis of the hoplets observed from this infrastructure, exploring their nature and topological features where we find that nearly 20\% of bottlenecks occurred along paths with no visible alternative. Finally, we demonstrate the generality of the system by detecting a likely network misconfiguration using data from RIPE Atlas.
  • #pam2021-session8-security)
    • Karl Olson, Jack Wampler, Fan Shen, and Nolen Scaife (University of Colorado Boulder)
      Abstract: Customer edge routers are the primary mode of connection to the Internet for a large portion of non-commercial users. As these consumer networks migrate from IPv4 to IPv6, stateful firewalls are needed to protect devices in the home. However, policy details crucial to the implementation of these inbound access controls are left to the discretion of the device manufacturers. In this paper, we survey ten customer edge routers to evaluate how manufacturers implement firewalls and user controls in IPv6. The result is a systemic, demonstrable failure among all parties to agree upon, implement, and communicate consistent security policies. We conclude with future research directions and recommendations for all parties to address these systemic failures and provide a consistent model for home security.
    • John Kristoff, Mohammad Ghasemisharif, Chris Kanich, and Jason Polakis (UIC)
      Abstract: IPv6 automatic transition mechanisms such as 6to4 and ISA- TAP endure on a surprising number of Internet hosts. These mechanisms lie in hibernation awaiting someone or something to rouse them awake. In this paper we measure the prevalence and persistence of legacy IPv6 automatic transition mechanisms, together with an evaluation of the potential threat they pose. We begin with a series of DNS-based experi- ments and analyses including the registration of available domain names, and demonstrate how attackers can conduct man-in-the-middle attacks against all IPv6 traffic for a significant number of end systems. To vali- date another form of traffic hijacking, we then announce a control set of special-purpose IPv6 prefixes, that cannot be protected by the RPKI, to see these routes go undetected, accepted, and installed in the BGP ta- bles of over 30 other upstream networks. Finally, we survey the Internet IPv4 address space to discover over 1.5 million addresses are open IPv6 tunnel relays in the wild that can be abused to facilitate a variety of un- wanted activity such as IPv6 address spoofing attacks. We demonstrate how many attacks can be conducted remotely, anonymously, and without warning by adversaries. Behind the scenes our responsible disclosure has spearheaded network vendor software updates, ISP remediation efforts, and the deployment of new security threat monitoring services.
    • Pegah Torkamandi, Ljubica Kärkkäinen, and Jörg Ott (Technical University of Munich)
      Abstract: Initially envisioned to accelerate association of mobile devices in wireless networks, broadcasting of Wi-Fi probe requests has opened avenues for researchers and network practitioners to exploit information sent out in this type of frames for observing devices' digital footprints and for their tracking. One of the applications for this is crowd estimation. Noticing the privacy risks that this default mode of operation poses, device vendors have introduced MAC address randomization—a privacy preserving technique by which mobile devices periodically generate random hardware addresses contained in probe requests. In this paper, we propose a method for estimating the number of wireless devices in the environment by means of analyzing Wi-Fi probe requests sent by those devices and in spite of MAC address randomization. Our solution extends previous work that uses Wi-Fi fingerprinting based on the timing information of probe requests. The only additional information we extract from probe requests is the MAC address, making our method minimally privacy-invasive. Our estimation method is also nearly real-time. We conduct several experiments to collect wireless measurements in different static environments and we use these measurements to validate our method. Through an extensive analysis and parameter tuning, we show the robustness of our method.
  • #pam2021-session9-hiddenbehavior)
    • Wenrui Ma (Shantou University), Haitao Xu (Zhejiang University)
      Abstract: Ad networks (e.g., Google Ads and Facebook Ads), advertisers, publishers (websites and mobile apps), and users are the main participants in the online advertising ecosystem. Ad networks dominate the advertising landscape in terms of determining how to pair advertisers with publishers and what ads are shown to a user. Previous works have studied the issues surrounding how ad networks tailor ads to a user (i.e., the ad targeting mechanisms) extensively and mainly from the perspective of users. However, it is largely unknown regarding the practices of how ad networks match between advertisers and publishers. In this paper, we present a measurement study of the practices of how ad networks pair advertisers with publishers as well as advertisers' preference on ad networks from the perspective of advertisers. To do this, we manage to harvest a unique advertising-related dataset from a leading digital market intelligence platform. We conducted paired comparison analysis, i.e., analyzing advertisers and publishers in pairs, to examine whether they are significantly similar or dissimilar to each other. We also investigate if advertisers in different categories have different preferences on ad networks, whether an advertiser partners with only one ad network for its ad campaign, and how much traffic that its ad campaign could bring about to its site. Specifically, we found that about a third of advertisers have their ads mostly displayed on publishers with the same category as themselves. In addition, most advertisers partner with multiple ad networks at the same time for their ad campaigns. We also found that the Adult, Romance & Relationships, and Gambling websites rely on advertising to attract visitors more than other advertiser categories. Our study produces insightful findings which provide advertisers more visibility into the complex advertising ecosystem so that they could make better decisions when launching ad campaigns.
    • Aniss Maghsoudlou, Oliver Gasser, and Anja Feldmann (Max Planck Institute for Informatics)
      Abstract: Internet services leverage transport protocol port numbers to specify the source and destination application layer protocols. While using port 0 is not allowed in most transport protocols, we see a non-negligible share of traffic using port 0 in the Internet. In this study, we dissect port 0 traffic to infer its possible origins and causes using five complementing flow-level and packet-level datasets. We observe 73 GB of port 0 traffic in one week of IXP traffic, most of which we identify as an artifact of packet fragmentation. In our packet-level datasets, most traffic is originated from a small number of hosts and while most of the packets have no payload, a major fraction of packets containing payload belong to the BitTorrent protocol. Moreover, we find unique traffic patterns commonly seen in scanning. In addition to analyzing passive traces, we also conduct an active measurement campaign to study how different networks react to port 0 traffic. We find an unexpectedly high response rate for TCP port 0 probes in IPv4, with very low response rates with other protocol types. Finally, we will be running continuous port 0 measurements and providing the results to the measurement community.
    • Alexander Marder and kc claffy (UCSD / CAIDA), Alex C. Snoeren (UCSD)
      Abstract: Public clouds fundamentally changed the Internet landscape, centralizing traffic generation in a handful of networks. Internet performance, robustness, and public policy analyses struggle to properly reflect this centralization, largely because public collections of BGP and traceroute reveal a small portion of cloud connectivity. This paper evaluates and improves our ability to infer cloud connectivity, bootstrapping future measurements and analyses that more accurately reflect the cloud-centric Internet. We also provide a technique for identifying the interconnections that clouds use to reach destinations around the world, allowing edge networks and enterprises to understand how clouds reach them via their public WAN. Finally, we present two techniques for geolocating the interconnections between cloud networks at the city level that can inform assessments of their resilience to link failures and help enterprises build multi-cloud applications and services.
    • Matthew R McNiece (North Carolina State University and Cisco Systems, Inc.), Ruidan Li (Cisco Systems, Inc.), Bradley Reaves (North Carolina State University)
      Abstract: Most desktop applications use the network, and insecure communications can have a significant impact on the application, the system, the user, and the enterprise. Understanding \emph{at scale} whether desktop application use the network securely is a challenge because the application provenance of a given network packet is rarely available at centralized collection points. In this paper, we collect flow data from 39,758 MacOS devices on an enterprise network to study the network behaviors of individual applications. We collect flows \emph{locally} on-device and can definitively identify the application responsible for every flow. We also develop techniques to distinguish ``endogenous'' flows common to most executions of a program from ``exogenous'' flows likely caused by unique inputs. We find that popular MacOS applications are in fact using the network securely, with 95.62\% of the applications we study using HTTPS\@. Notably, we observe security sensitive-services (including certificate management and mobile device management) do not use ports associated with secure communications. Our study provides important insights for users, device and network administrators, and researchers interested in secure communication.
  • #pam2021-session10-capacity)
    • Rob Jansen and Aaron Johnson (U.S. Naval Research Laboratory)
      Abstract: The Tor network estimates its relays’ bandwidths using relay self-measurements of client traffic speeds. These estimates largely determine how existing traffic load is balanced across relays, and they are used to evaluate the network’s capacity to handle future traffic load increases. Thus, their accuracy is important to optimize Tor’s performance and strategize for growth. However, their accuracy has never been measured. We investigate the accuracy of Tor’s capacity estimation with an analysis of public network data and an active experiment run over the entire live network. Our results suggest that the bandwidth estimates underestimate the total network capacity by at least 50% and that the errors are larger for high-bandwidth and low-uptime relays. Our work suggests that improving Tor’s bandwidth measurement system could improve the network’s performance and better inform plans to handle future growth.
    • Saahil Claypool (WPI), Jae Chung (Viasat), Mark Claypool (WPI)
      Abstract: While Internet satellite bitrates have increased, latency can still degrade TCP performance. Realistic assessment of TCP over satellites is lacking, typically done by simulation or emulation only, if at all. This paper presents experiments comparing four TCP congestion control algorithms - BBR, Cubic, Hybla and PCC - on a commercial satellite network. Analysis shows similar steady-state bitrates for all, but with significant differences in start up throughputs and round-trip times caused by queuing of packets in flight. Power analysis combining throughput and latency shows that overall, PCC is the most powerful, due to relatively high throughputs and low, steady round-trip times, while for small downloads Hybla is the most powerful, due to fast throughput ramp-ups. BBR generally fares similarly to Cubic in all cases.
    • Shivang Aggarwal (Northeastern University), Zhaoning Kong (Purdue University), Moinak Ghoshal (Northeastern University), Y. Charlie Hu (Purdue University), Dimitrios Koutsonikolas (Northeastern University)
      Abstract: In the near future, high quality VR and video streaming at 4K/8K resolutions will require Gigabit throughput to maintain a high user quality of experience (QoE). IEEE 802.11ad, which standardizes the 14 GHz of unlicensed spectrum around 60 GHz, is a prime candidate to fulfil these demands wirelessly. To maintain QoE, applications need to adapt to the ever changing network conditions by performing quality adaptation. A key component of quality adaptation is throughput prediction. At 60 GHz, due to the much higher frequency, the throughput can vary sharply due to blockage and mobility. Hence, the problem of predicting throughput becomes quite challenging. In this paper, we perform an extensive measurement study of the predictability of the network throughput of an 802.11ad WLAN in downloading data to an 802.11ad-enabled mobile device under varying mobility patterns and orientations of the mobile device. We show that, with carefully designed neural networks, we can predict the throughput of the 60 GHz link with good accuracy at varying timescales, from 10 ms (suitable for VR) up to 2 s (suitable for ABR streaming). We further identify the most important features that affect the neural network prediction accuracy to be past throughput and MCS.
  • #pam2021-session11-dns)
    • Arian Akhavan Niaki (University of Massachusetts Amherst), William Marczak (University of California, Berkeley), Sahand Farhoodi (Boston University), Andrew McGregor and Phillipa Gill (University of Massachusetts Amherst), Nicholas Weaver (University of California, Berkeley)
      Abstract: DNS cache probing infers whether users of a DNS resolver have recently issued a query for a domain name, by determining whether the corresponding resource record (RR) is present in the resolver’s cache. The most common method involves performing DNS queries with the “recursion desired” (RD) flag set to zero, which resolvers typically answer from their caches alone. The answer’s TTL value is then used to infer when the resolver cached the RR, and thus when the domain was last queried. Previous work in this space assumes that DNS resolvers will respond to researchers’ queries. However, an increasingly common policy for resolvers is to ignore queries from outside their networks. In this paper, we demonstrate that many of these DNS resolvers can still be queried indirectly through open DNS forwarders in their network. We apply our technique to localize website filtering appliances sold by Netsweeper, Inc and, tracking the global proliferation of stalkerware. We are able to discover Netsweeper devices in ASNs where OONI and Censys fail to detect them and we observe a regionality effect in the usage of stalkerware apps across the world.
    • Austin Hounsel and Paul Schmitt (Princeton University), Kevin Borgolte (TU Delft), Nick Feamster (University of Chicago)
      Abstract: In this paper, we study the performance of encrypted DNS protocols and conventional DNS from thousands of home networks in the United States, over one month in 2020. We perform these measurements from the homes of 2,693 participating panelists in the Federal Communications Commission’s (FCC) Measuring Broadband America program. We found that clients do not have to trade DNS performance for privacy. For certain resolvers, DoT was able to perform faster than DNS in median response times, even as latency increased. We also found significant variation in DoH performance across recursive resolvers. Based on these results, we recommend that DNS clients (e.g., web browsers) should periodically conduct simple latency and response time measurements to determine which protocol and resolver a client should use. No single DNS protocol nor resolver performed the best for all clients.
    • Giovane C. M. Moura (SIDN Labs), Moritz Müller (SIDN Labs/University of Twente), Marco Davids and Maarten Wullink (SIDN Labs), Cristian Hesselman (SIDN Labs/University of Twente)
      Abstract: The DNS provides one of the core services of the Internet, mapping applications and services to hosts. DNS employs both UDP and TCP as a transport protocol, and currently most DNS queries are sent over UDP. The problem with UDP is that large responses run the risk of not arriving a their destinations -- which can ultimately lead to \textit{unreachability}. However, it remains unclear how much of a problem these large DNS responses over UDP are in the wild. This is the focus on this paper: we analyze 114 billion queries/response pairs from more than 43k autonomous systems, covering two months and a week period (2019 and 2020), collected at the authoritative servers of the AnonCcTLD, the country-code top-level domain of AnonEUCountry. We show that fragmentation, and the problems that can follow fragmentation, rarely occur at such authoritative servers. Further, we demonstrate that DNS built-in defenses -- use of truncation, EDNS0 buffer sizes, reduced responses and TCP fall back -- are effective when reducing and addressing fragmentation. Last, we measure the uptake of the DNS flag day in 2020.
  • 18:45 - 19:00 - Farewell