Noah Apthorpe — Publications

Towards a Real-time Reciprocal Metacognition Feedback Paradigm in Introductory Computer Science Classrooms
Nicholas Diana, Carly Grizzaffi*, Noah Apthorpe
Proceedings of the 20th International Conference of the Learning Sciences (ICLS). 2026

Learning is a continuous process, with a student's level of mastery constantly evolving alongside instruction. While traditional assessment methods excel at assessing mastery at a particular moment in time, they are ill-suited for capturing the incremental nature of learning. We propose a novel system that allows students to provide continuous feedback about their perceived mastery throughout a lecture. With this data, instructors could more effectively respond to individual and class-wide misunderstandings in real-time. This new interaction paradigm would also promote metacognitive reasoning skills and allow the students to "dog-ear" moments of confusion during the lecture that could be returned to while studying. Through a series of user studies with both students and faculty, we explored the unique interaction challenges of this design space. This paper describes the key components of the proposed paradigm, drawing on these empirical insights to address design challenges.

Online Age Gating: An Interdisciplinary Evaluation
Noah Apthorpe, Brett Frischmann, Yan Shvartzshnaider
Yale Journal of Law & Technology (YJoLT). 2026

The recent surge in regulation seeking to establish age-based governance online is part of a decades-long attempt to establish online zoning. It is driven by active development of technologies to estimate or verify user age based on various characteristics of users, their credentials, or their activities. However, these developments have heightened prevailing concerns that online age gating technology will inevitably be abused and misused to cause a variety of privacy harms and rights infringements. This paper examines this ongoing debate by bridging technical and legal scholarship to explore the current state of online age-based governance. We discuss the current legal and policy landscape, the current status of online age gating technologies, and provide recommendations to guide legal and technological scholarship and practice. Our interdisciplinary assessment is particularly important and timely, given the recent flurry of state and federal laws that aim to implement age gating online and ongoing litigation challenging such laws.

Measuring NIST Authentication Standards Compliance by Higher Education Institutions
Noah Apthorpe, Boen Beavers*, Yan Shvartzshnaider, Brett Frischmann
Symposium on Usable Privacy and Security (SOUPS). 2025

Technical standards are a longstanding method of communicating best practice recommendations based on expert consensus. Cybersecurity standards are particularly important for informing policies that protect critical systems and sensitive data. Measuring standards compliance is therefore essential to identify vulnerabilities arising from outdated policies and to determine whether expert advice has effectively diffused to practitioners. In this paper, we examine the authentication policies of a diverse set of 135 colleges and universities in the United States and Canada to determine compliance with four standards from NIST Special Publication 800-63 Digital Identity Guidelines. We find widespread, but not universal, deployment of multi-factor authentication across institutions. We also find prevalent outdated use of password expiration, password composition rules, and knowledge-based authentication. These results support further investment and research into incentive structures for standards compliance and the diffusion of expert guidance to practitioners.

Automating Governing Knowledge Commons and Contextual Integrity (GKC-CI) Privacy Policy Annotations with Large Language Models
Jake Chanenson, Madison Pickering, Noah Apthorpe
Proceedings on Privacy Enhancing Technologies Symposium (PETS). 2025

Identifying contextual integrity (CI) and governing knowledge commons (GKC) parameters in privacy policy texts can facilitate normative privacy analysis. However, GKC-CI annotation has heretofore required manual or crowdsourced effort. This paper demonstrates that high-accuracy GKC-CI parameter annotation of privacy policies can be performed automatically using large language models. We fine-tune 50 open-source and proprietary models on 21,588 ground truth GKC-CI annotations from 16 privacy policies. Our best performing model has an accuracy of 90.65%, which is comparable to the accuracy of experts on the same task. We apply our best performing model to 456 privacy policies from a variety of online services, demonstrating the effectiveness of scaling GKC-CI annotation for privacy policy exploration and analysis. We publicly release our model training code, training and testing data, an annotation visualizer, and all annotated policies for future GKC-CI research.

Privacy Governance Not Included: Analysis of Third Parties in Learning Management Systems
Madelyn Rose Sanfilippo, Noah Apthorpe, Karoline Brehm, Yan Shvartzshnaider
Information and Learning Sciences. 2023

This paper aims to address research gaps around third party data flows in education by investigating governance practices in higher education with respect to learning management system (LMS) ecosystems. The authors answer the following research questions: how are LMS and plugins/learning tools interoperability (LTI) governed at higher education institutions? Who is responsible for data governance activities around LMS? What is the current state of governance over LMS? What is the current state of governance over LMS plugins, LTI, etc.? What governance issues are unresolved in this domain? How are issues of privacy and governance regarding LMS and plugins/LTIs documented or communicated to the public and/or community members?

Automating Internet of Things Network Traffic Collection with Robotic Arm Interactions
Xi Jiang*, Noah Apthorpe
Journal of Communications. 2023

Consumer Internet of things research often involves collecting network traffic sent or received by IoT devices. These data are typically collected via crowdsourcing or while researchers manually interact with IoT devices in a laboratory setting. However, manual interactions and crowdsourcing are often tedious, expensive, inaccurate, or do not provide comprehensive coverage of possible IoT device behaviors. We present a new method for generating IoT network traffic using a robotic arm to automate user interactions with devices. This eliminates manual button pressing and enables permutation-based interaction sequences that rigorously explore the range of possible device behaviors. We test this approach with an Arduino-controlled robotic arm, a smart speaker, and a smart thermostat, using machine learning to demonstrate that collected network traffic contains information about device interactions that could be useful for network, security, or privacy analyses. We also provide source code and documentation allowing researchers to easily automate IoT device interactions and network traffic collection in future studies.

You, Me, and IoT: How Internet-Connected Consumer Devices Affect Interpersonal Relationships
Noah Apthorpe, Pardis Emami-Naeini, Arunesh Mathur, Marshini Chetty, Nick Feamster
ACM Transactions on Internet of Things (TIOT). 2022

Internet-connected consumer devices have rapidly increased in popularity; however, relatively little is known about how these technologies are affecting interpersonal relationships in multi-occupant households. In this study, we conduct 13 semi-structured interviews and survey 508 individuals from a variety of backgrounds to discover and categorize how consumer IoT devices are affecting interpersonal relationships in the United States. We highlight several themes, providing exploratory data about the pervasiveness of interpersonal costs and benefits of consumer IoT devices. These results inform follow-up studies and design priorities for future IoT technologies to amplify positive and reduce negative interpersonal effects.

SkillBot: Identifying Risky Content for Children in Alexa Skills
Tu Le, Danny Yuxing Huang, Noah Apthorpe, Yuan Tian
ACM Transactions on Internet Technology (TOIT). 2022

Many households include children who use voice personal assistants (VPA) such as Amazon Alexa. Children benefit from the rich functionalities of VPAs and third-party apps but are also exposed to new risks in the VPA ecosystem. In this paper, we first investigate âriskyâ child-directed voice apps that contain inappropriate content or ask for personal information through voice interactions. We build SkillBot â a natural language processing (NLP)-based system to automatically interact with VPA apps and analyze the resulting conversations. We find 28 risky child-directed apps and maintain a growing dataset of 31,966 non-overlapping app behaviors collected from 3,434 Alexa apps. Our findings suggest that although child-directed VPA apps are subject to stricter policy requirements and more intensive vetting, children remain vulnerable to inappropriate content and privacy violations. We then conduct a user study showing that parents are concerned about the identified risky apps. Many parents do not believe that these apps are available and designed for families/kids, although these apps are actually published in Amazonâs âKidsâ product category. We also find that parents often neglect basic precautions such as enabling parental controls on Alexa devices. Finally, we identify a novel risk in the VPA ecosystem: confounding utterances, or voice commands shared by multiple apps that may cause a user to interact with a different app than intended. We identify 4,487 confounding utterances, including 581 shared by child-directed and non-child-directed apps. We find that 27% of these confounding utterances prioritize invoking a non-child-directed app over a child-directed app. This indicates that children are at real risk of accidentally invoking non-child-directed apps due to confounding utterances.

GKC-CI: A Unifying Framework for Contextual Norms and Information Governance
Yan Shvartzshnaider, Madelyn Sanfilippo, Noah Apthorpe
Journal of the Association for Information Science and Technology (JASIST). 2022

Privacy-enhancing technologies that incorporate a socially meaningful conception of privacy, one that meets people's expectations and is ethically defensible, need to factor in contextual privacy norms and information governance as part of their design. This involves understanding what information handling practices users deem acceptable, what factors influence users' perceptions and behaviors, and how informational norms evolve. In this paper, we present GKC-CI, a unifying framework for examining contextual privacy norms and information governance in a given context to help structure research inquiries around these questions.

IoT Inspector: Crowdsourcing Labeled Network Traffic from Smart Home Devices at Scale
Danny Yuxing Huang, Noah Apthorpe, Frank Li, Gunes Acar, Nick Feamster
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (Ubicomp/IMWUT). 2020

The proliferation of smart home devices has created new opportunities for empirical research in ubiquitous computing, ranging from security and privacy to personal health. Yet, data from smart home deployments are hard to come by, and existing empirical studies of smart home devices typically involve only a small number of devices in lab settings. To contribute to data-driven smart home research, we crowdsource the largest known dataset of labeled network traffic from smart home devices from within real-world home networks. To do so, we developed and released IoT Inspector, an open-source tool that allows users to observe the traffic from smart home devices on their own home networks. Since April 2019, 4,322 users have installed IoT Inspector, allowing us to collect labeled network traffic from 44,956 smart home devices across 13 categories and 53 vendors. We demonstrate how this data enables new research into smart homes through two case studies focused on security and privacy. First, we find that many device vendors use outdated TLS versions and advertise weak ciphers. Second, we discover about 350 distinct third-party advertiser and tracking domains on smart TVs. We also highlight other research areas, such as network management and healthcare, that can take advantage of IoT Inspector's dataset. To facilitate future reproducible research in smart homes, we will release the IoT Inspector data to the public.

Going Against the (Appropriate) Flow: A Contextual Integrity Approach to Privacy Policy Analysis
Yan Shvartzshnaider, Noah Apthorpe, Nick Feamster, Helen Nissenbaum
The Seventh AAAI Conference on Human Computation and Crowdsourcing (HCOMP). 2019

We present a method for analyzing privacy policies using the framework of contextual integrity (CI). This method allows for the systematized detection of issues with privacy policy statements that hinder readersâ ability to understand and evaluate company data collection practices. These issues include missing contextual details, vague language, and overwhelming possible interpretations of described information transfers. We demonstrate this method in two different settings. First, we compare versions of Facebookâs privacy policy from before and after the Cambridge Analytica scandal. Our analysis indicates that the updated policy still contains fundamental ambiguities that limit readersâ comprehension of Facebookâs data collection practices. Second, we successfully crowdsourced CI annotations of 48 excerpts of privacy policies from 17 companies with 141 crowdworkers. This indicates that regular users are able to reliably identify contextual information in privacy policy statements and that crowdsourcing can help scale our CI analysis method to a larger number of privacy policy statements.

Evaluating the Contextual Integrity of Privacy Regulation: Parents' IoT Toy Privacy Norms Versus COPPA
Noah Apthorpe, Sarah Varghese*, Nick Feamster
Proceedings of the 28th USENIX Security Symposium (USENIX Security). 2019

Increased concern about data privacy has prompted new and updated data protection regulations worldwide. However, there has been no rigorous way to test whether the practices mandated by these regulations actually align with the privacy norms of affected populations. Here, we demonstrate that surveys based on the theory of contextual integrity provide a quantifiable and scalable method for measuring the conformity of specific regulatory provisions to privacy norms. We apply this method to the U.S. Children's Online Privacy Protection Act (COPPA), surveying 195 parents and providing the first data that COPPA's mandates generally align with parents' privacy expectations for Internet-connected "smart" children's toys. Nevertheless, variations in the acceptability of data collection across specific smart toys, information types, parent ages, and other conditions emphasize the importance of detailed contextual factors to privacy norms, which may not be adequately captured by COPPA.

Keeping the Smart Home Private with Smart(er) IoT Traffic Shaping
Noah Apthorpe, Danny Yuxing Huang, Dillon Reisman, Arvind Narayanan, Nick Feamster
Proceedings on Privacy Enhancing Technologies Symposium (PETS). 2019

The proliferation of smart home Internet of Things (IoT) devices presents unprecedented challenges for preserving privacy within the home. In this paper, we demonstrate that a passive network observer (e.g., an Internet service provider) can infer private in-home activities by analyzing Internet traffic from commercially available smart home devices even when the devices use end-to-end transport-layer encryption. We evaluate common approaches for defending against these types of traffic analysis attacks, including firewalls, virtual private networks, and independent link padding, and find that none sufficiently conceal user activities with reasonable data overhead. We develop a new defense, "stochastic traffic padding" (STP), that makes it difficult for a passive network adversary to reliably distinguish genuine user activities from generated traffic patterns designed to look like user interactions. Our analysis provides a theoretical bound on an adversary's ability to accurately detect genuine user activities as a function of the amount of additional cover traffic generated by the defense technique.

User Perceptions of Smart Home IoT Privacy
Serena Zheng*, Noah Apthorpe, Marshini Chetty, Nick Feamster
Proceedings of the 2018 ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW). 2018

Smart home Internet of Things (IoT) devices are rapidly increasing in popularity, with more households including Internet-connected devices that continuously monitor user activities. In this study, we conduct eleven semi-structured interviews with smart home owners, investigating their reasons for purchasing IoT devices, perceptions of smart home privacy risks, and actions taken to protect their privacy from those external to the home who create, manage, track, or regulate IoT devices and/or their data. We note several recurring themes. First, users' desires for convenience and connectedness dictate their privacy-related behaviors for dealing with external entities, such as device manufacturers, Internet Service Providers, governments, and advertisers. Second, user opinions about external entities collecting smart home data depend on perceived benefit from these entities. Third, users trust IoT device manufacturers to protect their privacy but do not verify that these protections are in place. Fourth, users are unaware of privacy risks from inference algorithms operating on data from non-audio/visual devices. These findings motivate several recommendations for device designers, researchers, and industry standards to better match device privacy features to the expectations and preferences of smart home owners.

Security and Privacy Analyses of Internet of Things Children's Toys
Gordon Chu*, Noah Apthorpe, Nick Feamster
IEEE Internet of Things Journal (IoT-J). 2018

This paper investigates the security and privacy of Internet-connected childrenâs smart toys through case studies of three commercially-available products. We conduct network and application vulnerability analyses of each toy using static and dynamic analysis techniques, including application binary decompilation and network monitoring. We discover several publicly undisclosed vulnerabilities that violate the Childrenâs Online Privacy Protection Rule (COPPA) as well as the toysâ individual privacy policies. These vulnerabilities, especially security flaws in network communications with first-party servers, are indicative of a disconnect between many IoT toy developers and security and privacy best practices despite increased attention to Internet-connected toy hacking risks.

Discovering IoT Smart Home Privacy Norms using Contextual Integrity
Noah Apthorpe, Yan Shvartzshnaider, Arunesh Mathur, Dillon Reisman, Nick Feamster
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (Ubicomp/IMWUT). 2018

The proliferation of Internet of Things (IoT) devices for consumer "smart" homes raises concerns about user privacy. We present a survey method based on the Contextual Integrity (CI) privacy framework that can quickly and efficiently discover privacy norms at scale. We apply the method to discover privacy norms in the smart home context, surveying 1,731 American adults on Amazon Mechanical Turk. For $2,800 and in less than six hours, we measured the acceptability of 3,840 information flows representing a combinatorial space of smart home devices sending consumer information to first and third-party recipients under various conditions. Our results provide actionable recommendations for IoT device manufacturers, including design best practices and instructions for adopting our method for further research.

Automatic Neuron Detection in Calcium Imaging Data using Convolutional Networks
Noah Apthorpe, Alexander Riordan, Rob Aguilar*, Jan Homann, Yi Gu, David Tank, H. Sebastian Seung
Advances in Neural Information Processing Systems (NIPS). 2016

Calcium imaging is an important technique for monitoring the activity of thousands of neurons simultaneously. As calcium imaging datasets grow in size, automated detection of individual neurons is becoming important. Here we apply a supervised learning approach to this problem and show that convolutional networks can achieve near-human accuracy and superhuman speed. Accuracy is superior to the popular PCA/ICA method based on precision and recall relative to ground truth annotation by a human expert. These results suggest that convolutional networks are an efficient and flexible tool for the analysis of large-scale calcium imaging data.

Treating Software Defined Networks like Disk Arrays
Zhiyuan Teo, Ken Birman, Noah Apthorpe, Robbert Van Renesse, Vasily Kuksenkov
IEEE NetSoft Conference and Workshops (NetSoft). 2016

Data networks require a high degree of performance and reliability as mission-critical IoT deployments increasingly depend on them. Although performance and fault tolerance can be individually addressed at all levels of the networking stack, few solutions tackle these challenges in an elegant and scalable manner. We propose a redundant array of independent network links (RAIL), adapted from RAID, that combines software-defined networking, disjoint network paths and selective packet processing to improve communications bandwidth and latency while simultaneously providing fault tolerance. Our work shows that the implementation of such a system is feasible without necessitating awareness or changes in the operating systems or hardware of IoT and client devices.

Integrating Contextual Integrity and Data Minimization
Noah Apthorpe
Eighth Annual Symposium on Applications of Contextual Integrity. 2026

Measuring the Prevalence and Variety of Online Age Gates
Tajveer Singh Dhesi*, Noah Apthorpe
Ninth Workshop on Technology and Consumer Protection (ConPro). 2025

The legal landscape regarding age-based restrictions (age gates) for online services is rapidly changing. In order to comply with existing and proposed regulations, online services must determine whether users are older or younger than mandated age thresholds. The implementation details of these age gates are highly relevant for consumer protection advocates given the risk of user circumvention and/or chilling effects. We therefore propose a study measuring the prevalence and variety of age gating mechanisms across the Internet. We start with a case study of the e-cigarette industry, finding that nearly all site arrival age gates merely require users to self-attest that they are older than an age threshold. We plan to expand this study to additional industries, website interaction points, and automated classification techniques to produce a comprehensive assessment of online age gating practices.

Fostering a Market for Responsible Data Practices
Noah Apthorpe, Eleanor Birrell, Travis Breaux, Kirsten Martin, Rishab Nithyanand, Sarah Radway, Yan Shvartzshnaider, Maximiliane Windl
Seventh Annual Symposium on Applications of Contextual Integrity. 2025

A Qualitative Analysis of Governing Knowledge Commons and Contextual Integrity (GKC-CI) Privacy Policy Annotations with Large Language Models
Jake Chanenson, Madison Pickering, Noah Apthorpe
Sixth Annual Symposium on Applications of Contextual Integrity. 2024

Designing Effective Privacy-Preserving Age Verification Systems
Yan Shvartzshnaider, Noah Apthorpe, Brett Frischmann
Privacy Law Scholars Conference (PLSC). 2024

Automating GKC-CI Privacy Policy Annotations with LLMs
Jake Chanenson, Madison Pickering, Noah Apthorpe
Fifth Annual Symposium on Applications of Contextual Integrity. 2023

Privacy Not Included: Analysis of Add-ons in Learning Management Systems
Yan Shvartzshnaider, Noah Apthorpe, Madelyn Sanfilippo
Privacy Law Scholars Conference (PLSC). 2023

Automating Contextual Integrity and GKC-CI Privacy Policy Annotations with GPT-3
Noah Apthorpe
Fourth Annual Symposium on Applications of Contextual Integrity. 2022

Practical Assignments for Teaching Contextual Integrity
Noah Apthorpe
Third Annual Symposium on Applications of Contextual Integrity. 2021

This discussion prompt outlines several active learning techniques for teaching Contextual Integrity (CI), including case studies, privacy norm surveys, policy evaluations, formal logic, and system audits. It also highlights challenges to CI pedagogy in technical computer science courses and poses discussion questions to foster collaborative development of practical assignments for teaching CI.

Evaluating the Contextual Integrity of Privacy Regulation: Parentsâ IoT Toy Privacy Norms Versus COPPA
Noah Apthorpe, Sarah Varghese*, Nick Feamster
Second Annual Symposium on Applications of Contextual Integrity. 2019

A Developer-Friendly Library for Smart Home IoT Privacy-Preserving Traffic Obfuscation
Trisha Datta*, Noah Apthorpe, Nick Feamster
Proceedings of the 2018 Workshop on IoT Security and Privacy (IoT S&P). 2018

The number and variety of Internet-connected devices have grown enormously in the past few years, presenting new challenges to security and privacy. Research has shown that network adversaries can use traffic rate metadata from consumer IoT devices to infer sensitive user activities. Shaping traffic flows to fit distributions independent of user activities can protect privacy, but this approach has seen little adoption due to required developer effort and overhead bandwidth costs. Here, we present a Python library for IoT developers to easily integrate privacy-preserving traffic shaping into their products. The library replaces standard networking functions with versions that automatically obfuscate device traffic patterns through a combination of payload padding, fragmentation, and randomized cover traffic. Our library successfully preserves user privacy and requires approximately 4 KB/s overhead bandwidth for IoT devices with low send rates or high latency tolerances. This overhead is reasonable given normal Internet speeds in American homes and is an improvement on the bandwidth requirements of existing solutions.

Machine Learning DDoS Detection for Consumer Internet of Things Devices
Rohan Doshi*, Noah Apthorpe, Nick Feamster
IEEE Deep Learning and Security Workshop (DLS). 2018

An increasing number of Internet of Things (IoT) devices are connecting to the Internet, yet many of these devices are fundamentally insecure, exposing the Internet to a variety of attacks. Botnets such as Mirai have used insecure consumer IoT devices to conduct distributed denial of service (DDoS) attacks on critical Internet infrastructure. This motivates the development of new techniques to automatically detect consumer IoT attack traffic. In this paper, we demonstrate that using IoT-specific network behaviors (e.g. limited number of endpoints and regular time intervals between packets) to inform feature selection can result in high accuracy DDoS detection in IoT network traffic with a variety of machine learning algorithms, including neural networks. These results indicate that home gateway routers or other network middleboxes could automatically detect local IoT device sources of DDoS attacks using low-cost machine learning algorithms and traffic data that is flow-based and protocol-agnostic.

Cleartext Data Transmissions in Consumer IoT Medical Devices
Daniel Wood*, Noah Apthorpe, Nick Feamster
Workshop on Internet of Things Security and Privacy (IoT S&P). 2017

This paper introduces a method to capture network traffic from medical IoT devices and automatically detect cleartext information that may reveal sensitive medical conditions and behaviors. The research follows a three-step approach involving traffic collection, cleartext detection, and metadata analysis. We analyze four popular consumer medical IoT devices, including one smart medical device that leaks sensitive health information in cleartext. We also present a traffic capture and analysis system that seamlessly integrates with a home network and offers a user-friendly interface for consumers to monitor and visualize data transmissions of IoT devices in their homes.

Closing the Blinds: Four Strategies for Protecting Smart Home Privacy from Network Observers
Noah Apthorpe, Dillon Reisman, Nick Feamster
Workshop on Technology and Consumer Protection (ConPro). 2017

The growing market for smart home IoT devices promises new conveniences for consumers while presenting novel challenges for preserving privacy within the home. Specifically, Internet service providers or neighborhood WiFi eavesdroppers can measure Internet traffic rates from smart home devices and infer consumers' private in-home behaviors. Here we propose four strategies that device manufacturers and third parties can take to protect consumers from side-channel traffic rate privacy threats: 1) blocking traffic, 2) concealing DNS, 3) tunneling traffic, and 4) shaping and injecting traffic. We hope that these strategies, and the implementation nuances we discuss, will provide a foundation for the future development of privacy-sensitive smart homes.

A Smart Home is No Castle: Privacy Vulnerabilities of Encrypted IoT Traffic
Noah Apthorpe, Dillon Reisman, Nick Feamster
Data and Algorithmic Transparency Workshop (DAT). 2016

The increasing popularity of specialized Internet-connected devices and appliances, dubbed the Internet-of-Things (IoT), promises both new conveniences and new privacy concerns. Unlike traditional web browsers, many IoT devices have always-on sensors that constantly monitor fine-grained details of usersâ physical environments and influence the devicesâ network communications. Passive network observers, such as Internet service providers, could potentially analyze IoT network traffic to infer sensitive details about users. Here, we examine four IoT smart home devices (a Sense sleep monitor, a Nest Cam Indoor security camera, a WeMo switch, and an Amazon Echo) and find that their network traffic rates can reveal potentially sensitive user interactions even when the traffic is encrypted. These results indicate that a technological solution is needed to protect IoT device owner privacy, and that IoT-specific concerns must be considered in the ongoing policy debate around ISP data collection and usage.

Learning Password Best Practices Through In-Task Instruction
Qian Ma, Yingfan Zhou, Shubhang Kaushik, Aamod Joshi, Aditya Majumdar, Noah Apthorpe, Yan Shvartzshnaider, Sarah Rajtmajer, Brett Frischmann
arXiv Preprint. 2026

Users often make security- and privacy-relevant decisions without a clear understanding of the rules that govern safe behavior. We introduce pedagogical friction, a design approach that inserts brief, instructional interactions at the moment of action. We evaluate this approach in the context of password creation, a familiar task with clear quality criteria. We conducted a randomized study with 128 participants across four interface conditions that varied the depth and interactivity of guidance. We assessed three outcomes: (1) rule compliance in a subsequent password task without guidance, (2) accuracy on survey questions tied to password rules, and (3) behavior-knowledge alignment, which captures whether participants who correctly followed a rule also recognized it on the survey. Across the guided conditions, participants corrected most rule violations in the follow-up task and showed high behavior-knowledge alignment. Survey results suggested clearer advantages for some rule types, especially symbol related questions. These results position pedagogical friction as a lightweight intervention for security- and privacy-critical interfaces.

Analyzing Privacy Policies Using Contextual Integrity Annotations
Yan Shvartzshnaider, Noah Apthorpe, Nick Feamster, Helen Nissenbaum
arXiv & SSRN Preprint. 2018

In this paper, we demonstrate the effectiveness of using the theory of contextual integrity (CI) to annotate and evaluate privacy policy statements. We perform a case study using CI annotations to compare Facebook's privacy policy before and after the Cambridge Analytica scandal. The updated Facebook privacy policy provides additional details about what information is being transferred, from whom, by whom, to whom, and under what conditions. However, some privacy statements prescribe an incomprehensibly large number of information flows by including many CI parameters in single statements. Other statements result in incomplete information flows due to the use of vague terms or omitting contextual parameters altogether. We then demonstrate that crowdsourcing can effectively produce CI annotations of privacy policies at scale. We test the CI annotation task on 48 excerpts of privacy policies from 17 companies with 141 crowdworkers. The resulting high precision annotations indicate that crowdsourcing could be used to produce a large corpus of annotated privacy policies for future research.

Detecting Compressed Cleartext Traffic from Consumer Internet of Things Devices
Daniel Hahn*, Noah Apthorpe, Nick Feamster
arXiv Preprint. 2018

Data encryption is the primary method of protecting the privacy of consumer device Internet communications from network observers. The ability to automatically detect unencrypted data in network traffic is therefore an essential tool for auditing Internet-connected devices. Existing methods identify network packets containing cleartext but cannot differentiate packets containing encrypted data from packets containing compressed unencrypted data, which can be easily recovered by reversing the compression algorithm. This makes it difficult for consumer protection advocates to identify devices that risk user privacy by sending sensitive data in a compressed unencrypted format. Here, we present the first technique to automatically distinguish encrypted from compressed unencrypted network transmissions on a per-packet basis. We apply three machine learning models and achieve a maximum 66.9% accuracy with a convolutional neural network trained on raw packet data. This result is a baseline for this previously unstudied machine learning problem, which we hope will motivate further attention and accuracy improvements. To facilitate continuing research on this topic, we have made our training and test datasets available to the public.

Spying on the Smart Home: Privacy Attacks and Defenses on Encrypted IoT Traffic
Noah Apthorpe, Dillon Reisman, Srikanth Sundaresan, Arvind Narayanan, Nick Feamster
arXiv Preprint. 2017

The growing market for smart home IoT devices promises new conveniences for consumers while presenting new challenges for preserving privacy within the home. Many smart home devices have always-on sensors that capture users' offline activities in their living spaces and transmit information about these activities on the Internet. In this paper, we demonstrate that an ISP or other network observer can infer privacy sensitive in-home activities by analyzing Internet traffic from smart homes containing commercially-available IoT devices even when the devices use encryption. We evaluate several strategies for mitigating the privacy risks associated with smart home device traffic, including blocking, tunneling, and rate-shaping. Our experiments show that traffic shaping can effectively and practically mitigate many privacy risks associated with smart home IoT devices. We find that 40KB/s extra bandwidth usage is enough to protect user activities from a passive network adversary. This bandwidth cost is well within the Internet speed limits and data caps for many smart homes.

Using Contextual Integrity as a Gauge for Governing Knowledge Commons.
Yan Shvartzshnaider, Madelyn Sanfilippo, Noah Apthorpe
In B. Frischmann, M. R. Sanfilippo, K. J. Strandburg (eds.)
Governing Privacy in Knowledge Commons. 2021

Publications

Conferences & Journals

Workshops

Preprints

Book Chapters