UW Privacy Office

Education

ON THIS PAGE:


Training

UW Privacy Office staff members are happy to present on various privacy topics. Please submit your request to uwprivacy@uw.edu.

EU GDPR for Researchers Workshop Sessions

Wednesday, July 15, 2020 from 2 – 3 p.m. via Zoom
By invitation only – RSVP required. Sessions with no RSVPs 24 hours in advance of the session will be canceled.

To request a future virtual (Zoom) session, please send an email to uwprivacy@uw.edu. A member of the UW Privacy office staff will reply and coordinate scheduling going forward.

Personal Data Processing Agreement Online Training

The Personal Data Processing Agreement (PDPA) supports the UW’s values of integrity, diversity, excellence, collaboration, innovation, and respect. It sets forth the UW’s expectations for protecting University Personal Data and managing privacy-related risk when the UW is contracting with third parties.

View the PDPA Online Training videos to determine whether a PDPA is required, understand how to complete a PDPA and utilize the self-help resources, and if needed get help from the UW Privacy Office (closed captioning available). For more information about the PDPA, see the PDPA webpage.

Twenty-eight prior PDPA information sessions were held December 2018 – September 2020

Previous Training

Data Breach Notification Requirement Session

Tuesday, January 14, 2020 from 1:00 – 2:00 in Odegaard 220
Space is limited and available on a first-come, first-served basis

The State of Washington recently modified its data breach notification law. Washington isn’t the only state making changes. Come learn how data breach notification requirements have changed and what to do if your department experiences a potential or actual breach of personal data.
Prior sessions:

  • Wednesday, June 19, 2019 from 2:30 – 3:30 p.m., Gerberding Hall 142
  • Thursday, November 7, 2019 from 2:00 – 3:00 in Odegaard 220

2020 Data Privacy Day Session

  • Tuesday, January 28, 2020 from 10 a.m. – 11:30 a.m.
  • Odegaard Library, Room 220

No RSVP required, but space is limited, available on a first-come/first-served basis

Join us on Thursday, January 28 as we honor International Data Privacy Day, an annual effort to elevate the protection of privacy and data as everyone’s responsibility.

Session Agenda:

  • 10:00 – 10:30 a.m. Privacy Trends and Issues – Ann Nagel, AVP/Institutional Privacy Official, UW Privacy Office
  • 10:30 – 11:30 a.m. Two Faces of Artificial Intelligence – Annie Searle, The Information School

Artificial intelligence (AI) has been with us since 1965 when a computer figured out how to beat humans at checkers. AI is now embedded in many commonly used applications. But AI has two faces. We’ll examine a range of AI applications as well as proposed codes of ethics, and government research investments by the U.S. and China.

Annie Searle is a senior lecturer at The Information School at the University of Washington, where she teaches courses on risk management, cybersecurity, and information management.

Privacy Impacts of IT Log Data, presented by Lee Imrey (Splunk)

Friday, November 15 from 10 – 11:30 a.m.
Lee spoke to us about the evolution of privacy in today’s increasingly data-centric and data-inundated world, exploring how attitudes toward privacy are changing, and how laws are developing which balance our need for privacy with our businesses’, governments’, and other groups’ hunger for information. Participants discussed the options and implications of monitoring IT log data.

Personal Privacy and Digital Wellness Sessions

(Prior sessions held January & October 2018, October 2019)
The resources listed below were shared at prior “Personal Privacy and Digital Wellness” sessions (updated as of 10-1-2019). Many of these same resources were shared during a combined presentation with the Office of the Chief Information Security Officer to members of the UW Retirement Association in a session entitled: “Security Tools and Tactics to Support Digital Wellness” held on Friday, October 19, 2018, from 10 a.m. – 12:30 p.m.

UW

In addition to the UW Privacy Office website

External to UW

EU General Data Protection Regulation Information Sessions

(Three sessions held in 2018)
Presented for those who collect, use, or manage information about individuals who live in the European Union. New personal data protections were implemented May 25, 2018, under the European Union’s General Data Protection Regulation (EU GDPR) will impact multiple areas of UW operations with data on students, employees, alumni, human subjects, etc. who live in the EU.

UW resources were presented for the following European Union’s General Data Protection Regulation requirements:

  • Registering personal data use
  • Providing notice to individuals about the collection and use of personal data
  • Obtaining consent from individuals in certain circumstances
  • Tracking or monitoring individuals’ website activity
  • Reporting incidents or data breaches

Whitepapers and Studies

University of Washington Privacy Policy Benchmarking Study (June 2020)

Policy Benchmarking Study [PDF]

Introduction

This study presents a high-level summary of the emerging local, national, and global privacy legal landscape, followed by results of a recent Privacy Policy benchmarking initiative undertaken by the UW Privacy Office to identify privacy policy best practices and trends within institutions of higher education.

A well-written privacy policy must support and uphold the many privacy obligations of an institution of higher education:

“Colleges and universities have multiple privacy obligations: they must promote an ethical and respectful community and workplace, where academic and intellectual freedom thrives; they must balance security needs with civil and individual liberties, opportunities for using big data analytics, and new technologies, all of which directly affect individuals; they must be good stewards of the troves of personal information they hold, some of it highly sensitive; and finally, they also must comply with numerous and sometimes overlapping or inconsistent privacy laws.” [i]

Emerging Privacy Trends in the State of Washington

During the 2020 legislative session, the UW Privacy Office evaluated 14 privacy-related bills introduced or re-introduced in the Washington State legislature, including one which would have enacted a new comprehensive privacy law called the Washington Privacy Act (SSB6281). A similar version of the bill, though highly favored, failed to pass in 2019. While the 2020 Act passed in some form in both the House and Senate, lawmakers were unable to reconcile differences around enforcement authority. Had it passed, the Washington Privacy Act was expected to be viewed as significant privacy legislation, incorporating protections beyond those in the California Consumer Privacy Act (CCPA – one of the most comprehensive consumer privacy laws in the United States). The Washington Privacy Act (taken from Senate Bill Report 2SSB 6281) would have:

  • Provided Washington residents with the consumer personal data rights of access, correction, deletion, data portability, and opt-out of the processing of personal data for specified purposes.
  • Specified the thresholds a business must satisfy for the requirements set forth in this act to apply.
  • Identified certain controller responsibilities such as transparency, purpose specification, and data minimization. Required controllers to conduct data protection assessments under certain conditions.
  • Provided a regulatory framework for the commercial use of facial recognition services such as testing, training, and disclosure requirements. 

Emerging Privacy Trends in the United States

Privacy laws by state

IAPP 4-2020 map of state privacy laws

“State-level momentum for comprehensive privacy bills is at an all-time high. After the California Consumer Privacy Act passed in 2018, multiple states proposed similar legislation to protect consumers in their states. The IAPP Westin Research Center compiled the below list of proposed and enacted comprehensive privacy bills from across the country to aid our members’ efforts to stay abreast of the changing state-privacy landscape.” [ii]

Review Appendix A for details about the International Association of Privacy Professionals (IAPP) research efforts to continue evaluation of state legislation using an informative set of consumer rights offered and organizational obligations imposed by such legislation. 

Historically at the federal level, the United States has addressed privacy through laws outlining privacy rights and obligations within “silos” of personal information (HIPAA, FERPA, COPPA, PCI-DSS, GLBA, etc.) Recently, multiple comprehensive Privacy Acts have been proposed offering privacy legislation at the federal level (Consumer Online Privacy Rights Act, Consumer Data Privacy Act, bipartisan discussion draftConsumer Data Privacy and Security Act), though proposed language and provisions have differed widely, and progress on addressing these differences has been slow. In June 2020, the Brookings Institution released a report titled “Bridging the gaps: A path forward to federal privacy legislation”, with detailed recommendations about possible strategies to bridge policy gaps and enable federal privacy legislation to proceed. While the COVID-19 pandemic has further raised awareness of the many holes in U.S. privacy law, it has also consumed congressional attention, and the net result may be yet another delay in enacting comprehensive federal privacy legislation, leaving all to cope with the challenges of complying with different privacy provisions in all 50 states. The following quotes effectively capture the challenges of continuing on this path:

“Individual state regulations are no substitute for federal laws, and the inconsistencies from state to state contribute to the “wild west” state of affairs, with different sheriffs in different states drawing different lines in different sands.” [iii]

“Among organizations, even those that would prefer this space to remain self-regulated, there is an overwhelming preference for one federal set of rules, rather than 50 different laws from 50 different states.” [iv]

Emerging Privacy Trends Globally

Efforts to protect citizens’ personal information are expected to accelerate throughout the world as privacy expectations and information technologies continue to evolve. In the 2019 Gartner report entitled “The State of Privacy and Personal Data Protection,” one of the strategic assumptions documented captured this trend:

“Strategic Assumption: By 2022, half of the planet’s population will have its personal information covered under local privacy regulations in line with the General Data Protection Regulation (GDPR), up from one-tenth today.” [v]

That assumption is on its way to becoming a reality, as the Gartner report highlighted the emergence of the following new or revised privacy laws and regulations in just 12 months (from May 2018-May 2019):

Global Privacy Laws enacted: [vi]

  • 2018: European Union General Data Protection Regulation (EU GDPR)
  • 2018: China’s National Standard on Personal Information Protection
  • 2018: California Consumer Privacy Act (AB 375)
  • 2018: Brazilian General Data Protection Law (LGPD)
  • 2019: Japan/EU Adequacy Agreement
  • 2019: Thailand Personal Data Protection Act 

Gartner isn’t the only one making such assumptions. CPO Magazine also noted that “Laws throughout the world will continue to be updated and implemented, based in part to seek adequacy from the EU. These include Australia, which will look at potentially updating its privacy law, and the Office of the Privacy Commissioner of Canada, which will continue pushing for changes to its privacy law. In addition, India is set to pass its Personal Data Protection law, and other countries will pass or at least consider GDPR-influenced bills on data protection. [vii]

“In the 12 months from May 2018 to May 2019, privacy regulation has experienced change in all major hubs of data creation, from the U.S. to China and from Europe to Latin America. Many have called it a renaissance, but the fact is that nothing like this has ever happened before.” [viii]

Privacy Policy Benchmark Project

The UW Privacy Office conducted a high-level benchmarking project in May-June 2020, intended to assess the current status of privacy policies among other institutions of higher education within the United States.

Initial research was conducted using the following investigative approaches:

  1. Locate, review and search within 31+ individual institutional/system websites
  2. Review Educause resources compiled within the “Privacy” community group
  3. Review International Association of Privacy Professionals resources compiled at IAPP
  4. Review Gartner resources compiled at Gartner
  5. Review Educational Advisory Board IT Forum Research Access compiled at Educational Advisory Board (EAB) IT Forum
  6. Review and incorporate elements from select, relevant publications

A list of the institutions of higher education explicitly included in this Benchmark effort may be found in Appendix B. When reviewing privacy policies, primary focus was placed on reviewing internally-directed privacy policies rather than reviewing externally-facing privacy notices or statements:

“External privacy notices and internal privacy policies are two sides of the same coin. One is the commitment to users as to how the organization will handle their personal information and the other provides detailed guidance to employees and partners to deliver on that promise.” [ix]

Finally, it is important to note that we intentionally omitted from our research content related to incident notification policies or policy elements, handled within UW’s Administrative Policy Statement APS 2.5, Information Security and Privacy: Incident Reporting and Management.

Comparative Assessment of Privacy-related Policies

Only one of the higher education institutions reviewed appears to have a comprehensive privacy policy at the institutional level (Penn State). A second institution (University of California, San Diego), has established “Guiding Principles for Personal Data” under their Executive Vice Chancellor – Academic Affairs that “impose” privacy expectations while not in a formal policy. The remainder of colleges and universities address institutional privacy issues across multiple policies, either at the institutional level, or in a subset of organizational policies (e.g. Information Technology Policy). In some institutions, privacy policies are aligned with stakeholder groups: Student Privacy, Employee Privacy, Donor Privacy, etc. Others address privacy by academic or administrative function (Finance, Research, Educational Records), by organizational processes (Privacy in Admissions, Privacy in Healthcare), or (most commonly) by law (FERPA, HIPAA, PCI DSS, EU GDPR, etc.).

The University of California (UC) System has published a set of explicit privacy principles for its system: [x]
Autonomy Privacy Principles:

  • Free inquiry
  • Respect for individual privacy
  • Surveillance (“committed to balancing the need for the safety of individuals and property with the individuals’ reasonable expectation of privacy in a particular location.”)

Information Privacy Principles:

  • Privacy by Design
  • Transparency and notice
  • Choice
  • Information review and correction
  • Information protection
  • Accountability

In UC San Diego case, this institution has extended “Guiding Principles for Personal Data” to include Data Protection Principles: [xi]

“Personal data must be consistently protected throughout its lifecycle commensurate with its level of sensitivity and criticality to campus operations, regardless of where it resides, type of media, or what purpose it serves. Data collection, retention, use, and sharing practices should be transparent and provide essential protections for the privacy of individuals. When collecting, accessing, using, or disclosing personal data, we commit to the following data protection principles:

  • Transparency and individual rights;
  • Purpose specification and use limitation;
  • Data minimization;
  • Access control;
  • Security;
  • Data quality, accuracy and integrity;
  • Due Diligence”

While addressing privacy through multiple policies appear to be the prevailing norm among colleges and universities, there appears to be growing support across professional associations for a single, overarching institutional privacy policy. The Educause “Privacy” community group has had two extensive exchanges about the preference for moving toward a single privacy policy, with benefits well-articulated in this entry:

Having consistent policies across an organization makes it much easier to implement, modify, manage and approve the policies. It also makes it easier to train users, track their acknowledgment of the policies, and get through a regulator’s investigation. [xii]

These vast collections of individual policies are clearly necessary, but the key to leadership on ethics and digital ethics is an overarching institutional policy or statement that connects them all. [xiii]

Foundational Elements of a Privacy Policy in Higher Education

Regardless of how privacy-related policies have evolved or how they are structured in higher education, the following is a comprehensive summary of the privacy policy elements typically contained or addressed:

  • Purpose of Policy
  • Definitions
  • Scope
  • Roles
  • Responsibilities by Role
  • Policy Content:
    • Privacy Principles or a link to explicit privacy principles
      • Purposeful handling of personal information
      • Transparency
      • Notice
      • Consent and opt-out
      • Preference Management
      • Data minimization
      • Least privilege access
      • Protection/security
    • Data Classifications/Categories of Information
    • Data retention policies
    • Explicit subject rights (deletion, correction, etc.)
    • Acceptable/Unacceptable tools, practices, and methods (for example, to enable discovery of structured and unstructured personal data, in order to provide the capacity to index and locate personal data, or to identify conditions that justify video surveillance).
    • Risk Assessments and tracking capabilities (including requirements for Privacy Impact Assessments and for Records of Processing Activities when personal information is handled)
    • Link to Privacy Statement/Website Terms & Conditions
    • Other emerging policy/guidance areas (provided below)

Emerging Elements of Privacy Policy

The following content captures high-level emerging privacy policy trends, including level of specificity within policy. These trends have been organized into two categories: privacy rights of individuals, and business expectations and/or obligations to protect privacy. The categories are consistent with IAPP’s research paradigm about the state privacy legislative components.

Privacy rights of individuals:

  • Differentiating personal data from non-personal data, expanding the definitions of personal data to include biometrics, behaviors, attributes, identification numbers of all types, etc.
  • Evolving definitions of sensitive data types, and differential rights/requirements for handling sensitive data, including behavioral data (learning analytics, performance, etc.)
  • Privacy notices and options to consent at various levels (by population, by process, etc.)
  • Explicit declarations about commitments to privacy (“We do not sell, trade, or rent your personal information to others,” or “no expectation of privacy for employees” on internal systems.)

Business obligations to protect privacy:

  • Imposing tools to process personal data in encrypted form to minimize risk or to impose data anonymization/pseudonymization
  • Requirements for maintenance of data inventories
  • Requirements for conducting privacy assessments
  • Requirements against proliferating personal information
  • Expanded privacy roles (every member of the institution) and more explicit expectations by role
  • Institution-wide requirements for annual privacy or confidentiality attestations
  • Policies expanded to cover mobile devices, Internet of Things, wearables, etc.
  • Explicit lists of approved technologies
  • Explicit video surveillance guidance
  • Explicit electronic security and access systems guidance
  • Explicit guidance about use of data analytics and business intelligence (including artificial intelligence, machine learning, data mining, etc.)
  • Guidelines about use of facial recognition (banned in some institutions)
  • Guidelines/restrictions regarding use of biometrics
  • Explicit cookie policies (including opt out, commitments not to use cookies)
  • Explicit online monitoring guidance and limitations
  • Inclusion of checklists to ensure privacy by design (example below): [xiv]
Privacy checklist for data projects

Privacy checklist for data projects

Common Challenges and Pitfalls in Privacy Policy

In reviewing privacy policies and policy elements across various institutions of higher education, the UW Privacy Office observed the following shortcomings:

  1. Age of Policy/Guidance: Time since created/last reviewed averaged about approximately 3-5 years. For some, the last date on which the privacy policy was reviewed is over nine years ago. Privacy laws and technologies have shifted significantly since this time, and institutions of higher education do not appear to have privacy policies that are adequately keeping up.
  2. Incomplete policy: Vague, lack of explicit elements.
  3. Policies related to privacy and data protection have been embedded across multiple policies, which may present a greater challenge in training workforce members and/or in holding them accountable.
  4. Inconsistencies across disparate policies, conflicting guidance.
  5. Gaps in the holistic approach to privacy of data, vast areas not yet addressed remain invisible to campus.
  6. Policy management across disparate policies is time-consuming and can be more expensive (time, lawsuits, fines, sanctions, etc.).

Another daunting reality is the simple fact that at about the time we have fully wrapped our minds around the current set of worries, pitfalls, outrages, and solutions, there will be a new set of digital ethics quandaries before us. [xv]

Conclusion

As global privacy laws shift and expand, institutions of higher education will increasingly be challenged to ensure privacy policies and practices consistent with evolving compliance requirements.

Further, and perhaps even more challenging, will be the institutional need to offer clear, coherent, comprehensive privacy policies (and supporting resources) to successfully address emerging technological advancements as well as evolving citizen expectations around privacy and data ethics. Our stakeholders will look us to be exemplary data stewards as we continue to pursue the teaching, research and service missions of our respective academic institutions.

Endnotes

[i] EDUCAUSE resource: The Higher Education CPO Primer, Part 1: A Welcome Kit for Chief Privacy Officers in Higher Education, page 4. August, 2016. From the Higher Education Information Security Council (HEISC) in partnership with EDUCAUSE https://library.educause.edu/-/media/files/library/2016/8/cpoprimerpart1.pdf

[ii] International Association of Privacy Professionals (IAPP) “US State Comprehensive Privacy Law Comparison,” Last updated: 4/16/2020. https://iapp.org/resources/article/state-comparison-table/

[iii] John O’Brien, President and CEO of EDUCAUSE, “Digital Ethics in Higher Education 2020” EDUCAUSE Review, 2020 Issue #2, Volume 55, Number 2, May 18, 2020. https://er.educause.edu/toc/educause-review-print-edition-volume-55-number-2-2020-issue-2, page 33

[iv] Nader Henein, Bart Willemsen, “The State of Privacy and Personal Data Protection, 2019-2020,” Published 15 April 2019, ID G00376084. https://www.gartner.com/en/documents/3906874, page 4 of 16

[v] Nader Henein, Bart Willemsen, “The State of Privacy and Personal Data Protection, 2019-2020,” Published 15 April 2019, ID G00376084. https://www.gartner.com/en/documents/3906874, page 2 of 16

[vi] Nader Henein, Bart Willemsen, “The State of Privacy and Personal Data Protection, 2019-2020,” Published 15 April 2019, ID G00376084. https://www.gartner.com/en/documents/3906874, page 2 of 16

[vii] Anne Kimbol, “Emerging Trends: What to Expect From Privacy Laws in 2020,” CPO Magazine, January 29, 2020. https://www.cpomagazine.com/data-protection/emerging-trends-what-to-expect-from-privacy-laws-in-2020/

[viii] Nader Henein, Bart Willemsen, “The State of Privacy and Personal Data Protection, 2019-2020,” Published 15 April 2019, ID G00376084. https://www.gartner.com/en/documents/3906874, page 3 of 16

[ix] Nader Henein, Bart Willemsen, “The State of Privacy and Personal Data Protection, 2019-2020,” Published 15 April 2019, ID G00376084. (Gartner log-in required) https://www.gartner.com/en/documents/3906874, page 8 of 16

[x] University of California “UC Statement of Privacy Values,” pages 2-4. https://www.ucop.edu/ethics-compliance-audit-services/_files/compliance/uc-privacy-principles.pdf

[xi] University of California San Diego Executive Vice Chancellor – Academic Affairs website, “Guiding Principles for Personal Data,” https://evc.ucsd.edu/units/privacy/guiding-principles-personal-data.html

[xii] Listserv respondent, from the Archives of the EDUCAUSE Privacy Community Group listserv (630 subscribers), email extracted in July 2019. http://listserv.educause.edu/scripts/wa.exe?A0=PRIVACY

[xiii] John O’Brien, President and CEO of EDUCAUSE, “Digital Ethics in Higher Education 2020” EDUCAUSE Review, 2020 Issue #2, Volume 55, Number 2, May 18, 2020. https://er.educause.edu/toc/educause-review-print-edition-volume-55-number-2-2020-issue-2, page 38

[xiv] John O’Brien, President and CEO of EDUCAUSE, “Digital Ethics in Higher Education 2020” EDUCAUSE Review, 2020 Issue #2, Volume 55, Number 2, May 18, 2020. https://er.educause.edu/toc/educause-review-print-edition-volume-55-number-2-2020-issue-2, page 39

[xv] John O’Brien, President and CEO of EDUCAUSE, “Digital Ethics in Higher Education 2020” EDUCAUSE Review, 2020 Issue #2, Volume 55, Number 2, May 18, 2020. https://er.educause.edu/toc/educause-review-print-edition-volume-55-number-2-2020-issue-2, page 42

Additional References

  1. Cameron Kerry, “Keeping the fires burning for federal privacy legislation,” IAPP Privacy Perspectives, June 3, 2020. https://iapp.org/news/a/keeping-the-fires-burning-for-federal-privacy-legislation/
  2. Ron DeJesus, Founder and CEO DeJesus Consulting, “How to operationalize privacy by design,” IAPP The Privacy Advisor, May 27, 2020. https://iapp.org/news/a/how-to-operationalize-privacy-by-design/
  3. Lisa Ho, Campus Privacy Office at University of California, Berkeley, “Naked in the Garden: Privacy and the Next Generation Digital Learning Environment,” EDUCAUSE Review, July 31, 2017. https://er.educause.edu/articles/2017/7/naked-in-the-garden-privacy-and-the-next-generation-digital-learning-environment
  4. Nick Jones, David Cearley, “Gartner Top 10 Strategic Technology Trends for 2020: Transparency and Traceability,” March 10, 2020, ID G00450644. https://www.gartner.com/en/documents/3981951
  5. Bart Lazar, “Five Key Provisions a Federal Privacy Law Should Include,” CPO Magazine, June 1, 2020. https://www.cpomagazine.com/data-protection/five-key-provisions-a-federal-privacy-law-should-include/
  6. Emily Leach, Kevin Donahue, “Embedding data ethics into your ‘culture of privacy’,” IAPP The Privacy Advisor, May 27, 2020. https://iapp.org/news/a/embedding-data-ethics-into-your-culture-of-privacy/
  7. Bernard Woo, Bart Willemsen, “Gartner for IT Leaders: Toolkit Privacy Policy,” September 6, 2019, ID G00432942. https://www.gartner.com/en/documents/3957023

Appendix A

IAPP State Table of privacy legislation and rights/obligations

IAPP – State comprehensive-Privacy Law Comparison

Privacy law comparison by state

IAPP Table of State Privacy Laws as of 4-2020

(List below copied from IAPP website.)

The 16 common privacy provisions include the following:

  • The right of access to personal information collected or shared – The right for a consumer to access from a business/data controller the information or categories of information collected about a consumer, the information or categories of information shared with third parties, or the specific third parties or categories of third parties to which the information was shared; or, some combination of similar information.
  • The right to rectification — The right for a consumer to request that incorrect or outdated personal information be corrected but not deleted.
  • The right to deletion — The right for a consumer to request deletion of personal information about the consumer under certain conditions.
  • The right to restriction of processing — The right for a consumer to restrict a business’s ability to process personal information about the consumer.
  • The right to data portability — The right for a consumer to request personal information about the consumer be disclosed in a common file format.
  • The right to opt out of the sale of personal information — The right for a consumer to opt out of the sale of personal information about the consumer to third parties.
  • The right against solely automated decision making — A prohibition against a business making decisions about a consumer based solely on an automated process without human input.
  • A consumer private right of action — The right for a consumer to seek civil damages from a business for violations of a statute.
  • A strict opt-in for the sale of personal information of a consumer less than a certain age — A restriction placed on a business to treat consumers under a certain age with an opt-in default for the sale of their personal information.
  • Notice/transparency requirements — An obligation placed on a business to provide notice to consumers about certain data practices, privacy operations, and/or privacy programs.
  • Data breach notification — An obligation placed on a business to notify consumers and/or enforcement authorities about a privacy or security breach.
  • Mandated risk assessment — An obligation placed on a business to conduct formal risk assessments of privacy and/or security projects or procedures.
  • A prohibition on discrimination against a consumer for exercising a right — A prohibition against a business treating a consumer who exercises a consumer right differently than a consumer who does not exercise a right.
  • A purpose limitation — An EU General Data Protection Regulation–style restrictive structure that prohibits the collection of personal information except for a specific purpose.
  • A processing limitation — A GDPR-style restrictive structure that prohibits the processing of personal information except for a specific purpose.
  • Fiduciary duty — An obligation imposed on a business/controller to exercise the duties of care, loyalty, and confidentiality (or similar) and act in the best interest of the consumer.

Appendix B

Institutions of Higher Education included in the benchmarking research:

  • Carnegie-Mellon University
  • Case Western Reserve
  • Cornell
  • Duke University
  • Georgia Tech
  • New Mexico State University
  • Notre Dame
  • Ohio State University
  • Penn State University
  • Purdue
  • Temple University
  • Texas A & M
  • University of California – System Level
  • University of California – Davis
  • University of California – Berkeley
  • University of California – Los Angeles
  • University of California – San Diego
  • University of California – Santa Cruz
  • University of Colorado
  • University of Connecticut
  • University of Florida
  • University of Illinois – Urbana-Champaign
  • University of Kentucky
  • University of Manitoba
  • University of Miami
  • University of Michigan – Ann Arbor
  • University of New Mexico
  • University of North Carolina – Chapel Hill
  • University of Pennsylvania
  • University of Texas – system level
  • Wayne State University

University of Washington Whitepaper: Data Anonymization and De-Identification: Challenges and Options (August 2019)

Data Anonymization Whitepater [PDF]

Executive Summary

This whitepaper is intended to create a cohesive understanding of data anonymization and de-identification concepts, describe the risks and challenges associated with processing personal data that has been anonymized or de-identified, as well as briefly outline options for managing privacy risks related to re-identification. In sharing this paper among members of the University of Washington (UW) community, the UW Privacy Office aspires to raise awareness, accountability, and stewardship practices among those responsible for processing personal data at the UW.

For decades, those using data sets containing personal information have implemented a variety of techniques intended to “anonymize” and/or to de-identify such data, in an effort to protect privacy and prevent future re-identification of the individuals. The combination of emerging technological capabilities and evolving cultural practices is causing many to conclude that data anonymization/de-identification is no longer possible and that related risks of re-identification of individuals will continue to grow. The associated privacy risks of re-identification are defined in various ways and often contemplate an organizations’ values and privacy principles, trusted relationship with individuals, compliance obligations, reputation, and financial well-being. Privacy risks include the possibility of objective or subjective harms to individuals, including: loss of liberty/opportunity, economic loss, social detriment, (unconscious or conscious) behavioral changes and/or psychological dangers1. When personal data is processed “unexpectedly,” it may result in one or more objective or subjective harms to individuals’ or result in other corresponding violations of privacy.

As a University, we must think strategically about the paradox, and the challenges and possibilities associated with the important question of “How might the UW continue to protect the privacy of personal data and appropriately process such data to create innovative research, technologies, learning methods, products, and solutions?”

Background

The University of Washington (UW) collects, creates, and processes many types of personal data. In general, personal data is any information that identifies or can identify an individual, either on its own or in conjunction with other information. The processing of personal data can take various forms such as collecting, recording, using, sharing, adapting, altering, or storing of structured, un-structured, or meta data. As stewards of personal data, all members of UW who process personal data have a role in helping protect the privacy of personal information.

In order to manage privacy risks associated with processing of personal data, the users of personal data (e.g. analysts, researchers, data scientist, fiscal or computing specialists) use a variety of techniques to manipulate the data elements, including redaction, pseudonymization, de-identification and/or anonymization. These techniques differentially modify or impact the personal data elements included, elements which can be classified by the levels of “identifiability” implicit in each along an identifiability continuum3:

  • Direct identifiers serve to “uniquely” identify an individual, and include data elements such as Social Security Numbers, Tax ID numbers, Passport numbers, full names or addresses.
  • Indirect identifiers, while not unique to an individual, can be combined with other indirect identifiers to identify an individual among a set of individuals. Indirect identifiers include items such as zip code, birth date, IP address, etc.
  • Other personal data elements may be associated with multiple individuals, such as level of education, area of study, or communication preferences, and within a single data set a combination of such elements often does not allow the identification of a single individual.
  • When data have been appropriately manipulated, combined or aggregated (perhaps in census data or survey results) they typically can no longer be linked to any individual, and are considered anonymized.
  • Finally, some data elements (such as weather) are simply not related to individuals, and would not be considered personal information.

Techniques

Data sets containing either direct or indirect identifiers are generally perceived to be more useful for research or analytics, and typically present greater risks to individual privacy. Historically, in order to reduce such privacy risks, the following techniques3, described in a simplified manner, have been used:

1. Deletion, redaction or obfuscation:

Direct identifiers are covered, eliminated, removed or hidden. These techniques are difficult to accomplish well, particularly on unstructured data, and use of unsophisticated techniques may enable easy re-identification.

Example: Jane Doe – DOB 8/15/1970 – St. Louis -> XXXXXXXX – DOB 8/15/1970 – St. Louis

2. Pseudonymization:

Information from which direct identifiers have been eliminated, transformed or replaced by pseudonyms, but indirect identifiers remain intact. Re-identification may occur where there is failure to secure the pseudonymization method or key used, and/or when reverse engineering is successful.

Example: Jane Doe – DOB 8/15/1970 – St. Louis ->ID:TRXD 8/15/1970 St. Louis

3. De-identification:

Direct and known indirect identifiers (perhaps contextually identified by a particular law or regulation, i.e. HIPAA) have been removed or mathematically manipulated to break the linkage to identities.

Example: Jane Doe – DOB 8/15/1970 – St. Louis -> Female 1970 Missouri

4. Anonymization:

Direct and indirect identifies are removed or manipulated together with mathematical and technical guarantees, often through aggregation, in order to prevent re-identification. Anonymization is intended to be irreversible.

Example: Jane Doe – DOB 8/15/1970 – St. Louis -> Female Adult Missouri

Note that encryption is sometimes inaccurately thought of as an obfuscation or de-identification technique. However, it is not as such a technique, but rather is a security measure intended to protect the personal data that may contain any combination of identifiable data elements.

Re-identification Motivations, Methods, and Myths

Motivations for re-identification of individual data subjects can vary widely:

  • Threat actors may want to re-identify in order to conduct identity theft and fraud or social engineering of individuals.
  • Statisticians, data analysts, data scientists or other users of personal data may want to re-identify when challenged to prove data cannot (or that it can) be re-identified.
  • Scientific researchers may attempt to re-identify in order to further test-related hypotheses.
  • Capitalistic individuals and organizations increasingly want to re-identify in order to profile individuals, monetize personal data, or use personal data in ways that may not be expected, anticipated, or desired by the individuals. This is exemplified by data brokers, marketing and advertising organizations, social media firms, and so many other types of organizations today.

Among the prevalent methods used when attempting re-identification of anonymized or de-identified data:

  • “Reverse” redaction (as seen in the movie “Hidden Figures” by way of exposing a manually redacted document to intense light source, or as in the technological glitch in redaction of the Manafort legal documents, where cutting and pasting redacted text into a new document rendered the text visible once again);
  • “Reverse” pseudonymization – uncovering the methods, accessing the key, or reverse-engineering the pseudonymization techniques implemented;
  • And increasingly by way of combining or linking data with other data sets available either publicly or for purchase. As eloquently phrased by Boris Lubarsky “The proliferation of publicly available information online, combined with increasingly powerful computer hardware, has made it possible to re-identify “anonymized” data.”4

Finally, there are many myths about de-identification methods that simply must be refuted5:

  • Myth 1: Only highly knowledgeable data scientists can re-identify individuals within anonymized or de-identified data sets. Readers of this paper are encouraged to read the resources linked below for many recent examples where students and/or junior analysts successfully re-identify individuals within data sets.
  • Myth 2: De-identified data can be used for any purpose. To limit resulting privacy risks to organizations and individuals, stewards of personal data should limit personal data use to the original purpose for data collection/creation.
  • Myth 3: Once data is de-identified, it can be given to any recipient. Corresponding privacy risk of re-identification will likely increase with each subsequent release of de-identified data.

Basic Privacy by Design Steps to Help Protect Personal Data

Privacy by design, in general, is the protection of personal data by embedding privacy practices into University operations, business processes, information systems, and technologies, including: at the earliest design stage when initially determined that data processing will involve personal data; during data processing; and at the conclusion of the information lifecycle when personal data is no longer needed for the purpose it was collected or created by the University. The following privacy by design steps may help address the data anonymization and de-identification challenges and options.

  1. Clearly articulate the purpose for processing any personal data.
  2. Only collect what is required for the specific purpose.
  3. Plan how to protect personal data and identities before you collect or create data.
  4. Protect data as required under all applicable laws (review the UW Privacy Laws webpage). Note that personal data protection requirements often differ substantially, at times in conflict, under the large number of state, federal, international, and federation-level privacy laws.
  5. Anticipate privacy risks, and identify possible consequences of related harms should the data be compromised, in order to ensure the benefits of personal data collection outweigh possible costs.
  6. Provide notice (or seek consent) about data collection and purpose.
  7. Be intentional about the technique used to redact or obfuscate, de-identify, pseudonymize, or anonymize personal data to ensure compliance with relevant laws or regulations.
  8. Control and monitor access to details of de-identification process and/or keys.
  9. If you need to de-identify or anonymize data elements, engage with colleagues to discuss options and issues, and to identify reliable sources and strategies for de-identification.
  10. If sharing de-identified data, then specify with others that data will not be re-identified.
  11. Be mindful of sustaining data integrity, and include practices to periodically refresh personal data.
  12. Manage privacy across the entire data lifecycle from collection/creation to data destruction consistent with records retention schedules, once data purpose has been achieved.
  13. Acknowledge (and wherever possible, offset) increasing likelihood of re-identification privacy risks for the UW and individuals.
  14. Be prepared to respond to data subjects’ requests to exercise their various privacy rights under evolving laws.

Future Anonymization/De-identification Challenges

The many challenges associated with use of anonymized or de-identified personal data are expected to grow as a result of a combination of shifting factors:

  • New privacy legislation has been and is expected to be introduced at all levels of privacy law;
  • Existing privacy laws are expected to be reviewed, revised, updated or pre-empted;
  • De-identification and anonymization techniques, as well as re-identification techniques, will continue to be revised, enhanced, and invented; and
  • Technological improvements in computing capacity will continue.

…(T)he gathering evidence shows that all of the (“de-identifying”) methods are inadequate, said Dr. de Montjoye. “We need to move beyond de-identification,” he said. “Anonymity is not a property of a data set, but is a property of how you use it.”6

Adopting broad “Privacy by Design” practices throughout UW helps ensure continued stewardship and protection of the vast personal data under UW’s care.

Citations

1 R. Jason Cronk, Strategic Privacy by Design, International Association of Privacy Professionals (IAPP) 2018, https://iapp.org/ (search for print or electronic copy of this book), pages 154-155. The privacy harms content within Cronk’s book is based upon The Boundaries of Privacy Harm, Ryan Calo, Indiana Law Journal, Vol. 86, No. 3, 2011, written July 16, 2010​ https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1641487

2 Boris Lubarsky, Re-Identification of “Anonymized” Data, Georgetown Law Technology Review, https://georgetownlawtechreview.org/re-identification-of-anonymized-data/GLTR-04-2017/, pages 203-204

3 Barbara Sondag, Elimu Jajunju, and Elena Elkina, Privacy.Security.Risk.2017 conference presentation entitled: Global Technological and Legal Effects of De-Identification and Anonymization, (no longer available online)​, slide 2.

4 Boris Lubarsky, Re-Identification of “Anonymized” Data, Georgetown Law Technology Review, https://georgetownlawtechreview.org/re-identification-of-anonymized-data/GLTR-04-2017/, page 203

5 Barbara Sondag, Elimu Jajunju, and Elena Elkina, Privacy.Security.Risk.2017 conference presentation entitled: Global Technological and Legal Effects of De-Identification and Anonymization, (no longer available online)​, slide 5.

6 Gina Kolata, “Your Data Were ‘Anonymized’? These Scientists Can Still Identify You”, New York Times, July 23, 2019. https://www.nytimes.com/2019/07/23/health/data-privacy-protection.html

Additional Resources

Many additional resources were reviewed to inform overall content. The following were of particular help
Ryan Calo, “The Boundaries of Privacy Harm”, Indiana Law Journal, Vol. 86, No. 3, 2011, written July 16, 2010​, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1641487. ​

Mark Elliot, Elaine Mackey, Kieron O’Hara and Caroline Tudor, The Anonymisation Decision-Making Framework, United Kingdom Anonymisation Network (UKAN), University of Manchester, Oxford Road Manchester, M139PL, 2016, https://ukanon.net/ukan-resources/ukan-decision-making-framework/.

Kelsey Finch, A Visual Guide to Practical Data De-Identification, Produced by the Future of Privacy Forum (FPF), https://fpf.org/blog/a-visual-guide-to-practical-data-de-identification/.

Personal Data Protection Commission, Singapore, “Guide to Basic Data Anonymisation Techniques”, Published 25 January 2018, https://www.pdpc.gov.sg/-/media/Files/PDPC/PDF-Files/Other-Guides/Guide-to-Anonymisation_v1-(250118).pdf.

Jules Polonetsky, Omer Tene, Kelsey Finch, “Shades of Gray: Seeing the Full Spectrum of Practical Data De-identification”, Vol. 56, Number 3 Santa Clara Law Review, Article 3 (6-17-2016). Available at: https://digitalcommons.law.scu.edu/lawreview/vol56/iss3/3/.

Balaji Raghunathan, The Complete Book of Data Anonymization: From Planning to Implementation, Boca Raton: CRC Press, Taylor & Francis Group, 2013

Luc Rocher, Julien M. Hendrickx & Yves-Alexandre de Montjoye, “Estimating the success of re-identifications in incomplete datasets using generative models”, Nature Communications 10, Article number: 3029 (23 July 2019). Available at Nature Communications: https://www.nature.com/articles/s41467-019-10933-3

Ira Rubinstein, Woodrow Hartzog, “Anonymization and Risk”, 91 Washington Law Review 703 (2016); NYU School of Law, Public Law Research Paper No. 15-36, (August 17, 2015). Available at SSRN: https://ssrn.com/abstract=2646185

Alexandra Wood, Micah Altman, Aaron Bembenek, Mark Bun, Marco Gaboardi, James Honaker, Kobbi Nissim, David R. O’Brien, Thomas Steinke & Salil Vadhan (Harvard University Privacy Tools Project), “Differential Privacy: A Primer for a Non-technical Audience”, Vanderbilt Journal of Entertainment & Technology Law (JETLaw), Vol. 21, Issue 1, pages 209 – 276, http://www.jetlaw.org/journal-archives/volume-21/volume-21-issue-1/differential-privacy-a-primer-for-a-non-technical-audience/.

Opt In Versus Opt Out

Background

In today’s information age, it can be hard to know whether signing up for a product, service, catalog, newsletter, or discount will result in your information being shared, sold, or otherwise used without your knowledge. The concept behind informing consumers how their information is being used is known as “transparency.” Mobile app providers, websites, financial institutions, and other holders of personal information may be subject to laws regarding the information they collect and how it is used, but often the law has not kept pace with technology.

The issue of control over one’s data often involves the ability or knowledge of the choice of opting in or opting out of certain data collection practices. What does it mean to “opt out” of certain data practices?  Alternatively, what does it mean to “opt in” to such practices? The real-life stories below provide examples of these two concepts.

Opt Out

With an “opt out” approach the consumer must actively take steps to remove themselves from participation in data tracking or from the sharing of their information. Examples of “opt out” scenarios include apps automatically using your profile or tracking your behaviors, interest, likes, or location, and companies sharing or even selling personal information about individuals to subsidiaries. This is often termed a “partner providing services of possible interest to you.”

Businesses may or may not be required to disclose that tracking technologies or data sharing are in use. Additionally, it is sometimes cumbersome for consumers to remove themselves from lists they did not sign up for or to deactivate technologies they did not actively “turn on” in the first place.

Opt In

The benefits of collecting information about consumers of a product or business cannot be ignored, however, there are ways of collecting this information that affords consumers more control over their personal information. A hallmark of the “opt in” strategy is transparency between the information gatherer and the consumer. The individual is fully informed of how, where, and with whom personal information is shared. In some instances, individuals are given a choice about the level of tracking or data sharing related to the product or service they use.