The Evolution of Digital Identity

Kayode Ezike
25 min readAug 2, 2019
Me reviewing various forms of digital and analog identity: Driver’s license | Passport | Digital MEng diploma issued by MIT via Blockcerts | SolidVC: Verifiable Credentials framework developed for MEng thesis

My foray into the world of identity occurred on a rainy Valentine’s evening of ’96. This is the day I was born to two Nigerian Igbo parents in the Bronx. At the time, “my foray” was really that of my parents, who acted in my best interest as my trusted stewards. As they laid eyes on me for the first time, their Central Nervous System must have fired a starting pistol for an evening-long race among competing thoughts of gratitude, joy, and worry. The likely underdog for the event was the question of birth registration and yet it likely placed highly at the finish line that night, as this is the question that informs so much of our practical functionality, livelihood, and identity in the United States.

From this moment, all sorts of claims were being made about me: “I am a patient at this hospital”, “I was born in this state”, “I am a citizen of this country.” Anyone who was present in the room at the time of my birth could confidently assert that these claims, which serve as the basis for so many other future privileges and services, are true. Nevertheless, modern civilization has recognized a need and developed an accompanying set of mechanisms for formally and reliably recording these statements about me. Thus began my journey through the domains of identity.

Later on, my parents would visit the Social Security Administration on my behalf to register a Social Security Number, a highly-linked, government-issued identifier that once would have only determined my pension payout, but now grants me access to anything from rental services to financial credit. And as I grew older, I would begin registering components of my identity as needed: a passport to certify my nationality during international travel; debit and credit cards to build financial stability, purchase goods and services, and access credit; and a driver’s license to get around by car. Soon, even companies would vie for a piece of my identity as I connected with friends on social media and purchased goods on e-commerce services.

My experience with identity is hardly a unique one in the Western world. In order to operate as functional members of society, we are all often expected to register components of our identity with authoritative identity providers for the sake of accessing future services from these same providers or other third parties. However, it is worth noting that the currently accepted processes by which identity is registered, asserted, and verified was not destined to become the status quo.

The following report is an abridged historical account of identity management systems. I acknowledge that the account is, by no means, a comprehensive one and is rather intended to highlight some of the major developments in the industry for the purpose of understanding the state of the art and equipping the reader with the tools necessary to shape the future of the identity management space.

Analog Identity

For as long as people have inhabited Earth, there has always been an implicit need for people and institutions to know the entities with whom and which they are interacting. The nature of such a relationship is typically informed by previous interactions and, in conjunction with current interactions, informs the future of the relationship. With that framework in mind, the most important aspect about identity management is the ability to record and retrieve information about the dynamic relationships between a subject and the entities with whom and which they have interacted in the past (including themselves).

Throughout history, people have developed a colorful assortment of models for capturing and processing this information. In this section, I will explore the history of analog identity management systems, using Europe as the anchoring case study.* I decided to focus on Europe because their identity management practices directly influenced many other identity management systems around the world.

(*Note: I would be remiss if I neglected to mention that the following case study was inspired by a distilled account of Mawaki Chango’s brilliant thesis, Becoming Artifacts: Medieval Seals, Passports and the Future of Digital Identity, as provided in Domains of Identity by Kaliya “Identity Woman” Young. More on Young's paper later!)

Feudalistic Identity

In medieval Europe, feudalism was the dominant social system in place. In this system, servants (vassals) agreed to provide labor for landowners (lords) in exchange for inhabitance on the land (fief). Privileges and responsibilities of vassals were specified by the lords and were based on role and social relationships within the system. At the time, the Roman Catholic Church served as a surrogate government, establishing a system of laws and recording basic information about the populace, such as parish membership and duty fulfillment.

Coat of Arms of Anne, the Princess Royal (Source: Sodacan)

It was in this context that one of the earliest forms of identity documentation arose in the form of seals, which were used by monarchs and Kings to assert authority in different domains as well as commoners to authenticate themselves to other civilians and institutions. Additionally, a seal also served as an official assertion that the individual will stand by that the set of claims on the bearing document. The reader will appreciate this concept more when we discuss Verifiable Credentials and Self-Sovereign Identity in a later section of the report.

Corporate Identity

Soon seals took on a greater role as corporations entered onto the scene. Now, an entity representing a collective of individuals could define their own rules and adopt the status of a legal person. Because a corporation was comprised of individuals with a set of special privileges, they would need to register with the corporation in order to take actions on behalf of the company in the future. This registration process produced an “identity primitive” document bearing the company seal.

First Model of the Modern Passport

In 1414, King Henry V introduced the first example of the passport in the Common Era for the sake of identifying his subjects as they travel outside of the country. Couriers would carry passports with them as they traveled, receiving stamps at regular border checks en route to their final destination.

Left: King Henry V of England (Source: Unknown) | Right: British Passport (Source: Mike Rohsopht)

Origins of Birth Registration

With the rise of the modern state, governing bodies felt the urge to collect general information about the populace in their jurisdiction, which required impersonal procedures. Whereas subjective assertions of social relationships was enough in the feudal system, it was insufficient for the modern state, which desired to know objective facts such as place of birth. In 1538, Thomas Cromwell created a nationwide system for the registration of all births, deaths, and marriages within the Church of England. Cromwell’s intentions for this system was to serve as a legal mechanism for resolving questions regarding age, lineage, inheritance, and citizenry.

State of the Art

Today, many of the analog primitives of identity management persist in different forms of expression around the world. Passports and birth certificates remain staples in civil society in addition to other forms of achievement credentials, such as driver’s licenses and academic diplomas, which are also of the seal family of medieval times. Additionally, the installment of Social Security by former president FDR as a means for tracking monetary contributions made towards personal social benefit accounts, brought with it a litany of privacy and security issues that has reared its ugly head in the Digital Era.

Digital Identity

With the emergence of the Web, many services migrated from the physical domain into the digital domain. This development necessitated processes for digital identity management. Today, online identity is far from perfect and many wish that they could return to its early days to influence an alternate course of development. Nevertheless, it is important to understand the historical context within which the components of digital identity emerged. In this section, I will shed some light on this history and share some of the common practices in digital identity management today.

Emergence of the World Wide Web

In 1989, Sir Tim Berners-Lee leveraged old and new concepts to develop a generational technology: the World Wide Web (a.k.a., the Web). With this technology, users could publish and retrieve data hosted remotely on heterogeneous machines in a human readable format across the Internet without understanding the intricate workings of each machine. The Web would prove to be a valuable utility for people around the world. However, it came with its own issues, including cyberbullying, denial of service (DoS) attacks, and various forms of personal data breaches. One concern that came to light with the emergence of online service providers is the matter of digital identity. People want to know with whom and what they are connecting online. This service was not gifted directly by the Internet, so service providers took matters into their own hands, resulting in the ad hoc digital identity ecosystem we have today.

Sir Tim Berners-Lee, Inventor of the World Wide Web (Source: CERN)

Authentication Factors

If a Web service is the online equivalent of a party, then authentication factors are the various acceptable forms of evidence presented to the bouncer for entry into the party venue: a physical invitation delivered to the invitee’s home address, a virtual invitation delivered to the invitee’s e-mail address, a government ID card that includes a photo and matching name of the invitee from the invitation, etc.. Depending on who you ask, there are 3 to 5 standard authentication factors that are widely used in practice:

  • Knowledge Factor: This authentication factor tests for what information the subject knows. Common examples of factors that fall under this category include passwords and PINs.
  • Possession Factor: If you have ever purchased an item with a credit card or used the randomly generated access code on a key fob to access admin privileges to a computer system, you are familiar with the possession factor of authentication. This factor tests for what the subject owns.
  • Inherence Factor: Who are you and what physical inherent qualities do you embody that make you unique. These are the questions that the inherence factor aims to address. Inherence factors typically come in the form of biometrics, such as fingerprints and various forms of eye scans (ie. iris or retina).
  • Location Factor: It’s difficult to travel quickly between two distant locations and it’s impossible to be in multiple places at once. Web services deploy location factors to exploit these simple, but powerful, constraints in order to detect suspicious parallel or successive login activity and restrict access when necessary. This authentication factor is typically verified via Internet Protocol (IP) addresses.
  • Behavior Factor: This authentication factor concerns itself with the unique behavioral aspects of the subject’s interaction with the application. Example behavioral pattern types captured in this factor include those of keyboard typing, mouse clicking, and mouse dragging. Arguably the most sophisticated and underutilized factor of authentication, the behavior factor is a powerful technique that acknowledges the following approximation: each human being is capable of a unique behavioral action set, which possesses an entropy that distinguishes it from the behavioral action set of any other human being. In other words, although we can expect for human beings to exhibit slight divergences between the previously registered behavioral action and the currently exhibited behavioral action, we can also expect for each divergence to be consistent enough in magnitude and direction across all other previous interactions, so as to distinguish one user from another.

Single-Factor Authentication

The average online user has accustomed themselves to a particular workflow for registering with Web apps and services that create, display, or update personal datasets about themselves. For illustrative purposes, let’s imagine a fictional online service called Generic Social Web Service (GenSWS). Before I even reveal anything about the functionality of GenSWS, the reader can probably already imagine what the registration process would likely look like:

  • Provision of a series of personally identifiable information (PII) from the user, including contact information (name, e-mail, phone number)
  • Selection of username or provision of account information from Google or Facebook, which would have already performed the diligence of identifying and registering the user in advance
  • Configuration of password as authentication factor
  • [Optional] Configuration of security question in the event of compromise or loss of authentication factor
  • [Optional] Administration of a Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA)
  • Asynchronous verification of user identity by means of contact information provided above

The GenSWS login process is probably as easily conceivable:

  • Provision of username or other registered identifier
  • Provision of password

The registration and login processes for GenSWS presented above roughly describe the processes by which the average online service registers and authenticates users on the Web, modulo slight variations here and there in PIIs of interest and overall process. Together, these processes represent the single-factor authentication (SFA) workflow, where the authentication factor of interest is the knowledge factor.

Multi-Factor Authentication

Whenever you are asked to provide more than one factor of authentication to register or authenticate yourself with a service, you are experiencing the multi-factor authentication (MFA) workflow. To understand this workflow, let’s revisit a modification of an earlier example of the possession factor: debit card purchase. When purchasing goods at the register with a debit card, the customer is typically solicited to enter a PIN after inserting the card into a card reader. This is a model example of MFA because it requires the subject to enroll and assert two factors of authentication: a possession factor in the form of the physical debit card and a knowledge factor in the form of the PIN. The typical MFA workflow involves two authentication factors, commonly known as two-factor authentication (2FA), but there are exceptional examples of high-security computer systems in practice that require three or more factors.

Two-Factor Authentication (Source: Duo Security)

Out-of-Band Authentication

In order to make the most out of multi-factor authentication and its potential for heightened security, services should make use of out-of-band (OOB) authentication. With this technique, users are solicited for each authentication factor from a different communication channel. An example of OOB authentication is a login process that requires the user to enter their username and password in an interface on their computer and subsequently enter a verification code that is sent to their phone.

The benefit of this authentication workflow is that it increases the barrier for wrongful access, since it requires the attacker to hack two separate and independent authentication channels to gain entry. While, this technique is often employed by financial institutions and other security-critical organizations, it is generally recommended to bolster the MFA workflow with OOB authentication.

Single Sign-On

With time, practitioners in the Web community began to realize that authentication practices are repetitive and that new, authentication-light players should foster a relationship of trust with existing, authentication-heavy players in order to recycle the authentication and authorization data for their own use. In other words, if Alice has registered an account with Domain A and Alice now wants to register with Domain B, if Domain B trusts that Domain A has properly identified Alice during registration and has optionally collected additional data about Alice, then that information should be captured and transferred to Domain B. At least, that is the belief of single sign-on (SSO) advocates.

With SSO, users register with one authentication service and use their authentication credentials (ie., username and password) from that service to register with and login to another service. Examples of SSO specifications and services designed and implemented to support this workflow include Facebook Connect, Security Assertion Markup Language (SAML), Microsoft Account (formerly Microsoft Passport), Open Authorization (OAuth), and OpenID Connect (OIDC).

Privacy

With the existence of government and commercial surveillance and the emergence of the lucrative data broker and black market industries, the public cry for privacy has reached a fever pitch. The following are a series of practices and initiatives, old and new, that have been developed to protect personal identity-bearing data from unintended exposure and misuse and/or to secure communication of such sensitive data between privileged parties.

Cryptography

A central theme of privacy is protection of sensitive data from access by untrusted parties. In this light, if privacy is the big picture, cryptography is the paint brush. Cryptography is the study of techniques that are capable of facilitating secure and reliable communication of information in the midst of unauthorized adversaries. Many, if not most, privacy-centered systems, frameworks, algorithms, and applications leverage cryptographic techniques at one or more layers of abstraction. In fact, a number of these will come to light throughout the remainder of this section.

Transport Layer Security

People are not the only entities with digital identity. Often times, it is necessary to capture information about relationships involving inanimate objects, including physical architecture, computational machinery, and even software. Websites are no exception to this rule. In fact, if you have ever visited a site beginning with https, you have unwittingly made use of a technology that authenticates websites: Transport Layer Security (TLS). Previously known as Secure Sockets Layer (SSL), TLS is a cryptographic protocol designed primarily to provide communications security between Web browsers and Web servers over a computer network. The TLS protocol consists of two subprotocols: the handshake protocol and the record protocol.

The first of these protocols is responsible for authenticating the server to the client and establishing cryptographic primitives for future communications, such as encryption algorithm and shared session key. The latter protocol is what actually enables secure communication with the session key established in the previous protocol. TLS was originally developed by Netscape in 1995 after they realized that such an authentication service is critical for e-commerce. In fact, we can thank TLS for any degree of confidence we have in the exclusive communication of our credit card number with authorized recipients on the Web.

Differential Privacy

Digital identity systems allow us to know the individuals with whom we are interacting online. Many privacy-enhancing technologies (PET) achieve the opposite effect: they prevent us from knowing the individuals involved in a certain scenario. One example of this class of systems is differential privacy, a characteristic of dataset-consuming algorithms that limits the identifiability of individuals that submit their personal data as part of a statistical experiment. The goal of differentially private algorithms is to limit the potential for linkage attacks that result from stateful sequential queries against a statistical database by introducing slight perturbations in user data collected.

Consider a statistical database containing information correlating various personal factors with neurodegenerative diseases. The experiment reveals that football players are predisposed to Chronic Traumatic Encephalopathy (CTE). Differentially private algorithms aim to enable discovery of such useful aggregate insights without exposing individual, personal data handles that could be linked with other datasets to discover potentially compromising information.

The intuition behind differential privacy is that if User A is not included in a statistical experiment, then there is 0% chance that they can be identified from analysis of dataset D_without that was gathered as part of the experiment. Therefore, if analysis of D_with including User A resembles the analysis of D_without, then there is also 0% chance that User A can be identified from analysis of dataset D_with. For readers interested in learning about related privacy-preserving data analysis schemes, be sure to also explore k-anonymity and federated learning, respectively a predecessor and descendant of differential privacy.

Zero-Knowledge Proof

Sometimes, we want to prove ownership or knowledge of a hidden resource without revealing the resource. For illustrative purposes, consider the following party trick. Peter The Prover wants to prove the color of a randomly picked card from a standard 52-card deck to Valerie The Verifier. However, Peter would also like to keep the number and suit a secret throughout the course of the demo. In order to accomplish this goal, Peter draws and reveals 26 red cards from the deck. Because there are only 26 cards that are red and neither of them are the one that was initially pulled out from the deck, Peter has successfully proven to Valerie that the color of the card is black. Moreover, he has proven this statement without revealing extraneous information. This is an example of a zero-knowledge proof. There are three criteria for a zero-knowledge protocol:

  1. Completeness: If prover and verifier behave correctly according to the protocol specification, verification will proceed successfully.
  2. Soundness: If the prover’s statement is false and they do not have knowledge of the secret information, there is an extremely small probability that the prover can convince the verifier of the veracity of the statement.
  3. Zero-Knowledge: If the prover’s statement is true, the verifier learns nothing more than the fact that the statement is true.
Zero-Knowledge Protocol (Source: Introduction to Zero Knowledge Proof: The protocol of next generation Blockchain)

If we map the example to the rules, the completeness rule asserts that so long as Peter knows the true color of the card and Valerie understands the underlying constraints and consumptions of the protocol (e.g., there are 52 cards in a deck, half of which are black and the other half of which are red), the party trick will be a success every single time. The soundness criterion prevents Peter, with high probability, from being able to prove that the color of the card is red if he doesn’t actually know this fact. Finally, the zero-knowledge constraint bars Peter from sharing any additional information (ie., number or suit) with Valerie beyond the color of the card.

Data Minimization

If you think of a data subject as an employer and a Web application as a job applicant, data minimization is the strategy employed by the former when negotiating an offer letter with the latter. Data minimization (sometimes known as selective attribute disclosure), is the act of restricting the disclosure of personal data strictly to that which enables the proper and honest functioning of an interested data consumer.

Take for example, a driver’s license. We have been conditioned to present this document for many different purposes, including the obvious proof of authority to drive and the ancillary proof of age. However, many fail to realize that this credential document exposes more personal data than is necessary for the interested third party.

In the first case, the most relevant information is the image, expiry date, and revocation information. Meanwhile, in the second case, some will argue that the only relevant information is the birthday, but many experts in the digital identity community will make a stricter argument: the only necessary information for most age data consumers is a proof from a trusted document issuer (in this case, the DMV) that the subject is within a certain age range. This privacy constraint imposes a stricter set of requirements for data sharing that protects the data subject from unnecessary exposure. The reader will soon find that data minimization becomes increasingly relevant as more and more credentials migrate to the Web.

Credentials on the Web

As more and more activities migrate to the Web, users are finding it necessary, or at least convenient, to migrate their credentials onto the Web. Over time, this paradigm has given rise to a number of technical requirements, including new schemes for identifying and locating resources on the Web; novel techniques for the secure storage, transmission, and verification of personal data on the Web; and various PETs that are suitable for the Web at the application and protocol layers. Additionally, such a behavioral shift has required adoption at a wide enough individual and institutional scale. The following is a set of technologies, protocols, and specifications that support credentials on the Web and an accompanying discussion of the state of adoption where necessary.

WebID

WebID is a protocol for uniquely identifying an agent, such as a person, company, or organization with a Unique Resource Identifier (URI). WebID’s are an important primitive in the context of Linked Data (or the Semantic Web), which is a set of principles for representing, publishing, and parsing otherwise informal relationships between resources on the Web in a machine-readable manner for the sake of meaningful data discovery. In this context, entities are typically identified with a WebID, which can sometimes be dereferenced to reveal information that has been published by or about the subject, including name, interests, professional affiliations, and social relationships.

WebID-TLS Protocol (Source: The WebID Protocol & Browsers by W3C WebID Incubator Group)

The WebID-TLS Protocol integrates open Web standards, such as TLS discussed earlier, in order to authenticate individuals accessing these personal, sometimes sensitive, profile documents. The main difference is that instead of the authenticating server relying entirely on a Certificate Authority (CA) to authenticate the connecting client, it fetches the public key directly from the client’s remote profile page and compares it with that from the browser signed certificate. This authentication process is enough for the authenticating server to verify that the remote agent is indeed identified by the connecting WebID. Although WebIDs have been around since the turn of the century, they are still in regular use by the Linked Data community in the form of WebID-OIDC, a version of the WebID protocol that is based on OpenID Connect.

OpenID Connect

It is difficult to discuss Web-based credentials without paying homage to OpenID Connect (OIDC). Based on a standard authorization protocol called Open Authorization 2.0 (OAuth 2.0), it extends this protocol in order to enable Web applications to fetch identity information, retrieve authentication information, and enable SSO. OAuth alone is limited in its authentication functionality, as it simply grants an end user access to protected resources on the application with an Access Token. With OpenID Connect, an application is granted access to consented identity information in the form of a scoped ID Token. This gives end users the ability to selectively disclose elements of their personal data with relying parties using the data minimization principle. OpenID Connect is one of the earliest credential management tools on the Web and has been implemented at one point or another by many major organizations in the past, including AOL, Google, eBay, and IBM.

Open Badges

With the advent of of Massive Open Online Courses (MOOCs), hosted by platforms such as edX, Udacity, and Coursera, and the proliferation of online education that followed, the digital identity community found it necessary to capture information about accomplishments on the Web. One of the earliest efforts to capture online achievements was the Open Badges specification.

The Open Badges specification standardizes the terms used in portable digital badge files representing online achievement (i.e., recipient, issuer, and description), so that various services consuming the digitally signed data can understand the accomplishments and and integrity of the badge. Open Badges have been around for close to a decade and have been implemented by a variety of online education platforms, including Open EdX and more recently Blockcerts, led by Principal Architect, Kim Hamilton Duffy.

Verifiable Credentials

Open Badges were a great first step at standardizing online credentials. However, many practitioners in the digital identity space are of the opinion that it is a bit complex, requiring implementers to abide by a restrictive set of rules for issuing credentials. With these concerns percolating in the earlier half of the decade, the Verifiable Claims Working Group emerged from the Credentials Community Group (CCG) from within the World Wide Web Consortium (W3C) to begin the Verifiable Credentials (VC) specification.

Verifiable Credentials Roles & Workflow (Source: Verifiable Credentials Data Model 1.0)

According to Version 1.0 of the spec, the goal of VC is to provide a mechanism for expressing “credentials on the Web in a way that is cryptographically secure, privacy respecting, and machine-verifiable.” Verifiable Credentials answers a critical question on the Web: how do I reliably verify that an entity made a statement about another entity on the Web? In the specification, there are three entity types: Holders, Issuers, and Verifiers. The Holder owns the credential, the Issuer rewards the credential to an authenticated Holder, and the Verifier receives the credential from an authenticated Holder in order to validate its integrity, typically for the sake of deciding whether to deliver a protected service. If we use the example of academic credentials in the form of a diploma and its role in the job market, the following mapping could be made: The Issuer might be MIT, the Holder might be a Course 6–3 (Computer Science) alum, and the Verifier might be Google as a potential employer.

Verifiable Credentials is becoming a household name in the digital identity community, with tooling and applications provided by organizations like Digital Bazaar and Learning Machine, which recently received a grant from the Department of Homeland Security to align Blockcerts with Verifiable Credentials and Decentralized Identifiers. In fact, in a recent MIT-led initiative, nine leading institutions aim to change the landscape of online achievement by “building an infrastructure for academic credentials that can support education systems of the future”. Publicly available information seems to suggest that the Digital Credentials initiative will observe open Web standards such as Verifiable Credentials. Finally, I have recently taken a personal interest in Verifiable Credentials, implementing for my Masters thesis a proof of concept using a personal data management tool called Solid.

Decentralized Identifiers

Identifiers are useful for labeling people, places, and things. We have seen examples of identifiers above, such as e-mail addresses and WebIDs. However, such identifiers are owned by third-party identity providers with varying degrees of availability. Additionally, they are reusable and multipurpose, which sound like desirable qualities but in reality may expose the user to undesirable privacy and security concerns, including unintended credential attribute disclosure caused largely by data correlatability across the Web.

Decentralized Identifiers (DIDs) are a new kind of identifier that address these issues among others. Also in development within the W3C CCG, the DID specification provides a standard framework for managing identifiers that are under control by the subject and are not necessarily dependent on a centralized identity provider. The DID spec defines a common interface for generating and processing these identifiers as well as loading and updating relevant metadata associated with them. Every DID has a standard format:

DID Format (Source: W3C Credentials Community Group DID Primer)

The following are the functions for each component of a DID:

  • Scheme: This component indicates to applications that the identifier represents a DID. For each DID, this is a fixed value of did.
  • Method: This component specifies how the DID is created, updated, and interpreted.
  • Method-Specific String Identifier: This component is a unique string within the namespace spanned by the method.

There are a number of DID methods emerging of late, including Sovrin’s did:sov, Bitcoin’s did:btcr, and the ledger agnostic did:peer for peer-to-peer interactions (for a complete list of DID methods registered with the W3C CCG, visit the DID Method Registry). There is still a lot of work underway in the DID space, including efforts to establish DID rubrics that enable common evaluation of DID methods as well as various interoperability efforts.

Self-Sovereign Identity

With the rampant manipulation of personal data by private corporations, government institutions, and persistent hackers, there is a looming desire among folks in the digital identity community and the general public at large to protect and control data access and expression. This principle, which is central to many of the previously discussed topics (see Zero-Knowledge Proof, Data Minimization, and Decentralized Identifiers) is generally known as self-sovereign identity (SSI). In SSI systems, users are at the center of data management, controlling composition of data, access rights of interested parties, and granularity of attribute disclosure for different contexts. There are several platforms that have flocked to the SSI scene, including, but not limited to, Evernym, uPort, and Solid.

Identity Management Challenges & Opportunities

The digital identity space harbors no shortage of challenges. Fortunately, to the extent that challenge is the mother of innovation, there is an abundance of exciting opportunities waiting to be seized by the vigilant and dedicated ilk. Below, I present some of the outstanding challenges in the digital identity space and the opportunities that they present.

Technical Debt

Digital identity is not perfect and the practitioners in the field are not entirely to blame. The truth is that digital identity came as an afterthought in the Internet developer community. The Internet was not designed with the goals of security and identity at the core. Rather, the chief concern according to Internet pioneer, David Clark, was to connect heterogeneous machines across a diverse set of subnetworks. At the time, concerns about the people or entities with whom or which one is connecting on the Internet were minimal, if existent, because this predated the emergence of e-commerce Web services that operate on sensitive financial user information. As a result, when it comes to identity management and user authentication, what we have in the Internet is what we have in the Web: a messy quilt of disjunct identity systems that capture, represent, and process user identity for different, often conflicting, use cases.

ARPANET MAP 1973 (Source: DARPA)

In the seminal Laws of Identity, Kim Cameron expresses content amidst the chaos of the digital identity ecosystem, prescribing a set of 7 rules dictating a successful identity metasystem with universal buy-in and staying power. Refer to Cameron’s paper for more details. Though it was written about 15 years ago, it is still a relevant resource for digital identity enthusiasts (plus, it’s an easy read!).

Domain Complexity

Digital identity is a behemoth that spans many domains. In Domains of Identity, Kaliya “Identity Woman” Young posits that there are 16 domains of identity. These include SSI, delegated identity, and various forms of identity within the realms of government, employment, commerce, and civil society. Young argues that because each domain comes with its own set of assumptions, often times this complexity of scope complicates the discourse around identity, with rational parties reasoning at different planes of understanding according to their own underlying incentives.

In this article, I focused mainly on Me and My Identity, Commercial Registration, and Commercial Transactions in an attempt to simplify the discussion. However, there are so many other domains that I initially wanted to cover if space, time, and patience permitted, including Government Registration and Government Transactions (perhaps, next time!). Nevertheless, Young recommends establishing a keen mental model of these domains (especially the underdeveloped domains in her estimation: Civil Society Registration, Civil Society Transactions, Civil Society Surveillance, Commercial Registration, Commercial Transactions, and the Black Market), so that clarity and productivity will prevail in digital identity circles.

Interoperability

We have already seen examples of the tendency for conflict and divergence in standards and implementations. As a reminder, there is still a complicated relationship between Open Badges and Verifiable Credentials and there have been numerous efforts and online discussions to iron out the differences. Additionally, there is a desire to achieve multi-DID and multi-wallet interoperability. If these are issues of interest to you, there are various online communities tackling them, including, but not limited to, the W3C Credentials Community Group, the Decentralized Identity Foundation, and the Hyperledger Aries community. Get involved!

Adoption

Often times, behavior and politics are tougher to influence than technology. For issues ranging in diversity from vaccination to climate action, there exist known technical solutions that simply require a large enough scale of adoption to realize meaningful impact. Like many of these causes, the digital identity community has more than its fair share of adoption hurdles. The reality is that people are naturally lazy. We are comfortable enough with the status quo, content with “2FA-ing” into systems that own and monetize our data, albeit at our social and financial expense.

The problem exists across various role scopes too. Prior to adoption, institutions desire strong market signals, such as major philanthropic funding and developer support and individuals rely on institutional adoption and social influence to change their behavior. Even with reference implementations in place, there is no foolproof approach for adoption, as we see with the Credential Handler API, which still exists only as a polyfill, and the deprecation by major browsers of the keygen HTML tag, which enabled cryptographic keypair generation in the browser. The adoption challenge poses an opportunity for improved educational channels and outreach initiatives for data subjects as well as legal policies and social contracts for the online community.

Conclusion

There is a rich history behind the digital identity ecosystem that was informed along the way by the various needs that emerged with the openness and complexities of the Web. Evidently, the industry is flush with issues ranging from interoperability to adoption. Thankfully, the digital identity community is also flush with brilliant and well-meaning folks with the skillset and vision to tackle these major issues. So long as these forces dominate the pure market forces, I believe that the community will settle on a set of solutions that is valuable enough for most, if not all, who serve to benefit.

--

--

Kayode Ezike

Welcome to a public cross-section of my life. Here, I will be featuring relevant topics in tech, innovation, social impact, and more! Thanks for joining ✨