Abandoning a Digital Identity
Abstract: As more and more aspects of life have an Internet representation, digital identities are getting more and more complex and single aspects become interweaved with each other. Was abandoning an online identity a relatively trivial task in the nineties, it has become a rather complicated challenge affecting various aspects of everyday life. Some aspects are still easy to change while others turn out to be hardly feasible.
Most people are tied to numerous digital artifacts which in sum form what will be called a digital identity in the following.
The first part of this paper analyzes what makes up this identity. The second part then describes practical problems of changing the digital identity and provides a protocol for identity change attempts. The third section will explore what limits effective changes of the (digital) identity. Finally, section four provides a brief conclusion.
This paper will not discuss the usage of deceptive services such as TOR since their security implications are -- at the moment -- not completely known.
On method to divide different aspects of a digital identity utilizes the classic ISO/OSI network model; this section will adhere to this separation.
On physical layer, a connection to the Internet is realized using techniques based primarily on radio or (copper) cable, usually operated by a commercial ISP. In case of cable connection, the ISP must have a valid location information to provide its service; he probably also has billing information including a personal identity.
In case of mobile internet access, things stay pretty much the same: The ISP knows who the consumer is (in terms of billing information unless prepaid services are used) as well as his current location. That is, the ISP is able to give a more or less precise estimation of the user's past and present location. Algorithms may be used to expand these by estimating upcoming locations.
On data-link layer (MAC layer), networking devices have (in general) an unique hardware address which is visible to other communication peers within the local broadcast domain. Visibility of unique identifications thereby is limited to broadcast domains; for ethernet devices this means limitation to the local network, for wireless devices to their radio radius (note that encryption of WLAN traffic such as WPA does not apply to the address information). As long as no countermeasures are taken, this includes devices supplied by ISPs ("routers").
Whilst unique identification of hardware on data-link layer is a less pestering topic regarding ethernet devices, it is highly relevant concerning privacy of mobile devices. Wireless LAN hardware addresses are present both at access points and clients (end user devices such as smart phones, notebooks and other entertainment devices).
Addresses of stationary wireless devices are used commercially to provide geolocation services: Some companies (e.g. Google) maintain databases correlating WLAN hardware addresses with their respective geographic coordinates. This service is conceptually based on the observation that access points rarely change their position (they are more likely to disappear as they are replaced when users buy a new device) and that are rarely shut down. Thus, wireless LAN access points make almost perfect beacons for both in- and outdoor navigation. Side effect regarding privacy is the existence of (automatically updated) databases which can be queried to get an location fix for a given (set of) hardware address(es).
Addresses of mobile WLAN devices are used e.g. to trace consumers with in shopping malls as most mobile devices perform active probing for known networks -- an operation revealing the device's hardware address. It allows not only tracking customers once, but also identify their habits and visit frequencies. The privacy implications are obvious.
On network layer, IP addresses appear as identifier of a user towards a site or a service. Formally, the ISP owns the IP address (that is, e.g. website visitors can easily be tied to their ISP unless they use services such as TOR). The address may change within the ISP's registered address range, but it will never exceed it. Moreover, IP based geolocation can be used to get a rough location fix for the user associated with a specific IP address.
The IP address is visible to every accessed service on the Internet, unless measures of precaution (such as TOR, VPNs) are taken. Some services may be able to tie the address to a natural person (especially if personal information has been provided for other reasons, such as shopping). Of course an IP address does not necessarily represent one natural person, it can be used by a number of persons sharing an Internet uplink using NAT (typical scenarios cover room mates, coworkers, and the like). Nevertheless the IP address is an important part of the digital identity due to its visibility.
On transport layer, identification of operating systems based on timings or initial (sequential) values for certain protocol header fields is known to be working, forming the realm for operating system identification using tools such as nmap. Note that this identification is -- in contrast to data-link layer -- not tied to a specific hardware.
Session, presentation, and application layers are treated summarized as "upper layers" since they are typically handled within the user application. Following, relevant application categories are explored. Note that a full description of privacy implications of online an offline applications would exceed the scope of this paper by far.
All kinds of web applications that need a login (Facebook, Google Docs, eBay, ...) build the biggest and most obvious part of the digital identity, including all entered information, even if it is hidden as "deleted". Moreover, there are some applications trying to enforce the user to reveal his or his contact's natural identity.
Browsers (Mozilla Firefox, Microsoft Internet Explorer, Apple Safari, ...) can be identified uniquely even without the user's notice; each browser on each device has to be assumed individually identifiable regardless of any offered privacy enhancing program mode.
It also has to be kept in mind that the IP address is visible not only to the provider of a visited web site but also to servers providing embedded contents on a web site (imagery, scripts, ...). Some communication or collaboration platforms make it relatively easy to embed e.g. a picture in a post using an external source which then will be queried by all visitors of the affected page -- the hoster of the linked image will receive an HTTP request including each visitors source IP address every time e.g. a forum thread is visited. This behavior is a basic feature of the world wide web but its security implications have to be kept in mind when it comes to hiding identity. A popular example is the embedding of plugins for social networks or site statistics (that is, Facebook, Google, ...).
Instant messaging applications (Skype, ...) do require the user to identify himself with a unique combination of access name and password (mostly). Use of a pseudonym does not necessarily preserve identity, as it does not prevent chat peers from aliasing pseudonymous usernames with to real names; the pattern of numerous chat peers aliasing an account with the very same name is easily trackable.
Notably, even if not directly correlated with network access, there are applications which (for some use cases) need information about the user's identity -- in terms of login or real name and which embed those information in files (Office Suites, ...).
This list can extended easily and be summarized as nowadays every application having its share in building the user's digital identity.
The features identified in the preceding section give a foundation to discuss means necessary to overcome a digital identity (and bootstrap a new one which can be connected to the former only with high effort). Steps necessary include (but are not limited to) the following: Offline preparation, online preparation, execution, and post-processing.
The first step -- which takes place long time before any action is taken -- is assessing which aspects of the current digital identity need to be preserved and which not. This assessment includes which communication channels are essential, which can be throttled for a while and which can be shut down. Results of this assessment should be subject to regular revision.
It also makes sense to prepare a set of screen/login names and passwords since when needed there will probably be no time to make up good ones. Especially creating passwords in a hurry usually leads to weak passwords.
The next step is psychologically challenging: The social environment (which may span over a variety of communication channels including face time and telephone calls) has to be evaluated in order to determine which contacts are to be dropped on identity change. Most crucial is the estimation on whether the contact will be able to handle the knowledge of both accounts and the identity change with the necessary responsibility (especially considering that identity changes usually have a story which may also influence rapport) -- that is, kept contacts are not just expected to keep silent about it but also not to reveal it by accident to providers (e.g. by renaming screen names in communication programs which may synchronize such settings to their respective servers). Persons who are likely to disclose the identity change have to be be dropped. Whilst this advice reads easily and is completely logical, it is hard to be implemented. Insecure contacts can be parents, siblings, or spouses. In such cases, a workaround can be a change of communication channels; it almost inevitably will cause further problems since (to a provider) it makes almost no difference whether account A has a contact continuing the old name -- or account B being renamed back to A.
The psychological challenge of this step originates from the the harsh assessment of a person's own social environment with need for the hard decision whom to trust and whom not to trust.
The other aspect that needs to be prepared is network access and used machinery. A clear cut needs ideally completely new hardware [addresses] (this does not exclude this hardware to be second-hand) and a different network identification in terms of public IP addresses. A practical solution might be the use of a wireless dial-up connection (UMTS or alike) operated by a different provider than used for an existing uplink with an second hand notebook. It has to be realized that since both providers can trace the location of the connection they and everyone having access to their databases are probably able to trace them to the very same person. To disguise, the mobile internet access can be used from a different geographic location. As most persons are probably unwilling to stop using Internet at their homes permanently, it has either to be accepted that the internet service provider will be able to link old and new identity, or an offline identity change has to be planned and prepared, too.
Without preparation, changing the online identity would lead to stopping the use of the old account set and creation of a new set of accounts. Such action would be quite noticeable, especially if the new account would re-join a noticeable share of the previous account's social networks; especially if this exchange happens in a very short amount of time.
Thus, online preparation is necessary.
In order to disguise the connection between both accounts, it is necessary to use an internet uplink which cannot be traced to the real identity or be connected to the "old" digital identity for setting up the the new identitie's accounts.
Soon after the offline preparation is complete, creating shadow accounts can be started (delayed, not all at once), so they cannot be traced back to the identity change based on a simple look at the "has joined our service at"-label. Ideally, the new account should not stay completely passive until needed. Instead, it should utter some contents from time to time without attracting much attention. Nevertheless, wether it is a good idea to publish contents to communities frequented by both the current and the future account may highly depend on how the social environment will perceive it in light of an identity change.
It may make sense to keep an offline copy of everything from within a specific communication channel and not to make it unavailable (that is, set an unremindable password and destroy any physical reminder).
The more complete the identity change, the more devices need to be changed: If attackers must be supposed to snoop for wireless LAN hardware addresses, all devices with such interfaces have to be exchanged; if attackers are assumed to be just on the Internet, such measure is unnecessary since the hardware addresses are only visible in the local broadcast domain.
Once time has come to change the digital identity, the change can be realized relatively quick due to well preparation.
Since preparation has already taken place and new accounts are available, one might be tempted just to switch activities. But since every too-sudden activity change may ring bells, this impulse should be resisted. Instead, activity should be increased gradually, one social sphere after another.
It has to be stated that the challenging part is not starting to use the shadow account but the communication of the change to the social network. There might be communities at which a connection between the old and the new account is irrelevant (say, image board) to enjoy it. But there are also communities where each account has a personality associated and one may want to transfer rapport earned with the old account to the new one and thus disclose the connection to others. At this point the offline preparation is helpful: It provides a list of whom to disclose the fact -- and to whom not to disclose it.
After the identity change has passed some weeks, the pressure on the primary account may have gone (which had caused the need for identity change in the first place). One may then raise activity back to normal level.
There are some pieces of information which -- due to their nature -- stick to an individual and cannot be changed. This phenomenon applies mainly to data which identifies natural persons and which may not be changed solely by the affected persons. Such data includes but is not limited to name, birth date, picture, social security number, ...
Since these pieces of information identify a person uniquely while they can be changed legally only with support by proper authorities (especially name and birth date) or almost not at all (considering pictures), it is crucial not to disclose such information on the internet. Once connected to a (former) digital identity, such information should not be connected to a new identity -- in spite of all incentives to hand them over. Since states have their own interests, changing the ``natural identity'' without support (and thus notice) of proper authorities is mostly illegal.
Especially half-offline, half-online constellations (social network representation of offline social networks) have an unhealthy tendency to create sticky data in terms of photo series showing the members of a network in real-life situations and correlating them with e.g. location information.
Changing the digital identity is possible, although it demands a high amount of preparation and willingness to face probably personally challenging decisions. However, some personal details should never become tied to a digital identity since they complicate the process notably.
Additional obstacles can raise from bad habits in terms of personal information disclosure.
It can be discussed whether preparing online accounts for smooth switch is a good idea or whether such accounts should be created following the identity change. The decision may rely highly on the surrounding social environment and how they perceive one of their acquaintances having prepared for such an occurrence perhaps years ago. It may raise suspicion, but it also may raise awareness if explained properly. Worst case is people feeling fooled if they felt they had something like a relationship to the prepared account without knowing it was just a shadow.