Report of Expert Witness Dan Farmer in ACLU v. Reno II
UNITED STATES DISTRICT COURT
FOR THE EASTERN DISTRICT OF PENNSYLVANIA
AMERICAN CIVIL LIBERTIES UNION, et al.,
JANET RENO, in her official capacity as ATTORNEY GENERAL OF THE UNITED STATES,
|Civ. Act. No. 98-CV-5591|
Expert Report of Dan Farmer
General contents and layout of this report
- Dan Farmer's expert qualifications
- An overview of the WWW
- E-commerce and cryptography
- Managing and running WWW sites
- COPA and its effects
- Non-US issues
I am the head of computer security at EarthLink Networks, a large Internet Service Provider (ISP) with currently close to a million active customers. I have been retained by the plaintiffs in this case to provide expert testimony on the technological burdens of mandatory age verification on Internet content providers and users, and on the Internet as a whole. I am providing my services for free. I have not previously testified as an expert in any case.
Make a Difference
Your support helps the ACLU defend privacy rights and a broad range of civil liberties.
In my opinion, the COPA's age verification requirements present two significant technological challenges for Internet content providers: 1) obtaining and verifying information from users before providing access to material covered by the law; 2) protecting the security of the collected information. While each sounds fairly simple, in actuality each requires a series of steps that impose severe burdens on content providers. These burdens would be insurmountable for many content providers. The COPA's requirements also impose severe burdens on Internet users that will either deter, or completely prevent, users from obtaining access to material covered by the law. In addition, the COPA's requirements will negatively impact the performance of the Internet.
As the security architect of EarthLink I am responsible for the design and implementation of our security policies, firewall, the internal security infrastructure, as well as the maintenance and continual monitoring of host and network activity on our large internal network and the customers that we communicate with. Any and all issues (including products, services, and connectivity) that deal with Internet security must be approved by me, including e-commerce and any other aspects involving authentication, validation, or online transactions. In addition, I am currently writing a book on computer security and spend a great deal of my time performing research on forensic computing and large, high-speed networks.
Prior to EarthLink I worked for several years at a pair of fortune 500 hundred companies - Sun Microsystems, the largest UNIX system manufacturer in the world, and Silicon Graphics, Inc., the graphics workstation company.
At both companies I was in charge of the technical aspects of computer and network security, as well as doing security and network research. I also redesigned and rearchitected Sun's firewall, and consulted internally and for our customers.
In 1990 I started working for CERT, the Computer Emergency Response Team. CERT was a DARPA funded venture that was originally created because of the Internet worm incident in 1988 and was designed to facilitate the flow of security information and to provide emergency assistance to any person or organization on the Internet. At CERT I coordinated support to CERT customers that were experiencing computer security emergencies and was in charge of disseminating vulnerability information to venders and getting them to announce and to fix their security problems.
Before I was at CERT I worked for Purdue university and some small software companies as a programmer and computer consultant. I've been computing professionally for 18 years.
I have done security consulting for banks, computer companies, and Internet organizations for the past half-dozen years, and often speak at computer conferences, academic and research facilities, and government institutions.
Additional Professional Information
Nine years ago I wrote COPS (the Computer Oracle and Password System), the first publicly available Internet security tool. COPS provides a variety of ways to test and report on the security of a UNIX system. Three years ago I co-authored SATAN (the Security Administrator's Tool for Analyzing Networks.) SATAN is a program that analyzes and reports on the security of a network. COPS and SATAN are the most popular and widely used security analysis tools ever written. Several companies took the ideas in these tools and created commercial products. Titan, another security tool I co-authored, has just been released in December of 1998. Titan fixes a variety of security problems and can help administrators create firewalls and implement their organization's technical security policy.
I have published several papers on computer security and network, most notably on security analysis, software tools, analysis of the Internet, and on the current issues with unsolicited commercial email (UCE, or spam.)
Earlier this year I created and hosted the first Security Summit, a weekend retreat where fifty of the top security, cryptography, and network researchers gathered to discuss and attempt to solve some of the more pressing issues with security on the Internet.
I have been on the Internet for 18 years, and administrate and control all aspects (WWW, Usenet news, email, etc.) for several domains.
I have been interviewed as an expert on every major television, radio, and print news outlet (NBC, CBS, ABC, CNN, PBS, Time, Newsweek, the Wall Street Journal, NYT, etc.), and was interviewed and profiled by Scientific American as one of the top experts in Internet security.
House Subcommittee Testimony
On February of 1997, I was one of a panel of experts who presented a briefing on computer and network security to the U.S. House of Representatives Committee on Science, Subcommittee on Technology, talking about the need to protect the confidential nature of communications and to ensure that proprietary data would remain uncompromised.
- The COPS Security Checker System - June 1990 USENIX Proceedings
- COPS - Fall 1990 Purdue Technical Report
- Improving the Security of Your Site by Breaking Into It - 1993 Internet white paper
- SATAN, an unusual application of Web technology - Nov 1995 NLUUG Proceedings
- Shall we dust Moscow? - 1996 Internet security paper
- >From the Trenches: One ISPs Response to the Problem of Spam - April 1998 ;Login journal.
- Titan - December 1998 LISA USENIX Proceedings
Summary of qualifications
COPA requires age validation or authorization, as well as the protection of data from unauthorized access - both directly related to computer security. In addition, it impacts many other areas that involve the operation of Internet systems and managing WWW sites. I am qualified to testify on these issues based on my many years of experience running and maintaining Internet systems and my knowledge and expertise on Internet security and protocols.
What is the World Wide Web (WWW)?
It's important to understand the basic technical concepts that allow the WWW to function as well as it does if you're going to understand the potential problems that compliance with COPA will introduce. In its most basic form the WWW requires a modem, a computer, and some sort of way of accessing the Internet - typically provided by an Internet Service Provider, or ISP. (A user might not see the ISP portion if they are part of an organization that provides this for them, such as a university, company, etc.) It works by having a user run a program called a browser (such as Netscape's Communicator or Microsoft's Internet Explorer) on their computer (which can be a PC, Mac, UNIX system, whatever) to contact another program, called a server, to request information on a WWW site. The server primarily talks to the browser via the hypertext markup language (HTML), and the browser translates the HTML to the screen into something (hopefully) pleasing to the eye. Other than the fee that you pay for using the telephone line and possibly the ISP, the Internet and the WWW is nearly 100% free of charge. In the last couple of years some commercial ventures have started charging users a fee, but that is still relatively rare. More and more commercial ventures are putting up freely available content or services (such as email, chat rooms, etc.) to attract potential clientele. They try to make money by either advertising or by the sales of products and services listed on their WWW pages.
This last point, the freely available content, is one of the major attractions of the WWW. People feel that they're getting something for nothing (always a popular thing!), and the ease of browsing the WWW is augmented by the way that hypertext links can shunt you from site to site (often without you even being aware of it.) Users also feel empowered by the anonymity the WWW provides - since WWW content providers know almost nothing about the individuals looking at their site, it not only facilitates discussions about troublesome, very personal, or problematic topics (such as with child abuse and incest support groups), but also allows users to purchase or investigate items and ideas without being hassled by salespeople or other undesirables.
Sites that require personal information to access are not only fairly rare, but rather unpopular - users are not only unwilling to give out information about themselves, but perhaps more importantly it destroys the WWWs incredibly friendly user interface and usability. Since there is nearly always more than one WWW site offering similar content or services, users will simply go to another site that doesn't require registration or isn't as intrusive to get what they want.
On the content provider's side, it has gotten easier and easier to create a WWW site. ISPs often provide a one-stop service, where you can purchase fairly simple software to create content and store it on their site. While the more complex sites require significant amounts of resources, the WWW has, in many cases, become somewhat of an equalizer - any individual or organization can talk, advertise, or solicit customers in very inexpensive manner.
I'll discuss what needs to be done to create a WWW site in a later section, but here are some of the basic building blocks.
The http protocol
To many people, the WWW is the Internet, because they use their browsers for all, or nearly all, their interactions on the Internet. However, the Internet is far more complex than this. The WWW primarily (but by no means exclusively) utilizes the hypertext transfer (http) protocol to give users the ability to see WWW sites, play sounds, etc. A computer protocol is essentially a language or means of communication that computers use among themselves. While it is very difficult to tell exactly how much network traffic is made up of the http protocol, it is probably somewhere between two-thirds and three-quarters or more of the total US network traffic.
CGIs and other helper applications
WWW server programs (such as Apache) do not do much besides basic administrative commands and to send HTML to the users' browser. And pure HTML is limited to providing fixed, or static, content - it can't provide very flexible or dynamic content. To do this - such as creating a site that changes every time it is entered, or to process a information from a user - helper programs (often called "Common Gateway Interface," or "CGI," scripts or programs) are often used.
Unfortunately, creating even a simple CGI script involves programming, which most people are woefully untrained and unskilled at. To compound the problem network programming is among the most difficult of all programming tasks to do correctly, let alone securely. Due to the increasing popularity of the WWW, exploiting poorly written CGIs is fast becoming one of the most popular mechanisms to abuse and to break into systems. Worst of all there is no easy way, using current technology, to improve this situation. While off-the-shelf or commercial CGI scripts can be utilized, it is impossible for any programmer to anticipate all the needs of a customer's site, so custom CGIs must be constantly created. And even the commercial or publicly available CGI scripts are often full of security problems. It simply takes skilled and experienced programmers - a resource that is sorely lacking in the world - to fix this situation.
Many ISPs (including America Online) do not provide CGI capability to their content providers.
Caching - client, server, and gateway/proxy
Caching is a mechanism that computers use to increase their speed or to otherwise make transactions or processing more efficient, usually by the simple mechanism of keeping frequently used data closer to you. Caches work by analyzing past behavior or anticipating future behavior of programs or users and then make it easier or faster for that user to access a file or perform an operation. Caching is a fundamental computer construct of great import - without caching, computers and networks would run significantly more slowly. Everything uses caching, from CPU's to disk drives to networks.
The most common things that computers on the WWW cache are frequently accessed documents. For instance, it is much faster reading a file from your hard disk than it is to retrieve the same file across a modem. Since most browsers will start up accessing the same home page every time they are started, instead of having your computer going across the network and fetching the entire start page, with all those large graphic files and such every time, your browser simply asks the WWW site if the page has changed. If it hasn't, then it simply retrieves the home page from your hard disk, saving you time by not going across the network.
Many service providers or organizations do similar things on a larger level. It is usually faster to get something that is physically closer on a network than something that is farther away. So if many people on the West Coast are trying to access the Starr report on President Clinton, which might be on a busy server on the East Coast, it would be advantageous for a California ISP to fetch the report from across the network only once and keep a copy of it in California. From then on, anytime a customer wants to get this file across the network the server simply hands them the California copy. This can be a huge time and resource saving gesture - the remote server is less busy, since it has to give out the document fewer times, the Internet has less traffic crossing it, which gives faster access for other people, and the user sees things significantly faster.
The Internet is constantly on, and is running perilously close to its full capacity - all the backbone and service provider networks are stretched to their maximum bandwidth capabilities, and it is only because they are constantly replacing and improving hardware and building more and more pathways between systems for communication that we can keep up at all. Caching is crucial and becoming more and more important to keep the Internet running as it does.
Search engines, such as Altavista, Excite, Hotbot, etc., work by constantly running programs that seek out as many WWW sites as they can discover. They essentially act like a user, clicking on every link that they see, saving the results of their actions and the contents of all the pages that they find. All the pages that they have found are then sorted, sifted, and then indexed into huge databases, which can be searched by a WWW user by simply typing in various keywords. They are somewhat analogous to the white or yellow pages of a telephone book.
Without search engines the usefulness of the WWW would be greatly diminished. Finding a site - let alone many sites - that has information on the musical group The Beatles, for instance, would be nearly impossible without using them.
E-commerce & cryptography
The principles, theory, and practice of e-commerce and cryptography are fundamental to understanding the burdens of compliance with the COPA, so I'm going to cover the basics of this field. It should be noted that, compared to the overall volume of traffic on the Internet, e-commerce is nearly non-existent. There is a lot of money being made and exchanged on the WWW, but for every financial transaction there are millions of others that have nothing to do with it.
For the purposes of this report, I'll define e-commerce as the sale of goods or services on the Internet. It has a few weak points:
- It is rather painful to set up. Not only is it pricey, but it requires a significant amount of legwork - especially compared to the ease of setting up a normal WWW site. (I'll discuss this in more detail later.)
- The actual transmission of the order - e.g. the credit card numbers, goods or services requested, user identification, etc. This is what travels over the Internet, and is what most people focus upon when they think of e-commerce security.
- The two or more endpoints involved in the communication. Your own computer, the computer that you ordered from, and any other computers involved in the transaction (such as a Visa or Mastercard verification computer, etc.), are all important pieces in keeping your transaction and money safe.
The transmission or communication aspect of e-commerce relies heavily on cryptography to try an keep its users safe.
- Credit cards. While almost everyone in the US knows what a credit card is, most people don't know how they operate on the Internet. Most of the time a potential buyer transmits the credit card number either to the server offering goods or services or to an ISP that runs a special server. This computer then talks across the Internet to the bank or card services organization, who either confirms or denies the transaction request to the vendor by giving an authentication code. Ideally (in a security sense) this will be both encrypted and authenticated using cryptography, which not only slows down the process but puts additional strain on the computers involved and adds to the cost of the transaction.
There are three main problems with this approach. First, if the credit card and user information is not encrypted, then miscreants can silently steal the information and abuse the credit card. Second, if the credit card is encrypted using a weak encryption method (such as the default on most WWW browsers) it can be stolen using a fast computer or set of computers in a couple of hours. Finally, if an organization keeps credit cards on-line and is broken into, vast numbers of credit cards can be stolen. News articles that detail tens of thousands of credit cards being stolen in this way are not uncommon in the media.
- Digital Certificates. Digital certificates are nothing more than an electronic certificate or token that has been digitally signed by a Certificate Authority (CA), verifying something about the possessor of the digital certificate. You can think of it as something like a notary public's seal on a document. Unfortunately there are no good standards as to what makes someone a CA (although some companies, such as Verisign, do have fairly well-defined and strict rules that must be complied with in order for them to allow you to be a CA with their product), nor are there any standards as to what information or proof someone must give to convince the CA to give them a digital certificate certifying that they are telling the truth. In addition, for a digital certificate to have any validity the CA must be able to revoke the digital certificate to prevent abuse. For instance, if digital certificate technology was used to verify that a user is over 18 years of age, and the user later turned out to be 12 years old, the digital certificate must be cancelled, like a bad credit card. In computer terms this is a costly operation, for with each request to verify a digital certificate the server must talk across the Internet to the CA (or a system that the CA trusts) and perform a mathematical operation. Any individual validation is not problematic, but as the number of requests grows the number of systems involved and the amount of computer horsepower have to as well.
Like encrypted files or communication, a digital certificate can be broken by attackers if it is not strong enough. However, there is no standard that requires any minimum strength.
Generally speaking, cryptography is a science that uses mathematical procedures to protect data (via encryption), identify (via authentication), and to verify data (via digital signatures.) Every time you log onto a system and it asks you for your password and user name to verify your identity, each time you make a secure transaction with your WWW browser, anytime you use an ATM to get money, you're using cryptography.
No cryptographic method that is generally used in e-commerce or on the net is perfect, however (and the only perfect method, that of a one-time pad, is far too cumbersome to utilize) - the effectiveness of various cryptographic methods can be roughly measured by how long it takes to break them. However, while there are some methods that are thought to be essentially unbreakable in a practical sense, since cryptography can be used to hide information that could be deemed harmful to national security, the US government has put into place strong restrictions on the effectiveness of cryptographic methods and protocols that can be used to communicate with the outside world via the Internet. This means that cryptography only gives a modest amount of protection in most cases.
Furthermore, due to its complexity, almost no useful cryptographic methods have ever been proven to be really safe. We put faith in them due to their complexity and by the simple fact that the really good ones haven't been broken yet, but it's a cautious science. In addition, cryptography is slow, often difficult to work with, and usually requires additional effort for users to use. However, it is literally the only reasonably secure method that we have to identify things and hide information.
- SSL (Secure Sockets Layer.) SSL is the most popular way that browsers and WWW servers communicate with each other in a more secure fashion (such as when transmitting passwords, credit cards, etc.) The WWW browser and server engage in a short cryptographic exchange of information to set up the communication, and then from then on everything between the two is encrypted. One form of this encryption is very weak, and can be easily broken by an attacker (this is the default in many cases) while the other is a reasonably strong form that would be very difficult to compromise (this version cannot be exported out of the US, however.) Although SSL can be an effective way to communicate information in a fairly secure manner, it is not widely used simply because of the cost (in dollars and in time) to setup a secure server.
- Passwords. Most security systems use encryption to store passwords in a database on a system in an area that is hopefully unreadable by unauthorized personnel. When a user types in her or his password, the system uses a hopefully unbreakable mathematical function that compares it to the database and either allows access or not, depending upon the accuracy of the password. If a password file is stolen, even if it is encrypted, most of the passwords on the system are in great danger of being guessed or cracked by an attacker.
- Basic Authentication.
Basic authentication is one of the simplest ways that the WWW allows authentication of a login and password. Typically when a browser first attempts to access a WWW page that is password protected, the WWW server sends back a request to the browser asking for the user's user name and password. The user types them in, and when the browser sends them they are either encrypted via SSL or not, depending on whether the server is equipped to handle this (currently the vast majority of servers do not have this capability, and unfortunately almost no browser tells you if the transmission is encrypted or not.) If the user has a valid password they are granted access to the page and will not be asked for another password as long as the password is valid on that site, or when some timeout value (set by the WWW server administrator) has expired.
It should be noted that the WWW server has no idea of who is actually doing the surfing. If an adult is verified, leaves the computer, and a child sits down, then the server will merrily continue serving up potentially problematic files.
- Authentication vs. authorization. Authentication is a method used to verify identity, and is based on either what you know (like a password or passphrase ("the quick brown fox jumped over the?")), what you possess (a driver's license, hotel card key, etc.), or who you are (fingerprints, retinal scans, etc.) Authorization is furnishing grounds to an agent, empowering them to do something, such as writing a note to the principal to keep your child home when they are ill, or when you tell a mechanic to go ahead and fix your car.
Since COPA requires content providers to protect their information, I've included a short section on Internet security - how safe it is as well as some of the issues involved with keeping information from unauthorized access.
The Internet is not a very safe place for money or data to be. The sheer numbers of the systems out there compared to the number of electronic attackers has been the primary reason that more systems have not suffered significant loss and/or damage.
In a survey of a couple of thousand key sites that I performed and published (Security Survey of Key Internet Hosts & Various Semi-Relevant Reflections) about two years ago, I found that not only do approximately two thirds of all major sites have significant security problems, but, perhaps counter-intuitively, that sites that are less important are nearly twice as secure:
Type of site
Total % Vulnerable
US federal sites