E*TRADE Secure Data Exchange:
Using an SSL-based Web Server and Browser to Securely Exchange FilesBy Ross Oliver
Published in the December 1999 issue of ;login:
The e-commerce industry is awash in mergers, strategic alliances, partnerships, and outsourcing. This organizational networking creates an increasing demand for sharing of data among organizations. Much of the data is sensitive or confidential. At the same time, traditional secure data paths of direct dial-up links, private networks, and dedicated communications lines are giving way to the Internet as the all-purpose information conduit. Since the Internet is a public data medium, the challenge is how to give members of the the organization the capability to move data about quickly and easily, while still protecting data integrity and confidentiality.
This article describes a simple file exchange service called Secure Data Exchange (SDX) that I developed to fulfill this data transport need. My employer is E*TRADE Group (parent company of E*TRADE Securities), so speed and efficiency of online data flow and maintaining security and confidentiality are both critical factors.
For two-way file exchange, an FTP server is probably the most common service used. It was certainly the one requested the most by users. However, I was opposed to FTP because of these weaknesses:
I also considered Secure Shell (ssh), but as with FileDrive, licensing and installation of client software would be required. In addition ssh key management would require significant time and effort. The user interface would not be much better than FTP.
Another reason I wanted to stay away from commercial software packages
was to avoid going into the business of end-user support. This
service had to be as simple and fool-resistant as possible.
The Light BulbFor about a year, I had been toying with the idea of using an SSL-based web server for secure file exchange. Most Internet users are familiar with downloading files from web sites: click on a link to a document, and the browser retrieves the file, then either displays it, invokes the proper external viewer, or offers to store the file to the local disk drive.
Uploading would be done using an HTML form with an input field of type FILE. Here is an HTML fragment showing how the File type is used:
<form ENCTYPE=multipart/form-data method=post action=receivefile.cgi>
To upload a file, the user enters a file name in the text box of a form (newer browsers can also offer a file selection box), and when the form is submitted, the browser sends the contents of the file as part of the form data.
A web-based file exchange service would have several advantages over FTP:
I spent some time trying to write Perl code from scratch to parse the
multipart-MIME form data, but abandoned that approach when I discovered
the Perl module CGI.pm implemented file reception. In just a few
hours, I had a functioning web page consisting of a single input field and
a "Submit" button, along with a CGI script that would receive
the file and write it to disk. A new service was born.
User Interface DesignThe next step after proof-of-concept was to flesh out the service and design the user interface and associated functions. The user interface is a single page, four stacked sections. The top section contains the E*TRADE logo, the SDX title, and standard warning: "Unauthorized access is prohibited." The next section contains a directory name or title, and text description of this particular directory. The third section contains the file upload field (with a Browse button, if the browser support it) and "Upload" button. The bottom section contains the list of files available for download. The list includes the file names and modification dates and times of the files. If file deletion is permitted, a "DELETE" link beside each file name allows users to delete individual files.
My personal preference for web page visual design is rather utilitarian. I like 'em fast and clean, without a lot of eye candy. Nevertheless, a batch of text floating on a sea of white background is a little sterile. To add some visual interest, I added a vertical color bar (adopted from the company intranet) to the left side of the page, and the company logo to the top section.
To keep the service as generic as possible and keep management to a
minimum, no navigation links are provided to move among directories.
Users must either bookmark or enter URLs manually.
ImplementationSince the Netscape Enterprise server was already used to run the main E*TRADE web site, and I had plenty of experience with it in my previous position as a UNIX systems administrator for E*TRADE, I chose this as the web server for SDX as well. The host machine is a Sun Ultra 2 running Solaris 2.6.
Each department or group that uses SDX is given one or more file exchange areas. Each area has a corresponding unique URL, and map to a directory on the host filesystem.
Within each of these directories are the scripts and configuration files that implement the service. To hide the inner workings of SDX, and prevent conflicts between configuration files and data files, the actual data files are stored in a subdirectory immediately below each file area directory.
Two perl scripts implement the primary SDX functions: index.cgi generates all the HTML to display the page (there are no static HTML files), and receive file uploads. The script delete.cgi handles file deletions.
I named the main script index.cgi so it would be automatically invoked by Netscape when users entered the directory URL. This allows users to treat the URLs as directories or file folders, and prevents direct listings of directory contents.
Three zero-length files may also be present in each directory. They
serve as on-off flags for the upload, download, and delete functions. My
original plan was for these files to contain lists of LDAP groups that are
permitted to perform each operation. This has yet to be implemented.
Access ControlAccess to SDX requires individual logins and passwords. To maintain accurate activity logs and accountability, I don't allow shared or generic accounts.
For user management, the Netscape LDAP service included with the Enterprise Server 3.5 is used. Originally, I used the Netscape Administrator interface to create user accounts, but this became too cumbersome when adding groups of more than a few users. So I developed scripts that use the Netscape LDAP interface utility "ldapmodify" to perform batch additions of users. I also created a page and CGI script that uses ldapmodify to give the users the ability to change their passwords.
Once users and groups are defined, access to the directories is
controlled by the Enterprise Server access control lists (ACLs).
Having directory ACLs separated from user and group definitions is less
convenient than I would like. Ideally, a single interface would
manage all elements. My future goal is to give the index.cgi script
direct access to the LDAP database, and store the directory ACLs as LDAP
entries. Defining my own ACLs would also bypass a limitation of
Netscape's ACLs: the server can't determine a user's permissions until he
actually attempts an operation. I want to be able to determine in
advance the user's permitted operations, so forbidden operations are not
Tools for Automating the File Transfer ProcessOnce I had the basic system up, the need quickly arose for a way to perform non-interactive file transfers, such as from scripts and cron jobs. For an earlier project involving benchmarking web site performance, I had written a Perl script that used the SSLeay Perl module to perform HTTP retrievals of web pages. I was able to quickly adapt this script to the function of automated downloading by added a few lines to write the retrieved file to disk.
The upload script took more time, mainly in working out all the nuances of the multi-part MIME format required for the form submission. Once again, finding no examples on the web, I was working from scratch. I also added a flag variable to the upload form so when the server received a submission from a script, its reply was a small, easily-parsed text message rather than a full-blown HTML page.
The Perl scripts worked well in our predominantly Solaris UNIX environment. But many potential users wanted to transfer files to and from Windows NT systems. Obtaining or building Perl, then adding the SSLeay module on Windows would be beyond the capabilities of most of my intended audience, so the Perl scripts would be of no use. I decided to build standalone Windows executable versions.
I had not done any C coding in Windows in quite some time, so as an interim step, I first built C versions of the programs on UNIX. Using as a template the demo SSLeay client cli.cpp, included in the SSLeay package. These standalone UNIX binaries would later prove to be useful on UNIX hosts where no Perl versions or SSLeay module were available.
Once the UNIX versions were working well, I began the port to Windows using Microsoft Visual C++ 6.0. The most time-consuming part of the port was getting the SSLeay environment built under Visual C++. This took several hours of searching the online C++ help files and much trial-and-error. Once this was done, however, porting the actual programs went fairly smoothly. Most changes had to do with differences in the TCP/IP socket library functions.
One final change made later was to restrict the utilities to using only
the DES encryption algorithm, to avoid the need for an RSA license.
ProblemsA brief description of some of the problems I encountered:
Windows users do not hesitate to use spaces and all sorts of special characters in their file names. I had to take this into account when translating file names into URLs, translating non-alphanumeric characters into hex. Path names also had to be trimmed off. This requires two separate passes, one for UNIX paths, and another for Windows.
The the mime.types file included in the UNIX version of Netscape Enterprise server defines the file extensions .bat and .exe as the CGI file type. This meant whenever someone tried to download a file with either of these extensions, the server tried to execute the file locally (on the UNIX host) rather than of sending it as data.
A frequent occurrence is to omit the "s" from "https." This can happen if the user simply forgets the "s" when typing the URL, or relies on the "feature" of most browsers to assume "http" if only the domain name is entered. The usual result is a call to me complaining that the SDX server is down. A future enhancement would be to set up another web server instance on port 80 which would returns a redirect to the 443 port.
I originally had the server on a high-numbered IP port, because port 443 was already in use on the host machine. This caused a problem for some outside users because their sites' firewalls allowed outbound connections only on ports 80 and 443. Using the Solaris virtual interface capability, I added another IP address, and moved the SDX service to port 443. Once I had a dedicated IP address, I also assinged a dedicated DNS name.
I severely underestimated the amount of disk space that would be needed. Some users are transferring files several hundreds megabytes in size. The original host for the service was a Sun Ultra 2 with two 2-gigabyte disk drives (which were also shared with several other services). I am in the process of moving the service to a dedicated Sun Enterprise 250 server with 100 gigabytes of disk space. So that users would not be surprised by running out of disk space during an upload, I also added a message to the upload portion of the form showing the amount of free disk space.
The CGI.pm perl module uses a temp file to store the uploaded file as it is being read from the posted form. If the filesystem containing the temp file runs out of space, the module silently truncates the file. The default location for these temp files is /var/tmp, which on my original host, was a rather small filesystem. To permit larger temp files, I created a temp directory on the main file storage filesystem. The following Perl statement will specify to the CGI.pm module what directory to use for temp files:
$TempFile::TMPDIRECTORY = '/opt/usr/tmp';
If the upload does not complete (the network connection is broken or user presses the Stop button), CGI.pm leaves the temp file behind. To keep the temp space from filling up because of this, I created a daily cron job to clean up any leftover files.
I originally had placed the GIF graphics files used on all the pages in
a dedicated subdirectory "images," a subdirectory of the
document root directory. However, some versions of Netscape browser
prompted the user with two separate authentication prompts, one for the
images directory, and one for the actual directory being accessed.
Newer versions don't have this problem, but to solve the problem for the
affected users, I created beneath each document directory an
"images" symbolic link to the actual "images"
Future directionsImproved Education
One of the biggest challenges is educating potential users about the availability and use of SDX. It is not enough to simply put some documentation on the Intranet and wait passively for requests to come in. I plan to use email messages, presentations, and maybe even this paper to raise awareness about the virtues of SDX.
Along with education about the generic service, I plan to produce
documentation geared toward application and web site developers about how
to use my techniques in their own web sites.
Improvements in Access Control
Having directory ACLs stored separately from user and group definitions
is less convenient than I would like. Ideally, a single interface
would manage all access control elements. My goal is to give the CGI
scripts direct access to the LDAP database, and store the directory ACLs
as LDAP entries. A frequent request is to restrict
One of the weaknesses of server ACLs is that the server doesn't
determine a user's permissions until he actually attempts an operation.
I would prefer to have that information in advance, so when the dynamic
page is generated, forbidden operations are not even displayed.
Delegated User and Access Management
To reduce my administrative workload, I would like to be able to
delegate "sub-administrators" who can add, change, and delete
users in their designated groups. This will become even more critical as
the number of users grows, and perhaps multiple servers at our multiple
data centers are established. I am currently reviewing a product for
this purpose called SiteMinder, by Netegrity.
Integration with Other Access Control Methods
The last thing any of us need is yet another login and password (YALAP?).
Someday I would like to be able to point the server at an authentication
server, and not have to manage user accounts at all.
Measuring SuccessSDX has been in active service for over a year. There are about 10 different groups within E*TRADE who use SDX on a regular basis. The most frequent users use SDX to exchange spreadsheets, Microsoft Word files, and documents. Other groups have successfully built automated processes to exchange files using the tools I provide. One group has adopted my techniques to use on their own web site for serving their clients.
Some developers still cling to FTP, especially if they already have
scripts built to use it. But acceptance is growing, especially with
the increasing reliability of the automated transfer scripts.
ReferencesHere are the URLs for the tools and products mentioned in this article:
CGI.pm Perl module
Secure Shell (ssh)