Find_SSNs - Cornell enhanced version

Find_SSNs is a program, developed at Virginia Tech, to find credit card and Social Security numbers on a computer. It is written in Python and runs on Mac and Linux; it will also run on Windows, but I have not tested any of my changes on that platform yet.

This version has been changed and enhanced significantly from the VA Tech original. There are two main programs, Find_SSNs.pyw which does the scanning, and resview,py which views and acts on the results of the scan. See the file README.usage for more information.

This is development-quality software (though it does not crash nearly as often as Identity Finder on the Mac). I decided to get it into people's hands quickly, to meet the need for a sensitive data tool on PPC Macs and Linux boxes. It does run on Windows, but since we have Identity Finder for Windows, I have not put any effort into that platform. I may do so in the future if there is interest. I will continue to improve the software as I find time, and, of course, would be glad to have help. I would rather see Cornell spend money to keep people employed working on Open Source software than on purchasing software from outside.

The program is packaged as a ZIP file; simply download it, extract it, and follow the instructions in the file README.usage. Although the programs have graphical interfaces, you will need to start them from a terminal window. (For the Mac, clickable scripts are provided.)

As of version 0.9, the program only shows unformatted numbers (e.g. 123456789) if a keyword such as "SSN" or "Social Security" occurs in the file first. As of Version 0.10 there is a "friendly" GUI mode, invoked with the -f option, that requires fewer choices by the user. This mode is also available via clickable command scripts on MacOS (see the macos subdirectory). Version 0.12 adds decoding of .xls files, if the Python library xlrd is installed. (It is included in the Mac packages.)

Version 0.14 increases the largest scannable file size to 1 GB, adds checking for bank account numbers, and fixes a bug in the handling of mbox files; we recommend re-scanning any users who have a lot of data in this format (e.g. Eudora or Thunderbird users). Mac Mail does not use mbox files, and so was not affected.

find_ssns_cornell_0.14.zip

An installer package for Intel Macs running 10.5 or 10.6, including all the additional packages (pdftotext, ghostscript, antiword, xlrd):

find_ssns_cornell_0.14_mac.zip

Intel Macs running 10.4 can use the above package, but Python/WxPython 2.6 will also need to be installed.

An installer for PPC Macs, including the additional packages and Python/wxPython 2.6:

find_ssns_cornell_0.14_macppc.zip

The redactor component, that edits numbers out of files in place, is available as a separate script. It takes its input from a CSV file generated by Identity Finder.

redactatron_0.6.zip

This software runs on Python version 2, release 2.4 or newer. It has not been tested on Python version 3. The GUI mode of the scanner, the results viewer, and the redactor require that wxPython be installed. These are standard on MacOS Leopard and above, and are available on most Linux distributions. (On Red Hat and derivatives, install wxPython. On Ubuntu, it's python-wxtools.) MacOS Tiger and earlier come with a Python that is too old; you can install Python 2.6 and wxPython fron the links below. On 64-bit versions of Snow Leopard, you need to use a 32-bit Python. You can do this by setting an environment variable:

export VERSIONER_PYTHON_PREFER_32_BIT=yes

Alternatively, use the python2.5 command; that version is always 32-bit.

Find_SSNs also expects several additional programs to be installed. They are available as part of most Linux distributions; for convenience, I have included downloads of these programs for MacOS on this site, and they are in the Mac packages above:

Python 2.6 and wxPython for MacOS 10.4 and earlier:

Questions, comments, offers of help? Contact Steve Gaarder,