Tag Archives: reverse engineering

Job Opening: Software Forensic Engineer

Zeidman Consulting, a leading research and development company (and sister  company to SAFE Corporation), is looking to hire a full-time software forensic engineer. Acting as a high-tech sleuth, this person will analyze and reverse-engineer software using CodeSuite® and other state-of-the-art software tools, helping to resolve lawsuits involving hundreds of millions or billions of dollars. The employee will also work on one of several ongoing cutting edge research projects. These projects often lead to publication in academic journals, presentations at conferences, patents, and new product spinoffs. Past and ongoing projects include:

  • CodeMatch®, a program for comparing and measuring the similarity of different programs.
  • CodeGrid®, a computer grid-enabled version of CodeMatch®.
  • HTML Preprocessor™, a tool for breaking complex HTML pages into components consisting of text, pure HTML, JavaScript, images, etc.
  • RPG, a tool for automatically generating expert reports for copyright, trade secret, and patent litigation.

A successful candidate will need the following attributes:

  • At least a bachelor’s degree in computer science or equivalent. Advanced degree is preferred.
  • Excellent programming skills in one or more programming languages.
  • Ability to work independently on projects that are not well-defined.
  • Excellent verbal and writing skills for creating detailed specifications and reports.
  • Ability to work on multiple projects simultaneously and to switch projects suddenly as the need arises.
  • Enjoys working long hours on interesting projects, including weekends when projects hit critical periods.
  • Enjoys free time when projects are not in critical periods.

Zeidman Consulting pays above average salaries with profit-sharing and provides health insurance and paid time off for holidays, vacation, and illness. To apply, please email a resume to Info@ZeidmanConsulting.com.

Be a Pioneer in the Field of Software Forensics

I hope you’re all aware of my book The Software IP Detective’s Handbook: Measurement, Comparison, and Infringement Detection. It’s the first book on Software Forensics, a field that I pioneered at Software Analysis and Forensic Engineering and Zeidman Consulting. Whereas Digital Forensics deals with bits and files, without any detailed knowledge of the meaning of the data, Software Forensics deals with analysis of software using detailed knowledge of its syntax and functionality to perform analysis to find stolen code and stolen trade secrets. The algorithms described in the book have been used in many court cases. The book also describes algorithms for measuring software evolution, particularly as it relates to IP changes.

If you are a teacher, this is a great time to incorporate the materials in the book into your courses on software development, intellectual property law, business management, and computer science. There’s something for everyone in the various chapters of the book. Your students and you will be at the forefront of an important and very new field of study.

If you’re interested, please contact me.

HTML Preprocessor Released

S.A.F.E. recently released the HTML Preprocessor. The HTML Preprocessor is designed to transform web pages into files that are amenable to analysis by CodeSuite, DocMate, and other source code analysis tools. The HTML Preprocessor examines HTML files and other markup language files and extracts all embedded code into separate files. These files each contain only one kind of code that can be easily analyzed and compared using CodeSuite and DocMate. The code contained in these generated files are:

  • Scripts such as JavaScript and VBScript
  • Cascading style sheets (CSS)
  • Comment text containing HTML comments
  • Message text containing HTML user messages
  • HTML tags
  • Pure HTML
  • Pseudocode representation of the HTML

CodeSuite 4.4 and CodeSuite-LT 1.2 Released

S.A.F.E. recently released version 4.4 of CodeSuite and version 1.1 of CodeSuite-LT. The most important new feature of this version is that these programs now recognizes many different text encoding formats including ASCII, UTF-8, UTF-16, and UTF-32. Characters in alphabets other than the Latin alphabet used for English are now supported. For example, code with comments or strings in Japanese, Korean, Chinese, or Russian can be compared correctly.

The most significant change is to BitMatch. When examining binary object code to find text strings, you can now specify the encoding format of the file. If you’re not sure about the encoding, you can choose multiple formats.

As demand for our products increase outside the United States, we realized a need to support languages in those countries also.

The Software IP Detective’s Handbook

My book on software intellectual property, a labor of love (and hate) for the last two years, has just been published by Prentice-Hall. The book is intended for several different audiences including computer scientists, computer programmers, business managers, lawyers, engineering consultants, expert witnesses, and high-tech entrepreneurs. Some chapters give easy-to-understand explanations of intellectual property concepts including copyrights, patents, and trade secrets. Other chapters are highly mathematical treatments describing quantitative ways of comparing and measuring software and software IP. The first chapter of the book outlines which chapters are most important for the different audiences.

Overall the book covers the following topics:

  • Key concepts of software intellectual property
  • Comparing and correlating source code for signs of theft or infringement
  • Uncovering signs of copying in object code when source code is inaccessible
  • Tracking malware and third-party code in applications
  • Using software clean rooms to avoid IP infringement
  • Understanding IP issues associated with patents, open source, and DMCA

You can purchase your copy from Amazon.com here.

ADFSL 2011 Conference on Digital Forensics, Security and Law

Last year my consulting company presented a paper entitled Measuring Whitespace Patterns As An Indication of Plagiarism that examined and tested the concept that patterns of whitespace in two source code files can be used to determine whether one program was copied from the other. The conference was an enjoyable three days in St. Paul, Minnesota. We even got a tour of the Forensic Science Laboratory of the Bureau of Criminal Apprehension where we learned the real forensic science used to catch criminals (the CSI TV shows are a “little bit” exaggerated, but the reality is just as interesting).

This year the conference will be at Longwood University in Richmond, Virginia from May 25 through 27. I’m serving on the conference committee. We’re looking for paper, presentation, and panel submissions in the following areas:


1. Digital Forensics Curriculum
2. Cyber Law Curriculum
3. Information Assurance Curriculum
4. Accounting Digital Forensics Curriculum

Teaching Methods

5. Digital Forensics Teaching Methods
6. Cyber Law Teaching Methods
7. Information Assurance Teaching Methods
8. Accounting Digital Forensics Teaching Methods


9. Digital Forensics Case Studies
10. Cyber Law Case Studies
11. Information Assurance Case Studies
12. Accounting Digital Forensics Case Studies

Information Technology

13. Digital Forensics And Information Technology
14. Cyber Law And Information Technology
15. Information Assurance And Information Technology
16. Accounting Digital Forensics Information Technology

Networks And The Internet

17. Digital Forensics And The Internet
18. Cyber Law And The Internet
19. Information Assurance And Internet
20. Digital Forensics Accounting And The Internet

Anti-Forensics And Counter Anti-Forensics

21. Steganography
22. Stylometrics And Author Attribution
23. Anonymity And Proxies
24. Encryption And Decryption

International Issues

25. International Issues In Digital Forensics
26. International Issues In Cyber Law
27. International Issues In Information Assurance
28. International Issues In Accounting Digital Forensics


29. Theory Development In Digital Forensics
30. Theory Development In Information Assurance
31. Methodologies For Digital Forensic Research
32. Analysis Techniques For Digital Forensic And Information Assurance Research

Digital Rights Management (DRM)

33. DRM Issues In Digital Forensics
34. DRM Issues In Information Technology
35. DRM Issues In Information Assurance
36. DRM Issues In Cyber Law

Privacy Issues

37. Privacy Issues In Digital Forensics
38. Privacy Issues In Information Assurance
39. Privacy Issues In Cyber Law
40. Privacy Issues In Digital Rights Management

Software Forensics

41. Software Piracy Investigation
42. Software Quality Forensics

Other Topics

43. Cyber Culture And Cyber Terrorism

The deadline for submissions is February 19. The website for the conference is at http://www.digitalforensics-conference.org where you’ll find more information about the conference, the venue, and submission guidelines.

SAFE introduces CodeSuite-LT

CodeSuite-LT® is a less expensive, limited version of the full CodeSuite tool. Each tool in the suite produces a readable report that can be used to find copying. CodeSuite-LT includes BitMatch, CodeCross, CodeDiff, CodeMatch, FileCount, and FileIsolate. It also includes the ability to filter results using SourceDetective. CodeSuite-LT does not produce a database and does not allow post-process filtering of results. Instead, it generates an easy-to-read report that can be used to pinpoint copying.

Which is Right For You?

Which product is right for you, CodeSuite or CodeSuite-LT? Click here for a table that compares the features of both programs so you can choose the right solution.

DUPE: Depository of Universal Plagiarism Examples

In 2003 I created the CodeMatch program that very quickly became a de facto standard in software IP litigation. I created a test bench of purposely plagiarized code that could be used to independently and objectively compare the results produced by different plagiarism detection programs. Some in the academic community claimed that my tests were biased toward the algorithms used by CodeMatch, which explained why CodeMatch fared so well compared to the other programs. However, these same critics, despite my requests, never produced their own set of standard tests.

Although I believe that the standard tests I have used are not biased, it occurred to me that there could be a better way to eliminate even unintentional bias. The solution would be to take the source code for certain open source programs and announce a new open source project that would involve purposely plagiarizing the code. Programmers from around the world would be invited, perhaps in a competition, to change the source code while retaining the functionality. The original programs and the plagiarized versions submitted from others would be stored in a database known as the Depository of Universal Plagiarism Examples or DUPE. Plagiarism detection programs would then be run on DUPE and comparisons of the results could be made to determine which programs best detected copying. Also, important statistics about plagiarized code could be determined, as well as patterns identified in order to improve the plagiarism detection programs.

SAFE Corporation has begun looking into creating this database. However, we realize that we would like to work with partners in academia and industry. We believe that there are several key issues that need to be resolved in creating DUPE. These are:

  1. Choosing appropriate open source projects.
  2. Creating a minimum definition of software plagiarism.
  3. Creating the database.
  4. Determining policies including who can access it, how it will be used, and who will maintain it.
  5. Determining how to run the tests, how to generate the results, and how to distribute the results.

Please contact me if you’re interested in working on this important and groundbreaking project.

Interesting software IP cases of 2009

Here is my list of the most interesting software IP cases of 2009,
in chronological order:

SAFE Corporation is looking for great ideas

There are a lot of unanswered questions about source code, and we want to work with you to figure them out. We realize that currently accepted algorithms for analyzing, comparing, and measuring source code leave a lot to be desired in many cases. Also, there are a lot of techniques that have never been studied on large bodies of modern code. For example, measurement techniques developed in the 1970s were probably tested on assembly languages and older programming languages like BASIC, FORTRAN, and COBOL. Do they still hold on modern object oriented languages like Java and C#?

If you have a research idea relating to code analysis, and you can use the SAFE tools, let us know. Email Larry Melling, VP of Sales and Marketing with your ideas. If they pass our review process you’ll get free licenses to our tools, free support, and help getting your results published. This could be the beginning of a beautiful friendship.