Tag Archives: copyright infringement

The Software IP Detective’s Handbook

My book on software intellectual property, a labor of love (and hate) for the last two years, has just been published by Prentice-Hall. The book is intended for several different audiences including computer scientists, computer programmers, business managers, lawyers, engineering consultants, expert witnesses, and high-tech entrepreneurs. Some chapters give easy-to-understand explanations of intellectual property concepts including copyrights, patents, and trade secrets. Other chapters are highly mathematical treatments describing quantitative ways of comparing and measuring software and software IP. The first chapter of the book outlines which chapters are most important for the different audiences.

Overall the book covers the following topics:

  • Key concepts of software intellectual property
  • Comparing and correlating source code for signs of theft or infringement
  • Uncovering signs of copying in object code when source code is inaccessible
  • Tracking malware and third-party code in applications
  • Using software clean rooms to avoid IP infringement
  • Understanding IP issues associated with patents, open source, and DMCA

You can purchase your copy from Amazon.com here.

DocMatch detects plagiarism

S.A.F.E. has recently announced the release of DocMatch, a new tool for comparing all kinds of documents to find plagiarism. Our unique, patented technology has proved very useful for finding copied computer code in court. We decided to apply our technology to general documents like articles, papers, and novels. There have been a few cases where we built custom applications to compare written engineering specifications. The results were very useful. In one case, finding copied but modified software specifications gave clues that showed how one company copied another’s software.

DocMatch can be licensed as the full version or the LT version. The full version is the professional tool. It creates a database containing matching elements between two sets of documents. The full version can automatically search the Internet for all references to commonly used words and filter them from the database. Also, sophisticated statistics can be extracted from the database. The full version costs $150 for a one-year license. The LT version produces an easy-to-read HTML report showing words, sentences, and paragraphs that are identical or similar in every pair of documents. The LT version costs $30 for a one-year license. Register to download your copy here.

ADFSL 2011 Conference on Digital Forensics, Security and Law

Last year my consulting company presented a paper entitled Measuring Whitespace Patterns As An Indication of Plagiarism that examined and tested the concept that patterns of whitespace in two source code files can be used to determine whether one program was copied from the other. The conference was an enjoyable three days in St. Paul, Minnesota. We even got a tour of the Forensic Science Laboratory of the Bureau of Criminal Apprehension where we learned the real forensic science used to catch criminals (the CSI TV shows are a “little bit” exaggerated, but the reality is just as interesting).

This year the conference will be at Longwood University in Richmond, Virginia from May 25 through 27. I’m serving on the conference committee. We’re looking for paper, presentation, and panel submissions in the following areas:

Curriculum

1. Digital Forensics Curriculum
2. Cyber Law Curriculum
3. Information Assurance Curriculum
4. Accounting Digital Forensics Curriculum

Teaching Methods

5. Digital Forensics Teaching Methods
6. Cyber Law Teaching Methods
7. Information Assurance Teaching Methods
8. Accounting Digital Forensics Teaching Methods

Cases

9. Digital Forensics Case Studies
10. Cyber Law Case Studies
11. Information Assurance Case Studies
12. Accounting Digital Forensics Case Studies

Information Technology

13. Digital Forensics And Information Technology
14. Cyber Law And Information Technology
15. Information Assurance And Information Technology
16. Accounting Digital Forensics Information Technology

Networks And The Internet

17. Digital Forensics And The Internet
18. Cyber Law And The Internet
19. Information Assurance And Internet
20. Digital Forensics Accounting And The Internet

Anti-Forensics And Counter Anti-Forensics

21. Steganography
22. Stylometrics And Author Attribution
23. Anonymity And Proxies
24. Encryption And Decryption

International Issues

25. International Issues In Digital Forensics
26. International Issues In Cyber Law
27. International Issues In Information Assurance
28. International Issues In Accounting Digital Forensics

Theory

29. Theory Development In Digital Forensics
30. Theory Development In Information Assurance
31. Methodologies For Digital Forensic Research
32. Analysis Techniques For Digital Forensic And Information Assurance Research

Digital Rights Management (DRM)

33. DRM Issues In Digital Forensics
34. DRM Issues In Information Technology
35. DRM Issues In Information Assurance
36. DRM Issues In Cyber Law

Privacy Issues

37. Privacy Issues In Digital Forensics
38. Privacy Issues In Information Assurance
39. Privacy Issues In Cyber Law
40. Privacy Issues In Digital Rights Management

Software Forensics

41. Software Piracy Investigation
42. Software Quality Forensics

Other Topics

43. Cyber Culture And Cyber Terrorism

The deadline for submissions is February 19. The website for the conference is at http://www.digitalforensics-conference.org where you’ll find more information about the conference, the venue, and submission guidelines.

Zynga and CrowdStar, copying or coincidence?

Software Analysis & Forensic Engineering Corporation today released a case study of Online IP Screening between Zynga’s FarmVille game and CrowdStar’s Happy Aquarium game. The study shows some interesting correlation between the source code for the two games. SAFE Corporation is officially announcing its SAFE Online IP Screening service that is targeted at social games and other online applications. The screening service is a subscription service to regularly examine online applications for signs of copying. In this first case study, we already found surprising results. Even after the normal process of eliminating correlation due to third party code, commonly used identifier names, automatically generated code, common algorithms, and common authors, correlation remained. Was this intentional? Illegal? Acceptable? Coincidence? Decide for yourself: see summaries of this and other case studies here and register to download the full case studies here.

One unique feature of online applications is that often the full source code is downloaded to the user’s machine. This makes it easier for your competitors to copy your code. It also makes it easier for us to detect that copying. Learn more about SAFE Online IP Screening here or email us for details about how we can protect you from unauthorized copying and dissemination of your code.

SAFE introduces CodeSuite-LT

CodeSuite-LT® is a less expensive, limited version of the full CodeSuite tool. Each tool in the suite produces a readable report that can be used to find copying. CodeSuite-LT includes BitMatch, CodeCross, CodeDiff, CodeMatch, FileCount, and FileIsolate. It also includes the ability to filter results using SourceDetective. CodeSuite-LT does not produce a database and does not allow post-process filtering of results. Instead, it generates an easy-to-read report that can be used to pinpoint copying.

Which is Right For You?

Which product is right for you, CodeSuite or CodeSuite-LT? Click here for a table that compares the features of both programs so you can choose the right solution.

The age of copyright trolls?

Robert Zelnick, an attorney at McDermott Will & Emery, recently wrote an interesting article on Righthaven LLC, a company that buys up copyrights and then licenses them to, or threatens legal action against, organizations and individuals that post them on the web. This article about the new “copyright troll” is interesting and illuminating. There are, however, a few oversimplifications and at least one point overlooked. First, “don’t copy” is just too simple a solution. As an expert witness in copyright litigation, I know that things can look the same without being copied. Also, there are the fair use exceptions that leave lots of wiggle room. So even if someone doesn’t copy at all, there’s a chance of being hit with a lawsuit because two texts are surprisingly similar. And not copying at all means society will lose important works of commentary, satire, and news.

Second, Zelnick doesn’t foresee the possible ultimate business model of Righthaven. While I don’t agree or disagree with Righthaven’s motives, I believe I see where they’re going. Jerome Lemelson was perhaps the first patent troll, but definitely the first to reach $1 billion in personal fortune from his effort. My understanding is that he started by bringing actions against small companies that could not easily defend themselves and Japanese companies that didn’t understand U.S. patent law. These companies saw his royalty fees as small compared to the costs of hiring lawyers to study and defend the patent infringement suits he brought. After amassing a huge war chest, Lemelson went after bigger and bigger companies and sought bigger and bigger payments. The more capital he had, the easier it was to win these battles.

While Righthaven will probably never collect the multimillion dollar awards that Lemelson did, consider that nearly everyone in the world writes. There are thousands of novelists, thousands of journalists, thousands of researchers, and millions of bloggers. And copyright also applies to artists, filmmakers, and computer programmers. Righthaven, and companies like it, can potentially collect more than Lemelson even hoped for, and at less expense.

I believe that Righthaven and its business model should not be underestimated. The solution to protecting yourself is more complex than simply not copying. The exciting part is that this new business model will create new areas of legal effort and will require the best technology to allow the protection of both copyrights and free speech.

SAFE Corporation announces CodeScreener online software plagiarism detection

CodeScreener: Online Plagiarism Detection for Software

CodeScreener

 SAFE Corporation has developed an online plagiarism detection service for software. The CodeScreener™ service is built on SAFE Corporation’s court-tested CodeSuite® forensic software and patented source code correlation technology. CodeScreener is designed to streamline the plagiarism detection process, giving you a thorough analysis of each file and a consistent set of correlation metrics. It’s online, it’s interactive, and it’s much less expensive than standalone CodeSuite. Contact our  Sales Department to get a free evaluation license.

The DMCA exemptions

The Digital Millennium Copyright Act has been praised by some, vilified by others. Many don’t know that the DMCA specifically allows copying of protected works by researchers, libraries, nonprofits, and academic institutions. Also, the Librarian of Congress is required to issue exemptions from the prohibition against circumvention of access-control technology when such technology prevents people from making non-infringing uses of copyrighted works. The current exemptions, issued just last week are described below. Note that all of these allowable uses assume that the person copying the work has purchased the work or has otherwise rightfully obtained it.

  1. To copy short portions of movie DVDs for the purpose of criticism or comment, specifically:
    • Educational uses
    • Documentary filmmaking
    • Noncommercial videos
  2.  To enable computer programs that allow cell phones to run software applications written for other cell phones (known as “jailbreaking” or “rooting”).
  3. To enable computer programs that allow used cell phones to connect to a phone network as long as it is authorized by the operator of the network.
  4. To run video games on personal computers for the purpose of testing for, investigating, or correcting security flaws or vulnerabilities.
  5. To bypass broken or obsolete dongles that prevent a program from running.
  6. To enable an ebook’s read-aloud function or screen readers that convert the text into a specialized format.

The Report Generator (RPG)

The Report Generator (“RPG”) is a new program from SAFE that automatically generates draft expert reports and declarations for litigation. Reports have several generic sections such as an expert’s experience and descriptions of the technologies involved in the examination, which can be shared amongst reports. By automating the compilation of the generic information into a formatted and structured draft report, the expert can focus on performing the analysis and writing the case-specific arguments.

When using the RPG, an expert selects the type of case, type of report, types of technologies involved, types of tools used, and expert background profiles from a GUI. Then a Microsoft Word draft report is generated that includes all of the selected generic information intermixed with blank sections where case-specific information should be filled in manually.

Currently, many experts either dig through their prior works to find specific descriptions or write them from scratch each time. Maintaining a library of generic report elements is a challenge, especially when multiple experts are involved. RPG acts as a version control system between multiple experts who can upload and download detailed descriptions of experts, technologies, and tools from a central server. The reports are generated according to specific formats, so an entire team of experts can easily produce reports that are consistently formatted with the most up-to-date descriptions.

RPG also keeps synced descriptions of CodeSuite, so it can include the most up-to-date descriptions and pricing of the tools without having to search the S.A.F.E. website or CodeSuite help files.

If you’re interested in trying out RPG, contact our Sales Department.

Can whitespace patterns provide clues to plagiarism?

Over the years I’ve run into expert witnesses and attorneys who have told me about software copyright infringement cases where the only clues that copying occurred were patterns of spaces and tabs (“whitespace”). The idea is that if a truly ambitious thief wanted to cover his tracks, he would modify the stolen code so much that there was no longer a visible trace of copying. However, the clever software sleuth could find patterns of whitespace that the thief had missed; although virtually nothing remained, the invisible tabs and spaces could produce a conviction.

This always sounded intriguing, but I wondered whether anyone had ever tested this theory. We could find no articles or papers on the subject, except for one inconclusive paper, and I dreaded to think that some programmer was convicted based on an untested theory. I decided to have my consulting company, Zeidman Consulting, do some carefully controlled research. If the results turned out well, SAFE Corporation would add whitespace pattern algorithms to CodeSuite to further enhance its ability to detect copying.

Our results were published in a paper entitled Measuring Whitespace Patterns as an Indication of Plagiarism that was recently presented at the ADFSL Conference on Digital Forensics, Security and Law. Our results are summarized in the final paragraph:

This whitespace pattern matching method can be used to focus a search for evidence of similarity or copying, but this method cannot stand by itself.

What we discovered is that even very different files have often have similar whitespace patterns. At Zeidman Consulting we’ve used whitespace patterns to confirm copying that was already detected through the use of CodeMatch to find correlated programming elements. In those cases, the whitespace patterns offered further confidence in our findings and in some cases showed which program had been developed first. For a copy of the paper, email us at info@SAFE-corp.biz.

Our next research project is to look at sequences of whitespace within files. Maybe there we’ll find some clues to copying. But for now our results show that whitespace patterns without any other evidence should not be used to determine that copying occurred.