Tag Archives: copyright infringement

SAFE Corporation Awarded Seventh Patent for CodeSuite® Software Forensics Tool

SourceDetective searches the Internet to defend against charges of copying

CUPERTINO, CA (June 3, 2015) – Software Analysis & Forensic Engineering Corporation, the leading provider of forensic tools for software copyright and trade secret analysis, recently earned a seventh patent covering its CodeSuite® tool for comparing software code to help detect copyright infringement.

US patent 9,043,375, “Searching the Internet for Common Elements in a Document in Order to Detect Plagiarism,” covers CodeSuite’s innovative SourceDetective® functionality. While other “software plagiarism detection” tools perform a comparison of code and provide an indication of copying, SourceDetective takes the analysis further by searching the Internet to determine whether code in two different programs – including open source code – might be third-party code.

CodeSuite is the only commercially successful tool for comparing computer source code and object code to find infringement. It has been used successfully in more than 70 intellectual property litigations worldwide, and is recognized by the United States Patent and Trademark Office (USPTO) as a unique invention.

“I developed CodeSuite to overcome the inaccuracies common to other tools that can result in false positives and false negatives,” says Bob Zeidman, president of SAFE Corporation and inventor of CodeSuite. “Too much is at stake for the people and companies involved in IP litigation to rely on false results of any kind.”

CodeSuite 4.7 is available now and can be purchased on a term license or project basis. Project pricing is based on the size of code analyzed and the specific function used for the analysis. More information, and free trial licenses, can be requested by contacting sales@SAFE-corp.biz.

SAFE Corporation Awarded Patent Number Six for its CodeSuite Software Forensics Tool

The CodeCross function of CodeSuite compares functional source code to commented-out source code

CUPERTINO, CA (February 9, 2015) – Software Analysis & Forensic Engineering Corporation, the leading provider of forensic tools for software copyright and trade secret analysis, had its sixth patent allowed covering its CodeSuite® tool for comparing software code to help detect copyright infringement.

This latest patent is entitled “Detecting Plagiarism in Computer Source Code” and covers the CodeCross functionality that compares functional code to non-functional code. CodeSuite is the only commercially successful tool for comparing computer source code and object code to find infringement that has been accepted by the courts. It has been used successfully in over 70 intellectual property litigations worldwide. CodeSuite has been recognized by the USPTO as a unique invention. Our customers agree.

“Other programs that compare software don’t provide any understanding about the comparison or the results,” according to Gary Stringham of Gary Stringham & Associates, who has used CodeSuite in his expert witness cases. “Things match or they don’t. Only CodeSuite allows me to delve into the reasons for the matches, search the Internet for comparable third-party code, and then systematically filter out false positives. This means I can focus on possible infringement very quickly. Or, if nothing is left after filtering, I have a very strong argument against infringement.”

“CodeSuite has survived every challenge in court that it’s ever faced,” says Bob Zeidman, president of SAFE Corporation and inventor of CodeSuite. “Judges and juries like the quantitative, objective measurements produced by CodeSuite when they’re produced by a qualified expert trained in the tool. We provide online certification courses that give lawyers confidence that the expert knows how to use the tool and produce rock solid results that will stand up to scrutiny in court.”

CodeSuite 4.7 is available now and can be purchased on a term license or project basis. Project pricing is based on the size of code analyzed and the specific function used for the analysis. Pricing varies from $10 per megabyte for CodeCross® to $400 per megabyte for CodeMatch®. A six-month unlimited use license for CodeSuite is $50,000. A limited feature version of the program, CodeSuite-LT, is available for a six-month unlimited license for $3,000. Free trial licenses can be requested by contacting sales@SAFE-corp.biz.

Was the Microsoft Empire Built on Stolen Goods?

The history of the computer industry is filled with fascinating tales of sudden riches and lost opportunities. Take that of Ronald Wayne, who cofounded Apple Computer with Steve Wozniak and Steve Jobs but sold his shares for just US $2,300. And John Atanasoff, who proudly showed his digital computer design to John Mauchly who later codesigned the Eniac, typically recognized as the first electronic computer, without credit to Atanasoff. Perhaps the most famous story of missed fame and fortune is that of Gary Kildall. A pioneer in computer operating systems, Kildall started the company Digital Research and wrote Control Program for Microcomputers (CP/M), the operating system used on many of the early hobbyist personal computers, such as the MITS Altair 8800, the IMSAI 8080, and the Osborne 1, before IBM introduced its own PC. Kildall could have been the king of personal computer software, but instead that title went to his small-time rival Bill Gates. For years, rumors have circulated that the code for the original DOS operating system sold by Microsoft is actually copied from the CP/M operating system developed by Digital Research.

A couple years ago we took it upon ourselves to search out the original code and use CodeSuite to determine the truth once and for all. Our research was summarized in a popular (and not-so-popular) article in IEEE Spectrum entitled Did Bill Gates Steal the Heart of DOS? If you haven’t read it, you should. It’s a fun read but it only summarizes our exhaustive results using our tools and procedures for finding copied code. The article generated a lot of controversy and we always intended to publish the full technical details of our analysis, but it’s surprising how many people don’t like our conclusion and wouldn’t publish my paper. But now the full academic paper entitled A Code Correlation Comparison of the DOS and CP/M Operating Systems is available online in the Journal of Software Engineering and Applications. If you want to know the details, and you want to know the truth, it’s in the article and the details are in the paper.

S.A.F.E. Releases CodeSuite 4.7

Software Analysis and Forensic Engineering has just released a new version  of CodeSuite that has some really great new features.


PID  spreadsheets

What’s  a PID? It’s a partial identifier. Or more specifically, a partially matching identifier. That’s where two identifiers in code almost match. So for example, the identifiers identifier1 and confident_boy share the partial identifier (or “PID”) ident. CodeMatch has always been able to correlate PIDs and use that in calculating the identifier correlation score as a component of the entire correlation score between two source code files. But there can be so many PIDs that users got blurry-eyed trying to view them all and find suspicious ones in a CodeMatch HTML report. So we came up with a solution. You can now export the PIDs from a CodeSuite database into a spreadsheet. You can see not only the PIDs, but the original identifiers that share the PIDs. Now you can sort and select, cut and paste, and generally look for clues to copying in a simple spreadsheet.



Part of our process for finding copying has been to first find all the source code files in a directory of files so that you know what to examine. However, there are lots of source code files, and some can be missed. Some programming languages are a bit uncommon and you may not recognize the source code files. Well, we found a solution to that too. The new FileIdentify function of CodeSuite allows you to point at a folder and generate a spreadsheet containing all of the file extensions in that folder and all subfolders. If CodeSuite recognizes the (potential) programming language, it will put that information in the spreadsheet too.



From the beginning of CodeSuite, when there was only CodeMatch, the database has always been a fully documented text file that anyone can view. This allows our customers to make their own tools to extract data and statistics from a CodeSuite comparison, and some customers have created some very interesting utilities. Our database format was simple, but grew more complex over the years. Now we have a function in CodeSuite that converts any CodeSuite database into XML so that you can use off-the-shelf tools to examine it, translate it, or write utilities to extract data and statistics.

Job Opening: Software Forensic Engineer

Zeidman Consulting, a leading research and development company (and sister  company to SAFE Corporation), is looking to hire a full-time software forensic engineer. Acting as a high-tech sleuth, this person will analyze and reverse-engineer software using CodeSuite® and other state-of-the-art software tools, helping to resolve lawsuits involving hundreds of millions or billions of dollars. The employee will also work on one of several ongoing cutting edge research projects. These projects often lead to publication in academic journals, presentations at conferences, patents, and new product spinoffs. Past and ongoing projects include:

  • CodeMatch®, a program for comparing and measuring the similarity of different programs.
  • CodeGrid®, a computer grid-enabled version of CodeMatch®.
  • HTML Preprocessor™, a tool for breaking complex HTML pages into components consisting of text, pure HTML, JavaScript, images, etc.
  • RPG, a tool for automatically generating expert reports for copyright, trade secret, and patent litigation.

A successful candidate will need the following attributes:

  • At least a bachelor’s degree in computer science or equivalent. Advanced degree is preferred.
  • Excellent programming skills in one or more programming languages.
  • Ability to work independently on projects that are not well-defined.
  • Excellent verbal and writing skills for creating detailed specifications and reports.
  • Ability to work on multiple projects simultaneously and to switch projects suddenly as the need arises.
  • Enjoys working long hours on interesting projects, including weekends when projects hit critical periods.
  • Enjoys free time when projects are not in critical periods.

Zeidman Consulting pays above average salaries with profit-sharing and provides health insurance and paid time off for holidays, vacation, and illness. To apply, please email a resume to Info@ZeidmanConsulting.com.

Be a Pioneer in the Field of Software Forensics

I hope you’re all aware of my book The Software IP Detective’s Handbook: Measurement, Comparison, and Infringement Detection. It’s the first book on Software Forensics, a field that I pioneered at Software Analysis and Forensic Engineering and Zeidman Consulting. Whereas Digital Forensics deals with bits and files, without any detailed knowledge of the meaning of the data, Software Forensics deals with analysis of software using detailed knowledge of its syntax and functionality to perform analysis to find stolen code and stolen trade secrets. The algorithms described in the book have been used in many court cases. The book also describes algorithms for measuring software evolution, particularly as it relates to IP changes.

If you are a teacher, this is a great time to incorporate the materials in the book into your courses on software development, intellectual property law, business management, and computer science. There’s something for everyone in the various chapters of the book. Your students and you will be at the forefront of an important and very new field of study.

If you’re interested, please contact me.

HTML Preprocessor Released

S.A.F.E. recently released the HTML Preprocessor. The HTML Preprocessor is designed to transform web pages into files that are amenable to analysis by CodeSuite, DocMate, and other source code analysis tools. The HTML Preprocessor examines HTML files and other markup language files and extracts all embedded code into separate files. These files each contain only one kind of code that can be easily analyzed and compared using CodeSuite and DocMate. The code contained in these generated files are:

  • Scripts such as JavaScript and VBScript
  • Cascading style sheets (CSS)
  • Comment text containing HTML comments
  • Message text containing HTML user messages
  • HTML tags
  • Pure HTML
  • Pseudocode representation of the HTML

CodeSuite 4.4 and CodeSuite-LT 1.2 Released

S.A.F.E. recently released version 4.4 of CodeSuite and version 1.1 of CodeSuite-LT. The most important new feature of this version is that these programs now recognizes many different text encoding formats including ASCII, UTF-8, UTF-16, and UTF-32. Characters in alphabets other than the Latin alphabet used for English are now supported. For example, code with comments or strings in Japanese, Korean, Chinese, or Russian can be compared correctly.

The most significant change is to BitMatch. When examining binary object code to find text strings, you can now specify the encoding format of the file. If you’re not sure about the encoding, you can choose multiple formats.

As demand for our products increase outside the United States, we realized a need to support languages in those countries also.

Will Congress Break the Internet? A look at SOPA and PIPA.

There has been a lot of writing, and action, by people for and against the two bills being considered by Congress for protecting intellectual property owners from having their rights infringed online. The PROTECT-IP Act (PIPA) is the version of the bill being considered by the Senate. The Stop Online Piracy ACT (SOPA) is its counterpart being considered by the House of Representatives. The law firm of LaRiviere, Grubman & Payne, LLP does a good job of summarizing the two laws here. The two bills are different and, if passed, will have to be rolled into a single bill, but their essence is to enable U.S. law enforcement or a private party to shut down websites that are “dedicated to infringing activities.” Such a website is defined in the bills one whose primary purpose is infringement. The accuser must show that the website has “no significant use” other than engaging in, facilitating, or enabling any of the following:

  1. Copyright infringement; or
  2. Infringement or violation of any of the protections contained in the DMCA (Digital Millennium Copyright Act) including its anti-circumvention provisions; or
  3. The sale or promotion of counterfeit goods.

The shutdown of the website is effected by disabling DNS translation. When a user types in a URL such as www.ZeidmanConsulting.com, the network devices that implement the Domain Name System (DNS) throughout the Internet, called “DNS servers,” translate the characters into an Internet Protocol (IP) address consisting of numbers such as

Recently the web domain registrar GoDaddy announced that it supported the bills. Shortly thereafter, angry Internet users at blog site reddit called for a boycott of GoDaddy and, not surprisingly, GoDaddy competitors immediately jumped in by offering users discounts to jump ship. To date, over 40 Internet companies have come out against the bills (see here)*. The House issued a paper listing over 140 companies that have come out in favor of the bills (see here). GoDaddy gave in to the pressure and reversed its position on the bills.

Renowned attorney Mark Lemley and colleagues David S. Levine and David G. Post wrote a recent article for the Stanford Law Review entitled Don’t Break the Internet. You can tell from the title where they stand, but I’d like to address each of their main points.

The Bills Will Not Harm Internet Infrastructure

These authors claim that “the bills represent an unprecedented, legally sanctioned assault on the Internet’s critical technical infrastructure.” The authors go on to say that implementing such filtering “threatens the fundamental principle of interconnectivity” and “will also have potentially catastrophic consequences.” I’ll give them the benefit of the doubt that they’re not trying to simply use exaggerated scare tactics, but rather they just don’t understand the technical issues.

Every time you register a new domain, the DNS servers throughout the Internet are updated with the translation. This is part of the normal course of events. Every time a domain name expires, the DNS servers are again updated to remove the translation. According to a report by VeriSign, there were 4.9 million new domain name registrations in the third quarter of 2011. That’s about 37 DNS changes per minute on average, not counting changes due to expired domains. From a technical point of view, the bills do nothing different than what happens many times each day on the Internet and has no technical challenges or risks whatsoever.

The Bills Do Not Violate Basic Principles of Due Process

These authors go on to state that these acts “violate basic principles of due process… by depriving persons of property without a fair hearing and a reasonable opportunity to be heard.” I’ll assume that these attorneys have never watched the TV show Law and Order, or any other cop show, or taken part in a criminal investigation where a court orders a warrant, based on evidence, that otherwise violates a person’s constitutional rights because there is evidence of illegal activity. These bills, as with all similar bills, require a court to make a decision to take action or not. I’ll assume that the authors of the paper have also not spent much time in a courtroom, because as an expert witness I can tell you that no judge takes such a decision lightly and that there are high thresholds of proof. Without this kind of ability to shut down illegal activity, accused criminals would simply avoid showing up for court in order to evade punishment.

The Bills Do Not Violate Free Speech Rights

These authors claims that each bill is an “unconstitutional abridgement of the freedom of speech protected by the First Amendment.” I’ll assume that the law professors are a little rusty on constitutional law particularly with respect to the First Amendment. Many types of speech are not protected such as hate speech, child pornography, and speech that infringes on copyrights.

The authors go on to claim that “[t]he Constitution requires a court ‘to make a final determination’ that the material in question is unlawful ‘after an adversary hearing before the material is completely removed from circulation.'” In other words, you cannot take down a website until you allow the accused to appear in court to defend himself. This quote is taken from the decision in the case of Center for Democracy & Technology v. Pappert. Again I’ll give the authors the benefit of the doubt that they were just too busy to actually read the court’s decision, but you can do so by clicking on the link. The full decision reads a “publication may not be taken out of circulation completely until there has been a determination  of obscenity after an adversary hearing” (emphasis added).This case is about the conflict between free speech rights and an accusation of child pornography, not about free speech rights and copyrights. But a case about free speech and copyrights on the web already has a precedent. Years ago the Digital Millennium Copyright Act (DMCA) was similarly challenged in federal court and survived. The decision in U.S. v. Elcomsoft confirmed that restrictions in the DMCA were not a violation of due process and did not conflict with the First Amendment.

In fact, copyrights have been enforced in this country as long as the constitution has been around, and longer than the Bill of Rights because their protection is given in Article I, section 8:Congress shall have power… To promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries.

The formal codification of copyright law took the form of the Copyright Act of 1790, before the adoption of the Bill of Rights in 1791. So the First Amendment’s protection of speech and the Copyright Act’s provisions for injunctive relief, seizure, and forfeiture coexisted easily for over 200 years without conflict. Terry Hart explains the history of the relationship between copyright and free speech in his extensive article here and in several other well-researched articles on his Copyhype blog.

The Bills Would Not Turn the U.S. Into a Repressive Regime

The authors’ final point is made with this statement:

It would be not just ironic, but tragic, were the United States to join the ranks of… repressive and restrictive regimes, erecting our own “virtual walls” to prevent people from accessing portions of the world’s networks.

Repressive regimes are actually those that do not protect individual property rights, but rather allow the government to determine who owns what, or conversely allows property theft to go unpunished. Repressive regimes do not allow individuals to protect their own property but require the government to do so on their behalf. Repressive regimes do not have the court system and the legal system of the United States that strict procedures and requirements to be met. Repressive regimes do not have the checks and balances in their government systems to allow one organization, corporation, government branch, or individual to challenge any law and any action taken by any other organization, corporation, government branch, or individual. Repressive regimes concentrate power in a few elite, not in individuals. There is no realistic concern that this law will turn the U.S. into a repressive regime.

Copyright and Trademark Infringement on the Internet is a Very
Real Problem

In their conclusion I find surprising agreement with the authors. They state:

Copyright and trademark infringement on the Internet is a very real problem, and reasonable proposals to augment the ample array of enforcement powers already at the disposal of IP rights holders and law enforcement officials may serve the public interest. But the power to break the Internet shouldn’t be among them.

They are absolutely correct. We must find reasonable ways to stop infringement of intellectual property on the Internet. Such a solution must be fair to the victim of the infringement. It must uphold the principles of the Constitution of the United States. And it must not break the Internet. SOPA and PIPA may not be perfect implementations of such protection, but they meet all of these requirements. There may be better strategies that can be reached through measured and thoughtful debate, but not through excessive hyperbole and fear.

*It doesn’t surprise my to see Scribd on this list. I play a regular game of whack-a-mole trying to remove illegal, free copies of my articles and books on this site that just pop up again within a few weeks after I send them a DMCA takedown notice.

Guidelines for lawyers dealing with experts

Most lawyers know the importance of treating experts with respect. Even if we turn out to be ignorant, arrogant, immature idiots, we hold the keys to presenting the facts and the analysis that will win your client’s case or at least put it in the best light possible given all of the facts. If we’re going to testify, you want us feeling good about it, about the client, about you, and about ourselves. Most attorneys know this but some, in the emotion of the “battle,” forget this. Here’s a checklist to serve as a reminder.

  • Have us give input into schedules. We know best how much work an analysis is going to take. And some of us have lives outside of work (not me, but I’ve heard that others do). Don’t give us a schedule without our input and expect us to meet it.
  • Don’t hire us just to keep us off the other side. I’ve had this happen. It’s flattering, but it’s also unethical. I need to make a living. Also I will never work for you again, and I will warn my colleagues about you.
  • Involve us with crafting the strategy. Don’t let us work in the dark and then complain, for example, that our invalidity argument hurts the non-infringement argument or vice-versa. And by the way, a great argument for one will always make the other much more difficult to show.
  • Involve us with claim construction. We have the appropriate experience to figure out a decent claim construction. Too often I’m called into a case where the claim construction makes little sense to me. I need to be educated about how the claims are construed and then I need to see if I can work with them. Sometimes adding or removing a word from the claim construction would make things significantly easier for me to understand and explain to the judge and jury.
  • Give us enough time to do our jobs. Maybe this is a pipe dream. Lately, cases have been more and more compressed and I’m brought in later, probably to save costs. But it hurts the case and stresses us out.
  • Don’t antagonize us. We’re they guys who are going to help your client by clarifying their position and explaining difficult concepts to the judge and/or jury. You don’t want us ticked off, even if we really are stupid jerks. You want us in a good frame of mind and happy about what we’re doing. At least until we’re done testifying.
  • Explain your positions to us patiently. If you can’t get us to understand it and adopt it, how can you get a judge or jury?
  • Don’t tell us we have to adopt your positions or we’ll lose the case. We’re independent and unbiased. The threat of losing the case is not a reason for us to support your position, and stating this can come back to haunt both of us eventually.
  • If things aren’t going well, meet face-to-face. It’s easier to communicate about difficult subjects. It’s easier to wave hands, draw diagrams, point to things. And it’s more likely for both to see each other as humans, not someone being difficult.
  • Don’t expect us to understand all the legal issues. I’ve met lawyers who didn’t understand all the legal issues. I actually do understand legal issues more than most experts because of my experience and my writing on the topic. Yet there are still gaps. And the lawyers can disagree. I’ve been in many long sessions where lawyers argued about legal issues.
  • Don’t believe you understand all the technical issues. Some of the lawyers I’ve met were once great engineers. Others have no engineering experience whatsoever. Some will take my word completely and others will fight me. I don’t mind reasoned debate—in fact I enjoy it. But remember that my understanding of the technical issues is ultimately what I will present in my reports and my testimony.
  • Be clear in your instructions. We know you’re in a hurry, but this is critical to getting good information. I’ve had cases where I got a quick call to do some analysis and then spent the weekend setting up equipment, getting results, and writing a report, only to find there had been a miscommunication about what was needed. Sure I get paid per hour, but I’d still like to know I’m doing something useful. I’m sure you and your client prefer that too.
  • Have us sit in on depositions. We can add a lot of knowledge and we can help craft the direction of the questioning. I was in one deposition where, searching the Internet, I found an expert’s presentation slides promoting a software method while she was testifying she would never ever use such an “unreliable” method. I’ve also had lawyers call me after a “very successful” deposition where they thought they’d uncovered some really useful facts but were asking questions about the wrong technology.
  • Don’t write the reports and expect us to just sign it. Our reputations and careers are on the line, not yours. Unfortunately, some experts do this and collect their checks. I won’t and neither will any expert worth his or her hourly rate.
  • Expect us to sleep some time. OK, the lawyers themselves get little sleep during a case. Me too. I just prefer that you act as though you care about my getting rest even though we both know I won’t. So don’t tell me to be available at midnight, ask me if I can please make myself available at midnight even though you know it’s a burden. It just sounds nicer.
  • Pay us on time or be honest about any problems. Sometimes clients run into financial trouble. I prefer to work for a client who is honest about financial trouble than one who constantly tells me “the check is in the mail.” Usually this is an issue with the client not the lawyer, but I’ve had lawyers misplace my final invoice, simply because they had moved onto other more pressing matters. My payment is a pressing matter, and a late or missing payment means I’m unlikely to be available the next time you need my expertise.
  • Don’t negotiate our fees after the case is over. This is just poor business practice and makes me not want to work with you again. The time for negotiation is before hiring me, not after I’ve put in time on the case.
  • Remember that our job is to be honest and unbiased. Expect us to point out the bad along with the good. If we find your client’s case doesn’t have merit, at least be happy we discovered that before the other party’s expert informed you at trial. You can settle early or limit the damages or just know that you did the right thing.