You have the source code from two different programs. You run them through CodeMatch and find high correlation numbers. Have you proven copying? Not yet. There are still a few steps to go through first. Finding a correlation between the source code files for two different programs doesn’t necessarily mean that illicit behavior occurred. At SAFE we’ve determined that there are exactly six reasons for correlation between two different programs. These reasons can be summarized as follows.
- Third-Party Source Code. Both programs use open source
code or purchased libraries.
- Code Generation Tools. Automatic code generation tools,
such as Microsoft Visual Basic or Adobe Dreamweaver, generate
software source code that looks very similar.
- Common Identifier Names. Certain identifier names are
commonly taught in schools or commonly used by programmers in
- Common Algorithms. There may be an easy or well-understood
way of writing a particular algorithm that most programmers use,
or one that was taught in school or in textbooks.
- Common Author. One programmer, or “author,”
will create two programs that have correlation simply because
that programmer tends to write code in a certain way.
- Copied Code. Code was copied from one program to another.
If the copying was not authorized by the original owner, then
it comprises plagiarism.
It’s important when using CodeMatch to understand these rules. Especially in litigation. Before there can be proof of copyright infringement, all of the other 5 reasons for correlation need to be eliminated. CodeSuite offers some sophisticated filtering functions that allow you to filter out aspects of the code that are correlated due to the other 5 reasons. What’s left, after filtering, is correlation due to copying.
You can read more about this in the article in IP Today entitled, What, Exactly, Is Software Plagiarism?