Software Analysis and Forensic Engineering has just released a new version of CodeSuite that has some really great new features.
What’s a PID? It’s a partial identifier. Or more specifically, a partially matching identifier. That’s where two identifiers in code almost match. So for example, the identifiers identifier1 and confident_boy share the partial identifier (or “PID”) ident. CodeMatch has always been able to correlate PIDs and use that in calculating the identifier correlation score as a component of the entire correlation score between two source code files. But there can be so many PIDs that users got blurry-eyed trying to view them all and find suspicious ones in a CodeMatch HTML report. So we came up with a solution. You can now export the PIDs from a CodeSuite database into a spreadsheet. You can see not only the PIDs, but the original identifiers that share the PIDs. Now you can sort and select, cut and paste, and generally look for clues to copying in a simple spreadsheet.
Part of our process for finding copying has been to first find all the source code files in a directory of files so that you know what to examine. However, there are lots of source code files, and some can be missed. Some programming languages are a bit uncommon and you may not recognize the source code files. Well, we found a solution to that too. The new FileIdentify function of CodeSuite allows you to point at a folder and generate a spreadsheet containing all of the file extensions in that folder and all subfolders. If CodeSuite recognizes the (potential) programming language, it will put that information in the spreadsheet too.
From the beginning of CodeSuite, when there was only CodeMatch, the database has always been a fully documented text file that anyone can view. This allows our customers to make their own tools to extract data and statistics from a CodeSuite comparison, and some customers have created some very interesting utilities. Our database format was simple, but grew more complex over the years. Now we have a function in CodeSuite that converts any CodeSuite database into XML so that you can use off-the-shelf tools to examine it, translate it, or write utilities to extract data and statistics.