We’re often asked why we would do binary analysis on software that we already have the source code to, and Rob Graham over at Errata’s blog had a great post on this a few days ago about that very topic. As Graham says the key difference between coders and hackers (or security researchers playing the part) is the concrete versus the abstract. Analyzing the binary itself allows us to have a much more complete understanding of what the program is actually doing without all the assumptions getting in the way.

In looking at the binary an auditor has to, on some level, forget what they know about what the program is supposed to do and focus on the specifics of the section that they are analyzing. Each memory read and write has to be examined for what it is, and not for what it is supposed to be, which in all honestly can be quite tedious but it’s the only way to find a lot of vulnerabilities. Not to say that if everyone was doing all their coding in assembly that we’d have less security problems, but an eye towards the underlying actions that happen at the basic level during the development process would.

But there is also a place for source code analysis. When looking for certain types of problems, like logic and implementation correctness that type of analysis will be very fruitful and can be found much easier than slogging through assembly. Auditing a section of code using complex mathematics an auditor could work his way up from the additions/subtractions in the binary to understand the function and spot the problem, but it’s a lot more likely that he would notice a typo and an incorrect variable being used and probably spend a lot less time find it. Doing this level of analysis also gives us insight into how vulnerabilities may have been created in the first place, allowing for recommendations of changes in coding practices and “big picture” security issues to prevent more like it from occurring. Both binary and source analysis have their place in an audit and combined give real understanding of programs security from top to bottom.