I don't think there is a tool, or ever will be, that can provide certainty. You do not state if you are looking for accidental or intentionally malicious code. Accidental mistakes are easier to catch, intentional "back doors" impossible, at the tool level, to distinguish from valid code.
I would start with (as you have already mentioned) static checks, and dynamic checks (valgrind).
Internal code review/audit will be required, and depending on importance and project budgets, external code review/audit by specialists would be money well spent. 150kLOC is a pretty average size project, an audit of that should be able to be performed.
Testing the interfaces (blackbox) is essential. Use a security consultant if you do not have experts inhouse.
If you are looking for intentional code, that's a really big problem, as it's pretty easy to hide an intentional vulnerably (e.g. an intentional buffer overflow hidden behind cast to hide the compiler warning), and it's too big code base to be certain of identifying every one. Assume you won't find them all, and deploy accordingly.
Answer from mattnz on Stack Exchangesecure coding - Examining source code for maliciousness and security vulnerabilities - Software Engineering Stack Exchange
reverse engineering - How to find vulnerabilities in a program by looking at it's source code - Information Security Stack Exchange
security - Finding Vulnerabilities in Software - Stack Overflow
c - How can I find vulnerabilities in this code? - Stack Overflow
Videos
Hello dear community, I am a newbie and looking for resources on how to perform code reviews from security perspective. Do you have any suggestions on how to practice it and what resources could be helpful? In general I have a pretty good security knowledge but want to practice and improve this part!
I don't think there is a tool, or ever will be, that can provide certainty. You do not state if you are looking for accidental or intentionally malicious code. Accidental mistakes are easier to catch, intentional "back doors" impossible, at the tool level, to distinguish from valid code.
I would start with (as you have already mentioned) static checks, and dynamic checks (valgrind).
Internal code review/audit will be required, and depending on importance and project budgets, external code review/audit by specialists would be money well spent. 150kLOC is a pretty average size project, an audit of that should be able to be performed.
Testing the interfaces (blackbox) is essential. Use a security consultant if you do not have experts inhouse.
If you are looking for intentional code, that's a really big problem, as it's pretty easy to hide an intentional vulnerably (e.g. an intentional buffer overflow hidden behind cast to hide the compiler warning), and it's too big code base to be certain of identifying every one. Assume you won't find them all, and deploy accordingly.
There are periodic contests for obfuscated code and code that does something other than what it appears to do. Study the contest entries, and you will discover that this is not a trivial problem to solve.
and welcome to Security.SE. This question doesn't really fit the StackExchange format, and will likely be closed. Questions like this typically end up just being a list of books, which could potentially get out of date very rapidly.
To answer your question, I would recommend a couple of things:
- Pick an area that you would like to study (i.e. web vulnerabilites vs. C source code).
- Find applications which fit your choice which are open source and have had security vulnerabilities which have been discovered in the past. There are also plenty of purpose made vulnerable applications such as Owasp's WebGoat you could look at.
- Download an old version of the source code which still contains the vulnerability, and read through the code to see if you can understand how the vulnerability presents itself.
As far as books go, there are plenty available, but as @multithr3at3d mentioned, it's kind of a broad topic.
You can also try running some static analysis tools against some open source projects, and review the results to find other potential vulnerabilities.
See https://www.youtube.com/watch?v=ibF36Yyeehw (start watching at around 30:00) to see how Moxie Marlinspike does it. In this example, he shows how he found a serious bug in the SSL implementation used by Mozilla. By exploiting the bug, he was able to launch a deadly attack that allowed him create a certificate on the fly that could be used to MITM any site, with the certificate being trusted by the browser.
On the lower layers, manually examining memory can be very revealing. You can certainly view memory with a tool like Visual Studio, and I would imagine that someone has even written a tool to crudely reconstruct an application based on the instructions it executes and the data structures it places into memory.
On the web, I have found many sequence-related exploits by simply reversing the order in which an operation occurs (for example, an online transaction). Because the server is stateful but the client is stateless, you can rapidly exploit a poorly-designed process by emulating a different sequence.
As to the speed of discovery: I think quantity often trumps brilliance...put a piece of software, even a good one, in the hands of a million bored/curious/motivated people, and vulnerabilities are bound to be discovered. There is a tremendous rush to get products out the door.
There is no efficient way to do this, as firms spend a good deal of money to produce and maintain secure software. Ideally, their work in securing software does not start with a looking for vulnerabilities in the finished product; so many vulns have already been eradicated when the software is out.
Back to your question: it will depend on what you have (working binaries, complete/partial source code, etc). On the other hand, it is not finding ANY vulnerability but those that count (e.g., those that the client of the audit, or the software owner). Right?
This will help you understand the inputs and functions you need to worry about. Once you localized these, you may already have a feeling of the software's quality: if it isn't very good, then probably fuzzing will find you some bugs. Else, you need to start understanding these functions and how the input is used within the code to understand whether the code can be subverted in any way.
Some experience will help you weight how much effort to put at each task and when to push further. For example, if you see some bad practices being used, then delve deeper. If you see crypto being implemented from scratch, delve deeper. Etc