As mentioned above IDA is a great dissembler, but do not expect good C source from the dissembled native object. Overall the range of utilities to manipulate PE executables is quite limited in comparison to more universal and open executable like ELF. I would be more interested in the disassembled assembly since even remotely acceptable C code will not be possible since allot of "user" executables have obfuscated symbols. I haven't used a windows environment in ages, but when I did for disassembly functions I used the Boomerang decompiler which is open source and free http://boomerang.sourceforge.net/
Answer from n00ax on Stack ExchangeAs mentioned above IDA is a great dissembler, but do not expect good C source from the dissembled native object. Overall the range of utilities to manipulate PE executables is quite limited in comparison to more universal and open executable like ELF. I would be more interested in the disassembled assembly since even remotely acceptable C code will not be possible since allot of "user" executables have obfuscated symbols. I haven't used a windows environment in ages, but when I did for disassembly functions I used the Boomerang decompiler which is open source and free http://boomerang.sourceforge.net/
There is the Hexrays Decompiler, which is a plugin for the Interactive Disassembler (hexrays.com). It decompiles machine code into Pseudo-C code.
Videos
So you had a cow, but you inadvertently converted it to hamburger, and now you want your cow back.
Sorry, it just doesn't work that way.
Simply restore the source file from your backups.
Ah, you didn't have backups. Unfortunately, the universe doens't give you a break for that.
You can decompile the binary. That won't give you your source code, but it'll give you some source code with the same behavior. You won't get the variable names unless it was a debug binary. You won't get the exact same logic unless you compiled without optimizations. Obviously, you won't get comments.
I've used Boomerang to decompile some programs, and the result was more readable than the machine code. I don't know if it's the best tool out there. Anyway, don't expect miracles.
Several tools are common in reverse engineering an executable.
- The command "file" which takes the file path as the first parameter so you can determine (in most cases) what type of executable you have.
- Disassemblers which show EXACTLY what the executable does but is difficult to read for those that don't write assembly code on that specific architecture or have experience with disassembly.
- Decompilers like Boomerang, Hex-rays, and Snowman can provide some greater readability but they do not recover the actual variable names or syntax of the original program and they are not 100% reliable, especially in cases where the engineers that created the executable tested with these packages and tried to obfuscate the security further.
- Data flow diagrams or tables. I know of no free tool to do this automatically, but a Python or Bash script over the top of a text parser of the assembly output (which can be written in sed or Perl) can be helpful.
- Pencil and paper, believe it or not, for jotting flows and ideas.
In most cases I've seen, the code needed to be rewritten from scratch, maintained as an assembly language program, or reconstituted by re-applying change requests to an older version.
Duplicate of this question here.
Yes, it is possible, however when it comes to peeking function bodies and the like, you might have a little less luck. Operating systems like Kali Linux specialize in de-compilation and reverse engineering, so maybe look into a VM of that. And of course, windows has a lot of applications you can use as well to check the application code.
Look over the other question for specific app suggestions. :)
- Edit : You will most likely have lost all your logic and function bodies, but you might be able to recover the overall structure. It's your EXE so you might be more familiar with how it was all connected up.
You cannot get the original source code but you can decompile the binary into source code using tools given in this similar question: Is there a C++ decompiler?
The output source code will not look like the original as the compiler will have optimised the original source when generating the executable.
With a debugger you can step through the program assembly interactively.
With a disassembler, you can view the program assembly in more detail.
With a decompiler, you can turn a program back into partial source code, assuming you know what it was written in (which you can find out with free tools such as PEiD - if the program is packed, you'll have to unpack it first OR Detect-it-Easy if you can't find PEiD anywhere. DIE has a strong developer community on github currently).
Debuggers:
- OllyDbg, free, a fine 32-bit debugger, for which you can find numerous user-made plugins and scripts to make it all the more useful.
- WinDbg, free, a quite capable debugger by Microsoft. WinDbg is especially useful for looking at the Windows internals, since it knows more about the data structures than other debuggers.
- SoftICE, SICE to friends. Commercial and development stopped in 2006. SoftICE is kind of a hardcore tool that runs beneath the operating system (and halts the whole system when invoked). SoftICE is still used by many professionals, although might be hard to obtain and might not work on some hardware (or software - namely, it will not work on Vista or NVIDIA gfx cards).
Disassemblers:
- IDA Pro(commercial) - top of the line disassembler/debugger. Used by most professionals, like malware analysts etc. Costs quite a few bucks though (there exists free version, but it is quite quite limited)
- W32Dasm(free) - a bit dated but gets the job done. I believe W32Dasm is abandonware these days, and there are numerous user-created hacks to add some very useful functionality. You'll have to look around to find the best version.
Decompilers:
- Visual Basic: VB Decompiler, commercial, produces somewhat identifiable bytecode.
- Delphi: DeDe, free, produces good quality source code.
- C: HexRays, commercial, a plugin for IDA Pro by the same company. Produces great results but costs a big buck, and won't be sold to just anyone (or so I hear).
- .NET(C#): dotPeek, free, decompiles .NET 1.0-4.5 assemblies to C#. Support for .dll, .exe, .zip, .vsix, .nupkg, and .winmd files.
Some related tools that might come handy in whatever it is you're doing are resource editors such as ResourceHacker (free) and a good hex editor such as Hex Workshop (commercial).
Additionally, if you are doing malware analysis (or use SICE), I wholeheartedly suggest running everything inside a virtual machine, namely VMware Workstation. In the case of SICE, it will protect your actual system from BSODs, and in the case of malware, it will protect your actual system from the target program. You can read about malware analysis with VMware here.
Personally, I roll with Olly, WinDbg & W32Dasm, and some smaller utility tools.
Also, remember that disassembling or even debugging other people's software is usually against the EULA in the very least :)
psoul's excellent post answers to your question so I won't replicate his good work, but I feel it'd help to explain why this is at once a perfectly valid but also terribly silly question. After all, this is a place to learn, right?
Modern computer programs are produced through a series of conversions, starting with the input of a human-readable body of text instructions (called "source code") and ending with a computer-readable body of instructions (called alternatively "binary" or "machine code").
The way that a computer runs a set of machine code instructions is ultimately very simple. Each action a processor can take (e.g., read from memory, add two values) is represented by a numeric code. If I told you that the number 1 meant scream and the number 2 meant giggle, and then held up cards with either 1 or 2 on them expecting you to scream or giggle accordingly, I would be using what is essentially the same system a computer uses to operate.
A binary file is just a set of those codes (usually call "op codes") and the information ("arguments") that the op codes act on.
Now, assembly language is a computer language where each command word in the language represents exactly one op-code on the processor. There is a direct 1:1 translation between an assembly language command and a processor op-code. This is why coding assembly for an x386 processor is different than coding assembly for an ARM processor.
Disassembly is simply this: a program reads through the binary (the machine code), replacing the op-codes with their equivalent assembly language commands, and outputs the result as a text file. It's important to understand this; if your computer can read the binary, then you can read the binary too, either manually with an op-code table in your hand (ick) or through a disassembler.
Disassemblers have some new tricks and all, but it's important to understand that a disassembler is ultimately a search and replace mechanism. Which is why any EULA which forbids it is ultimately blowing hot air. You can't at once permit the computer reading the program data and also forbid the computer reading the program data.
(Don't get me wrong, there have been attempts to do so. They work as well as DRM on song files.)
However, there are caveats to the disassembly approach. Variable names are non-existent; such a thing doesn't exist to your CPU. Library calls are confusing as hell and often require disassembling further binaries. And assembly is hard as hell to read in the best of conditions.
Most professional programmers can't sit and read assembly language without getting a headache. For an amateur it's just not going to happen.
Anyway, this is a somewhat glossed-over explanation, but I hope it helps. Everyone can feel free to correct any misstatements on my part; it's been a while. ;)
Reko maintainer here. You could give Reko a try (https://github.com/uxmal/reko). Like the other decompilers you've tried, it won't generate immediately compileable code, for a vast number of reasons.
However, open source projects usually appreciate constructive user feedback. You could try running Reko (or any other decompiler) on your binary, and then looking at places where you think Reko could do better. Then you could file specific issues (here's a good example: https://github.com/uxmal/reko/issues/1129). This is more likely to result in improvements than the non-specific "I ran and the output is not what I want it to be."
Many years ago, I wrote a lengthy answer explaining why it is generally impossible for decompilers to produce C code that compiles for arbitrary input binaries. There is a lot of manual work in your future.