Decompilation is unlikely to be a practical solution, and it is even less likely that a tool for your specific compiler and instruction set combination even exists.
Disassembly however is straightforward, though whether you will be able to make sense of the resulting code is a different matter since no comments or symbols are preserved in the HEX file; if you have the original object code it may render the disassembly more readable. There are many PIC disassemblers available, just Google it; I can't direct you at any specific one because there are a number of PIC families with different instruction sets, and you did not specify.
A simple approach to disassembly would be to simply load your HEX file into MPLAB and select View->Disassembly Listing, then right-click the windows and select "Output to File". This output may need some massaging for it to be suitable for input to an assembler.
Answer from Clifford on Stack OverflowVideos
Decompilation is unlikely to be a practical solution, and it is even less likely that a tool for your specific compiler and instruction set combination even exists.
Disassembly however is straightforward, though whether you will be able to make sense of the resulting code is a different matter since no comments or symbols are preserved in the HEX file; if you have the original object code it may render the disassembly more readable. There are many PIC disassemblers available, just Google it; I can't direct you at any specific one because there are a number of PIC families with different instruction sets, and you did not specify.
A simple approach to disassembly would be to simply load your HEX file into MPLAB and select View->Disassembly Listing, then right-click the windows and select "Output to File". This output may need some massaging for it to be suitable for input to an assembler.
I know this is an old post, but I have recently encountered a similar problem and didn't find a very complete answer online. I lost my MPLAB X IDE project due to hard drive failure, luckily I had already programmed a device with a working version of the code.
Recover the .hex
Follow the steps below to recover the .hex information from a programmed device:
Use MPLAB X IDE and your PIC programmer (I used PICkit3) to read the .hex file from the programmed device:
- Start a new project for your device.
- In "Project Properties" select your programmer.
- Right click on the project folder and select "Set as Main Project".
- Click on the arrow next to the "Read Device Memory Main Project" and select "Read Device Memory to File". Reading device memory to .hex file
Disassemble the .hex
You can view the disassembly in MPLAB X IDE, but you cannot edit or save it (or at least I couldn't figure out how to) and it is very cryptic. I found the easiest, no strings attached, disassembler to be the one packaged with gputils, it is called gpdasm. To download and install, visit the gputils page here:
https://gputils.sourceforge.io/
Now open a command prompt and navigate to the folder where your .hex file is located. Generate an assembly source file from the .hex with the following command:
gpdasm -p p16f84a -csno hexfile.hex > asmfile.dis
With the -c -s -n and -o options, this generates quite a good listing which is very near to being able to be assembled as is. Obviously the variable names and labels cannot be recovered, but at least subroutines are identified which makes things a lot easier. Hope this helps someone in the future.
I would look at the output of avr-objdump (gulp):
avr-objdump -j .sec1 -d -m avr5 foo.hex
You will have to change the words following the "-m" to your architecture. Even when/if this works it will give you the Assembly code, which might not look anything you have ever written. The variable names will different, and the handy Arduino functions will look like messy Assembly junk. I hope there is a better way, sorry.
See also AVR GCC forum - Using avr-objdump to disassemble hex code.
This may not have been available at the time the other answers was given. It coverts the .hex back to assembler. You might need to know the architecture of the original AVR it was intended for. Works well for me for code that I wrote and compiled. I tested with AVR-25 for Tiny 85. Hope it helps. Would be nice to have an offline version of same thing! http://www.onlinedisassembler.com/ An alternative commercial option is IDA from https://www.hex-rays.com/
Short answer: You can't.
At least, don't expect a readable, and compilable, C source code. There's discussion why elsewhere on this site, so i won't get into details.
Also, note there's not an easy walk-through or how-to. You need to experiment, and you'll need some experience as well.
To get you started, you might:
- convert the .hex file into a raw binary file, for example using Hex2bin
- use that binary file with the retargetable Decompiler selecting "raw machine code" and probably "ARM+Thumb" as architecture
- If the results of of the retargetable decompiler are unsatisfactory (it didnt work well for me, last time i tried), you might want to try the Online Disassembler to get assembly code
- Of course, the ultimate tool is IDA, but the freeware version can't handle ARM, and the price is probably a bit steep for a hobby project.
radare2 supports these ihex files directly
note the command in radare2 in my original answer uses switch -b32 SYS_V commented that it should be -b16 and posted an answer with ample details how to proceed from the end of my answer to a tangible result here is the link to that thread how to find usefull info from a bin file
:\>ls -l
total 172
-rw-rw-rw- 1 Admin 0 172401 2016-01-01 00:44 SMOK_X_CUBE_II_firmware_v1.07.hex
:\>rahash2 -a md5 SMOK_X_CUBE_II_firmware_v1.07.hex
SMOK_X_CUBE_II_firmware_v1.07.hex:0x00000000-0x0002a170 md5: 351660a42b846d19e35f54f75530e2d9
:\>radare2 -A -a arm -b 32 ihex://SMOK_X_CUBE_II_firmware_v1.07.hex
Function too big at 0xa50e54
Function too big at 0xfe25a2ac
Function too big at 0x1648234
Function too big at 0x13ed738
[0x00000000]> s 0xc1
[0x000000c1]> pd 10
| 0x000000c1 4885460c mcrreq p5, 4, r8, r6, c8
| 0x000000c5 f070fc00 ldrshteq r7, [ip], 0
| 0x000000c9 480047e9 stmdb r7, {r3, r6} ^
| 0x000000cd 1b0000e8 stmda r0, {r0, r1, r3, r4}
| 0x000000d1 0e002004 strteq r0, [r0], -0xe
| 0x000000d5 48804704 strbeq r8, [r7], -0x48
| 0x000000d9 480047fe cdp2 p0, 4, c0, c7, c8, 2
| 0x000000dd e7fee7fe cdp2 p14, 0xe, c15, c7, c7, 7
| 0x000000e1 e7fee7fe cdp2 p14, 0xe, c15, c7, c7, 7
| 0x000000e5 e7fee75d stclpl p14, c15, [r7, 0x39c]!
[0x000000c1]>
if you were wondering how did radare2 get the 4885460c at address 0xxxxc1 then read further
intel seems to have published the specs of ihex file format i got hold of one pdf from microsym named intelhex.pdf dont know if any latest version are available this is my first brush with ihex or arm for that matter
Hexadecimal Object File Format Specification Revision A January 6, 1988
based on the description in the file
it seems each line in the ihex file starts with a colon :
followed by ONE BYTE = record length
followed by TWO BYTES = offset to load
followed by ONE BYTE = Record Type
Last BYTE in the line = Checksum
each of the above are hexpairs ie BYTE E8 will be 0x45 0x38 in the file
3A 31 30 30 30 30 30 30 30 45 38 30 45 30 30 :10000000E80E00
the file consists of 3833 lines out of which 3830 lines have a record length of 0x10
wc SMOK_X_CUBE_II_firmware_v1.07.hex
3833 3833 172401 SMOK_X_CUBE_II_firmware_v1.07.hex
grep -ivn :10 SMOK_X_CUBE_II_firmware_v1.07.hex
1::020000040000FA
3832::04000005000000C136
3833::00000001FF
dissecting the first line
line1 data size = 0x02
load offset = 0x0000
record type is = 0x04 (extended linear address 32 bit format)
and it stays as is until another record 04 is encountered lets check if the file contains another record 04
:\>grep -in :......04 SMOK_X_CUBE_II_firmware_v1.07.hex
1::020000040000FA
only one line
the last line 3833 denotes end of file record type 0x01
the last but one line denotes start linear address record type 0x05 and EIP = 0xc1 checksum matches (100 - (0xc1 + 0x05 + 0x04 ) ) == 0x36
so that leaves 3830 lines as code data xx
based on the above details we can carve the bytes
delete the first , last . last but one lines from the input file
sed substitute the first nine characters and last two character and rip off the line endings \r\n .
sed s/:........//g < in | sed s/..$//g | tr -d "\r\n" > out
lets check out if sed magic worked we should have 32 characters per line if it worked concatenated into one big string
:\>wc out
0 1 122560 out
:\>set /a 32*3830
122560
lets convert the hex pairs to binary
rax2 -s < out > bin
this doesn't work as it should deliver us half the input size but it is higher than that also the inversion rax2 -S < bin > is_original doesn't get us the original input back
to the developers of radare if you read this can you check if rax2 -s works properly in windows if input is file it seems it suffers from unix / windows line ending quirks (windows seems to convert the binary 0x0A to 0x0d 0x0a when using the redirection operator
:\>rax2 -s < out > bin
:\>wc bin
616 4126 61896 bin
:\>set /a 61896 * 2
123792
lets cook a python unhexlify
:\>cat makebin.py
import binascii
fp = open("out","rb")
fo = open("bin","wb")
buff = fp.read()
fo.write(binascii.unhexlify(buff))
fp.close()
fo.close()
:\>python makebin.py
:\>wc *
616 4126 61280 bin
3830 3830 172350 in
6 14 139 makebin.py
0 1 122560 out
616 4126 61896 rax2bin
5068 12097 418225 total
:\>set /a 61280*2
122560
:\>
python seems to deliver us the correct bytes
lets ask radare2 if the size 122560 is right
[0x000000c1]> if
file ihex://SMOK_X_CUBE_II_firmware_v1.07.hex
fd 2357404
size 0xef60
blksz 0x0
mode r--
block 0x100
format any
[0x000000c1]> !rax2 0xef60*2
122560
[0x000000c1]>
it seems to agree lets xxd and look if we fished the right bytes at right offset
:\>xxd -g 4 -l 32 -s 0xc1 bin
00000c1: 4885460c f070fc00 480047e9 1b0000e8 H.F..p..H.G.....
00000d1: 0e002004 48804704 480047fe e7fee7fe .. .H.G.H.G.....
it looks right now we can try and start understanding the mnemonics (google doesn't seem to know much about mccreq i don't know arm so good-luck from here on