The problem isn't nearly as complex as you are making it out to be.
Here is a pretty good article about hex editing strings.
It sounds like you think that the strings are embedded in the machine instructions but that's not true. They are
easily visible just by scrolling to the read only data section in any hex editor that supports SHIFT_JIS, such as MadEdit. Editing them directly is trivial and was how I was able to quickly edit the few battle strings which actually are translated in the most recent release. Where the problem becomes more complex is when you cannot fit the English string into the Japanese string's slot. The article I linked covers how to handle this. It involves doing some math and diving into the instructions to edit the pointers.
The
real problem though lies in the fact that Hemo updates the game frequently. The game hasn't even been out for five months yet and we are on version 1.34. Directly editing the .exe won't work because we'd have to manually find and edit everything all over again. I'm in the process of writing a set of applications and scripts that handle this problem.
The first step is finding and dumping all of the strings. The process I am using is as follows:
The section .rdata, is where all of the read only data (in other words, the strings) are stored. It is also the last section in the binary so I just allow the program to loop until it reaches eof. I have already created and run this. It produced a .csv with about .75k strings in it. The idea is that I can pass this .csv off to somebody who will handle the translation.
Here are some snippets of the output:
[Offset], JP String, EN String
...
...
...
[0x40aa12],"の封印を解いた・・・",""
[0x40aa2a],"を人形箱に入れた。",""
[0x40aa3e],"%sは 倒れた!",""
[0x40aa4e],"相手の %sは 倒れた!",""
[0x40aa66],"バリアオプションは 壊れた!",""
[0x40aa82],"相手の バリアオプションは 壊れた!",""
[0x40aaa6],"気象が %sになった!",""
[0x40aaba],"気象が 元に戻った!",""
[0x40aace],"地相が %sになった!",""
[0x40aae2],"地相が 元に戻った!",""
[0x40aaf6],"Lv.%2u",""
[0x40aafe],"%6s/%6s",""
[0x40ab06],"%d:%02d",""
[0x40ab0e],"Lv.",""
[0x40ab12],"EXP+",""
[0x40ab1a],"PP+",""
[0x40ab1e],"x",""
...
...
...
[0x410b38],"タ",""
[0x410b3c],"@チ",""
[0x410b40],"ネツヘフフフフフ・囮劔劔・333333・ヘフフフフフ・",""
[0x410b67],"タV@UUUUUU\@",""
[0x410b77],"€l@",""
[0x410b7d],"・)@ク・Qク・",""
[0x410b8f],"@@",""
[0x410b97],"`蓮",""
[0x410ba0],"Bタ",""
[0x410ba4],"0B",""
[0x410bb0],"F@",""
[0x410bb4],"狠",""
[0x410bbf],"@m@",""
[0x410bc4],"jテ",""
[0x410bcf],"@mタ",""
[0x410bd4],"沓ヘフL?",""
...
...
...
[0x423348],"Gソ",""
[0x42334b],"ヒソ",""
[0x42334e],"kJユ|dUェ ユUVテ2スソ!・QェQソU ソ {ソ9K ソ)s&ソ!{SソY{SェトA{RソY{Rソ・ソ!{ソ!Kソ!{]ソ・ソ!{\ソY{\ソ_ソ!{_ソq{ヌソe・ソ",""
[0x424206],"/QZソ ;ソ /Pユs覬ソ8/Sェ/ユSソ8/RZ*Rソ8/]ユs訃ソ8/\ェ/ユ\ソ8/_Z*_ソ・ソ #ソ・ソ{ソ #Wユ|ソh#ソ",""
[0x424259],"ソソ",""
[0x42425c],"ウソ0#ソ匂クソホ/
ソaロ!ソ9檸+ソ? ソ",""
As you can see, I have a chunk of valid data there and then a big chunk of garbage. A good percentage of the 7.5k strings are garbage like this but there really isn't any way to tell whether or not you're grabbing something that you're looking for. I've determined that the best way to go about it is just grabbing everything at once and letting people edit which lines they want.
The current hangup is that the offsets are being reported incorrectly. I'm not sure why and the margin of error isn't consistent over the course of the output. Once I figure it out then I'll be ready to share the .csv with people so they can begin editing. I have some ideas but haven't gotten around to working on it again between coursework, my day job, and being too tired during the few free hours I get.
The second step involves placing the new EN strings into a new section of the .exe and then correcting the pointers in the instructions. As I mentioned already this involves doing a bit of math which is trivial. The hard part will be searching for the correct pointer so I can edit it but I imagine that won't be too difficult. Once I am confident that the reported offsets are the correct ones I'll be able to move on to this step. There isn't any sense in performing a search when you find the right spot but report the wrong value.
Now for the final step. A script will need to be written to copy over the translated elements from the old .csv to a new dump. This will likely be trivial.
Down the road this will provide much faster patch updates in response to Hemo's updates. We will be able to just create a new dump, bring our changes over, translate the new/changed strings if there are any, and then push it into the game.
This ended up being a pretty long post but hopefully it answers your question. Even if it isn't it was useful for me to gather my thoughts on things.