Binary translator to binary code of x86 architecture

The translation comes from binary code that is created by my own processor, which is located in the "working cpu" repository.

Basic concept

By its principle, this is JIT (Just-in-Time), which means that translation and execution occur in almost the same process. How does it work? First, all binary code is translated into a buffer, then this buffer is executed.

How certain commands are deployed in x86_64 commands can be viewed in the ASM_LOG file, which is a file in the nasm language, when compiled, a binary code identical to the code generated by the translator will be obtained.

Calling native C functions

The implementation of the input, output and square root functions are implemented by calling the functions of the C language library. Therefore, when executing already x86_64 binary code, access to C libraries is still required.

Performance comparison

Testing methodology and typical mistakes

We will compare directly the execution of the binary code execution on my processor and on the x86_64 architecture processor. For successful testing and obtaining real results, several actions should be performed:

Isolate the parts of both programs that are responsible for execution from other parts so that slow functions (for example, printf) do not affect the execution time.
When testing any system, errors are possible, so it is necessary to perform more than one execution cycle, in our tests we will execute the code 1000 times.
It is necessary to completely disable the logs
Choose a program as the code of the tested program so that it uses approximately all the capabilities of processors equally. This will help to avoid distorting the performance of the program due to the specifics of the implementation of a particular process.
It is necessary to set the compiler optimization flags to the maximum possible value (-O3 in our case).
Exclude keyboard input so that the results do not affect the tester's printing speed.

Next, several stages of testing the program of my processor will be shown, which will show how much all these points are needed.

As a test program, it will use recursive factorial calculation for the number 5. Without reading from the keyboard and without displaying the result on the screen. Thus, we will use arithmetic operations, function calls, and jumps.

The performance evaluation will be performed by the kcachegrind program.

Test run without disabling logs (my cpu)

After any operation, my processor writes a dump to the log file, where it describes in detail the state of my stack, as well as a dump of the binary buffer indicating which instruction it is currently on. All this is done using the fprintf() function in the log file

As you can see, we got very strange results. 99% of the program execution time is spent writing to a log file. Naturally, this cannot even come close to giving an understanding of the true performance of the program.

Test run with disabled logs (my cpu)

After disabling the logs, we get a performance gain of 177 times. Now we can say that our result reflects the performance of the program quite well.

Test run without logs with -O3 flag (my cpu)

We got a 4x performance gain just because of the -O3 flag and a 699x increase compared to the version without -O3 with logs.

Only now our program is ready for comparison!

Now, using the example of a binary translator, we will make sure that isolating the part responsible for execution is simply necessary.

Test run of the translator without isolation of the execution process from the translation process

The process of execution of the translated code takes 1% of the entire execution time of the program, the situation is similar to the case of logs included. It is impossible to evaluate the performance of such a program

Test run of the translator with isolation of the execution process

The total execution time of the program was reduced by 246 times compared to the non-isolated version.

Only now our program is ready for comparison!

Final performance comparison

My cpu	x86_64
82080908	153050

So, counting factorial 5 on an x86_64 architecture processor works

536 times

faster than on my processor. This is not surprising, because one x86_64 command is revealed in a huge number of commands of my processor. Even the -O3 compilation flag is not able to significantly affect this situation.

Installation

git clone https://github.com/jirol9xa/binaryTranslator

making project:

make

The command line argument must specify the name of the binary code file. Running:

./trans <file_name>

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.vscode		.vscode
include		include
pics		pics
src		src
ASM_LOGS		ASM_LOGS
Binary		Binary
Makefile		Makefile
fctrl		fctrl
quatro		quatro
quatro2		quatro2
readme.md		readme.md
trans		trans

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Binary translator to binary code of x86 architecture

Basic concept

Calling native C functions

Performance comparison

Testing methodology and typical mistakes

Test run without disabling logs (my cpu)

Test run with disabled logs (my cpu)

Test run without logs with -O3 flag (my cpu)

Test run of the translator without isolation of the execution process from the translation process

Test run of the translator with isolation of the execution process

Final performance comparison

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Binary translator to binary code of x86 architecture

Basic concept

Calling native C functions

Performance comparison

Testing methodology and typical mistakes

Test run without disabling logs (my cpu)

Test run with disabled logs (my cpu)

Test run without logs with -O3 flag (my cpu)

Test run of the translator without isolation of the execution process from the translation process

Test run of the translator with isolation of the execution process

Final performance comparison

Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages