15-Manipulate and Translate Machine and Assembly Code
15-Manipulate and Translate Machine and Assembly Code
15-Manipulate and Translate Machine and Assembly Code
Translation
The assembler's job is to read these binary patterns and translate them into a different
representation, namely the machine code of the target processor. Translation comprises four
main steps:
1. Parsing / lexical analysis. This step reads the bytes and tries to group them into
lexical tokens. In the case of an assembler, it groups up bytes and determines if they
represent mnemonics, register names, numbers, labels, directives, or other syntactic
pieces such as commas and colons.
You can compare this to trying to understand an English sentence by first breaking it
up into words, numbers and punctuation, for example, as opposed to just a long,
undifferentiated sequence of letters.
(code ko byte ma convert krna or is ma reg or mnemonic jasi cheezon ko check krna
2. Syntactic analysis. This step groups up the tokens and tries to make sense of them.
Not all sequences of tokens are valid in assembly language. MOV EAX, EBX is a
valid x86 assembly statement. EAX, MOV EBX is not, even though it has the same
tokens, just in a different order.
3. Semantic analysis. At this step, the assembler determines the meaning of what you
wrote. The assembler now determines that the statement MOV EAX, EBX means you
want the processor instruction that moves between two registers, and you want it to
move between registers EAX and EBX. If you ask for an impossible instruction, it's at
this point the assembler notices. For example, MOV EIP, EAX is not a legal
instruction on x86, even though it's syntactically correct.
When reading English, you do the same thing when trying to understand the meaning
of a sentence. "Turn the radio on" asks you to perform an action on the radio. "Boil
the radio purple," on the other hand is syntactically correct, but I'd be at a loss as to
what it means.
[jo logic banai gai ha us ka matlab bataey ga ka ya kia mov wagyra kr raha ha]
4. Code generation. Once the assembler has determined what all of your statements
mean, it can generate the actual stream of machine code 1s and 0s for your program.
Most assembly statements have a direct 1:1 translation to machine code, so the process
is fairly straightforward. Some assembly statements, such as assembler directives,
guide the overall process. And other aspects of assembly code, such as resolving
labels, require additional steps.
Comparing to reading English again, this is where you'd actually act on what you just
read. Most English sentences are straightforward. Some need additional context to be
fully understood. For example, pronouns in English need antecedents, just like label
references in assembly need label definitions.
[is ma code ke generation ho gi]