SC : EP02 : DISCOVERING AND EXPLOITING : SHELLCODE --- ASSEMBLY LANGUAGE INTRODUCTION
THOUGHT OF AUTHOR : Without understanding of assembly language you don't move further so I decided to first cover some basics of assembly language. If you are already know how to program in assembly language than please move on.
#START:EP01:
Assembly language Introduction: Assembly language is a low level programming language for a computer.Each computer has a microprocessor that manages the computer's arithmetical, logical and control activities.Each processors has it's own set of instructions for various operations. These set of instructions are called machine language instructions'. Processors understand only the strings of 1's and 0's which are actually machine language instructions. So the low level assembly language is designed for a specific family of processors that represents various instructions in symbolic code which are more understandable.
The processor support the following data sizes:
A 8-bit value is called a BYTE, which is 8 bits wide and composed of zeros (0) and ones (1)
A 16-bit value is called a WORD, which is two bytes wide
A 32-bit value is called a DWORD, which stands for double word and is 4 bytes wide
A 64-bit value is called a QWORD, which stands for quad word and is 8 bytes wide
A 80-bit value is called a TWORD, which stands for ten byte and is 10 bytes wide
A 128-bit value is called a DQWORD, which stands for double quad word and is 16 bytes wide
Integer Constants: An integer constant contains an optional leading sign, one or more digits and optional suffix character (called a radix) indicating the number base.
Radix definitions :
b or y Binary representation
d or t Decimal representation
q or o Octal representation
h or x hexadecimal representation
Example with radix:
10010111b ; Binary representation
0b10010111 ; Binary with prefix radix
0b10010111 ; Binary with prefix radix
210q ; Octal representation
0o210 ; Octal with prefix radix
200 ; Decimal representation by default
200d ; Decimal representation
0d7h ; Hexadecimal representation
0xd7 ; Hexadecimal with prefix radix
Floating point constants : Floating point constant are expressed in the form :
[sign] [digits][period(decimal point)][digits(optional)][e(optional)][exponent(optional)]
Example of valid floating point number constants:
-0.3
2.8
32.e+76
28.88e+345
Character Constants: A character constant is represented as one to eight bytes long enclosed in single or double quotations.Character constants with more than one byte are stored with the least significant byte at the lower memory address and the most significant byte at the higher address.
Example of character constants:
's' ; Single character constant
"e" ; Single character constant
'abcdefgh' ; Eight byte character constant
String Constants: String constants are represented as a sequence of characters including spaces enclosed in single or double quotations.
Examples of string constants :
"assembly language" ; String constant
'a','s','s','e','m','b','l','y',' ','l','a','n','g','u','a','g','e' ; Equivalent string constant
Unicode strings are defined using special mnemonics '__utf16__' and '__utf32__'.These operators take UTF-8 formatted strings and convert them to UTF_16 and UTF-32 respectively.
Examples of unicode string constants:
__utf16__('assembly language') ; UTF-16 string constant
__utf32__('assembly language') ; UTF-32 string constant
Escape sequences are recognized by backqouted strings:
\’ single quote (‘)
\” Double quote (“)
\` Backquote (`)
\\ Backslash (\)
\? Question mark (?)
\a B EL (ASCII 7)
\b BS (ASCII 8)
\t TAB (ASCII 9)
\n LF (ASCII 10)
\v VT (ASCII 11)
\f FF (ASCII 12)
\r CR (ASCII 13)
\e ESC (ASCII 27)
\377 Up to 3 octal digits – literal byte
\xFF Up to 2 hexadecimal digits –literal byte
\u1234 4 hexadecimal digits – Unicode character
\U12345678 8 hexadecimal digits – Unicode character
Reserved words : Reserved words are some pre defined words that have special meaning in their correct context.
*Register names such as EAX, EIP, and ST0
*Instruction mnemonics such as MOV, ADD, and XOR
*Pseudo-Instructions (attributes) such as BYTE, WORD, and DB
*Operators used in expressions such as ‘<<’, ‘+’, and ‘/’
*Predefined symbols such as ‘$’, “$$”, ‘%’, ‘%%’, ‘?’, and ‘@’
Identifiers : Identifiers are programmer chosen names to variables, constants, procedures and labels. Identifiers can contains letters, numbers, underscores,$,#,@,~,. and ? . Periods and question marks have special meaning.
***avoid using @ symbol in your identifier as it used by assembler as a prefix for predefined symbols.
Directives : Directives are instruction to the assembler to define local segments, define variables, create procedures etc.They are also called pseudo-instructions. Directives do not execute at run-time and are not case-sensitive i.e .data,.DATA and .Data are the same.
Example : assLang DWORD 26 ; DWORD Directive
Example : assLang DWORD 26 ; DWORD Directive
Labels : A label is an identifier that acts as a place-holder to mark a location in your code, or to identify a variable. When your code is assembled, each instruction field is assigned a numeric offset address. A label placed at the front of an instruction field implies that field’s offset in the CODE section. Similarly, a label placed in front of a variable implies the variable’s offset address in the DATA section. Labels make it easy to identify variables. You can think of labels as pointers containing an address. Valid characters in labels are letters, numbers, _, $, #, @, ., ?, and :.. The only characters which may be used as the first character of an identifier are letters, periods (‘.’), underscores (‘_’), and question marks (‘?’). Colons can be placed at the end of a label to make it stand out better in your code.
Example data labels:
integer_number dd 12345 ; DWORD integer
string_example db ‘Assembly is fun!’ ; BYTE character string
A label can be prefixed with a ‘$’ symbol to indicate that it is intended to be used as an identifier and not a reserved word. This also ensures that if you define a symbol called eax for example, you can refer to $eax in your assembly code to distinguish the symbol from the register.
Example of using the ‘$’ symbol in your code:
$eax dw 5280 ; $ used to indicate eax is a variable and not a register
$loop: ; $ used to indicate loop is a label and not a loop instruction
$align: ; $ used to indicate align is a function and not an instruction
Mnemonics : A mnemonic instruction is a MASM key word that identifies an operation to be carried out by the processor. Intel has published the IA-32 instruction set in their multi-volume IA-32 Architecture Software Development manuals. Examples of mnemonic instructions include MOV, ADD, JMP, CALL, etc. All mnemonics are covered in later chapters; don’t worry about their meanings right now.
Operands : MASM defines four types of operands:
REGISTER: These kinds of operands refer directly to the contents of the processor’s built-in registers. Examples include: EAX, DL, EIB, etc.
REGISTER: These kinds of operands refer directly to the contents of the processor’s built-in registers. Examples include: EAX, DL, EIB, etc.
MEMORY: These operands refer to memory locations and the data contained therein. More often, the address of the data are represented in the instruction and are always offsets from the beginning of a segment specified by one of the segment registers. Examples include: [myData], [myLabel], etc.
CONSTANT: These operands represent fixed data values (constants) that are listed in the instruction itself and not contained in the data segment of your program. Examples include: “This string field”, 3.14, etc.
INFERRED: These operands are not explicitly shown in your code, they are implied. As an example, the increment instruction doesn’t display a 1. The 1 is implied so that when an increment operation is executed, the value in the register is automatically increased by 1.
Example of labels, mnemonics, operands, and comments:
LABEL MNEMONIC OPERANDS COMMENT
L1: xor eax, ecx ; EAX and ECX are register operands
mov eax, [myString] ; Register and memory operands
add eax, 100 ; 100 is an immediate operand
inc eax ; Increment EAX by 1 is inferred
Comments: Inline assembly code comments begin with a semi-colon and can be at the end of an instruction field or placed on a line all by itself. Everything after the semi-colon is ignored by the assembler. Since comments take up no room in assembled code, it is good programming practice to use comments to explain your code.
#END:EP02:
****to be continued till next. :)
Thanks to visit us.
Thanks to visit us.
Find us on Facebook : https://www.facebook.com/Sourceinfo
Find us on Youtube : https://www.youtube.com/channel/UCx3DsLftaibO_ErCdTwGbCA
#UNITED_BY_ONE#
#DIVIDED_BY_ZERO#
#PEACE


Comments
Post a Comment