An Assembly Lanuage Program, which converts string to its ASCII value and store in array

By | May 28, 2014

Now we will write another Assembly Lanuage Program, which converts string to its ASCII value and store in array

Let’s identify variables needed for this program.
Let’s identify variables needed for this program.
First variables will be the one which will hold the Strings entered by user in the variables P1 LABEL BYTE    M1 DB 0FFH    L1 DB ?    P11 DB 0FFH DUP (‘$’) to copy ascii value of string given by user to an Array and Other variables will be holding the Messages ’ENTER ANY STRING :- $’ to be printed for the User, So in all Three variables.The identified variables are P11, ARRAY and MSG1.

First Line – DATA SEGMENT

DATA SEGMENT is the starting point of the Data Segment in a Program and DATA is the name given to this segment and SEGMENT is the keyword for defining Segments, Where we can declare our variables.

Next Line – MSG1 DB 10,13,’ENTER FIRST STRING :- $’
   
    ARRAY DB 50 DUP (‘$’)
   
    P1 LABEL BYTE
    M1 DB 0FFH
    L1 DB ?
    P11 DB 0FFH DUP (‘$’)

ARRAY  DB 50 DUP (‘$’) this line is a declaration of Array initialized with ’$’ which works as New Line Character. $ is used as (\n) NULL character in C program. (A Number Character is of a BYTE size Hence we have to used only DB Define Byte ) as we don’t know the lenght of the digits in the Resultant Decimal equivalent printable form, Therefore we take it approx size ten. Here 50 DUP (‘$’) stands for N i.e. Size of Array or Array Size. DUP stands for Duplicate i.e. it will duplicate the value in All the Array with the value present in Bracket (i.e. $). P1 LABEL BYTE    M1 DB 0FFH    L1 DB ?    P11 DB 0FFH DUP (‘$’) this line is a declaration of Array with Variable Length of User’s Choice (i.e. User can enter String of Variable Length) initialized with ’$’ which works as New Line Character $ is used as (\n) NULL character in C program. P1 is the Start of the Label Byte Data Type. M1 is used for assigning Maximum Length of the Array. L1 is used to Get the LENGTH of the entered String by User. P11 is the name refered for the Array in the proram. (A Number Character is of a BYTE size Hence we have to used only DB Define Byte ) as we don’t know the lenght of the Characters in the String. Therefore we take it approx size 256. Here 0FFH DUP (‘$’) stands for N i.e. Size of Array or Array Size. DUP stands for Duplicate i.e. it will duplicate the value in All the Array with the value present in Bracket (i.e. $). Similarly  P2 LABEL BYTE    M2 DB 0FFH    L2 DB ?    P22 DB 0FFH DUP (‘$’) and P3 LABEL BYTE    M3 DB 0FFH    L3 DB ?    P33 DB 0FFH DUP (‘$’) same as above. MSG1 DB ‘ENTER STRING HERE :- $’ this line is a declaration of Charater Array initialized with ’ENTER FIRST STRING :- $’  and $ is used as (\n) NULL character in C program. (A Character is of a BYTE Hence we have to use only DB Define Byte ). Similarly MSG2 DB 10,13,’ENTER SECOND STRING :- $’, MSG3 DB 10,13,’LENGTH OF FIRST STRING IS :- $’,MSG4 DB 10,13,’LENGTH OF SECOND STRING IS :- $’ and MSG5 DB 10,13, ’CONCATENATED STRING IS :- $’

Next Line – DATA ENDS

DATA ENDS is the End point of the Data Segment in a Program. We can write just ENDS But to differentiate the end of which segment it is of which we have to write the same name given to the Data Segment.

Now, Selection of data type is DB data type the numbers which we are adding will be integers so DB is sufficient.

DATA SEGMENT
    MSG1 DB 10,13,'ENTER FIRST STRING :- $'
   
    ARRAY DB 50 DUP ('$')
   
    P1 LABEL BYTE
    M1 DB 0FFH
    L1 DB ?
    P11 DB 0FFH DUP ('$')
  
DATA ENDS

 In Assembly programming, the variable are all defined by bytes only.

DB – Define Byte  (Size – 1 Byte)

DW – Define Word  (Size – 2 Byte)

DD – Define Double word  (Size -  4 Bytes)

DQ – Define Quad word  (Size – 8 Bytes)

DT – Define Ten Bytes  (Size – 10 Bytes)

NUMBER SYSTEM in Assembly Programming is Decimal, Octal, Hexadecimal, Binary.

In the Program, We are entering the values for the variables and Do arithmetical Operations like Addition, Subtraction, Multiplication and Division So the Computer should understand which kind of Number is entered. Hence there is a different letters for different Number Systems. O or o stands for Octal, H or h stands for Hexadecimal, B or b stands for Binary, D or d stands for Decimal. By default type of numbering system is Decimal. If you do not specify any letter then the number is understood to be Decimal (By default).

MACROS

Macros are just like procedures, but not really. Macros look like procedures, but they exist only until your code is compiled, after compilation all macros are replaced with real instructions. If you declared a macro and never used it in your code, compiler will simply ignore it.

DISPLAY MACRO MSG
    MOV AH,9
    LEA DX,MSG
    INT 21H
ENDM   

DISPLAY :- is the Name (Identifier) of the Macro. MACRO is the Keyword Used. MSG is the Argument Passed.
    MOV AH,9             }
    LEA DX,MSG        }      :- /* code inside macro */
    INT 21H                  }
ENDM   :- is the end of Macro.

The code which is used most of the time is written in between the macro for reducing the length of Code.

DATA SEGMENT
    MSG1 DB 10,13,'ENTER FIRST STRING :- $'
   
    ARRAY DB 50 DUP ('$')
   
    P1 LABEL BYTE
    M1 DB 0FFH
    L1 DB ?
    P11 DB 0FFH DUP ('$')
  
DATA ENDS
DISPLAY MACRO MSG
    MOV AH,9
    LEA DX,MSG
    INT 21H
ENDM   
CODE SEGMENT
    ASSUME CS:CODE,DS:DATA
START:
        MOV AX,DATA
        MOV DS,AX                
               
        DISPLAY MSG1
       
        LEA DX,P1
        MOV AH,0AH    
        INT 21H           
                            
        LEA SI,P11
        LEA DI,ARRAY
               
        MOV CL,L1
        MOV CH,0
       
COPY1:  MOV AL,[SI]
        MOV [DI],AL
  
        INC DI
        INC SI
        LOOP COPY1                    
      
        MOV AH,4CH
        INT 21H
CODE ENDS
END START          

Explanation : 

In this Assembly Language Programming, A single program is divided into four Segments which are 1. Data Segment, 2. Code Segment, 3. Stack Segment, and 4. Extra  Segment. Now, from these one is compulsory i.e. Code Segment if at all you don’t need variable(s) for your program.if you need variable(s) for your program you will need two Segments i.e. Code Segment and Data Segment.

Next Line –CODE SEGMENT

CODE SEGMENT is the starting point of the Code Segment in a Program and CODE is the name given to this segment and SEGMENT is the keyword for defining Segments, Where we can write the coding of the program.

Next Line –     ASSUME DS:DATA CS:CODE

In this Assembly Language Programming, their are Different Registers present for Different Purpose So we have to assume DATA is the name given to Data Segment register and CODE is the name given to Code Segment register (SS,ES are used in the same way as CS,DS )

Next Line – START:

START is the label used to show the starting point of the code which is written in the Code Segment. : is used to define a label as in C programming.

Next Line – MOV AX,DATA
MOV DS,AX

After Assuming DATA and CODE Segment, Still it is compulsory to initialize Data Segment to DS register.  MOV is a keyword to move the second element into the first element. But we cannot move DATA Directly to DS due to MOV commands restriction, Hence we move DATA to AX and then from AX to DS. AX is the first and most important register in the ALU unit. This part is also called INITIALIZATION OF DATA SEGMENT and It is important so that the Data elements or variables in the DATA Segment are made accessable. Other Segments are not needed to be initialized, Only assuming is enhalf.

Next Line –  DISPLAY MSG1

 DISPLAY MSG1 is Calling of a Macro DISPLAY with Argument MSG1. This will display String Msg1 on Screen. 

Next Line – LEA DX,P1
        MOV AH,0AH    
        INT 21H

The above three line code is used to Scan the String entered by user Onscreen to the variable length character Array present in P1 Label Byte and String refered by P11 as Array.

Now, lets understand line by line

LEA DX,P1 in this LEA stands for LOAD EFFECTIVE ADDRESS and it loads the effective address of second element into the first element.  This same code can be interchangably written as MOV DX, OFFSET P1 where OFFSET  means effective address and MOV means move  second element into the first element.

MOV AH,0AH    
INT 21H

The above two line code is used to SCAN the String entered by user Onscreen to the variable length character Array to the address present in DX.

Standard Input and Standard Output related Interupts are found in INT 21H which is also called as DOS interrupt. It works with the value of AH register, If the Value is 0AH, That means SCAN the String entered by user Onscreen to the variable length character Array to the address present in DX.

Next Line – LEA SI,P11
        LEA DI,ARRAY             
        MOV CL,L1
        MOV CH,0

The above Four line code is used to initialize P11 to SI register and to initialize ARRAY to DI register. Move L1 (i.e. Actual Length of first String Entered) to CL register and MOV CH,0 is used to move or assign value Zero (decimal value) to  CH Register. 

Next Line – COPY1:  MOV AL,[SI]
        MOV [DI],AL

COPY1: is a LABEL and all the words ending in colon (:) are Labels. Move value at Address of SI Register to AL register Move value of AL register to Address of DI Register as we want to copy Character from [SI] to [DI].

Next Line –  INC DI
        INC SI

 INC DI will increment the value present in DI register by One. Similarly INC SI will increment SI register.

Next Line – LOOP COPY1

This end of loop. In assembly programming language we have a LOOP instruction. This works with two other helpers which are Label and Counter. The Loop start with LABEL and ends with LOOP instruction with the same LABEL name with it. the execution of the Loop depends on the value in CX register ( CX is also Called COUNTER).

Next Line –  MOV AH,4CH
      INT 21H

The above two line code is used to exit to dos or exit to operating system. Standard Input and Standard Output related Interupts are found in INT 21H which is also called as DOS interrupt. It works with the value of AH register, If the Value is 4ch, That means Return to Operating System or DOS which is the End of the program.

Next Line – CODE ENDS

CODE ENDS is the End point of the Code Segment in a Program. We can write just ENDS But to differentiate the end of which segment it is of which we have to write the same name given to the Code Segment.

Last Line – END START

END START is the end of the label used to show the ending point of the code which is written in the Code Segment.

Note :- In this Assembly Language Programming, We have Com format and EXE format. We are Learning in EXE format only which simple then COM format to understand and Write. We can write the program in lower or upper case, But i prepare Upper Case.

Screen Shots :-

 Asm_program_String_Ascii_to_Array

Output After Execution :-

Asm_program_String_Ascii_to_Array_Out

Output Variable Values After Execution :-

Asm_program_String_Ascii_to_Array_Vble

Note :- To see the variable and its value you have to click vars button in the emulator.

Leave a Reply