An Assembly Lanuage Program to search for a character in a given string and calculate the number of occurrences of the character in the given string

By | May 30, 2014

Now we will write another Assembly Lanuage Program to search for a character in a given string and calculate the number of occurrences of the character in the given string

Let’s identify variables needed for this program.
First variable will be the one which will hold the Strings entered by user in the variables P1 LABEL BYTE    M1 DB 0FFH    L1 DB ?    P11 DB 0FFH DUP (‘$’) to save string given by user, Other variables will be holding character entered by the user, it will be CHAR, next  will be used for calculating number of occurences, it will be COUNT and Other variables will be holding the Messages ‘ENTER ANY STRING :- $’, ‘ENTER ANY CHARACTER :- $’, ‘NO, CHARACTER FOUND IN THE GIVEN STRING $’  and ‘ CHARACTER(S) FOUND IN THE GIVEN STRING $’STRING IS : $’ to be printed for the User, So in all Eight variables.The identified variables are P11, CHAR, COUNT, MSG1, MSG2, MSG3, MSG4 and MSG5.

First Line – DATA SEGMENT

DATA SEGMENT is the starting point of the Data Segment in a Program and DATA is the name given to this segment and SEGMENT is the keyword for defining Segments, Where we can declare our variables.

Next Line – MSG1 DB 10,13,’ENTER ANY STRING :- $’
    MSG2 DB 10,13,’ENTER ANY CHARACTER :- $’
    MSG3 DB 10,13,’ $’
    MSG4 DB 10,13,’NO, CHARACTER FOUND IN THE GIVEN STRING $’ 
    MSG5 DB ‘ CHARACTER(S) FOUND IN THE GIVEN STRING $’
    CHAR DB ?
    COUNT DB 0
    P1 LABEL BYTE
    M1 DB 0FFH
    L1 DB ?
    P11 DB 0FFH DUP (‘$’)

P1 LABEL BYTE    M1 DB 0FFH    L1 DB ?    P11 DB 0FFH DUP (‘$’) P22 DB 0FFH DUP (‘$’) this line is a declaration of Array with Variable Length of User’s Choice (i.e. User can enter String of Variable Length) initialized with ’$’ which works as New Line Character $ is used as (\n) NULL character in C program. Initialize CHAR to ? (? stands for blank value). COUNT to 0 (Zero). P1 is the Start of the Label Byte Data Type. M1 is used for assigning Maximum Length of the Array. L1 is used to Get the LENGTH of the entered String by User. P11 and  P22 are the names refered for the Arrays in the program. (A Number Character is of a BYTE size Hence we have to used only DB Define Byte ) as we don’t know the lenght of the Characters in the String. Therefore we take it approx size 256. Here 0FFH DUP (‘$’) stands for N i.e. Size of Array or Array Size. DUP stands for Duplicate i.e. it will duplicate the value in All the Array with the value present in Bracket (i.e. $).  MSG1 DB 10,13,’ENTER ANY STRING :- $’:- $’this line is a declaration of Charater Array initialized with ‘ENTER ANY STRING :- $’  and $ is used as (\n) NULL character in C program. (A Character is of a BYTE Hence we have to use only DB Define Byte ). Similarly MSG2 DB 10,13,’ENTERED STRING IS :- $, ‘MSG3 DB 10,13,’LENGTH OF STRING IS :- $’, MSG4 DB 10,13,’NO, GIVEN STRING IS NOT A PALINDROME $’, MSG5 DB 10,13,’THE GIVEN STRING IS A PALINDROME $’ and MSG6 DB 10,13,’REVERSE OF ENTERED STRING IS :- $’.

Next Line – DATA ENDS

DATA ENDS is the End point of the Data Segment in a Program. We can write just ENDS But to differentiate the end of which segment it is of which we have to write the same name given to the Data Segment.

Now, Selection of data type is DB data type the numbers which we are adding will be integers so DB is sufficient.

[codesyntax lang=”asm”]

DATA SEGMENT
    MSG1 DB 10,13,’ENTER ANY STRING :- $’
    MSG2 DB 10,13,’ENTER ANY CHARACTER :- $’
    MSG3 DB 10,13,’ $’
    MSG4 DB 10,13,’NO, CHARACTER FOUND IN THE GIVEN STRING $’ 
    MSG5 DB ‘ CHARACTER(S) FOUND IN THE GIVEN STRING $’
    CHAR DB ?
    COUNT DB 0
    P1 LABEL BYTE
    M1 DB 0FFH
    L1 DB ?
    P11 DB 0FFH DUP (‘$’)

DATA ENDS

[/codesyntax]

 In Assembly programming, the variable are all defined by bytes only.

DB – Define Byte  (Size – 1 Byte)

DW – Define Word  (Size – 2 Byte)

DD – Define Double word  (Size –  4 Bytes)

DQ – Define Quad word  (Size – 8 Bytes)

DT – Define Ten Bytes  (Size – 10 Bytes)

NUMBER SYSTEM in Assembly Programming is Decimal, Octal, Hexadecimal, Binary.

In the Program, We are entering the values for the variables and Do arithmetical Operations like Addition, Subtraction, Multiplication and Division So the Computer should understand which kind of Number is entered. Hence there is a different letters for different Number Systems. O or o stands for Octal, H or h stands for Hexadecimal, B or b stands for Binary, D or d stands for Decimal. By default type of numbering system is Decimal. If you do not specify any letter then the number is understood to be Decimal (By default).

MACROS

Macros are just like procedures, but not really. Macros look like procedures, but they exist only until your code is compiled, after compilation all macros are replaced with real instructions. If you declared a macro and never used it in your code, compiler will simply ignore it.

DISPLAY MACRO MSG
    MOV AH,9
    LEA DX,MSG
    INT 21H
ENDM   

DISPLAY :- is the Name (Identifier) of the Macro. MACRO is the Keyword Used. MSG is the Argument Passed.
    MOV AH,9             }
    LEA DX,MSG        }      :- /* code inside macro */
    INT 21H                  }
ENDM   :- is the end of Macro.

The code which is used most of the time is written in between the macro for reducing the length of Code.

[codesyntax lang=”asm”]

DATA SEGMENT
    MSG1 DB 10,13,’ENTER ANY STRING :- $’
    MSG2 DB 10,13,’ENTER ANY CHARACTER :- $’
    MSG3 DB 10,13,’ $’
    MSG4 DB 10,13,’NO, CHARACTER FOUND IN THE GIVEN STRING $’ 
    MSG5 DB ‘ CHARACTER(S) FOUND IN THE GIVEN STRING $’
    CHAR DB ?
    COUNT DB 0
    P1 LABEL BYTE
    M1 DB 0FFH
    L1 DB ?
    P11 DB 0FFH DUP (‘$’)   
DATA ENDS 
DISPLAY MACRO MSG
    MOV AH,9
    LEA DX,MSG
    INT 21H
ENDM   
CODE SEGMENT
    ASSUME CS:CODE,DS:DATA
START:
        MOV AX,DATA
        MOV DS,AX                
               
        DISPLAY MSG1
       
        LEA DX,P1
        MOV AH,0AH    
        INT 21H
       
        DISPLAY MSG2
       
        MOV AH,1
        INT 21H
        MOV CHAR,AL                           
       
        DISPLAY MSG3
       
        LEA SI,P11
                      
        MOV CL,L1
        MOV CH,0
       
CHECK:
        MOV AL,[SI]
        CMP CHAR,AL
        JNE SKIP
        INC COUNT       
SKIP:       
        INC SI
        LOOP CHECK
             
        CMP COUNT,0
        JE NOTFOUND
             
        DISPLAY MSG3
       
        MOV DL,COUNT
        ADD DL,30H
        MOV AH,2
        INT 21H 
               
        DISPLAY MSG5
        JMP EXIT 
NOTFOUND:      
        DISPLAY MSG4
               
EXIT:   MOV AH,4CH
        INT 21H
CODE ENDS
END START          

[/codesyntax]

Explanation : 

In this Assembly Language Programming, A single program is divided into four Segments which are 1. Data Segment, 2. Code Segment, 3. Stack Segment, and 4. Extra  Segment. Now, from these one is compulsory i.e. Code Segment if at all you don’t need variable(s) for your program.if you need variable(s) for your program you will need two Segments i.e. Code Segment and Data Segment.

Next Line –CODE SEGMENT

CODE SEGMENT is the starting point of the Code Segment in a Program and CODE is the name given to this segment and SEGMENT is the keyword for defining Segments, Where we can write the coding of the program.

Next Line –     ASSUME DS:DATA CS:CODE

In this Assembly Language Programming, their are Different Registers present for Different Purpose So we have to assume DATA is the name given to Data Segment register and CODE is the name given to Code Segment register (SS,ES are used in the same way as CS,DS )

Next Line – START:

START is the label used to show the starting point of the code which is written in the Code Segment. : is used to define a label as in C programming.

Next Line – MOV AX,DATA
MOV DS,AX

After Assuming DATA and CODE Segment, Still it is compulsory to initialize Data Segment to DS register.  MOV is a keyword to move the second element into the first element. But we cannot move DATA Directly to DS due to MOV commands restriction, Hence we move DATA to AX and then from AX to DS. AX is the first and most important register in the ALU unit. This part is also called INITIALIZATION OF DATA SEGMENT and It is important so that the Data elements or variables in the DATA Segment are made accessable. Other Segments are not needed to be initialized, Only assuming is enhalf.

Next Line –  DISPLAY MSG1

 DISPLAY MSG1 is Calling of a Macro DISPLAY with Argument MSG1. This will display String Msg1 on Screen. 

Next Line – LEA DX,P1
        MOV AH,0AH    
        INT 21H

The above three line code is used to Scan the String entered by user Onscreen to the variable length character Array present in P1 Label Byte and String refered by P11 as Array.

Now, lets understand line by line

LEA DX,P1 in this LEA stands for LOAD EFFECTIVE ADDRESS and it loads the effective address of second element into the first element.  This same code can be interchangably written as MOV DX, OFFSET P1 where OFFSET  means effective address and MOV means move  second element into the first element.

MOV AH,0AH    
INT 21H

The above two line code is used to SCAN the String entered by user Onscreen to the variable length character Array to the address present in DX.

Standard Input and Standard Output related Interupts are found in INT 21H which is also called as DOS interrupt. It works with the value of AH register, If the Value is 0AH, That means SCAN the String entered by user Onscreen to the variable length character Array to the address present in DX.

Next Line –  DISPLAY MSG2

 DISPLAY MSG1 is Calling of a Macro DISPLAY with Argument MSG2. This will display String Msg2 on Screen. 

Next Line – MOV AH,1
      INT 21H
      MOV CHAR,AL

The above three line code is used to Read a Character from Console and save the value entered in variable CHAR in its ASCII form.

Standard Input and Standard Output related Interupts are found in INT 21H which is also called as DOS interrupt. It works with the value of AH register, If the Value is 1 or 1h, That means READ a Character from Console, Echo it on screen and save the value entered in AL register.

MOV CHAR,AL  means move value in AL register into variable CHAR.

Next Line –  DISPLAY MSG3

 DISPLAY MSG3 is Calling of a Macro DISPLAY with Argument MSG3. This will display String Msg3 on Screen. 

Next Line –  LEA SI,P11

The above line code is used to initialize P11 to SI register. 

Next Line –   MOV CL,L1
        MOV CH,0

The above Two line code is used to Move L1 (i.e. Actual Length of String Entered) to CL register and MOV CH,0 is used to move or assign value Zero (decimal value) to  CH Register. 

Next Line – CHECK:

CHECK: is a LABEL and all the words ending in colon (:) are Labels.

Next Line –  MOV AL,[SI]
        CMP CHAR,AL
        JNE SKIP

MOV AL,[SI] Move value at Address of SI Register to AL register CMP CHAR,AL is used to compare AL (value of [SI]) with character present in CHAR  variable and JNE SKIP Short Jump if Not Equal i.e. [SI] is NotEqual to CHAR Than the Control jumps to the respective LABEL SKIP. The result of Comparision is not stored anywhere, but flags are set according to result.

Next Line –  INC COUNT

 INC COUNT will increment the value present in COUNT Variable by One.

Next Line – SKIP:

SKIP: is a LABEL and all the words ending in colon (:) are Labels.

Next Line –  INC SI

 INC SI will increment the value present in SI register by One.

Next Line – LOOP CHECK

This end of loop. In assembly programming language we have a LOOP instruction. This works with two other helpers which are Label and Counter. The Loop start with LABEL and ends with LOOP instruction with the same LABEL name with it. the execution of the Loop depends on the value in CX register ( CX is also Called COUNTER).

Next Line –  CMP COUNT,0
        JE NOTFOUND

CMP COUNT,0 is used to compare 0 (i.e. ZERO) with value present in COUNT variable and JE NOTFOUND Short Jump if Equal i.e. 0 (i.e. ZERO) is Equal to COUNT Than the Control jumps to the respective LABEL NOTFOUND. The result of Comparision is not stored anywhere, but flags are set according to result.

Next Line –  DISPLAY MSG3

 DISPLAY MSG3 is Calling of a Macro DISPLAY with Argument MSG3. This will display String Msg3 on Screen. 

  Next Line – MOV DL,COUNT
        ADD DL,30H
        MOV AH,2
        INT 21H

The above Four line code is used to Write a Character on Console present in COUNT variable (i.e. Number of occurences).

Standard Input and Standard Output related Interupts are found in INT 21H which is also called as DOS interrupt. It works with the value of AH register, If the Value is 2 or 2h, That means WRITE a Character on Console present in DL register hence the value to be printed is moved to DL register. Here we are printing COUNT variable.

Next Line –  DISPLAY MSG5

 DISPLAY MSG5 is Calling of a Macro DISPLAY with Argument MSG5. This will display String Msg5 on Screen. 

JMP EXIT JMP is Unconditional Jump. This will Jump to Label EXIT.

Next Line –  NOTFOUND:
        DISPLAY MSG4

 NOTFOUND: is a LABEL and all the words ending in colon (:) are Labels. DISPLAY MSG4 is Calling of a Macro DISPLAY with Argument MSG4. This will display String Msg4 on Screen. 

Next Line –  EXIT:   MOV AH,4CH
        INT 21H

EXIT: is a LABEL and all the words ending in colon (:) are Labels. The above two line code is used to exit to dos or exit to operating system. Standard Input and Standard Output related Interupts are found in INT 21H which is also called as DOS interrupt. It works with the value of AH register, If the Value is 4ch, That means Return to Operating System or DOS which is the End of the program.

Next Line – CODE ENDS

CODE ENDS is the End point of the Code Segment in a Program. We can write just ENDS But to differentiate the end of which segment it is of which we have to write the same name given to the Code Segment.

Last Line – END START

END START is the end of the label used to show the ending point of the code which is written in the Code Segment.

Note :- In this Assembly Language Programming, We have Com format and EXE format. We are Learning in EXE format only which simple then COM format to understand and Write. We can write the program in lower or upper case, But i prepare Upper Case.

Screen Shots :-

Asm_program_Search_Char_in_Str

 Output After Execution :-

Asm_program_Search_Char_in_Str_Output

Note :- To see the variable and its value you have to click vars button in the emulator.

Leave a Reply