Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / IoT / Arduino

Integer Square Root on 15 Assembler Lines

5.00/5 (6 votes)
5 Jun 2019CPOL1 min read 20.3K  
This code calculates square root of an integer on a few assembler code lines.

Introduction

You can use this code on small controllers used on embedded systems, that lacks many arithmetic operations, like Arduino. It calculates the square root stored in BX (31 bit) and puts the result on AL (7 bit) and the remainder on AH (7 bit). It is possible to use the 63/15 bit wide version using the intel extended registers (EBX for input and EAX for results). It is also possible to modify it for arbitrary precision. Only shifts and compare used:

ASM
asm( "mov ax, 0; stc" );            /* Initialize */
asm( "loop:"       );
asm(   "rcl bx, 1; rcl ah, 1"  );
asm(   "shl bx, 1; rcl ah, 1"  );
asm(   "shl al, 2"     );
asm(   "or  al, 1"     );
asm(   "cmp ah, al"    );
asm(   "jc skip"       );
asm(     "sub ah, al"  );
asm(     "or  al,  2"  );

asm( "skip:"           );
asm(   "shr al, 1"     );
asm(   "cmp bx,0x8000" );
asm(   "clc" );
asm(   "jnz  loop"     );

As you can see, it is really tiny and fast.

Background

While remembering the old long division method for square roots, I thought that the particularities of base 2 binary numeric system could be exploited to simplify this trick, related to "human" base 10 approach.

Using the Code

The easiest way to test this code is using GCC, as a wrapper:

C++
unsigned char sqrt( unsigned short w )
{ asm( "mov bx, %0 " : "=r" ( w ) );        /* c specifics */

// ------------------------------------> The real code  starts here

  asm( "mov ax, 0; stc" );            /* Initialize */
  asm( "loop:"       );
  asm(   "rcl bx, 1; rcl ah, 1"  );
  asm(   "shl bx, 1; rcl ah, 1"  );
  asm(   "shl al, 2"     );
  asm(   "or  al, 1"     );
  asm(   "cmp ah, al"    );
  asm(   "jc skip"       );
  asm(     "sub ah, al"  );
  asm(     "or  al,  2"  );

  asm( "skip:"           );
  asm(   "shr al, 1"     );
  asm(   "cmp bx,0x8000" );
  asm(   "clc" );
  asm(   "jnz  loop"     );

// ------------------------------------> The real code ends here 
}

int main()
{ unsigned short w= sqrt( 25*25+1 );

  return( w );
}

You can cut and paste to sqrt.c, add trace points as wished and build it with: gcc.exe -masm=intel sqrt.c

Another option is to use an IDE like codeblocks and trace the execution.

Points of Interest

This code only uses the source and result registers (BX and AX), so it doesn't "pollute" any other register. It is possible that a quick version exploits the register boundary, but it takes more registers to achieve this. The instruction set used is present in almost all microcontrollers, so porting it is easy.

As an observation, the square root or a binary number takes half of the bits of operand, thus, if operand is 15 bits long, the result can fit on 7 bits (this case), If 31 bits long, fits on 15 bits, and so.

History

  • 5th June, 2019: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)