Introduction
I recently had the time to devote to reading a new book. I wanted to learn about Low Level Assembler and how it can be used with respect to performance tuning applications. I have heard that the book 'The Art of Assembly' is one of the best books on the topic of Assembler. While looking at other book's published by the same author; Randall Hyde, I found his newer series on 'Writing Great Code'.
The second book in the series focuses on exactly what I was looking for. A brief introduction to assembler and how it can be used to tune the performance of code.
The book is geared for the x86 and PowerPC Instruction sets. My background is in Java and C# .NET. Both are 4th level languages and use an optimizing JIT compiler on either OP Code (Java) or IL (.NET). I already had a working knowledge of both languages and intermediates. However I had no understanding of assembler (though I took assembler in college).
The books starts out by introducing the basic assembly code for both x86 and PowerPC. This was really forthcoming in explaining the similarities and differences between the two. Most notably is the way the x86 uses a small set of registers and PowerPC has an extended set. The Assembler covered is at a very simple level of understanding and does not make mention of anything but the most basic instructions of the languages. This is key in helping the reader more quickly pick up the basics without having to have domain knowledge of the languages and instruction sets. I would think it's much like the difference between skiing and snowboarding. Skiing is hard to learn at first and easier to master, and snowboarding is easy to learn and hard to master. Assembler is more like the latter if Randall's technique is used. The book makes mention of an online concordance for the languages and instruction sets however I did not make use of them while reading the book. This really forced me to use my mind to interpret the code.
Each example starts out with an introduction to the problem being covered. Then an example in a high level language is supplied with a corresponding disassembly of the code with and without the compilers optimizer applied. Several compilers are used against both x86 and PowerPC generalizations of the code. From this, I could quickly see how the GNU Compiler for the PowerPC seemed to provide the best optimizations.
The goal of the book is to show how to write code at the high level and understand what the compiler is doing to optimize the code. This can be very useful, I certainly learned more than I expected to learn about compiler optimizations. I also learned that the compiler is only doing what it was designed to do. So if I write code at a high level the compiler doesn't expect, then no optimizations will be applied. More importantly the book uncovers many situations where the Developer may invernly keep the compiler from making an optimization due to lack of understanding and dependency problems.
One reviewer for the book states that the section on Boolean logic is worth the cost of the book. I thought to myself: 'Well kind of', this underwhelming feeling could be due to me having a degree in Electronics Engineering. I found the sections on memory allocations, Iterators, and Branch Operators really interesting.
While reading the book, I came to a conclusion that the editor; who is said to have made the book easier to read, really didn't do a great job on the edit. I would have edited it differently, but I think the goal of the book is still true to the intended purpose and audience.
Chapter and Contents
Some basic statistics:
- Cover price: $44.95 US / 58.95 CAN
- Total number of pages: 587 + online appendicitis
- Publisher: No Starch Press
Chapter 1: Thinking Low-Level, Writing High-Level
A brief 10 page introduction covering the outline of the book, how to read the book, and misconceptions about compiler quality.
Chapter 2: Shouldn't You Learn Assembly Language?
A 6 page synopsis on why Assembly Language is hard to learn and understand, how the book solves the initial learning curve problem of learning assembly, high level assembles and HAL (The author's personal Assembly language) and a little knowledge on the author's other book 'The Art of Assembler' and other great books on assembler.
Chapter 3: 80x86 Assembly for the HLL Programmer
An introduction to the x86 Assembly language geared for the high level programmer. Basic 80x86 Architecture is explained in 20 pages of detail followed up with a minimal instruction set listing. This chapter and the next chapter are written almost identically to allow for easy comparison between the two microprocessors covered in the book.
Chapter 4: PowerPC Assembly for the HLL Programmer
An introduction to the PowerPC family and its Assembly language geared for the high level programmer. Not as much detail here as seen for the 80x86 only 10 pages covering the Architecture and minimal instruction set. I found that after reading chapters 3 and 4, the editor could have made some notes on the similarities between the two. The book leaves making the connection between the two up to the reader, however the remainder of the book does do a great job in outlining the differences.
Chapter 5: Compiler Operation and Code Generation
This chapter is focused on introducing the compiler and covers basic compiler theory and the differences between languages, file types, and translation processes used by compilers. This information is a prerequisite to reading the rest of the book. It really does a good job of covering the whole process the compiler uses to produce an executable. The author uses 40 pages to cover this material in 10 sections.
Chapter 6: Tools for Analyzing Compiler Output
Gives the reader some knowledge on how to decompile and use other tools to help produce an Assembler listing from a high level language. 50 pages are given to cover this topic. Most of the book focuses on decompiling C language code on various compilers and architectures.
Chapter 7: Constants and High-Level Languages
A 40 page introduction to constants and how they are defined in assembler.
Chapter 8: Variables in a High-Level Language
This chapter covers more than just variables. It really focuses on variables and memory organization, life time, and consumption. 50 pages are given to cover this topic.
Chapter 9: Array Data Types
A 40 page overview of array types and how they are organized in memory. The chapter shows different methods of padding arrays in assembler and shows some mistakes made by programmers using arrays.
Chapter 10: String Data Types
The chapter covers basic strings and how they are represented in assembler. The author admits that a full working and understanding of strings in assembler is out of scope for the book. 30 pages cover this topic.
Chapter 11: Pointer Data Types
The question of 'What is a pointer?' is covered in this chapter. The pitfalls of pointers and arithmetic are covered (a common problem with using pointers).
Chapter 12: Record, Union, and Class Data Types
40 pages covering high level languages. This chapter does not really cover how the high level constructs covered translate into assembler, but it does show some problems the compiler runs into when trying to optimize the produced assembler.
Chapter 13: Arithmetic and Logical Expressions
The difference between stack based, accumulator, and register machines is covered. The section on optimizing arithmetic expressions is interesting from a high level understanding. 50 pages cover this topic.
Chapter 14: Control Structures and Programmatic Decisions
Flow control statements are covered. I found myself not really understanding the assembler statements in this section while reading. After reading other chapters, I got a better understanding of what the different 'jump
' statements do and came back to this chapter and re-read it. I thought this to be one of the most interesting chapters of the book. 40 pages.
Chapter 15: Iterative Control Structures
This chapter helped to fill in the gaps in the previous chapter. Also very interesting. 40 pages.
Chapter 16: Functions and Procedures
30 pages on functions and problems the compiler runs into when different value or reference types are passed.
It seems the author wrote some chapters which covered the course of the book. Mostly on compiler theory, performance and dependency problems. Overall I would give the book a 4 of 5, and recommend it for anyone who is interested in writing better code and understanding how compilers optimize high level code.
History
- First draft - December 2009