papers
+HCU papers
courtesy of fravia's page of reverse engineering

   SLH



   FUTURE VISION

   

   The supression and resurrection of assembler programming.

   

   Historical perspective

   ~~~~~~~~~~~~~~~~~~~~~~

   A long time ago in world far away, the designers of the millenium bug

   scribble up flow charts in ancient ciphers like Cobol and Fortran, send it

   to the girls in the punch card room who dutifully typed out the punch cards

   and send the result to an air conditioned boiler room downstairs where they

   were loaded into a card reading machine and run overnight.

   

   This was a cosy corporate world of high profits, high salaries and endless

   accolades for being on the leading edge of the computer age. The writing on

   the wall started to be read when the oil price rises in the seventies took

   its toll on the profitability of the corporate sector and the cost of running

   and upgrading mainframes generated the need for a cheaper alternative.

   

   The birth of the PC concept was an act of economic necessity and it heralded

   a major turn in the computer world where ordinary people were within

   reasonable cost range of owning a functional computer. When Big Blue

   introduced the original IBM PC, it put a blazingly fast 8088 processor

   screaming along at 2 megahertz into the hands of ordinary people.

   

   The great success of the early PCs onwards was related to the empowerment

   on a technical level of ordinary people being able to do complex things in

   a way that they could understand.

   

   Early Programming

   ~~~~~~~~~~~~~~~~~

   The early tools available for development on PCs were very primitive by

   modern standards yet they were a great improvement over what was available

   earlier. If anyone has ever seen the electronic hardware guys writing

   instructions for eproms in hex, the introduction of a development tool like

   debug was a high tech innovation.

   

   PCs came with an ancient dialect of ROM basic where if you switched the

   computer on without a floppy disk in the boot drive, it would start up in

   basic. This allowed ordinary users to dabble with simple programs that

   would do useful things without the need of a room full of punch card

   typists and an air conditioned boiler room downstairs with an array of

   operators feeding in the necessary bits to keep a mainframe going.

   

   The early forms of assembler were reasonably hard going and software output

   tended to take months of hard work using rather primitive tools which gave

   birth to the need for a powerful low level language that could be used on a

   PC that would improve the output.

   

   C filled this gap as it had the power to write at operating system level

   and as the language improved, it had the capacity to write assembler

   directly inline with the C code.

   

   If the runtime library functions could not do what you wanted, you simply

   added an asm block,

   

       ASM

          {

            instruction ...

            instruction ...

            instruction ...

          }

   

   and compiled it directly into you program.

   

   As the tools improved from being driven by market demand, the idea of a

   common object file format emerged which dramatically increased the power

   that programmers had available.

   

   Different languages had different strengths which could be exploited to

   deliver ever more powerful and useful software.

   

   C had the architecture to write anything up to an operating system.

   Pascal had developed into a language with a very rich function set that

   was used by many games developers.

   

   Basic had emerged from its spagetti code origins into a compiler that

   had advanced capacity in the area of dynamic memory allocation and string

   handling.

   

   The great unifying factor to mixed language programming was the capacity to

   fix or extend the capacity of each language by writing modules in assembler.

   

   Modern Programming

   ~~~~~~~~~~~~~~~~~~

   By the early nineties, modern assemblers came with integrated development

   environments, multi language support in calling conventions and powerful

   and extendable macro capacities which allowed high level simulations of

   functions without the overhead associated with high level languages.

   

   To put some grunt into a deliberately crippled language like Quick Basic,

   you wrote a simple assembler module like the following,

   

   ;--------------------------------------------------------------

   

               .Model  Medium, Basic

               .Code

   

       fWrite  Proc handle:WORD, Address:WORD, Length:WORD

   

               mov ah, 40h

               mov bx, handle

               mov cx, Length

               mov dx, Address

               int 21h

   

               ret         ; Return to Basic

   

       fWrite  Endp

   

       End

   

   ;--------------------------------------------------------------

   

   Change the memory model to [ .Model  Small, C ] and you had a printf

   replacement with one tenth the overhead.

   

   Code as simple as this allowed you to write to files, the screen or a

   printer, just by passing the correct handle to the function.

   

   Simply by specifying the calling convention, the programmer could extend

   C, Pascal, Basic, Fortran or any other language they wrote in so that it

   would deliver the capacity that was wanted.

   

   This capacity was nearly the heyday of flexible and powerful software

   development in the hands of non-corporate developers. The monster looming

   on the horizon came to fruition as a consequence of corporate greed on one

   hand and expediency on the other.

   

   The Decline

   ~~~~~~~~~~~

   Legal wrangling about the ownership of UNIX in the early nineties crippled

   its development for long enough to leave the door open for the early version

   of Windows to gain enough popularity to be further developed. With the

   advent of the upgraded version 3.1, DOS users had a protected mode, graphics

   mode add on that offered extended functionality over the old DOS 640k limit.

   

   The great divide started by stealth, development tools for version 3.1 were

   thin on the ground for a long time, the technical data necessary to write

   protected mode software was proprietary and very expensive.

   

   Even after parting with a reasonably large amount of hard currency, the

   version of C and the SDK that was supposed to be the be all and end all

   came with a development environment that crashed and crashed and crashed.

   The documentation could only be classed as poor and it dawned on most who

   bothered that the proprietor couldn't care less.

   

   The sales were their and they no longer needed the little guys who supported

   them on the way up.

   

   The Fall

   ~~~~~~~~

   Over the duration of 16 bit windows, the little guys made a reasonable

   comeback and produced some often very good and reliable software but the

   die had been cast. The reigns of proprietry control drew tighter and

   tighter while the support for the expensive software became poorer and

   poorer.

   

   The problem for the corporate giants was that the world was finite and

   market saturation was looming over their head in the very near future.

   Their solution was to gobble up the smaller operators to increase their

   market share and block out the little guys by controlling the access to

   the development tools.

   

   The Great Divide

   ~~~~~~~~~~~~~~~~

   Many would say, why would anyone bother to write in assembler when we have

   Objects, OLE, DDE, Wizards and Graphics User Interfaces ? The answer is

   simple, EVERYTHING is written in assembler, the things that pretend to be

   development software are only manipulating someone elses assembler.

   

   Market control of the present computer industry is based on the division of

   who produces useful and powerful software and who is left to play with the

   junk that is passed off on the market as development tools.

   

   Most programmers these days are just software consumers to the Corporate

   sector and are treated as such. As the development tools get buggier and

   their vendors spend more time writing their Licence Agreements than they

   appear to spend debugging their products, the output gets slower and more

   bloated and the throughput of finished software is repeatedly crippled by

   the defects in these "objects".

   

   A simple example of market control in development software occurs in the

   Visual Basic environment.

   

   Visual Basic has always had the capacity to pass pointers to its variables.

   This is done by passing the value by REFERENCE rather than by VALUE. The

   problem is that the VB developer does not have access at the pointer and

   has to depend on expensive aftermarket add ons to do simple things.

   

   Visual Basic has been deliberately crippled for commercial reasons.

   

   This is something like downloading and running a function crippled piece

   of shareware except that you have already paid for it. There are times

   when listening to the hype about enterprise solutions is no more than a

   formula for an ear ache.

   

   Why would a language as powerful as C and its successor C++ ever need to

   use a runtime DLL. The answer again is simple, programs that have a startup

   size of over 200k are not a threat to corporate software vendors who are

   in a position to produce C and assembler based software internally.

   

   The great divide is a THEM and US distinction between who has the power to

   produce useful things and who is left to play with the "cipher" that passes

   as programming languages.

   

   In an ideal world, a computer would be a device that knew what you thought

   and prepared information on the basis of what you needed. The problem is

   that the hardware is just not up to the task. It will be a long time into

   the future before processors do anything more than linear number crunching.

   

   The truth function calculus that processors use through the AND, OR, NOT

   instructions is a useful but limited logic. A young German Mathematician

   by the name of Kurt Godel produced a proof in 1931 that axiomatic systems

   developed from the symbolic logic of Russell and Whitehead had boundaries

   in their capacity to deliver true statements.

   

   This became known as "The indeterminacy of Mathematics" and it helps to put

   much of the hype about computers into perspective. The MAC user who asks

   the question "Why won't this computer do what I think" reveals a disposition

   related to believing the hype rather than anything intrinsic about the

   68000 series Motorola processors.

   

   Stripped of the hype surrounding processors and operating systems leaves the

   unsuspecting programmer barefoot, naked and at the mercy of large greedy

   corporations using their market muscle to extract more and more money by

   ruthlessly exploiting the need to produce software that is useful.

   

   Computer processors into the foreseable future will continue to be no more

   than electric idiots that switch patterns of zeros and ones fast enough

   to be useful. The computer programmer who will survive into the future is

   the one who grasps this limitation and exploits it by learning the most

   powerful of all computer languages, the processors NATIVE language.

   

   The Empire Fights Back

   ~~~~~~~~~~~~~~~~~~~~~~

   The Internet is the last great bastion of freedom of thought and this is

   where the first great battle has been won.

   

   The attempt to make the Internet into a corporate controlled desktop has

   been defeated for the moment. Choose your browser carefully or you may

   help shoot yourself in the foot by killing off the alternative.

   

   Control of knowledge is the last defence of the major software vendors and

   it is here where they are loosing the battle. The Internet is so large and

   uncontrollable that the dispossessed who frequent its corridors have started

   to publish a vast array of information.

   

   Assembler is the "spanner in the works" of the greedy corporate sector.

   There are some excellent technical works that have been contributed by

   many different authors in the area of assembler. The very best in this

   field are those who have honed their skills by cracking games and other

   commercial software.

   

   It should be noted that the hacking and cracking activities of the fringe

   of computing is a different phenomenon to cracking games and commercial

   software protection schemes. The fringe play with fire when they attack

   security information and the like and complain when they get their fingers

   burnt. The attempt by the major software vendors to label the reverse

   engineering activities in the same class as the fringe is deliberate

   disinformation.

   

   These authors are at the leading edge of software research and like most

   highly skilled people, their knowledge is given out freely and is not

   tainted by the pursuit of money. It comes as no surprise that the corporate

   sector is more interested in supressing this knowledge than they are in

   supressing the WAREZ sites that give away their software for free.

   

   The Comeback Trail

   ~~~~~~~~~~~~~~~~~~

   Start with the collection of essays by the +ORC. You will find an incisive

   mind that gives away this knowledge without cost. Start looking for some

   of the excellent tools that can be found on the Internet ranging from

   dis-assemblers to software in circuit emulators (SoftIce).

   

   There are some brilliant essays written by _mammon on how to use SoftIce

   which are freely available.

   

   Dis-assemblers are a supply of enormous quantities of code to start

   learning how to read and write assembler. The best starting point is the

   nearly unlimited supply of DOS com files. This is for good reason in that

   they are simple in structure being memory images and are usually very

   small in size.

   

   The other factor is an eye to the future. COM files are an escapee from

   early eighties DOS programming where most PCs only had 64k of memory. This

   meant that they are free of the later and far more complex segment

   arithmetic that DOS and 16 bit Windows EXE files are cursed with.

   

   The emerging generation of 32 bit files are called Portable Executables and

   they are written in what is called FLAT memory model where there is no 64k

   limit. COM files were restricted to 64k absolute but could directly read

   and write anything in their address space.

   

   A portable executable file has a very similar capacity except that in 32 bit

   it can theoretically read and write anything within a 4 gigabyte address

   space. In a very crude sense, PE files are 32 bit COM files but without

   some of the other limitations.

   

   A very good dis-assembler for COM files is SOURCER 7. Particularly in the

   early stages of exploring the structure of COM files, its capacity to

   add comments to the reconstructed source code make the code much easier to

   read.

   

   To start making progress, you will need an assembler. Although they are

   getting a bit harder to get, you can still source either MASM or TASM and

   start writing your own COM files. The generic "Hello World" example comes

   with a lot less code than many would think.

   

   ;----------------------- Hello.ASM ----------------------------

   

   com_seg segment byte public            ; define the ONLY segment

   

           assume cs:com_seg, ds:com_seg  ; both code & data in same segment.

           org 100h                       ; go to start adress in memory.

   

   start:

           mov ah, 40h                    ; the DOS function number.

           mov bx, 1                      ; the screen handle.

           mov cx, 11                     ; the length of the text to display.

           mov dx, offset Greeting        ; the address of the text.

           int 21h                        ; get DOS to execute the function.

   

           mov ax, 4Ch                    ; the TERMINATE process function.

           int 21h                        ; call DOS again to EXIT.

   

   Greeting db "Hello World",13,10        ; specify the text as byte data.

   

   com_seg ends                           ; define the end of the segment.

   

           end start

   

   ;----------------------------------------------------------------

   

   This tiny program assembles at 31 bytes long and it makes the point that

   when you write something in assembler you only get what you write without

   a mountain of junk attached to it. Even in C, putting printf in a bare

   Main function with the same text will compile at over 2k. The humourous

   part is if you dump the executable, printf uses DOS function 40h to output

   to the screen.

   

   Once you assemble a simple program of this type, immediately dis-assemble

   it and have a look at your program as it has been converted from binary back

   to code again. This will train your eye into the relationship between your

   written code and the results of dis-assembly.

   

   This will help to develop the skill to dis-assemble programs and read them

   when you don't have the source code. Once you start on the mountain of DOS

   com files available, you will find that much of the code is very similar to

   what you have written yourself and you get to see an enormous quantity of

   well written code that you can learn from without having to pay one brass

   razoo for the privilege.

   

   Some people are slightly bemused by the +ORC's reference to Zen yet if it is

   understood in the sense that the human brain processes data at a rate that

   makes fast computer processors look like snails racing down a garden bed,

   you will start to get the idea of "feeling" the code rather than just

   munching through it like a computer does.

   

   As you read and write more code your brain will start "pattern matching"

   other bits of code that you have already digested and larger blocks of

   code will start to become very clear.

   

   Once you go past a particular threshold, the process of "data mapping" and

   "model fitting" starts to occur. This is where you know enough to project

   a model of what is happening and then test it to see if it work the way

   you have modelled it. The rest is just practice and a willingness to keep

   learning.

   

   Once you get the swing of manipulating data in assembler, you will start to

   comprehend the power and precision that it puts in your hands. Contrary to

   the "buzz word" area of software where logic is couched in "Boolean" terms,

   the foundation of logic is called "The law of excluded middle". In layman's

   terms, something either is or it ain't but it can't be both.

   

   George Boole and others like Augustus De Morgan developed parts of logic

   during the nineteenth century but it was not until Russell and Whitehead

   published "Principia Mathematica" shortly before the first world war that

   logic became a complete and proven system. Russell based much of this

   milestone in reasoning on a very important distinction, the difference

   between EXTENSIONAL and INTENSIONAL truth.

   

   Something that is spacio temporally "extended" in the world is subject to

   the normal method of inductive proof where things that are "intension"

   cannot be either proven or disproven.

   

   Logic had been held back for centuries by the assumption that it was a

   branch of metaphysics where Russell and Whitehead delivered the proof

   that logic is "hard wired" into the world.

   

   This is important to computing in very fundamental ways. The devices that

   programming is about controlling are very "hard wired" in the way that they

   work. Recognising the distinction between what the devices ARE as against

   what some would like them to BE, or worse, the bullshit that is peddled

   to an unsuspecting public about the "wonders" of computers, and you have

   made one of the great leaps forward.

   

   The devices are in fact very powerful at manipulating data at very high

   speed and can be made very useful to the most poweful of all processors,

   the conceptual apparatus of the brain using it.

   

   The only reason why this distinction has ever been inverted is through the

   greed and arrogance of corporate software vendors and their desire to

   extract yet another quick and dirty buck.

   

   In this sense, the large commercial vendors are in the same class as the

   proliferation of low class smut vendors clogging up the Internet, they lure

   with the promise of fulfilling the "lusts of the flesh" yet when they

   extract the money that they are after, they leave their victims both poorer

   and unfulfilled.

   

   Most ordinary computer users part with reasonably large sums of money when

   they buy a computer and load it with software yet the promise of fun,

   convenience and usefulness is more often than not followed by the bugs,

   crashes and defects in the software. A Faustian bargain where the hidden

   cost is not seen until the money is handed over.

   

   The EXIT clause for the programmers who are wise enough to see that their

   skills are being deliberately eroded for marketing reasons is the most

   powerful tools of all, the direct processor instructions that assembler

   puts in their hands.

   

   The time to make the move to learning assembler is not open ended. DOS is

   a passing world and without the background of starting in a simpler world

   that DOS has given us for so many years, the move to assembler will be

   much harder and less accessible. There is probably only a couple of years

   left.

   

   If you are not robust enough to use the +ORC's formula for martinis, pure

   malt has a glow to it that surpasses all understanding.

   

   FUTURE VISION

   

papers
+HCU papers
redhomepage red links red anonymity +ORC redstudents' essays redacademy database
redantismut redtools redcocktails redjavascript wars redsearch_forms redmail_fravia
redIs reverse engineering illegal?