compudanzas

uxn tutorial: day 1, the basics

en español: tutorial de uxn día 1

hello! in this first section of the uxn tutorial we talk about the basics of the uxn computer called varvara, its programming paradigm in a language called uxntal, its architecture, and why you would want to learn to program it.

we also jump right in into our first simple programs to demonstrate fundamental concepts that we will develop further in the following days.

why uxn?

or first of all... what is uxn?

The Uxn/Varvara ecosystem is a personal computing stack based on a small virtual machine that lies at the heart of our software, and that allows us to run the same application on a variety of systems.

100R — uxn

for further context, i invite you to read or watch "weathering software winter" from the 100R site, as well.

100R — Weathering Software Winter

uxn is the core of the varvara virtual computer. it is simple enough to be emulated by many old and new computing platforms, and to be followed by hand.

personally, i see in it the following features:

all these concepts sound great to me, and hopefully to you too!

however, i see in it some aspects that may make it seem not too very approachable:

the idea of this tutorial is to explore these two aspects and reveal how they play along to give uxn its power with relatively little complexity.

postfix notation (and the stack)

the uxn core is inspired by forth-machines in that it uses the recombination of simple components to achieve appropriate solutions, and in that it is a stack-based machine.

this implies that it is primarily based on interactions with a "push down stack", where operations are indicated using what is called postfix notation.

Reverse Polish notation (RPN), also known as Polish postfix notation or simply postfix notation, is a mathematical notation in which operators follow their operands [...]

Reverse Polish notation - Wikipedia

postfix addition

in postfix notation, the addition of two numbers, 1 and 48, would be written in the following form:

1 48 +

where, reading from left to right:

the book Starting Forth has some great illustrations of this process of addition:

The Stack: Forth’s Workspace for Arithmetic

from infix to postfix

more complex expressions in infix notation, that require either parenthesis or rules of operator precedence (and a more complex system for decoding them), can be simplified with postfix notation.

for example, the following infix expression:

(3 + 5)/2 + 48

can be written in postfix notation as:

3 5 + 2 / 48 +

we can also write it in many other ways, for example:

48 3 5 + 2 / +

i invite you to make sure these expressions work and are equivalent! you just have to follow these rules, reading from left to right:

note: in the case of the division, the operands follow the same left-to-right order. 3/2 would be written as:

3 2 /

you'll start seeing how the use of the stack can be very powerful as it can save operands and/or intermediate results without us having to explicitly assign a place in memory for them (i.e. like using "variables" in other programming languages)

we'll come back to postfix notation and the stack very soon!

varvara computer architecture

one of the perks of programming a computer at a low-level of abstraction, as we will be doing with uxn, is that we have to know and be aware of its internal workings.

for this section of the tutorial, i recommend you to follow along with the great illustrations and notes by Rostiger:

nchrs: uxn notes

8-bits and hexadecimal

binary words of 8-bits, also known as bytes, are the basic elements of data encoding and manipulation in uxn.

uxn can also handle binary words of 16-bits (2 bytes), also known as shorts, by concatenating two consecutive bytes. we'll talk more about this in the second day of the tutorial.

numbers in uxn are expressed using the hexadecimal system (base 16), where each digit (nibble) goes from 0 to 9 and then from 'a' to 'f' (in lower case).

a byte needs two hexadecimal digits (nibbles) to be expressed, and a short needs four.

the uxn cpu

it is said that the uxn cpu is a beet, capable of performing 36 different instructions with three different mode flags.

each instruction along with its mode flags can be encoded in a single word of 8-bits.

all of these instructions operate with elements in the stack, either to get from it their operands and/or to push down onto it their results.

we'll be covering these instructions very calmly over this tutorial.

memory

memory in a uxn computer consists in four separate spaces:

each byte in the main memory has an address of 16-bits (2 bytes) in size, while each byte in the i/o memory has an address of 8-bits (1 byte) in size. both of them can be accessed randomly.

the first 256 bytes of the main memory constitute a section called the zero-page. this section can be addressed by 8-bits (1 byte), and it is meant for data storage during the runtime of the machine.

there are different instructions for interacting with each of these memory spaces.

the main memory stores the program to be executed, starting at the 257th byte (address 0100 in hexadecimal, or 256 in decimal). it can also store data.

the stacks cannot be accessed randomly; the uxn machine takes care of them.

instruction cycle

the uxn cpu reads one byte at a time from the main memory.

the program counter is a word of 16-bits that indicates the address of the byte to read next. its initial value is the address 0100 in hexadecimal.

once the cpu reads a byte, it decodes it as an instruction and performs it.

the instruction will normally imply a change in the stack(s), and sometimes it may imply a change of the normal flow of the program counter: instead of pointing to the next byte in memory, it can be made to point elsewhere, "jumping" from a place in memory to another.

usage, installation and toolchain

to run varvara, you have several options: running it from your (web)browser, downloading it as a pre-built application, or building it from source yourself.

online

you can experiment with all the materials in the tutorial using the learn-uxn site by metasyn:

learn-uxn by metasyn

for most of the exercises in day 1, you can alternatively use the uxntal playground that is optimized for working with text-only:

Uxntal Playground

desktop bundles, building from source and more

in order to run varvara locally and off the grid we need to get an appropriate emulator.

the 100R website allows you to download the emulators for major desktop systems; these come bundled with a selection of programs in the form of "roms":

100R — uxn

for further instructions in how to run varvara in this way, see uxn running.

uxntal and a very basic hello world

uxntal is the assembly language for uxn machines.

above, we were talking about the uxn cpu and the 36 instructions it knows how to perform, each of them encoded as a single 8-bit word (byte).

uxntal being an assembly language implies that there's a one-to-one mapping of a written instruction in the language to a corresponding 8-bit word that the cpu can interpret.

for example, the instruction ADD in uxntal is encoded as a single byte with the value 18 in hexadecimal (that's what's called its opcode), and corresponds to the following set of actions: take the top two elements from the stack, add them, and push down the result.

in forth-like systems we can see the following kind of notation to express the operands that an instruction takes from the stack, and the result(s) that it pushes down onto the stack:

ADD  ( a b  --  a+b )

this means that ADD takes first the top element 'b', then it takes the new top element 'a', and pushes back the result of adding a+b.

now that we are at it, there's a complementary instruction, SUB (opcode 19), that takes the top two elements from the stack, subtracts them, and pushes down the result:

SUB ( a b  --  a-b )

note that the order of the operands in the subtraction is similar to the order for the division as we discussed above when talking about postfix notation: it is as if we moved the operator from between operands, to the end after the second operand.

also, note that in unxtal, text in between parenthesis is a comment, i.e. it is used for documentation purposes.

a first program

let's use learn-uxn to write our first program:

learn-uxn by metasyn

press the "new" button or delete the code that you find there, and then, write the following code:

( hello.tal )
|0100 LIT 68 LIT 18 DEO LIT 0a LIT 18 DEO

press the "assemble" button, and look at the box in the bottom-right corner!

scroll up within that box to see several messages, some of them tagged with [web], others with [asm] and lastly some tagged with [emu].

the [asm] message, corresponding to the output of the assembler that read the code and converted it into a rom, will look something like this:

Assembled output.rom in 10 bytes(0.02% used), 0 labels, 0 macros.

then, the first [emu] message should read something like this:

h

interesting, what is happening?

i invite you to try replacing the 68 in the code with, for example, 65. then, assemble and run the program again.

what was the difference in the output now?

one instruction at a time

we just ran the following program written in uxntal:

( hello.tal )
|0100 LIT 68 LIT 18 DEO LIT 0a LIT 18 DEO

now let's analyze it!

the first line is a comment: comments are enclosed between parenthesis. there have to be spaces in between them. similar to other programming languages, comments are ignored by the assembler.

the second line has several things going on:

reading the program from left to right, we can see the following behavior:

we're talking about a device address, 18, but what does it mean?

looking at the devices table from the varvara reference, we can see that the device with address 1 in the high nibble is the console (standard input and output), and that the column with address 8 in the low nibble corresponds to the "write" port.

XXIIVV — varvara console device

so, device address 18 corresponds to "console write", or standard output.

our program is sending the hexadecimal values 68 (character 'h') and 0a (newline) to standard output!

you can see the hexadecimal values of the ascii characters in the following table:

ascii table

note that sending the newline might or might not be needed for the 'h' to appear in the console, depending on the emulator that is being used.

raw numbers

note that the raw numbers that we wrote, 0100, 18, 68 and 0a, are written in hexadecimal using either 4 digits corresponding to two bytes, or 2 digits corresponding to one byte.

in uxntal we can only write numbers that are 2 or 4 hexadecimal digits (nibbles) long. if, for example, we were only interested in writing a single hexadecimal digit, we would have to include a 0 at its left.

assembled rom

when we assembled our program, we saw that it was 10 bytes in size.

if we looked at the numerical contents of the rom, we would see something like this:

80 68 80 18 17 80 0a 80 18 17 

80 is the "opcode" corresponding to LIT, and 17 is the opcode corresponding to DEO. and there you can see our 68, 18 and 0a!

so, our assembled program matches one-to-one the instructions we just wrote!

actually, we could have written our program using these hexadecimal numbers, i.e. the machine code, and it would have worked the same way:

( hello.tal )
|0100 80 68 80 18 17 80 0a 80 18 17  ( LIT 68 LIT 18 DEO LIT 0a LIT 18 DEO )

maybe it's not the most practical way of programming, but indeed it's a fun and beautiful one :)

you can find the opcodes of all 36 instructions in the uxntal reference

XXIIVV - uxntal

hello program

we could expand our program to print more characters:

( hello.tal )
|0100 LIT 68 LIT 18 DEO ( h )
      LIT 65 LIT 18 DEO ( e )
      LIT 6c LIT 18 DEO ( l )
      LIT 6c LIT 18 DEO ( l )
      LIT 6f LIT 18 DEO ( o )
      LIT 0a LIT 18 DEO ( newline )

if we assemble and run it, we'll now have a 'hello' in our console, using 30 bytes of program :)

ok, so... do you like it? does it look straightforward? maybe unnecessarily complex?

we'll look now at some features of uxntal that make writing and reading code a more "comfy" experience.

runes, labels, macros

runes are special characters that indicate to the assembler some pre-processing to do when assembling our programs.

absolute pad rune

we already saw the first of them: | defines an "absolute pad", i.e. the address where the next written items will be located in the main memory.

if the address is 1-byte long, it is assumed to be either an address of the i/o memory space or the zero-page.

if the address is 2-bytes long, it is assumed to be an address for the main memory.

literal hex rune

let's talk about another one: #.

this character defines a "literal hex": it is basically a shorthand for the LIT instruction.

using this rune, we could re-write our first program as:

( hello.tal )
|0100 #68 #18 DEO #0a #18 DEO

the following would have the same behavior as the program above, but using two bytes less (in the next day of the tutorial we'll see why)

( hello.tal )
|0100 #6818 DEO #0a18 DEO

note that you can only use this rune to write the contents of either one or two bytes, i.e. two or four nibbles.

important: remember that this rune (and the others with the word "literal" in their names) is a shorthand for the LIT instruction. this implies that uxn will push these values down into the stack.

if we just want to have a specific number in the main memory, without pushing it into the stack, we would just write the number as is, "raw". this is the way we did it in our first programs above.

raw character rune

this is the raw character or string rune: "

the assembler reads the ascii character after the rune, and decodes its numerical value.

using this rune, our "hello program" would look like the following:

( hello.tal )
|0100 LIT "h #18 DEO
      LIT "e #18 DEO
      LIT "l #18 DEO
      LIT "l #18 DEO
      LIT "o #18 DEO
      #0a #18 DEO ( newline )

note the "raw" in the name of this rune indicates that it's not literal, i.e. that it doesn't add a LIT instruction by itself.

that's why we need to include a LIT instruction.

runes for labels

even though right now we know that #18 corresponds to pushing the address (18) of the console write device port down onto the stack, for readability and future-proofing of our code it is a good practice to assign a set of labels that would correspond to that device and port.

the rune @ allows us to define labels, and the rune & allows us to define sub-labels.

for example, for the console device, the way you would see this written in uxntal programs for the varvara computer is the following:

|10 @Console &vector $2 &read $1 &pad $5 &write $1 &error $1

here, we can see an absolute pad to address 10 (the console device), that assigns the following items to that address. because the address consists of one byte only, once we use the DEO instructions, uxn understands it's referring to the i/o memory space,.

then we see a label @Console: this label is assigned to address 10.

next we have several sub-labels, indicated by the & rune, and relative pads, indicated by the $ rune. how do we read and interpret them?

none of this would be translated to machine code, but aids us in writing uxntal code.

the rune for referring to literal addressess in the zero page or i/o address space, is . (dot), and a / (slash) allows us to refer to one of its sublabels.

remember: as a "literal address" rune it will add a LIT instruction before the corresponding address :)

we could re-write our "hello program" as follows:

( hello.tal )

( devices )
|10 @Console &vector $2 &read $1 &pad $5 &write $1 &error $1

( main program )
|0100 LIT "h .Console/write DEO
      LIT "e .Console/write DEO
      LIT "l .Console/write DEO
      LIT "l .Console/write DEO
      LIT "o .Console/write DEO
      #0a .Console/write DEO ( newline )

now this starts to look more like the examples you might find online and/or in the uxn repo :)

macros

following the forth heritage, in uxntal we can define our own "words" as macros that allow us to group and reuse instructions.

during assembly, these macros are recursively replaced by the contents in their definitions.

for example, we can see that the following piece of code is repeated many times in our program:

.Console/write DEO

we could define a macro called EMIT that will take from the stack a byte corresponding to a character, and print it to standard output.

for this, we need the % rune, and curly brackets for the definition.

don't forget the spaces!

( macro: print a character to standard output )
%EMIT { .Console/write DEO } ( character -- )

in order to call a macro, we just write its name:

( print character h )
LIT "h EMIT

we can call macros inside macros, for example:

( print a newline )
%NL { #0a EMIT } ( -- )

note that macros are a helpful way of grouping and reusing code, especially when beginning to learn uxntal. for more advanced uses, macros are replaced by other strategies.

for that reason, some uxntal assemblers like the one in Uxntal Playground, don't allow their use.

a more idiomatic hello world

using all these macros and runes, our program could end up looking like the following:

( hello.tal )
( devices )
|10 @Console &vector $2 &read $1 &pad $5 &write $1 &error $1

( macros )
( print a character to standard output )
%EMIT { .Console/write DEO } ( character -- )
( print a newline )
%NL { #0a EMIT } ( -- )

( main program )
|0100 LIT "h EMIT
      LIT "e EMIT
      LIT "l EMIT
      LIT "l EMIT
      LIT "o EMIT
      NL

it ends up being assembled in the same 30 bytes as the examples above, but hopefully more readable and maintainable.

we could "improve" this program by having a loop printing the characters, but we'll study that later on :)

exercises

EMIT reordering

in our previous program, the EMIT macro is called just after pushing a character down onto the stack.

how would you rewrite the program so that you push all the characters first, and then "EMIT" all of them with a sequence like this one?

EMIT EMIT EMIT EMIT EMIT

print a digit

if you look at the ascii table, you'll see that the hexadecimal ascii code 30 corresponds to the digit 0, 31 to the digit 1, and so on until 39 that corresponds to digit 9.

ascii table

define a PRINT-DIGIT macro that takes a number (from 00 to 09) from the stack, and prints its corresponding digit to standard output.

%PRINT-DIGIT {    } ( number -- )

remember that the number would have to be written as a complete byte in order to be valid uxntal. if you wanted to test this macro with e.g. number 2, you would have to write it as 02:

#02 PRINT-DIGIT

instructions of day 1

these are the instructions we covered today:

day 2

well done! hope you had a great start today!

in uxn tutorial day 2 we start exploring the visual aspects of the varvara computer: we talk about the fundamentals of the screen device so that we can start drawing on it!

however, i invite you to take a little break before continuing! :)

support

if you enjoyed this tutorial and found it helpful, consider sharing it and giving it your support :)

incoming links