Let's take a quick look at implementations for some common tasks in Hoon. I assume that you want to use library tools whenever possible, but you can delve into the source code itself if you want to know more.
Sorting
- Given a
list
of values, sort the values according to a given rule and produce alist
.
The standard ++sort
gate implements a quicksort. You need to provide a comparison function as a gate as well.
> =/ my-list `(list @ud)`~[10 9 8 1 2 3 7 6 4 5](sort my-list gth)~[10 9 8 7 6 5 4 3 2 1]> =/ my-list `(list @ud)`~[10 9 8 1 2 3 7 6 4 5](sort my-list lth)~[1 2 3 4 5 6 7 8 9 10]
If you are sorting something more complicated than atoms, you'll need a comparison gate that handles pairwise values and returns truthness.
E.g. given a data structure like [@ud @tas]
, a cell which we wish to sort on the tail, we could sort like this:
> =/ my-list `(list [@ud @tas])`~[[1 %a] [2 %c] [3 %b] [4 %d]](sort my-list |=([a=[@ud @tas] b=[@ud @tas]] (lth +.a +.b)))~[[1 %a] [3 %b] [2 %c] [4 %d]]
- Given a
set
of values, sort the values according to a given rule and produce alist
.
If something isn't a list
, the easiest way to sort it is to convert it to a list
first and then sort that.
> =/ my-set (silt ~[1 10 10 1 2 3 3 2 4 9 8 5 7 6 8 9])=/ my-list ~(tap in my-set)(sort my-list lth)~[1 2 3 4 5 6 7 8 9 10]
Bitwise Operations
Bitwise operations are necessary when you are closely packing data into binary formats or otherwise dealing with binary data. Basically there are three scenarios:
- Using packed binary data, e.g. bit packing or NaN-boxing↗. Urbit supports bitwise operations on arbitrarily-sized atoms.
- Working with MIME-type data. Urbit has standard library support for yielding and parsing these, but it's sometimes a bit confusing where they may be located.
- Directly processing particular kinds of data streams, like audio or video data. On Urbit, you'll be serving these or interfacing with an external service. (Remember, Urbit is more like a server than a GPU.)
Binary Operations
If you are working with packed binary data, you'll typically print the atom data with a @ux
unsigned hexadecimal aura.
We'll use 'Wild Hearts Can\'t Be Broken'
as our source atom. As @ux
the ASCII components are clearly visible.
> `@ux`'Wild Hearts Can\'t Be Broken'0x6e.656b.6f72.4220.6542.2074.276e.6143.2073.7472.6165.4820.646c.6957
Here are a few ways to slice and dice binary atom data.
++rip
disassembles an atom b
into 2^a
-sized chunks as a list
.
> =/ text 'Wild Hearts Can\'t Be Broken'(rip 3 text)~[87 105 108 100 32 72 101 97 114 116 115 32 67 97 110 39 116 32 66 101 32 66 114 111 107 101 110]> =/ text 'Wild Hearts Can\'t Be Broken'`(list @ux)`(rip 3 text)~[0x57 0x69 0x6c 0x64 0x20 0x48 0x65 0x61 0x72 0x74 0x73 0x20 0x43 0x61 0x6e 0x27 0x74 0x20 0x42 0x65 0x20 0x42 0x72 0x6f 0x6b 0x65 0x6e]> =/ text 'Wild Hearts Can\'t Be Broken'`(list @ux)`(rip 6 text)~[0x6165.4820.646c.6957 0x276e.6143.2073.7472 0x6f72.4220.6542.2074 0x6e.656b]
++rap
is the opposite operation, producing an atom from a list of atoms b
with a block size 2^a
.
> `(list @ux)`(rip 3 'Mars')~[0x4d 0x61 0x72 0x73]> `@t`(rap 3 ~[0x4d 0x61 0x72 0x73])'Mars'> `@ux`(rap 3 ~[0x4d 0x61 0x72 0x73])0x7372.614d> `@ux`(rap 6 ~[0x4d 0x61 0x72 0x73])0x73.0000.0000.0000.0072.0000.0000.0000.0061.0000.0000.0000.004d
++cut
slices 2^a
-sized chunks from b
to c
out of atom d
.
> =/ text 'Wild Hearts Can\'t Be Broken'`@ux`(cut 3 [0 4] text)0x646c.6957> =/ text 'Wild Hearts Can\'t Be Broken'`@t`(cut 3 [0 4] text)'Wild'
(There are many more of these types of operations available as well.)
Standard bitwise logical operations are available:
OR
is++con
(conjoint):> `@ub`(con 0b10.0001.0110 0b11.1101.1011)0b11.1101.1111AND
is++dis
(disjoint):> `@ub`(dis 0b10.0001.0110 0b11.1101.1011)0b10.0001.0010XOR
is++mix
:> `@ub`5340b10.0001.0110> `@ub`9870b11.1101.1011> `@ub`(mix 534 987)0b1.1100.1101NOT
is++not
; it requires a block size (powers of 2) because leading zeroes are always stripped in Hoon (and thusNOT
is implicitly based on block size):> `@ub`(not 2 1 0b1011)0b100> `@ub`(not 3 1 0b1011)0b1111.0100> `@ub`(not 4 1 0b1011)0b1111.1111.1111.0100
MIME Data Operations
MIME data types↗ allow HTTP communications to include rich content. The ++html
core in the standard library provides quite a few operations for encoding and decoding MIME-typed values.
Data transmitted as bytes (“octets”↗) are decoded using ++as-octs:mimes:html
. This is transparent for ASCII text data:
> (as-octs:mimes:html '<html><body>')[p=12 q=19.334.824.924.412.244.887.090.784.316]> `[@ @ux]`(as-octs:mimes:html '<html><body>')[12 0x3e79.646f.623c.3e6c.6d74.683c]> `[@ @t]`(as-octs:mimes:html '<html><body>')[12 '<html><body>']
Perhaps surprisingly, many text conversion operations live here. To parse a hexadecimal value as a string, for instance, use ++de:base16:mimes:html
:
> (de:base16:mimes:html 'deadbeef')[~ [p=4 q=3.735.928.559]]> `(unit [@ @ux])`(de:base16:mimes:html 'deadbeef')[~ [4 0xdead.beef]]
There are operations for JSON, Base58↗, and XML/HTML as well.
Urbit furthermore has a notion of jammed noun, which is essentially a serialization (marshaling, pickling) of a noun into an atom.
Primes and Factors
To calculate a prime number, the tried-and-true method is the Sieve of Eratosthenes, which is an elimination algorithm. In other words, we need to be able to calculate factors of a given positive integer. We're actually going to do an even simpler (and less efficient) method here, and leave the Sieve as an exercise to the reader.
This gate accepts a number and divides it by every number from half the number down to 2. If the remainder after division is zero, then it is a factor and we add it to the front of the list of factors.
|%++ factors|= n=@ud ^- (list @ud)=/ ctr (div n 2)=/ fxr *(list @ud)|- ^- (list @ud)?: =(1 ctr) fxr?: =(0 +:(dvr n ctr))$(ctr (sub ctr 1), fxr [ctr fxr])$(ctr (sub ctr 1))--
Now that we can find factors, it should be straightforward to find primes. In this case, we simply check each value up to n
and see if it has any factors (other than itself and 1, of course).
|%++ factors|= n=@ud ^- (list @ud)=/ ctr (div n 2)=/ fxr *(list @ud)|- ^- (list @ud)?: =(1 ctr) fxr?: =(0 (mod n ctr))$(ctr (sub ctr 1), fxr [ctr fxr])$(ctr (sub ctr 1))++ primes|= n=@ud ^- (list @ud)=/ ctr n=/ prm *(list @ud)|- ^- (list @ud)?: =(1 ctr) prm?: =(~ (factors ctr))$(ctr (sub ctr 1), prm [ctr prm])$(ctr (sub ctr 1))--
- How would you change this algorithm to the more efficient Sieve of Eratosthenes?
Pragmatic Input/Output
While Hoon has a sophisticated text parsing library, the primitives are rather low-level and we won't assume that you want to directly implement a parser using them in a rapid-fire competitive environment.
- Text Processing III - This module will cover text parsing.
- Parsing Text
Fortunately, there is a regular expression library you can incorporate into your program which will allow you to match and work with code.
Functional Operators
Hoon is a functional programming language, so naturally it supports the basic collective operations which functional programming languages expect.
Curry
Currying describes taking a function of multiple arguments and reducing it to a set of functions that each take only one argument. Binding, an allied process, is used to set the value of some of those arguments permanently. Hoon has a left-bind ++cury
and a right-bind ++curr
.
> =full |=([x=@ud a=@ud b=@ud c=@ud] (add (add (mul (mul x x) a) (mul x b)) c))> (full 5 4 3 2)117> =one (curr full [4 3 2])> (one 5)117
Map
The Map operation describes applying a function to each item of a set or iterable object, resulting in the same final number of items transformed. In Hoon terms, we would say slamming a gate on each member of a list
or set
. The standard library arms that accomplish this include ++turn
for a list
, ++run:in
for a set
, and ++run:by
for a map
.
> (turn `(list @ud)`~[1 2 3 4 5] one)
Reduce
The Reduce operation applies a function as a sequence of pairwise operations to each item, resulting in one summary value. The standard library arms that accomplish this are ++roll
and ++reel
for a list
, ++rep:in
for a set
, and ++rep:by
for a map
.
> =my-set (silt `(list @ud)`~[1 2 3 4 5])> (~(rep in my-set) mul)b=120
Filter
The Filter operation applies a true/false function to each member of a collection, resulting in some number of items equal to or fewer than the size of the original set. In Hoon, the library arms that carry this out include ++skim
, ++skid
, ++murn
for a list
, and ++rib:by
for a map
.
> `(list @ud)`(skim `(list @ud)`~[1 2 3 4 5] (curr gth 2))~[3 4 5]
An interesting feature of Hoon is that it really prefers functions-that-produce-functions, which feels very functional once you get used to the idiom. Here you can see that done with ++curr
.
- Functional Programming - This module will discuss some gates-that-work-on-gates and other assorted operators that are commonly recognized as functional programming tools.
Floating-Point Operations
@ud
unsigned decimal operations are, of course, postive-integer-only. For floating-point maths, you will need to work with @rs
for 32-bit single-precision IEEE 754 floating-point atoms. These are prefixed with a single .
which is not a decimal point.
> .100.100> .1e2.100> .1e-4.1e-4> .0.0001.1e-4
Equivalent mathematical operations for @rs
values are available in the ++rs
door:
> (add:rs .1 .2).3> (mul:rs .5 .0.6).0.3> (div:rs .10 .3).3.3333333
- Mathematics - This module introduces how non-
@ud
mathematics are instrumented in Hoon.
(I picked the above set of examples after perusing the excellent book Antti Laaksonen (2017) Guide to Competitive Programming: Learning and Improving Algorithms Through Contests↗.)
Debugging and Troubleshooting
Static typing with compile-time type checking turns out to be a secret strength of Hoon. Once you've satisfied the typechecker, your code is often surprisingly free of bugs (compared to e.g. Python).
There are three basic things that tend to go wrong:
- Syntax error, general (just typing things out wrong, for instance in a way Dojo would prevent)
- Syntax error mismatching rune daughters (due to
ace
/gap
or miscounting children) - Type issues (
need
/have
, notoriously)
This last case can be handled with a couple of expedients:
Frequent use of
^-
kethep/^+
ketlus to make sure that types match at various points in the code.This has two benefits: it verifies your expectations about what values are being passed around, and it means that mismatches raise an error more closely to the source of the error.
(Since Hoon checks type at build time, this does not incur a computational cost when the program is running.)
Use of assertions to enforce type constraints. Assertions are a form of
?
wut rune which check the structure of a value. Ultimately they all reduce back to?:
wutcol, but are very useful in this sugar form:?>
wutgar is a positive assertion, that a condition must be true.?<
wutgal is a negative assertion, that a condition must be false.?~
wutsig is a branch on null.
For instance, some operations require a
lest
, alist
guaranteed to be non-null (that is,^- (list) ~
is excluded).> =/ list=(list @) ~[1 2 3]i.list-find.i.listfind-forkdojo: hoon expression failed?~
wutsig solves the problem for this case:> =/ list=(list @) ~[1 2 3]?~ list !!i.list1In general, if you see an error like
find.fork
, it means that the type system is confused by your use of a too general of a type for a particular case. Use the assertion runes to correct its assumption.Testing Code - This module will discuss how we can have confidence that a program does what it claims to do, using unit testing and debugging strategies.