In day 6 of the 2019 Advent of Code edition, we are presented with a system of objects orbiting in space. All objects orbit the center of mass of the system (modeled as object **COM**), either directly or by orbiting a larger object.

```
COM)B # B orbits COM
B)C # C orbits B
C)D # D orbits C
...
```

These chains of orbits can be solved as a directed graph. It just so happens that the old Unix toolkit has tools to operate on directed graphs. They aren’t designed for the use that we will make of them, but that’s what makes this fun.

## A legacy utility

Directed graphs are useful for modelling systems of dependencies (e.g., packages, build artefacts), which explains the existence of `tsort`

.

Rooted in the way that early Unix linkers handled archive files, this primordial utility performs a topological sort of a directed graph. Its simple output format limits what we can do with it, but internally it builds a graph based on directed arcs (**A → B**, just like our orbits input). It checks for cycles in the graph, then lists all nodes in topological order.

In our scenario of objects orbiting in space, this topological order maps to “from innermost to outermost orbit”.

```
< input \ # Read our problem input
tr ')' ' ' | \ # Rewrite each orbit "A)B" as "A B"
tsort # Sort all space objects
```

```
COM
B
G
C
...
```

This output can provide insight when we are prototyping a solution to the orbits problem, but to fully solve it with `tsort`

alone is a bit of a stretch. We can’t query for specific nodes or relationships: the sequence of ordered nodes is all we get.

That said, I did explore a solution that involves looping over the sorted output with `awk`

to build a table associating each orbit between **A** and **B** with the distance to the center of mass (**COM**). It involves looping over all nodes for each node — and thus has *O(n²)* time complexity — and counting the number of orbits between a node and the center of mass. Snippet attached below for those feeling curious:

**Attachment:** Constructing a table of orbits

```
table=$(tr ')' $'\t' < input)
for o in $(tsort <<< "$table"); do
awk "\$1==\"$o\"" <<< "$table"
done | awk '
{
ns[$2] = ns[$1]+1;
print $0 "\t" ns[$2]
}'
COM B 1
B C 2
B G 2
G H 3
C D 3
D E 4
D I 4
I SAN 5
E F 5
E J 5
J K 6
K L 7
K YOU 7
```

Summing the values in the third column above will give us the result to question one.

However limited, our use of `tsort`

is semantically correct. So we move on to a worse approach.

## The wrong tool for the job

Celebrating its 45th anniversary this year, the `make`

tool has long been a staple of build automation in software projects. Given a description of how each artefact is built, it works by internally building a graph of dependencies in order to determine the right sequence of build steps. For example, if we describe that file `program`

depends on file `source.o`

, which itself depends on `source.c`

, then `make`

will retrace the necessary steps to build every intermediate asset. This description is provided in the form of a `Makefile`

.

What if we represent our orbital system as a Makefile?

```
# Convert input file into Makefile
awk -F ')' '{ printf "%s: %s\n\t# %s\n", $2, $1, $2 }
END { print "COM:" }'
```

```
B: COM
# B
C: B
# C
D: C
# D
...
COM:
```

Now, `make`

can answer the question of which objects a given object orbits, from outermost to innermost:

```
make D
# B
# C
# D
```

So wrong, yet so right.

We can package this into a shell function named `chain`

that takes a space object as argument and returns the chain of orbits for that object.

**Attachment:** Defining `chain`

```
chain() {
awk -F')' '
{ printf "%s: %s\n\t# %s\n", $2, $1, $2 }
END { print "COM:" }
' | make -f- "$1"
}
```

The answer to the first question is the sum of the lengths of all objects’ chains, which we can refactor as the length of the concatenation of all chains:

**Attachment:** Counting all orbital chains

```
# Exclude special objects COM, SAN and YOU from the count.
objs=$(cut -d')' -f2 "$input" | grep -Ev 'COM|SAN|YOU')
for o in $objs; do
chain "$o" < "$input"
done | wc -l
```

This works, but is immensely inefficient, since a new `make`

process is run for every object, and each process needs to recompute the graph. The solution for the sample input (*n = 13*) is returned immediately, but for the proper input (*n = 1714*) the answer takes several minutes to return.

I’ve looked at `make`

ʼs debug output to look for ways to compute these chains in a single run, but nothing stood out. If anyone as willing to waste their time as me finds a way, please let me know!

## A strangely apt comparison tool

The second part of the challenge deals with “orbital transfers”: we are asked to determine how many orbits one has to “hop” between two objects (**YOU** and **SAN**). This amounts to finding the path through the closest common ancestor node between **YOU** and **SAN**.

So let’s *compare* the two chains with our friend `comm`

:

```
chain YOU "$input" > chain-you
chain SAN "$input" > chain-san
comm -3 chain-you chain-san
```

And here is the same but taking advantage of Bash’s process substitution:

`comm -3 <(chain YOU < "$input") <(chain SAN < "$input")`

```
# E
# I
# J
# K
# SAN
# YOU
```

`comm`

compares two files line by line and produces three text columns as output: lines present only in file 1, lines present only in file 2, and lines present in both files. The flag `-3`

tells it to reject lines in the third category.

In our spatial problem, this means that `comm -3`

will reveal the orbital steps between **YOU**, **SAN**, and their common ancestor, while ignoring the common trunk. Looking at the output above, from **YOU** we have to hop inward to **K**, then to **J**, then to **E**, then to our common ancestor (**D**, omitted from the output); from there we hop outward to **I**, then to **SAN**.

To get the correct number of orbital “hops”, we need to exclude **YOU**ʼs and **SAN**ʼ s own direct orbits by subtracting two:

```
transfers() {
comm -3 <(chain YOU <"$1") <(chain SAN <"$1") \
| wc -l \
| dc - <(echo '2 - p') # Subtract YOU and SAN
}
```

That `dc`

trick is just for fun. `dc`

is a stack-based postfix calculator. There are much friendlier ways of obtaining the same result:

**Attachment:** Subtracting 2 from a line count

```
# With `wc -l` and Bash arithmetic
n=$(transfers input | wc -l)
echo $((n - 2))
# With AWK
transfers input | awk 'END { print NR - 2 }'
```

In the end, it’s clear that there are diminishing returns: asking `make`

to return a single chain is relatively elegant, but solving the first part of the challenge with duct tape around `make`

doesn’t scale at all. That said, for the second part, comparing just two chains with `comm`

felt wonderfully easy and semantic; clocking at ~1s, it is *acceptably slow*.

Usurping traditional tools was a lot of fun, and we are left with an extremely concise implementation. See the full source.