A Tour of Vada
Vada is an idiosyncratic language with diverse influences. The design patterns that inform its architecture can feel strange, especially for those who are only used to C-like languages. This tour covers some of the what and whys of Vada without getting too deep into the implementation details, in order to (hopefully) de-strange it a bit.
Under Construction. Expect a Mess.
What Exactly is Vada?
The elevator pitch for Vada is that it’s data language “designed to take the pain out of composing, validating, and transforming data artifacts.”
The concrete use case Vada focuses on, however, is wrangling the specific kinds of aggregate entity data artifacts you commonly see in game development.
This means Vada is built for folks who need to write and manage diverse data sets comprised of large table-like lists inside of deeply nested hierarchical structures, where records are spread across multiple files in multiple directories and upwardly aggregated00-1 to form output artifacts.
It also means that Vada is built for cross-disciplinary teams, where senior developers need to provide cleanly siloed schemas and tools to designers, artists, and translators in ways that minimize processorial footguns.
And perhaps most importantly, it means that Vada is built for environments where tools need to be hackable, extensible, and shippable.
At the output level, Vada handles like a database.
At the record level, Vada handles like a config language.
At the expression level, Vada handles like an unholy merger of Elixir and MATLAB.
At the parser level, Vada handles like an easily pluggable data evaluation toolkit.
Vada is a lot of things.
Core Concepts
Vada doesn’t try to be a programming language, which means it contains a mountain of weird and a small dash of the familiar.
Before we get too deep into the details, though, it’s important to establish the common terms you’ll seen thrown around throughout this tour. Here’s our core vocab:
- A Vada project is called an Archive.
- Archives are comprised of Collections.
- A Collection can be as small as a definition in a file, or spread across multiple files in a directory.
- Collection are comprised of Sections.
- Sections can be comprised of Records, Schemas, Transforms, Accessors, and sub-Sections.
- A Record is a unit of data, comprised of a Container, one or more Primitives, and zero or more Representations.
- Containers determine how data is encapsulated.
- Primitives determine the data’s underlying ‘Type’.
- Representations determine how data is serialized.
- A Schema is an unreified Record that can take input, encapsulating a series of constraints into something like a Type constructor.
- A Transform is Vada’s version of a function.
- A Record is a unit of data, comprised of a Container, one or more Primitives, and zero or more Representations.
- Sections function as both semantic groups and as capture groups. They do a lot of heavy lifting throughout Vada.
- Sections can be comprised of Records, Schemas, Transforms, Accessors, and sub-Sections.
- Every Archive and Collection has a Manifest, a special Section that defines its imports and interfaces.
- Manifests can be defined as Sections, or with
__manifest.Vada
files (for directories). - Manifests are directional, with separate interfaces for parent, sibling, and child collections.
- This directionality is what lets Vada mix implicit and explicit imports.
- Manifests can be defined as Sections, or with
Value Reification
What is the difference between a: 4
and b: !int & >3 & <5
? The former explicitly assigns 4
to a
, while the latter limits b
’s acceptable values to integers greater than three and less than five. But are they equivalent?
While it can help to think of constraint expressions as miniature validator functions when you’re first starting out, the truth is that Vada considers constraints to be no different than any other value or expression. Every expression in Vada is, fundamentally, a description of a set and all set descriptions are values.
The difference between 4
and !int & >3 & <5
isn’t that one is a value and the other is a constraint; they’re both expressions that describe sets and they’re even descriptions of the same set. The difference is their concreteness. Both describe 4
, but where a
is a concrete description of 4
, b
is an abstract description of 4
.
If you wanted to compare a
and b
, you’d first have to extract a list of b
’s possible values from its upper and lower bounds, and then collapse that list to a uniform in order to derive a value that has the same concreteness and container as a
.
In order words, you’d have to reify b
in order to compare it to a
.
A core pillar of Vada’s design is the assertion that all valid expressions produce sets which are members of a partially ordered superset that forms the language’s type lattice.00-2
Value reification is the process of converting an upper (abstract) member of the lattice into an equivalent lower (concrete) member of the lattice by explicitly enumerating upper’s inner members into an appropriate container that reflects its topology.
Reifying an abstract value is comparable to traversing the greater type lattice to find the abstract value’s concrete pair, affirming its position in a secondary lattice of concretely-representable types. Abstract values that can’t be reified (e.g. infinite expressions such as >0
and !=2
) have no concrete pairs, and are therefore excluded from the secondary lattice.
A consequence of Vada’s system of value reification is that a lot of what you’d typically think of as behavior in other languages is (unreified) data in Vada. This is especially true for Vada’s control flow, where things like switch statements are lists of expressions that can be passed around and lazily reified.
Semantic Interfaces
Another pillar of Vada’s design is the use of semantic interfaces (Sections) to create parallel access patterns.
someRecord: {
--- SectionA:
fieldA: 123
fieldB: "abc"
fieldC: (0.2, -0.05, 1.3)
--- SectionB:
fieldD: true
fieldE: "jimmy"
}
Sections group records together without nesting them inside a struct. fieldA
in the struct above is accessible through both someRecord.fieldA
and someRecord::SectionA::fieldA
In the above record, all of the fields can be accessed directly as members or through Sections
This is primarily accomplished through the use of Sections. A section is something like a cross between an interface and a soft namespace, augmented with a grab-bag of things like inherent topology, captures, and field constraints.
First-Class Spatial Data
If you’re a MATLAB fan, Vada’s syntax for multi-dimensional uniforms should feel fairly familiar, as it uses comma-separated rows and semi-colon separated columns, extended with a sensible adjacent circumfix syntax stacking the outermost axis.00-3
Vada is zero-indexed, but its indexing syntax differs slightly from most other languages; rather than the typical someVar[]
syntax, Vada uses someVar.[]
, treating indices as members.
This syntax applies to uniforms, lists, and some fancy stuff like struct filtering and record destructuring expressions. The grammar’s a bit odd on the face of it, but it lets you do some cool stuff once you get used to it.
It’s also worth noting that Vada uses axis indexing, where the dimensionality of the index expression matches the dimensionality of the record and slices are rotated.
AKA, this:
someCube: ( 1.1, 2.1, 3.1;
4.1, 5.1, 6.1;
7.1, 8.1, 9.1 )
( 1.2, 2.2, 3.2;
4.2, 5.2, 6.2;
7.2, 8.2, 9.2 )
( 1.3, 2.3, 3.3;
4.3, 5.3, 6.3;
7.3, 8.3, 9.3 )
// Rotated from XZ to XY
xPlane: someCube.[0][ ][ ] // 1.1, 2.1, 3.1;
// 1.2, 2.2, 3.2;
// 1.3, 2.3, 3.3
// Rotated from YZ to XY
yPlane: someCube.[ ][0][ ] // 1.1, 4.1, 7.1;
// 1.2, 4.2, 7.2;
// 1.3, 4.3, 7.3
// Inherently XY oriented
zPlane: someCube.[ ][ ][0] // 1.1, 2.1, 3.1;
// 4.1, 5.1, 6.1;
// 7.1, 8.1, 9.1
Combined with prefix inversion ~someVec
, prefix transposition %someVec
, infix wedge product someVec ^^ anotherVec
, infix cross product someVec ** anotherVec
, and piped swizzle-reshape someVec->(1,0,2)
, Vada’s approach to multi-dimensional data removes mountains of boilerplate code for game development.
While 3d→2d projections are relatively easy to grok, higher-dimension rotations are… less so.
hCube:
{
[ (1.0, 2.0, 3.0)(4.0, 5.0, 6.0)(7.0, 8.0, 9.0) ]
[ (1.1, 2.1, 3.1)(4.1, 5.1, 6.1)(7.1, 8.1, 9.1) ]
[ (1.2, 2.2, 3.2)(4.2, 5.2, 6.2)(7.2, 8.2, 9.2) ] }
{
[ (1.3, 2.3, 3.3)(4.3, 5.3, 6.3)(7.3, 8.3, 9.3) ]
[ (1.4, 2.4, 3.4)(4.4, 5.4, 6.4)(7.4, 8.4, 9.4) ]
[ (1.5, 2.5, 3.5)(4.5, 5.5, 6.5)(7.5, 8.5, 9.5) ] }
{
[ (1.6, 2.6, 3.6)(4.6, 5.6, 6.6)(7.6, 8.6, 9.6) ]
[ (1.7, 2.7, 3.7)(4.7, 5.7, 6.7)(7.7, 8.7, 9.7) ]
[ (1.8, 2.8, 3.8)(4.8, 5.8, 6.8)(7.8, 8.8, 9.8) ] }
// Strictly speaking, all single-index slices of multi-dimensional uniforms
// can be seen as k-blades (pseudovectors) of the uniform's vector space.
xMat: hCube.[0][ ][ ][ ] // [ (1.0, 2.0, 3.0)(1.1, 2.1, 3.1)(1.2, 2.2, 3.2) ]
// [ (1.3, 2.3, 3.3)(1.4, 2.4, 3.4)(1.5, 2.5, 3.5) ]
// [ (1.6, 2.6, 2.6)(1.7, 2.7, 3.7)(1.8, 2.8, 3.8) ]
yMat: hCube.[ ][0][ ][ ] // [ (1.0, 1.1, 1.2)(4.0, 4.1, 4.2)(7.0, 7.1, 7.2) ]
// [ (1.3, 1.4, 1.5)(4.3, 4.4, 4.5)(7.3, 7.4, 7.5) ]
// [ (1.6, 1.7, 1.8)(4.6, 4.7, 4.8)(7.6, 7.7, 7.8) ]
zMat: hCube.[ ][ ][0][ ] // [ (1.0, 2.0, 3.0)(4.0, 5.0, 6.0)(7.0, 8.0, 9.0) ]
// [ (1.3, 2.3, 3.3)(4.3, 5.3, 6.3)(7.3, 8.3, 9.3) ]
// [ (1.6, 2.6, 3.6)(4.6, 5.6, 6.6)(7.6, 8.6, 9.6) ]
wMat: hCube.[ ][ ][ ][0] // [ (1.0, 2.0, 3.0)(4.0, 5.0, 6.0)(7.0, 8.0, 9.0) ]
// [ (1.1, 2.1, 3.1)(4.1, 5.1, 6.1)(7.1, 8.1, 9.1) ]
// [ (1.2, 2.2, 3.2)(4.2, 5.2, 6.2)(7.2, 8.2, 9.2) ]
// The .[] grammar makes it easy to chain index expressions,
// such that you can extract sub-slices without having to
// rely on a mess of range expressions
tVec: hCube.[ ][ ][ ][0]
.[ ][ ][0]
.[0][ ] // (1.0, 2.0, 3.0)
tVec: hCube.[0][ ][0][0]
TL;DR: All the data axes. Literally. All of them. Slice it all the ways.
Sensible Data Serialization
Where data is generally immutable during evaluation, the serialization process is more flexible, as the way you structure your data for evaluation is decoupled from how you structure it for export.
Data representations can be handled at the schema, section, and archive level, letting you build up your serialization model piece-by-piece, override it where necessary, and handle it as modularly (or non-modularly) as you want — all while maintaining topological guarantees.
Critically, Vada’s serialization process is designed facilitate data flattening, turning hierarchical data and lists of structs into space-efficient and access-efficient binary archives utilizing structs of arrays and common data tables.
Where common numeric representations fall short, Vada lets you define custom bit-packing schemas that (optionally) export memory-safe unpacking instructions as a part of the binary archive’s header table. This enables rapid iteration in development environments by alleviating the need to regenerate the consuming applications’ unpacking code, and the selective inclusion of dynamic deserialization in production00-4.
And it can do JSON/YAML/CSV stuff. Because reasons.
TL;DR: Vada targets the gaming industry, where stuff gets encoded in all sorts of weird mixed-precision ways and folks end up engineering weird kinds of data polymorphism via bitvecs and integer packing so they can abuse manually aligned structs and not refactor their data pipelines.
Vada’s approach to packing (especially with the !raw
primitive and its various non-decimal numeric literals) lets you indulge in all of that in a way that’s readable and maintainable.
Is it a whole pile of stuff that you probably shouldn’t do in the first place? Come on, man. Don’t call out Blizzard like that.
And because Vada knows how headache-inducing data packing can be, it gives you a ton of ways to handle it.
Directional Data Propagation
While Vada is heavily inspired by CUE, it handles scoping and propagation in a very different way.
Where CUE uses pure downwards propagation with siloed directories (in which there is no cross-branch propagation of data or constraints), Vada uses directional interfaces to allow cross-propagation from siblings without creating unresolvable circular dependencies.
The TL;DR on directional interfaces is that collections can differentiate what they expose for export for each direction they export it in (e.g. to-parent, to-child, to-sibling).
This lets users establish cross-dependencies while preventing circular dependencies footguns by presenting separate silos of data and selectively flattening siblings through single-child upwards interfaces.
Spoilers: 99% of the time, these data propagation details will only matter if you’re deliberately doing weird stuff.
Downward interfaces propagate implicitly. Upwards and horizontal interfaces propagate explicitly. All interfaces let you keep things clean by re-arranging and filtering data.
It might sound complicated, but it’s simple in practice.
Full Toolchain Incremental Compilation
Vada’s evaluator is written in a way that lets the language server and CLI share an incremental compute cache — every time semantic tokens are updated in your IDE, the intermediate data representation used to generate output files is updated as well.
And since Vada is deterministically evaluated, compute caches can be remotely calculated and shared.
This shared cache setup can be leveraged in a variety of CI/CD pipelines. It’s deliberately left as an open-ended opportunity for hackery.
- “Upwards propagating” meaning the opposite of CUE’s root-to-leaf downwards propagation model.↩
- Just… Don’t worry about it.↩
- Adjacent circumfixes are interpreted as the ‘next’ axis of the list or uniform, such that
(1)(2)(3)
is equivalent to(1,2,3)
and(1,2,3)(4,5,6)
is equivalent to(1,2,3;4,5,6)
. This also works recursively for higher-dimension data; 3D data can be written as(1,2;3,4)(5,6;7,8)
, 4D data can be written as( (1,2;3,4)(5,6;7,8) )( (1,2;3,4)(5,6;7,8) )
, and so on.↩ - Yes, you can stranger danger yourself with this. Be smart about it.↩
Expressions
Vada’s expression syntax is fairly unique, as it’s built around a family of flexibly pipe operators that you won’t see in most other languages. And since many of Vada’s other grammatical quirks are tied to these operators, understanding them is an important part of understanding Vada as a whole.
Expression Structure
Vada’s formal BNF expression grammar is long and tedious and no one actually knows how to read it.
That said, the average Vada user should know that its expression structure is engineered for clarity, even when that means diverging from the general C-like patterns most programmers are used to.
The primary way Vada does this is through clause patterns that arise from its low-precedence Outer Pipe operator and the equally low-precedence open transform grammar.
Combined with The Implicit (a context variable similar to Self
in other languages, but for the clause-level expression value), they they make it easy to perform imperative sequences of operation within a single expression, while also limiting the number of circumfixes needed to clarify precedence.
A secondary effect of all of this is that Vada is fairly easy to read at small font sizes. I’m not saying its a language made for old people, but, well… It’s a language made for old eyes.
someData: [...!num::dec]
sorted: someData ~> #filter: > 0
~> #iter: *2 ?> _ * 2 | _ + 4
~> #sort: >_
Operators
While you don’t need to understand all of the inner details of Vada’s expression parsing in order to get the most out of it, knowing the basics can help you understand the overall design.
Vada uses a modified Pratt parsing system for its entire expression system. Name-paths, function calls, lists — it’s Pratt parsed all the way down, which means we can express operator precedence and operator associativity through two tables of left-right binding power tuples.
High binding power operators are processed before low binding power operators. Operators with a (low, high)
pattern are left-associative, while those with a (high, low)
are right-associative.
With that in mind, here are the main infix operators:
Op Group | Binding Power | Comments |
---|---|---|
~> |
(0, 0) | Outer Pipe |
, ; |
(1, 2) | Sequence Delineators |
= |
(3, 4) | Alias Bind |
| |: |
(5, 6) | Disjunction |
& |
(7, 8) | Conjunction |
> >= < <= == != |
(9, 10) | Equality Operators |
+ - |
(11, 12) | Standard Math |
* / ^ % |
(13, 14) | Standard Math |
** // ^^ %% !! |
(13, 14) | Spicy Math |
.& .| .^ .~ .< .> |
(15, 16) | Bitwise Ops |
… | … | Where the lower unary operators fit in power-wise |
-> ?> |
(19, 20) | Inner Pipe and If Pipe |
.. ..= =.. |
(21, 22) | Ranges expressions are effectively operands. |
… | … | Signs, Inverse, and Transpose prefixes |
. :: |
(∞, ∞) | Accessors are always the tightest-binding operators |
The main prefix operators
Op Group | Binding Power | Comments |
---|---|---|
> >= < <= |
(0, 17) | Value Constraints |
* // ^ |
(0, 17) | Value Constraints |
~= != |
(0, 17) | Composition Constraints |
.. ..= |
(0, 17) | Upper Range |
... |
(0, 17) | Spread Op |
+ - ~ % |
(0, 23) | Signs, Inverse, Transpose |
And the postfix operators:
Op Group | Binding Power | Comments |
---|---|---|
.. =.. |
(17, 0) | Lower Range |
The operators with binding powers lower than the standard arithmetic operators can be thought of as ‘clause delineating’ operators, while the ones with higher binding powers can be seen as ‘operand modifying’ operators. It’s also worth noting that the different binding powers of the conjunction and disjunction operators allows for disjunctions of conjunctions (of arithmetic expressions, etc.), and that the high binding power of the bitwise operators allows for easy ad-hoc flag math — the there’s some nuance here that’s bloody painful to discuss without mentioning Pratt parsing.
Control Flow?
Similar to how Rust’s control flow statements are technically a sub-expressions (with their own interior grammar), Vada provides control flow through compositions involving a general-purpose operators: the if-pipe ?>
.
The if-pipe pushes the left operand to the right operand’s implicit, and then yields the right operand if the left is truthy. Combined with a disjunction, it forms a simple ternary expression: someBool ?> val | fallback
Where they show real utility, however, is when they’re used recursively. An if-pipe expression that yields a list or uniform of if-pipe expressions functions as a multi-match, while a disjunction of if-pipe expressions functions as an if...else if...
sequence.
Combined with their utility for safe path navigation — someVar ?> _::memberA ?> _
— and nullability handling, if-pipes are powerful tools.
$someEnum: !flag[Red; Green; Blue; Yellow]
someBool: @false
currState: (someBool == @true) ?> $someEnum::Green | $someEnum::Red
matchVar: 4 + currState ?> (
_::Red ?> 1;
_::Green ?> 2;
_::Blue ?> 3;
_::Yellow ?> 4;
@null ?> 0)
eq: matchVar == 6 // @true
If-pipes (and sinks) are the magic ingredients that enable the almost-Turing-complete type generation shenanigans that are referenced throughout these docs. They’re kinda nuts, but they’re also really useful when you want to do things like inline re/de-structuring.
What’s With The a: >3 & *2
Stuff?
TL;DR: Vada lets you use a bunch of math operators as constraint operators, if you write them as prefixes instead of infixes.
In most languages, operators such as >
and <
are equality operators; they’re used to compare two operands and return a Boolean. In Vada, >
and <
(and many other operators) can also be used as constraint operators01-2, which only have one (right-hand) operand and return a limit.
While there’s a whole academic conversation that can be had about constraint conjunction and lattice evaluation, the main thing to extract from it is that constraint expressions are like little field-validators, and they can be used to build up types and restrictions on records and schemas without having to write a bunch of boilerplate code.
Is the syntax a little weird? Yes. Is it worth the weirdness? Also yes. Checkout these (somewhat facile) examples:
Constraint Expression | Description | Equality Expression |
---|---|---|
*N |
Limit to multiples of N | Val % N == 0 |
/N |
Limit to divisors of N | N % Val == 0 |
>N |
Limit to greater than N | Val > N |
<N |
Limit to less than N | Val < N |
M..N |
Limit to half-open bounds M and N | Val >= M && Val < N |
A | B | C |
Limit to one of A, B, or C | Val == A || Val == B || Val == C |
A & B | A & C |
Limit to either A and B or A and C | (Val == A && Val == B) || (Val == A && Val == C) |
Vada’s constraint syntax is only marginally more compact than traditional equality statements for simple expressions, but it’s significantly more concise for more complex expressions.
Sequences and Circumfixes
A sequence expression is composed of series of comma and/or semi-colon delineated sub-expressions. Commas delineate horizontal sequences, while semi-colons delineate vertical sequences.
A circumfix expression is a normal expression wrapped in a pair of enclosing characters that indicate the container primitive it evaluates to. The three primary circumfixes are ()
, []
, and {}
, which signify uniforms, lists, and structs, respectively.
An un-enclosed sequence is evaluated as a list by default, and superfluous circumfixes are ignored when the container type of the inner expression is unambiguous, such that {([[1,2,3]])}
would evaluate the same as [1,2,3]
01-1. which means you can use all three primary circumfixes the way you’d use parentheses in other languages, so long as the innermost operands are properly enclosed.
A good way to think of this behavior is that circumfixes constrain, but they don’t coerce.
Literals
Vada has a lot of literals, because there are a lot of ways to represent data.
Numeric Literals
Vada allows underscores in all numeric literals. They have no semantic value, however they can be used as visual delineators to improve readability.
Name | Example | Comments |
---|---|---|
Integer Literal | 1 |
|
Decimal Literal | 1.0 |
|
Binary Literal | 0b1001_0010 |
No inherent length limit. |
Hex Literal | 0xfa_20_22 |
Because self-control is a foreign concept for me, Vada has a dynamic-base numeric literal.
Here’s the regex for it:
([0][2-9]|[1-5][0-9]|6[0-4])a_([a-zA-Z0-9]+[a-zA-Z0-9_]*)(\.[a-zA-Z0-9]+[a-zA-Z0-9_]*)?
(yes, that’s a non-integer arbitrary base literal), the number in the first capture group is interpreted as the base for the number represented second and third capture groups.
such that 02a_0101
is equivalent to 0b0101
, 10a_1000
is equivalent to 1000
, and 36a_5f
is equivalent to 195
, and so on.
String Literals
Vada keeps strings simple by only using backticks for enclosing characters and by directly plagiarizing Rust’s raw string syntax.
Name | Example | Comments |
---|---|---|
Escaped String | `A String` |
|
Raw String | r#`A String`# |
Keywords
One of the first things folks notice about Vada is that it uses a ton of sigils (leading symbols on variable names), with all four primary sigils showing up in even simple examples:
someRecord: @Ext::std::unitTypes::$phys & {
name: "Jimbo"
classes:
[...{name: !str level: !int & >0}]
& {"Fighter", 4}, {"Mage", 2}
level: classes->#reduce(cls; @sum cls[1]) / classes->#count
}
Vada uses sigils to limit namespace pollution, optimize code-completion, and ensure expression clarity. Sigils also help users extrapolate from the code in front of them without having to rely on IDE features and make it easier to search the documentation; @if
and if
are two very different things.
At first glance sigils can feel like clutter, and concepts like sigil hoisting — shifting a terminal sigil to the start of a section path to get some extra juice out of IDE auto-complete, e.g. $::std::unitTypes::phys
— and sigil filtering — ending a section path with a bare sigil to import/access the matching subset, e.g. @import: std::unitTypes::#
— are kinda funky.
Where the semantic value of sigils (and their related access patterns) really starts to shine, however, is when you’re working with large archives that have substantial dependency trees and complex schemas.
In those cases, the visual noise is out-weighted by the benefits that come from being able to instantly know reference types without having to rely on IDE features. Sigils add clarity (and let us pull of some neat parsing tricks).
And when you get right down to it, the number of sigils that Vada uses is fairly modest, and what they signify is pretty, well, significant, too.
Sigil | Role | Example |
---|---|---|
@ |
Keywords and Control Flow | @if @else @repr |
! |
Type Primitives and Containers | !int !flags !list |
$ |
User Types/Schemas | $someSchema |
# |
Transforms | #filter #map #reduce |
If it has a leading @
it’s always a noun. If it has a leading !
it’s always a built-in type or container. If it has a leading $
it’s always a user-created schema. If it has a leading #
it’s always a verb. If it doesn’t have a sigil, it’s just a plain old reference.
Important Keywords
The @
sigil group is fairly large, as it contains a ton of both topological and operational keywords that are frequently used. Here’s a brief list of the ones you’ll see most commonly:
Keyword | Context | Description |
---|---|---|
@Collection |
File Topology | Mid-Level namespace |
@Local |
Section | Private namespace equivalent |
@Public |
Section | Public namespace equivalent |
@repr |
Section | Output packing data |
@Doc::[short, long] |
Universal | Docstrings |
@True, @False |
Constants | Boolean Constants |
@None, @Abstract, @Concrete |
Constants | Reification Constants |
The less common keywords are listed in the main docs, but their meanings tend to be easily guessable from the contexts they’re used in. They’re all noun-ish, they’re all consistent.
The other sigiled tokens are covered in the related sections below.
Pipes and Transforms
Pipes are arrow-shaped binary infix operators — ~> ->
— which pass the left operand into the expression namespace of the right operand by binding its value to the Implicit, a sort of contextual reference that can be accessed via the @Implicit
keyword or the idiomatic _
.
the Outer Pipe ~>
has the lowest binding power among the binary operators, binding whole expressions to whole expressions, while the Inner Pipe ->
has the second-highest binding power (the path operators .
and ::
have the highest), binding immediate operands to immediate operands. Pipes are used to break long expressions into sequences, to pass data into transforms and lambdas, and as a general tool for structuring expressions04-1.
Transforms are Vada’s equivalent to functions. Transforms operate upon the Implicit and are parsed as in one of two ways:
- As an unary prefix, when followed by a colon (low-bind) or a circumfix expression (high-bind).
- As an operand (unary transforms).
This means #T(...)
, #T[...]
and #T{...}
are all syntactically valid high-bind transforms.
In all circumfix expressions, superfluous circumfixes are ignored; {([1,2,3])}
is treated the same as [1,2,3]
04-2, which lets you pass full argument lists as variables without having to worry about whether the right circumfix is used.
Of the three constructions, the columnar list syntax — #T[arg1; arg2]
— is the most common syntax you’ll see for multi-argument transforms.
Transforms have equal binding power to the Outer pipe when they are used with a colon — e.g. ... ~> #someTransform: ... ~>
— allowing them to be used without parentheses, but requiring an Outer Pipe to break the expression, and equal binding power to the Inner Pipe when used without a colon.
It might seem strange to have two pipe operators and two ways of calling functions, however this duality ultimately promotes clarity in complex expressions by letting users elide parentheses in a readable and predictable way. Because the binding powers in question bracket the binding powers of the more common operators, the impact they have on precedence is unambiguous and the net result is a cleaner expression syntax.
Common Transforms
Uniform Transforms | Description | Example |
---|---|---|
#mag |
Returns the scalar magnitude of the LHS | (3, 1, 2) ~> #mag // 3.741 |
#normalize |
Returns unit norm of the LHS | (3, 1, 2) ~> #normalize // (0.802, 0.267, 0.534) |
#inverse |
Returns the inverse of the LHS | (3, 1, 2) ~> #inverse // (0.333, 1, 0.5) |
#transpose |
Returns the transpose of the LHS | (3, 1, 2) ~> #transpose // (3; 1; 2) |
#sign |
Returns the signed unit of the LHS | (-3, 1, -2) ~> #sign // (-1, +1, -1) |
#cross |
Returns the cross product of the LHS and RHS | (3, 1, 2) ~> #cross (4, -2, 6) // (10, -10, -10) |
#dot |
Returns the inner product of the LHS and RHS | (3, 1, 2) ~> #dot (4, -2, 6) // 22 |
#wedgeReal |
Returns the exterior product (in ℝ) of the LHS and RHS04-3 | (3, 1, 2) ~> #wedge (4, -2, 6) // (-10, 10, 10) |
#wedgeArb |
Returns the exterior product of the LHS and RHS | (3,1,2) ~> #wedgeArb[(1,0,0; 0,1,0; 0,0,1); (4,-2, 6)] // ... |
#reshape |
Returns a member-wise copy of the LHS using the RHS layout | (1,2;3,4;5,6) ~> #reshape(3;2) // (1,2,3; 4,5,6) |
#iter |
It iterates, you goblin. |
Don’t worry if that last one,
#wedgeArb
, looks confusing; the RHS columnar list is topologically equivalent to the struct{ basis: (1,0,0; 0,1,0; 0,0,1) operand: (4,-2, 6) }
Struct Transforms | Description | Example |
---|---|---|
#open |
Creates an open copy of the LHS struct | SomeRec: {a:1 b:2}~>#open |
#fork |
Copies the LHS struct, with overrides from the RHS struct | $SomeSch: $SchA~>#fork{a?: !string } |
#valueList |
Creates a list containing the LHS struct’s field values | {a:1, b:2, c:"word"}~>#valueList // [1, 2, "word"] |
#nameList |
Creates a list containing the LHS struct’s field names | {a:1, b:2, c:"word"}~>#nameList // ['a', 'b', 'c'] |
#fork
and#open
are incredibly powerful transforms for “resetting” structs and schemas passed down from parent modules. The downsides to abusing them are obvious. Discipline is suggested.
List Transforms | Description | Example |
---|---|---|
#iter |
It iterates, you goblin. | |
#filter |
Filter elements from LHS that don’t match RHS constraint | [1,2,3,4,5,6] ~> #filter: _ % 2 == 0 |
#reduce |
Sums LHS with RHS expression | [100, 50, 25] ~> #reduce: |
#reshape |
Returns a copy of the LHS arranged with the RHS expr | [1,2;3,4;5,6] ~> #reshape: |
#reduce
is roughly comparable to an ad-hoc!sink
, in that it offers all of the same options in terms of striding the data and shaping the output.
Transforms, Side-Effects, and You
You can technically use @Local
and some schema polymorphism, combined with smartly designed sinks, to model side-effect-like behaviors. But doing that requires you to rebuild simple fields as sinks and feed them using factory schemas that produce intermediary schemas which function as algebraic data types.
You can even go so far as to expose a top-level state as a sink that’s propagated downwards via the Implicit and then use topological weighting to effectively scope write access. But at that point you’d have an distributed state tree so goddamned glorious that you could probably get grant funding for it.
And it technically wouldn’t produce side effects, just side-effect-like behavior, because you’re technically feeding that top sink with non-reified values and modeling behavior branches on partial collapse-reductions of the top sink’s contents via the n-ary sinks you recursively ascend through on eval.
So, yes, you can have side effects if you want to one-shot your way to a tenure track position at a university somewhere.
Field Expressions (Captures)
In unreified structs, field names can be defined through generative captures.
- In addition to unwrapping to the lowest circumfix, Vada also lowers anonymous structs to columnar lists, as they’re topologically comparable under the hood. Conversely, it doesn’t lower lists to uniforms, as doing that willy-nilly is computationally intensive at-scale.↩
- You can also use those operators in equality statements; Vada is smart enough to interpret them as infix operators when there’s a left operand present.↩
Data Types
Vada’s overall type system is composition-like, both in the sense that the conjunction of constraints is comparable to class composition and in the sense that any one ‘type’ can be broken down into a composite of primitives, containers, and constraint expressions (and @repr configs).
This means that Vada’s list of built-in types (primitives + containers) is short and simple, and that the consequential interactions of those types is a gigantic wibbly ball of set theory that’s hard to grok without hands-on experience.
Primitives
Primitives are relatively simple: they’re what you typically think of when discussing types. Strings, integers, decimal numbers, Booleans — Vada’s list of primitives is short, as most of the complexity comes from the layers built on top of them.
Primitive | Subtypes | Description |
---|---|---|
!str |
simple, slug, disp |
UTF-8, with lots of magic. |
!num |
int, dec |
Signs, precision, etc., are constraints |
!flag |
set, exclusive |
All the bitvec fun. With recursion. |
!raw |
char, byte, arb |
Arbitrarily sized raw bit sequences. |
!bool |
It’s a fucking bool. |
String Primitives
Vada’s strings are complicated, but in a good way.
Numeric Primitives
Flag Primitives
The !flag
primitive is a generalization of C-style enums that can be used in both bitwise and set operations, with the additional enhancements for hierarchical sets.
Vada’s flags are specifically designed to fulfill the role that bitvec-like flag sets do in game engines.
$someFlagset: !flag <~ {
red,
// flags can be left open, allowing for downstream customization
green: {...},
blue: {
cyan,
turquoise,
royal
}
}
Raw Primitives
Raw primitives are primarily used for two things:
- Output encoding
- Hackery
Containers
Vada treats containers as multi-dimensional by default, wherein scalar containers are simply one-by-one matrices.
This makes Vada particularly handy for the sorts of linear algebra data mashing that is common in game development, but it means that container configuration can be a bit hard to grok without context.
Container | Description | Use Case |
---|---|---|
!uniform |
Fixed Multi-Dimensional, Single Primitive | Scalars, Vectors, Matrices, Rotors, Etc. |
!list |
Dynamic Multi-Dimensional, Multi-Primitive | Arrays, Tuples, Complex Numbers, Etc. |
!struct |
Dynamic Hierarchical, Multi-Primitive, Named Members | Yes |
!sink |
Idempotent Delayed-Eval Accumulators | Monads, Type Factories, Fuzzy State, Etc. |
In short, the !uniform
container is for everything that is, well, uniform. Scalars are uniform because they’re scalar. The only difference between a scalar, a vector, and a matrix is the number of members to which the uniformity applies.
Meanwhile, the !list
container is specifically built for, well, list iteration, and the !struct
container is built for semantically structured and hierarchical data.
Uniform Containers
We explored some of the coolness of uniforms up in the Core Concepts section, but the focus there was primarily on uniform numeric literals rather than the uniform container itself.
The uniform container is Vada’s atomic container.
TODO: Expand
List Containers
A list is a struct with anonymous members. That’s the long and short of it.
Take a look at this struct:
someStruct: {
name: "jim"
age: 21
hobbies: [
{ name: "stamp collecting", experience: 4 },
{ name: "kyaking", experience: 2 },
...
]
stats: {
strength: 4
agility: 6
intelligence: 3
luck: 8
charm: 1
}
}
If we were to access it as a list with ::@values
, recursively flattening the member values into rows, we’d get:
someList: [
"jim";
21;
[ ["stamp collecting"; 4], ["kyaking"; 2], ... ];
[4; 6; 3; 8; 1]
]
And if we were to extract a schema from that list with #asAbstract
, we’d get:
$someTopology: [
!string;
!num::int;
[ [!string; !num::int], ... ];
[!num::int, !num::int, !num::int, !num::int, !num::int]
]
Which we could then use as an anonymous constraint for other structs, whereby otherStruct::@values & $someTopology
would constrain otherStruct
’s members purely by position.
You can also do someStruct::@fields & otherStruct::@fields
to perform a name constraint without creating a duplicate struct full of someField: @any
constraints.
This topological equivalence is both an inherent side-effect of Vada’s design and an intentional feature of lists. Lists are iterable, filterable, bulk-constrainable jagged arrays. They lack the any-axis slicing of uniforms, and they aren’t as ergonomic as structs for topologically complex records, but the position they occupy as the middle-point between those two containers makes them incredibly useful.
Constraining Lists
The main bit of magic for constraining lists is the sequence prefix, ...
.
...
functions similarly to JavaScript’s spread operator, flattening the prefixed operand into the outer container. Where it gets fancy is the ways in which you can do math with sequences:
// Concrete Spreads
simpleSpread: [ ...[1, 2, 3], ...[4, 5, 6] ] // [1, 2, 3, 4, 5, 6]
repeatedSpread: [ ...[1, 2, 3] * 2 ] // [1, 2, 3, 1, 2, 3]
interleavedSpread: [ ...[1, 2, 3] + ...[4, 5, 6] ] // [1, 4, 2, 5, 3, 6]
// Abstract Spreads
abstractRepeat: ...!num * 5 // [!num, !num, !num, !num, !num]
abstractRange: ...!num & 1..4 // [!num] | [!num, !num] | [!num, !num, !num]
// Fancy
CombinedSpreads: [...!num & 0..6, ...!string & 0..6] & 1..8 // 0-5 numbers, 0-5 strings, for a minimum of 1 and a maximum of 8 items.
DisjunctSpreads: [ ...(!num & 0..) * 5 | ...(!num & ..0) & 1..6] // Either 5 numbers greater than zero or 1-5 numbers less than zero.
Vada goes hard on list constraints. This is just the tip of the iceberg.
Struct Containers
If you manage to create a substantial Vada archive that isn’t primarily composed of structs, you’re probably doing something weird.
What the hell is a Section?
A Section is a semantically distinct sub-group of a struct’s members. Sections are used to organize fields, apply multi-field constraints, create multi-field captures, and create semantic access patterns. Sections are, simply put, magical.
$characterInfo: {
// Fields in the @local inherent can't be externally referenced
--- @local {
isBoring: !bool & @default -> @true
}
// Simple sections don't need circumfixes. They simply enclose
// the fields that follow them, up until the next section declaration.
--- info
name: !string & != ""
age: !num & 0..
background: !string
// if you want to nest sections, however, you should use a circumfix
--- classInfo {
--- combat
classes: [...{name: !string & != "" level: !num::int & 0..21}]
powerLevel: !num::int
combatTitles: [...!string & != ""]
-- secondary
jobs: classes: [...{name: !string & != "" level: !num::int & 0..51}]
jobTitles: [...!string & != ""]
}
// You can bulk-constrain sections.
// When you bulk-constrain, the right side of the outermost
// pipe expression is treated as the section body.
--- attributes: !num::int ~> [
strength,
endurance,
agility,
courage
]
// You can further constrain the section body to control
// the number of fields, create field-set disjunctions, etc.
--- skills: {
name: !string
level: !num::int
} ~> [...] & 0..11
// $["meta_\w*"] is a regex-based field name capture, which
// lets you do fancy organizational things with open structs.
--- metaData: $["meta_\w*"] & !string ~> {
...
}
// Anonymous sections can be used to create semi-open structs,
// applying constraints to any additional fields that are added.
--- !string ~> [...]
}
Open Structs vs Closed Structs
Magic.
What the Hell is a Sink?
A Sink is an algebraic container that encapsulates an accumulator (read: a write-only open list) which is transformed when the sink is evaluated (read: unwrapped).
The inputs into a sink’s accumulator and the output of a sink’s evaluation transform are constrained separately, allowing sinks to function as both enums (in the Rust-y sense) and as vaguely cursed not-monads.
Understanding all of the shenanigans you can pull with sinks takes time, but you can use them as algebraic data types and super-disjunctions without diving into the funky bits.
Multi-Sinks
Most of the sinks you’ll create will have multi-dimensional accumulators,
$someSink: !sink <~ {
@accumulator: {
fieldA: [!number, ...]
fieldB: [!number, ...]
fieldC: [!number, ...]
}
@out: fieldA->#sum + fieldB->#sum * fieldC->#mean
}
Structs
Structs are the bread and butter of most Vada archives.
someList: [ "jim"; 21; [["stamp collecting"; 4], ["kyaking"; 2]] ]
are topologically equivalent.
Schemas
Schemas are magical and sigil-worthy due to the fact that you can pipe values into them (like transforms) and use the values passed via the Implicit to parameterize the constraints of the resultant record, allowing an expressive form of polymorphism that is constraint-friendly without requiring an exhaustive enumeration of sub-types.
Notably, schemas can be initialized by left-piping the initialization value, like so:
$someSchema<~{initArgs:...} & {fields:...}
Which can help to maintain clarity in large expressions by keeping the sigiled schema name at the left of the sub-expression, emphasizes the fact that schemas are sink-like in how they consume their init values, and makes it easier to auto-complete the topology of the init struct.
Schemas can, of course, be initialized via the typical right-piping, as well, however using a left-pipe is the idiomatic approach.
Schemas + Sinks = Dynamic Type Generation
A sink schema that produces sink schemas, using recursion to incrementally refine intermediate schemas until a concrete record is produced (which can then still consume more inputs through its accumulator and optionally de-reify back to a schema on eval) can be pretty damn close to Turing complete.
Can a combination of sinks and schemas be used to create type generation tools that let you scaffold out large archives based on a sample dataset and a handful of smart declarations? Yes. Does the utility of the above make the combo any less cursed? No.
With great power comes great responsibility.
Unit vs Non-Unit Numeric Schemas
Numeric constraints don’t prevent you from doing math with differently-constrained operands; as long as the primitives are compatible, it works. And since Vada doesn’t have floats at the eval stage — you can output floats, but the actual math is deterministic — it means that you can do math with just about any combination of numerics you can think of.
Which is where Units come in.
All numerics are technically [!num, !flagset]
tuples
$metricLengthFlags: !flagset <~ [ mm, cm, dm, m, dam, km ]
$USCLengthFlags: !flagset <~ [ in, ft, yd, mi ]
!num::unitGroup::metric <~ {
// This pulls $metricLengthFlags into the local namespace for easy reference
@flags: $metricLengthFlags
// Sets the 1-scale unit
@baseUnit: m
// Determines the family's relation to the base unit.
// Intermediate relations are automatically calculated and cached.
@baseScale: (
mm ?> *0.001;
cm ?> *0.01;
dm ?> *0.1;
m ?> *1;
dam ?> *10;
km ?> *1000
)
// Conversions for other unit groups.
// These external equivalencies require explicit units.
@equiv:
!num::unitGroup::USC ?> (
in ?> 2.54::cm;
ft ?> 0.3048::m;
yd ?> 0.9144::m;
mi ?> 1.609344::km | 1609.344::m
)
}
!num::unitGroup::USC <~ {
@flags: $USCLengthFlags
@baseUnit: in
@baseScale: (
in ?> *1;
ft ?> *12;
yd ?> *3;
mi ?> *5280
)
@equiv:
!num::unitGroup::metric ?> (
mm ?> 0.039370::in;
cm ?> 0.39370::in;
dm ?> 3.9370::in;
m ?> 39.370::in | 3.2808::ft;
dam ?> 32.808::ft;
km ?> 0.62137::mi | 1093.6::yd | 3280.8::ft;
)
}
The Vada CLI
Vada’s CLI is designed to work with multiple named variants of the evaluator existing side-by-side, such that archives can specify both the version number and variant name of the evaluator they target. This setup makes it easy for studios to juggle project-specific extensions of the evaluator without having to mess with CLI aliases and/or PATH trickery.
Is this a mildly over-engineered way to implement a CLI? Yes. It is definitely an over-engineered CLI.
The long-term plan for the CLI is to make it even more over-engineered, until we end up with something approaching an actual database CLI, complete with robust CRUD functionality and migration tools.
Why? Because Vada’s core use case involves archives large enough and diverse enough that you kinda need SQL-like broad-scale interactability in order to get things done.
And since Vada’s parser uses concrete syntax trees, it’s relatively easy to interact with Vada archives in a programmatic fashion. So there’s a ton of head-room for CLI tooling.
Archive Configuration
Vada archives are configured through their root manifest. As such, using the CLI to interact with an archive’s config is really just using the CLI as an interface layer. So why do it? Because the CLI can intelligently search and navigate the config in order to help you do things.
External Archive Management
Vada doesn’t ship a full npm-like archive manager, because I don’t have the brain juice for that and it isn’t aligned with Vada’s primary use case.
Instead, Vada ships with tooling for using and maintaining local-network git-aware archive registries. Because, let’s face it, a package manager that can pull the right local branch from the studio’s intranet is more useful than stranger danger-ing yourself with mystery archives from the internet.
The CLI can be used to browse these registries, validate the availability of an archive’s external dependencies, and determine what would be tree shaken from those external archives (and, optionally, internal collections) to generate a standalone archive from the source archive’s manifest.
And it can validate environment variables and build variables that influence how external archive dependencies are handled. Because bloody hell, that’s the bare minimum that a CLI should do for you. The most pre pre-checking you can do. Please, people, have standards.
Output Artifact Management
In addition to running the actual output process, you can also use the CLI to preview, compare, and validate output artifacts.
Output Artifact Hashing
Vada is deterministically evaluated, but you can choose to include non-deterministic data (build time, builder signature, build variables, etc.) in the header of the output artifact. You can also choose to ignore or include those non-deterministic header entries when calculating the hash of an output artifact, which means you can have your cake (and eat it, too) in pipelines that hash-check in their action triggers.
Testing and Validating Archives
Vada’s CLI includes a handful of common archive queries as baked-in commands that can be used for integrity and assumption testing.
A Common Scenario: Managing Text Localizations
So you’re a studio. And you need a data pipeline that makes it easy for translators to contribute translations of entity names and UI text. That pipeline needs to support easy post-release updates (and maybe even OTA updates), because there’s no way you’ll be able to get all of the translation work done before you ship 1.0, and
Here’s where Vada can factor in.
Leveraging the !str::disp
Primitive
The !str::disp
primitive is for Display Strings, and comes with some fancy features for templating, as well as the ability to both implicitly and explicitly link them to @variants
tables.04-1
Using @variants
A @variants
table can be defined directly in a containing Section, as a part of a collection’s manifest, or through the @variants
longitudinal inherent collection.
Longitudinal inherents are, by far, the most powerful option for managing localizations, so hit the foldable if you’re not familiar with them.
A longitudinal inherent collection is a collection tree that exists alongside the archive’s actual contents, shadowing its structure and providing access to the matching inherent section that exists in each collection.
Here’s what that might look like:
Longitudinal inherents serve as a convenient way to separate concerns inside an archive, letting users work inside siloed slices of data in a way that plays nicely with IDE’s and code generation tools. While the focus here is on the longitudinal of @variants
for localization purposes, the same approach can also be used for other inherents, such as @repr
and @manifest
.
A @variants
table can be created for any record field, but Display Strings have the most support in terms of automated detection and code generation.
@module: PersonData
--- @Imports:
@std::util::$.[$utilSchema]
--- @Local:
$edType: "None" | "Highschool" | "Undergraduate" | "Graduate"
$desc_string: !str & != ""
--- @Public:
$personSchema: $utilSchema & {
@Doc::short: "A basic structure for person data"
--- Basic:
name: $desc_string
age: !int & > 0
address: @Local::$addrStruct | @None // Words!
--- Experience:
employer: (!str & != "") | @None
education: [($edType, $desc_string), ...]
certs: [!str & != "", ...] | @None
#score: certs->#reduce(@sum)
+ education->#reduce: + _ ?>
"None" ?> 0;
"Highschool" ?> 1;
"Undergraduate" ?> 4;
"Graduate" ?> 6
}
Vada’s approach to Display Strings takes inspiration from Unreal Engine’s
FText
implementation, albeit with a notably different approach to building and interacting with the actual sets of translation data.Regarding Vada’s other string primitives,
!str::simple
is for strings that are just strings, while!str::slug
is for strings meant to serve as internal, scoped, identifiers.And yes,
!str::slug
is inspired by Unreal’sFName
type.↩