Building an inverted index on a large text collection with JSONiq
, 21 August 2019
We show how JSONiq can be used not only to manipulate JSON input, but also to build a standard inverted index on and query a text collection.
Rumble, an engine to run JSONiq on top of Spark
, 06 June 2019
This is an introduction and motivation to the Rumble engine, in particular, how it addresses the limitations of dataframes and Spark SQL when the dataset is heterogeneous and nested.
The design and implementation of a lock-free ring-buffer with contiguous reservations
Andrea Lattuada (@utaal) and James Munns (@bitshiftmask)
, 03 June 2019
This is the story of how James Munns and Andrea Lattuada designed and implemented (two versions!) of an high-perf lock-free ring-buffer for cross-thread communication. If any of those words look scary to you, don't fret, we'll explain everything from the basics.
Academics Should Build Their Own Computers to Advance Systems Research
, 13 May 2019
Mothy was invited to write a post
for the ACM SIGARCH blog, and decided to talk about building hardware
designed specifically for system software research (as opposed to
run commercial workloads). You can check it out here:
A fork() in the road
, 20 April 2019
Orran Krieger, and
have written a paper for
Hot Topics in Operating Systems next month about the Unix
fork() system call.
String interning and beyond, in differential dataflow
Frank McSherry (@frankmcsherry)
, 10 December 2018
Differential dataflow does a great number of interesting bits of data processing, but what about when you want to use complicated types, like strings? In this post we’ll check out how to use differential dataflow to intern strings, replacing them with integer identifiers that will allow the rest of our computation to execute more efficiently. From there, we’ll see how this generalizes to automatically assigning distinct record identifiers to collection elements, much like a database does!
Physical Adressing on Real Hardware in Isabelle/HOL
Lukas Humbel (home)
, 09 November 2018
Modern memory systems are much more complicated than the traditionally assumed
virtual and physical address space separation. We explain in this post which
effects can not solely expressed by the basic model and are important for
correct function of operating systems. We summarize our recent paper.
In this work we present a theory for addressing in such modern memory
subsystems. We formalize the theory in Isabelle/HOL.
A hammer you can only hold by the handle
Andrea Lattuada (@utaal)
, 05 November 2018
Today we’re looking at the rust borrow checker from a different perspective. As you may know, the borrow checker is designed to safely handle memory allocation and ownership, preventing accessess to invalid memory and ensuring data-race freedom. This is a form of resource management: the borrow checker is tracking who’s in charge of a chunk of memory, and who is currently allowed to read or write to it. In this post, we’ll see how these facilities can be used to enforce higher-level API constraints in your libraries and software. Once you’re familiar with these techniques, we’ll cover how the same principles apply to advanced memory management and handling of other more abstract resources.