Weld is a runtime for improving the performance of data-intensive applications. It optimizes across libraries and functions by expressing the core computations in libraries using a small common intermediate representation, similar to CUDA and OpenCL.
Modern analytics applications combine multiple functions from different libraries and frameworks to build complex workflows. Even though individual functions can achieve high performance in isolation, the performance of the combined workflow is often an order of magnitude below hardware limits due to extensive data movement across the functions. Weld’s take on solving this problem is to lazily build up a computation for the entire workflow, optimizing and evaluating it only when a result is needed.
Weld can increase the performance of existing data analytics frameworks with little integration effort. For example, for Spark, NumPy, and TensorFlow, porting over a few Weld operators can increase performance by up to 30x even on some simple workloads!
Prototype integrations of Weld with Spark (top left), NumPy (top right), and TensorFlow (bottom left) show up to 30x improvements over the native framework implementations, with no changes to users' application code. Cross library optimizations between Pandas and NumPy (bottom right) can improve performance by up to two orders of magnitude.
Weld is developed in the Stanford Infolab.