Stanford Infolab

Weld

Fast parallel code generation for data analytics frameworks. Developed at Stanford University.

What is Weld?

Weld is a compiler and runtime for improving the performance of data-intensive applications. It enables powerful compiler optimizations and automatic parallelization across functions by expressing the core computations in libraries using a small common intermediate representation and a lazy runtime API.

Weld performance

Weld can improve the performance of workflows such as SQL with Spark SQL, logistic regression with TensorFlow, and data cleaning in NumPy and Pandas.

Releases

Getting Started with Weld

The easiest way to use Weld is through one of our library integrations. Grizzly is a Weld-enabled version of Pandas, and WeldNumPy is a Weld-enabled version of NumPy. Get them both on PyPi:

# Install Grizzly
pip install pygrizzly

# Install WeldNumPy
pip install weldnumpy

You can also build the Weld compiler and runtime from source for use with C, C++, Python, or Rust programs, or to use in your own projects.

Research With Weld

Weld started as a research project at Stanford University, and continues to drive new research projects both at Stanford and elsewhere! Below are a few research papers and technical reports on Weld:

Some other research papers in this space by our group at Stanford:

We are working on other follow up projects related to Weld, and other groups are also using Weld to do exciting research in the data space! Find code and more details about these projects on our research page.

Support and Contact

For support, join our Spectrum channel or subscribe to the Google Group. You can contact the developers at [email protected].

Sponsors

Weld sponors