# Compute-over-Data with content-addressed data

The term "Compute-over-data" (CoD) generally refers to a computing paradigm where processing of data is performed near the location of the data. This concept is particularly relevant in the context of big data and distributed computing, where the transfer of large volumes of data over a network can be inefficient and costly. By performing computations close to where the data is stored (compute-over-data), faster processing speeds and lower network bandwidth requirements are possible.

IPFS users can perform CoD on IPFS data with the Bacalhau platform and the InterPlanetary Virtual Machine (IPVM) specification, both of which natively support content-addressed data.

# Bacalhau

Bacalhau is a platform for fast, cost-efficient, secure, distributed computation. Bacalhau works by running jobs where the data is generated and stored, also referred to as Compute Over Data (or CoD). Using Bacalhau, you can streamline existing workflows without extensive refactoring by running arbitrary Docker containers and WebAssembly (Wasm) images as compute tasks. The name Bacalhau was coined from the Portuguese word for "salted cod fish".

# Features

Bacalhau can:

  • Simplify management of compute jobs by providing a unified platform for managing jobs across different regions, clouds, and edge devices.
  • Provide reliable and network-partition resistant orchestration, ensuring jobs will complete even if there are network disruptions.
  • Provide a complete and permanent audit log, so you can be confident that jobs are being executed securely.
  • Run private workloads (opens new window) to reduce the chance of leaked data outside of your organization.
  • Reduce ingress and egress costs since jobs are processed closer to the source.
  • Run against data mounted anywhere (opens new window) on your machine.
  • Integrate with services running on nodes to run jobs, such as DuckDB (opens new window).
  • Operate at scale over parallel jobs and batch process petabytes of data.
  • Auto-generate art using a Stable Diffusion AI model (opens new window) trained on the chosen artist’s original works.

# More Bacalhau resources

# IPVM

The InterPlanetary Virtual Machine (IPVM) specification defines the easiest, fastest, most secure, and open way to run decentralized compute jobs on IPFS. One way to describe IPVM would be as "an open, decentralized, and local-first competitor to AWS Lambda".

IPVM uses WebAssembly (Wasm) (opens new window), content addressing, simple public key infrastructure (SPKI) (opens new window), and object capabilities to liberate computation from specific, prenegotiated services, such as large cloud computing providers. By default, execution scales flexibly on-device, all the way up to edge points-of-presence (PoPs) and data centers.

The core, Rust-based implementation and runtime of IPVM is the Homestar project (opens new window). IPVM supports interoperability with Bacalhau (opens new window) and Web3Storage (opens new window)

# More IPVM resources