How Fusion Works From the Inside: Bytecode Pushing

TL;DR

The remote node is running but has none of your code. Fusion ships it automatically:

Reads BEAM bytecode atoms to discover module dependencies
Recursively pushes them bottom-up to the remote node
Loads your code with :code.load_binary/3

No source code leaves your machine - only compiled bytecode. Part 3 of a 3-part series.

The Problem

Part 2 left us with a remote BEAM node connected to our local cluster. It’s running, but empty. It has Erlang/OTP and Elixir stdlib - nothing else.

You call:

Fusion.run(remote_node, MyApp.Worker, :process, [data])

MyApp.Worker doesn’t exist on the remote. Neither do its dependencies. Fusion needs to figure out what to ship and ship it - automatically.

The Public API

The entry point is thin. Fusion.run/5 delegates to TaskRunner:

defmodule Fusion do
  defdelegate run(node, module, function, args, opts \\ []), to: TaskRunner
  defdelegate run_fun(node, fun, opts \\ []), to: TaskRunner
end

TaskRunner.run/5 does two things: ensure the module is available on the remote, then call it.

def run(node, module, function, args, opts \\ []) do
  timeout = Keyword.get(opts, :timeout, 30_000)

  with :ok <- ensure_module_available(node, module) do
    call(node, module, function, args, timeout)
  end
end

All the interesting work happens in ensure_module_available/2.

Check Before Push

No point shipping code that’s already there. Fusion asks the remote node first:

defp ensure_module_with_deps(node, module, visited) do
  if MapSet.member?(visited, module) do
    :ok
  else
    visited = MapSet.put(visited, module)

    case :erpc.call(node, :code, :is_loaded, [module]) do
      {:file, _} -> :ok
      false ->
        with :ok <- push_dependencies(node, module, visited),
             :ok <- push_module_bytecode(node, module) do
          :ok
        end
    end
  end
end

:erpc.call/4 is the modern replacement for :rpc.call/4 (OTP 23+). Better error handling, proper timeouts.

If the module is loaded, skip it. If not, push dependencies first, then the module itself. Bottom-up order matters - the module might reference its dependencies at load time.

Dependency Resolution

This is the cleverest part. How does Fusion know which modules MyApp.Worker depends on?

It reads the BEAM bytecode.

Every compiled BEAM file has an atoms chunk - a table of all atoms referenced in the module. This includes module names used in function calls, struct references, and protocol implementations.

def get_module_dependencies(module) do
  with {:ok, {_module, binary, _filename}} <- get_module_bytecode(module),
       {:ok, {_module, [{:atoms, atoms}]}} <- safe_beam_chunks(binary, :atoms) do
    atoms
    |> Enum.map(fn {_index, atom} -> atom end)
    |> Enum.filter(&elixir_module?/1)
    |> Enum.reject(&(&1 == module))
    |> Enum.reject(&stdlib_module?/1)
  else
    _ -> []
  end
end

Step by Step

Get the bytecode: :code.get_object_code(module) returns the compiled binary from the local BEAM.

Read the atoms chunk: :beam_lib.chunks(binary, [:atoms]) parses the BEAM file format and extracts the atoms table.

Filter for Elixir modules: Only atoms starting with "Elixir." are Elixir modules. Erlang modules like :lists or :gen_server are lowercase atoms - they’re already available on the remote.

defp elixir_module?(atom) do
  atom |> Atom.to_string() |> String.starts_with?("Elixir.")
end

Reject stdlib: Modules from Elixir/OTP standard library are already on the remote. Fusion checks the module’s file path - if it’s not in _build/, it’s stdlib:

defp stdlib_module?(module) do
  case :code.which(module) do
    :preloaded -> true
    :cover_compiled -> true
    path when is_list(path) -> not String.contains?(List.to_string(path), "_build/")
    _ -> true
  end
end

Modules in _build/ are your project code or your dependencies. Everything else ships with Erlang/Elixir.

Recursive Push

Dependencies can have their own dependencies. Fusion walks the tree recursively:

defp push_dependencies(node, module, visited) do
  deps = get_module_dependencies(module)

  Enum.reduce_while(deps, :ok, fn dep, :ok ->
    case ensure_module_with_deps(node, dep, visited) do
      :ok -> {:cont, :ok}
      {:error, {:module_not_found, _}} -> {:cont, :ok}
      {:error, reason} -> {:halt, {:error, reason}}
    end
  end)
end

The visited MapSet prevents infinite loops from circular references. If a dependency can’t be found locally (maybe an OTP module that slipped through the filter), Fusion skips it instead of failing.

The Actual Push

Once dependencies are resolved, pushing a module is three lines of real work:

defp push_module_bytecode(node, module) do
  case get_module_bytecode(module) do
    {:ok, {^module, binary, filename}} ->
      case :erpc.call(node, :code, :load_binary, [module, filename, binary]) do
        {:module, ^module} -> :ok
        {:error, reason} -> {:error, {:load_failed, module, reason}}
      end
    {:error, reason} -> {:error, reason}
  end
end

:code.get_object_code(module) - Gets the compiled BEAM binary from the local node.

:erpc.call(node, :code, :load_binary, [module, filename, binary]) - Sends the binary to the remote node and loads it into the remote code server.

That’s it. The remote node now has the module in memory, ready to call.

Anonymous Functions

Fusion.run_fun/3 handles anonymous functions (lambdas). An anonymous function is tied to the module that defines it. Fusion extracts that module and pushes it:

def run_fun(node, fun, opts \\ []) do
  timeout = Keyword.get(opts, :timeout, 30_000)
  info = Function.info(fun)
  module = Keyword.fetch!(info, :module)

  with :ok <- ensure_module_available(node, module) do
    call_fun(node, fun, timeout)
  end
end

Function.info/1 returns metadata about the function, including which module compiled it. When you write:

Fusion.run_fun(remote_node, fn ->
  File.ls!("/etc") |> Enum.filter(&String.ends_with?(&1, ".conf"))
end)

That lambda is compiled into your current module (or an iex helper module). Fusion pushes that module’s bytecode, and the remote node can execute the function.

The Full Picture

Fusion is ~700 lines of Elixir with zero dependencies. Three tricks make it work:

SSH tunnels for distribution. Three tunnels (two reverse, one forward) make the local EPMD and distribution ports appear on the remote machine. The remote BEAM thinks it’s joining a local cluster.

EPMD port pinning. The remote node starts with ERL_EPMD_PORT pointing to the tunneled EPMD and a pinned distribution port. No random ports, no firewall surprises.

Bytecode shipping via :erpc. Read local BEAM files, parse the atoms chunk for dependencies, push everything bottom-up with :code.load_binary/3. No source code leaves your machine.

The result: you write Elixir locally, run it remotely, get results back. The remote server just needs SSH and Elixir.