-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Presently, url-safe base64 encoding is handled by b64url NIF at src/b64url. However, support for RFC 4648 compliant url-safe encoding was added to Erlang stdlib's base64 in Erlang/OTP 26.0. Additionally, encoding was made upto 4 times faster thanks to the JIT compiler that was merged in the same release.
Benchmarking base64 and b64url with benchee1, with the following benchmark, we find that the built-in base64 is faster:
Mix.install([:benchee, {:b64url, github: "apache/couchdb", sparse: "src/b64url/"}])
defmodule B64Bench do
def main do
[workers, min_size, max_size, duration, entries] =
Enum.map(System.argv(), &String.to_integer/1)
bytes =
1..entries
|> Enum.to_list()
|> Enum.map(fn _ ->
:crypto.strong_rand_bytes(min_size + :rand.uniform(max_size - min_size))
end)
Benchee.run(
%{
"b64url" => fn input -> process(input, &:b64url.encode/1, &:b64url.decode/1) end,
"base64 (standard) + re" => fn input ->
process(
input,
fn url ->
url = :erlang.iolist_to_binary(:re.replace(:base64.encode(url), "=+$", ""))
url = :erlang.iolist_to_binary(:re.replace(url, "/", "_", [:global]))
:erlang.iolist_to_binary(:re.replace(url, "\\+", "-", [:global]))
end,
fn url64 ->
url64 = :erlang.iolist_to_binary(url64)
url64 = :erlang.iolist_to_binary(:re.replace(url64, "-", "+", [:global]))
url64 = :erlang.iolist_to_binary(:re.replace(url64, "_", "/", [:global]))
padding =
:erlang.list_to_binary(
:lists.duplicate(rem(4 - rem(:erlang.size(url64), 4), 4), 61)
)
:base64.decode(<<url64::binary, padding::binary>>)
end
)
end,
"base64 (urlsafe)" => fn input ->
process(
input,
&:base64.encode(&1, %{mode: :urlsafe}),
&:base64.decode(&1, %{mode: :urlsafe})
)
end
},
parallel: workers,
time: duration,
inputs: %{"generated" => bytes}
)
IO.inspect(:erlang.byte_size(Enum.join(bytes)), label: "Total size (B)")
end
def process(bytes, encode, decode) do
Enum.each(bytes, fn bin -> decode.(encode.(bin)) end)
end
end
B64Bench.main()$ elixir b64_bench.exs 4 10 100 60 100
Operating System: Linux
CPU Information: 12th Gen Intel(R) Core(TM) i7-1255U
Number of Available Cores: 12
Available memory: 15.31 GB
Elixir 1.18.4
Erlang 28.0.2
JIT enabled: true
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 1 min
memory time: 0 ns
reduction time: 0 ns
parallel: 4
inputs: generated
Estimated total run time: 3 min 6 s
Excluding outliers: false
Benchmarking b64url with input generated ...
Benchmarking base64 (standard) + re with input generated ...
Benchmarking base64 (urlsafe) with input generated ...
Calculating statistics...
Formatting results...
##### With input generated #####
Name ips average deviation median 99th %
base64 (urlsafe) 8.18 K 122.31 μs ±44.53% 110.85 μs 296.40 μs
b64url 6.18 K 161.88 μs ±56.17% 139.65 μs 609.83 μs
base64 (standard) + re 0.74 K 1345.47 μs ±32.12% 1076.60 μs 2221.58 μs
Comparison:
base64 (urlsafe) 8.18 K
b64url 6.18 K - 1.32x slower +39.57 μs
base64 (standard) + re 0.74 K - 11.00x slower +1223.16 μs
Total size (B): 5491
Therefore I propose we drop b64url in favour of the stdlib functions. This has the following benefits:
- Less code to maintain
- (marginally) better peformance
- Enhanced safety (by way of eliminating an NIF)
Footnotes
-
I found updating the existing benchmarks to compare 3 or more implementations tedious. The new benchmark's arguments are similar to the previous benchmark, with the exception of an extra parameter
entries, which is the number of random binaries to encode. Theipsresults are directly proportional to the previousbpsand can be converted tobpsby multiplying by total size. ↩