Generic Programming
Contents
Generic Programming#
We have seen duck typing as a convenient abstraction tool for data types:
As a user, we don’t have to care about how a specific type, say an
Array
, is implemented. By being an array (i.e. a subtype ofAbstractArray
) it behaves like we expect and we can just use it.As a developer, as long as we make our objects behave like, say, and
AbstractArray
, we can implement it in whatever way we deem appropriate and it will work with all kinds of algorithms.
Building upon this principle, we also want to formulate our algorithms in an abstract way such that it works will all kinds of data types irrespective of their precise implementation (or even meaning). This is generally known as generic programming.
From Wikipedia:
Generic programming is a style of computer programming in which algorithms are written in terms of types to-be-specified-later that are then instantiated when needed for specific types provided as parameters.
Example 1: Summation#
function sum_naive(x)
s = 0.0
for xi in x
s += xi
end
return s
end
sum_naive (generic function with 1 method)
function sum_generic(x)
s = zero(eltype(x))
for xi in x
s += xi
end
return s
end
sum_generic (generic function with 1 method)
using BenchmarkTools
x = rand(100_000);
@btime sum_naive($x);
@btime sum_generic($x);
133.679 μs (0 allocations: 0 bytes)
133.678 μs (0 allocations: 0 bytes)
x = rand(Int, 100_000);
@btime sum_naive($x);
@btime sum_generic($x);
133.686 μs (0 allocations: 0 bytes)
12.042 μs (0 allocations: 0 bytes)
Example 2: Vandermonde matrix#
(modified from Steven’s Julia intro)
\begin{align}V=\begin{bmatrix}1&\alpha _{1}&\alpha _{1}^{2}&\dots &\alpha _{1}^{n-1}\1&\alpha _{2}&\alpha _{2}^{2}&\dots &\alpha _{2}^{n-1}\1&\alpha _{3}&\alpha _{3}^{2}&\dots &\alpha _{3}^{n-1}\vdots &\vdots &\vdots &\ddots &\vdots \1&\alpha _{m}&\alpha _{m}^{2}&\dots &\alpha _{m}^{n-1}\end{bmatrix}\end{align}
using PythonCall
np = pyimport("numpy")
Python module: <module 'numpy' from '/usr/local/lib/python3.9/dist-packages/numpy/__init__.py'>
np.vander(1:5, increasing=true)
Python ndarray:
array([[ 1, 1, 1, 1, 1],
[ 1, 2, 4, 8, 16],
[ 1, 3, 9, 27, 81],
[ 1, 4, 16, 64, 256],
[ 1, 5, 25, 125, 625]])
The source code for this function is here. It calls np.multiply.accumulate
which is implemented in C here. However, this code doesn’t actually perform the computation, it basically only checks types and stuff. The actual kernel that gets called is here. This isn’t even C code but a template for C code which is used to generate type specific kernels.
Overall, this setup only supports a limited set of types, like Float64
, Float32
, and so forth.
Here is our simple generic Julia implementation
function vander(x::AbstractVector{T}) where {T}
m = length(x)
V = Matrix{T}(undef, m, m)
for j = 1:m
V[j, 1] = one(x[j])
end
for i = 2:m
for j = 1:m
V[j, i] = x[j] * V[j, i-1]
end
end
return V
end
vander (generic function with 1 method)
vander(1:5)
5×5 Matrix{Int64}:
1 1 1 1 1
1 2 4 8 16
1 3 9 27 81
1 4 16 64 256
1 5 25 125 625
A quick speed comparison#
Show Code
using BenchmarkTools, Plots
ns = exp10.(range(1, 4, length=30));
tnp = Float64[]
tjl = Float64[]
for n in ns
x = 1:n |> collect
push!(tnp, @belapsed np.vander(\$x) samples=3 evals=1)
push!(tjl, @belapsed vander(\$x) samples=3 evals=1)
end
plot(ns, tnp./tjl, m=:circle, xscale=:log10, xlab="matrix size", ylab="NumPy time / Julia time", legend=:false)
Note that the clean and concise Julia implementation is beating numpy’s C implementation for small matrices and is on-par for large matrix sizes.
At the same time, the Julia code is generic and works for arbitrary types!
vander(Int32[4, 8, 16, 32])
4×4 Matrix{Int32}:
1 4 16 64
1 8 64 512
1 16 256 4096
1 32 1024 32768
It even works for non-numerical types. The only requirement is that the type has a one (identity element) and a multiplication operation defined.
New “features” emerging from generic programming#
Symbolic computations#
using Symbolics
@variables a b c d e
v = vander([a, b, c, d, e])
substitute(v, Dict(b => 2, d => 4))
Arbitrary precision computing#
x = rand(BigFloat, 10)
10-element Vector{BigFloat}:
0.2161031980318831255921322433408436907781671454955668696579829700944428593281134
0.9207073642769274767111781566471123977336878227480311277064596264604037528701053
0.8495758285303895622128469631880481687885863958640820458291008812941111667820195
0.1409939714391858189570747335937928093843166313973919351135492100570271062807622
0.8182314309046965691127340692361068002967282049571550873208670576392475049435058
0.3736399997569659792614749994324503727598543381609754716088332423833328700714368
0.5089521677934883707406197967620895023439420711903230099404844752797502609071063
0.5399866349106401290729825983026868742643178995678155591627679788469629320506872
0.5471017233930594890325921322753917364622313134246785786505090827608953421940656
0.1643490361224571024576140391199536160440523611275100721801816121999483572338236
sum(x)
5.079641355159693623151249731898475968855884183933529757170736137016122152661652
Differential equation solving with uncertainty#
Code:
using OrdinaryDiffEq, Measurements, Plots
#Half-life of Carbon-14 is 5730 years.
c = 5.730 ± 2
#Setup
u0 = 1.0 ± 0.1
tspan = (0.0, 1.0)
#Define the problem
radioactivedecay(u,p,t) = -c*u
#Pass to solver
prob = ODEProblem(radioactivedecay,u0,tspan)
sol = solve(prob, Tsit5(), reltol=1e-8, abstol=1e-8);
plot(sol.t, sol.u, ylabel="u(t)", xlabel="t", lw=2, legend=false)
Output:
Historical note: In some sense, Julia implemented that feature by itself. The authors of Measurements.jl and DifferentialEquations.jl never had any collabration on this.
Core messages of this Notebook#
Julia can be fast.
A function is compiled when called for the first time with a given set of argument types.
The are multiple compilation steps all of which can be inspected through macros like
@code_warntype
.Code specialization based on the types of all of the input arguments is important for speed.
Calculations can be moved to compile-time to make run-time faster.
In virtually all cases, explicit type annotations are irrelevant for performance.
Type annotations in function signatures define a type filter/user interface.