Duck Typing, Interfaces, and Benchmarking
Contents
Duck Typing, Interfaces, and Benchmarking#
Julia’s ‘interfaces’ and duck-typing are a core part of Julia’s type system.
A lot of the power and extensibility in Julia comes from a collection of informal interfaces. By extending a few specific methods to work for a custom type, objects of that type not only receive those functionalities, but they are also able to be used in other methods that are written to generically build upon those behaviors.
Example 1: UnitRange
#
x = 1:30
1:30
typeof(x)
UnitRange{Int64}
supertypes(UnitRange)
(UnitRange, AbstractUnitRange{T} where T<:Real, OrdinalRange{T, T} where T<:Real, AbstractRange{T} where T<:Real, AbstractVector{T} where T<:Real, Any)
typeof(x) <: AbstractArray
true
Because it is a subtype of AbstractArray
we can do array-like things with it (it should basically behave like an array!)
x[3]
3
size(x)
(30,)
eltype(x)
Int64
However, it’s not implemented like a regular Array
at all.
In fact, it’s just two numbers! We can see this by looking at it’s fields:
fieldnames(typeof(x))
(:start, :stop)
or just by inspecting the source code
@which UnitRange{Int64}(1, 30)
It is an immutable
type which just holds the start and stop values.
This means that indexing, A[i]
, is not just a look-up but a (small) function (try @which getindex(x, 4)
).
What’s nice about this is that we can use it in calculations and no array, containing the numbers from 1 to 30, is ever created.
Allocating memory is typically costly.
@time collect(1:10_000_000);
0.054209 seconds (2 allocations: 76.294 MiB, 28.63% gc time)
But creating an immutable type of two numbers is essentially free, no matter what those two numbers are:
@time 1:10_000_000;
0.000000 seconds
This is so far the same as in Python, but here is a key difference:
In [1]: a = range(10000)
In [2]: a
Out[2]: range(0, 10000)
In [3]: a * 2
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[3], line 1
----> 1 a * 2
TypeError: unsupported operand type(s) for *: 'range' and 'int'
As Julia uses multiple dispatch, UnitRange
has its own methods defined for arithmetic which are custom made for that type. Not only does it behave the same way as an array with the same operations, it also does so with great performance:
# Double all elements in the range and print off the 10th element:
@time r = collect(1:10_000_000) * 2
@time r = (1:10_000_000) * 2
println(r[10])
0.096616 seconds (4 allocations: 152.588 MiB, 36.33% gc time)
0.000001 seconds
20
# Sum all of the elements in the range:
@time sum(collect(1:10_000_000))
@time sum(1:10_000_000)
0.138812 seconds (2 allocations: 76.294 MiB, 67.88% gc time)
0.000000 seconds
50000005000000
Parts of this do work in Python:
In [1]: %time sum(range(10_000_000))
CPU times: user 183 ms, sys: 169 µs, total: 183 ms
Wall time: 183 ms
using BenchmarkTools # This will be explained later
using PythonCall # As well as this
@btime @pyexec "sum(range(10_000_000))"
192.622 ms (0 allocations: 0 bytes)
This effectively does the same thing the collect
example does, just a few times slower, and far slower than the example using a range.
@which sum(1:10_000_000)
Duck typing and interfaces let us define specialised code that is very performant, and interacts with the rest of the language as expected.
Example 2: DiagonalMatrix
#
Let’s create a simple custom DiagonalMatrix
type that can represent square diagonal matrices, i.e.
struct DiagonalMatrix{T} <: AbstractArray{T,2}
diag::Vector{T}
end
In the spirit of duck typing, we integrate our DiagonalMatrix
into Julia’s type hierarchy by making it a subtype (<:
) of AbstractMatrix
to indicate array-like behavior. (Note that this does not indicate inheritence of structure!)
Of course, to actually make it behave like a matrix (a two-dimensional array) we must also implement (parts of) the AbstractArray
interface.
DiagonalMatrix([1, 2, 3])
MethodError: no method matching size(::DiagonalMatrix{Int64})
Closest candidates are:
size(::AbstractArray{T, N}, ::Any) where {T, N} at abstractarray.jl:42
size(::Union{LinearAlgebra.Adjoint{T, var"#s886"}, LinearAlgebra.Transpose{T, var"#s886"}} where {T, var"#s886"<:(AbstractVector)}) at /usr/local/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/adjtrans.jl:173
size(::Union{LinearAlgebra.Adjoint{T, var"#s886"}, LinearAlgebra.Transpose{T, var"#s886"}} where {T, var"#s886"<:(AbstractMatrix)}) at /usr/local/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/adjtrans.jl:174
...
Stacktrace:
[1] length
@ ./abstractarray.jl:279 [inlined]
[2] isempty(a::DiagonalMatrix{Int64})
@ Base ./abstractarray.jl:1170
[3] show(io::IOContext{IOBuffer}, #unused#::MIME{Symbol("text/plain")}, X::DiagonalMatrix{Int64})
@ Base ./arrayshow.jl:364
[4] limitstringmime(mime::MIME{Symbol("text/plain")}, x::DiagonalMatrix{Int64})
@ IJulia /usr/local/julia/local/share/julia/packages/IJulia/6TIq1/src/inline.jl:43
[5] display_mimestring(m::MIME{Symbol("text/plain")}, x::DiagonalMatrix{Int64})
@ IJulia /usr/local/julia/local/share/julia/packages/IJulia/6TIq1/src/display.jl:71
[6] display_dict(x::DiagonalMatrix{Int64})
@ IJulia /usr/local/julia/local/share/julia/packages/IJulia/6TIq1/src/display.jl:102
[7] #invokelatest#2
@ ./essentials.jl:729 [inlined]
[8] invokelatest
@ ./essentials.jl:726 [inlined]
[9] execute_request(socket::ZMQ.Socket, msg::IJulia.Msg)
@ IJulia /usr/local/julia/local/share/julia/packages/IJulia/6TIq1/src/execute_request.jl:112
[10] #invokelatest#2
@ ./essentials.jl:729 [inlined]
[11] invokelatest
@ ./essentials.jl:726 [inlined]
[12] eventloop(socket::ZMQ.Socket)
@ IJulia /usr/local/julia/local/share/julia/packages/IJulia/6TIq1/src/eventloop.jl:8
[13] (::IJulia.var"#15#18")()
@ IJulia ./task.jl:484
Base.size(D::DiagonalMatrix) = (length(D.diag), length(D.diag))
DiagonalMatrix([1, 2, 3])
CanonicalIndexError: getindex not defined for DiagonalMatrix{Int64}
Stacktrace:
[1] error_if_canonical_getindex(::IndexCartesian, ::DiagonalMatrix{Int64}, ::Int64, ::Int64)
@ Base ./abstractarray.jl:1260
[2] getindex
@ ./abstractarray.jl:1240 [inlined]
[3] isassigned(::DiagonalMatrix{Int64}, ::Int64, ::Int64)
@ Base ./abstractarray.jl:565
[4] alignment(io::IOContext{IOBuffer}, X::AbstractVecOrMat, rows::Vector{Int64}, cols::Vector{Int64}, cols_if_complete::Int64, cols_otherwise::Int64, sep::Int64, ncols::Int64)
@ Base ./arrayshow.jl:68
[5] _print_matrix(io::IOContext{IOBuffer}, X::AbstractVecOrMat, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64, rowsA::UnitRange{Int64}, colsA::UnitRange{Int64})
@ Base ./arrayshow.jl:207
[6] print_matrix(io::IOContext{IOBuffer}, X::DiagonalMatrix{Int64}, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64) (repeats 2 times)
@ Base ./arrayshow.jl:171
[7] print_array
@ ./arrayshow.jl:358 [inlined]
[8] show(io::IOContext{IOBuffer}, #unused#::MIME{Symbol("text/plain")}, X::DiagonalMatrix{Int64})
@ Base ./arrayshow.jl:399
[9] limitstringmime(mime::MIME{Symbol("text/plain")}, x::DiagonalMatrix{Int64})
@ IJulia /usr/local/julia/local/share/julia/packages/IJulia/6TIq1/src/inline.jl:43
[10] display_mimestring
@ /usr/local/julia/local/share/julia/packages/IJulia/6TIq1/src/display.jl:71 [inlined]
[11] display_dict(x::DiagonalMatrix{Int64})
@ IJulia /usr/local/julia/local/share/julia/packages/IJulia/6TIq1/src/display.jl:102
[12] #invokelatest#2
@ ./essentials.jl:729 [inlined]
[13] invokelatest
@ ./essentials.jl:726 [inlined]
[14] execute_request(socket::ZMQ.Socket, msg::IJulia.Msg)
@ IJulia /usr/local/julia/local/share/julia/packages/IJulia/6TIq1/src/execute_request.jl:112
[15] #invokelatest#2
@ ./essentials.jl:729 [inlined]
[16] invokelatest
@ ./essentials.jl:726 [inlined]
[17] eventloop(socket::ZMQ.Socket)
@ IJulia /usr/local/julia/local/share/julia/packages/IJulia/6TIq1/src/eventloop.jl:8
[18] (::IJulia.var"#15#18")()
@ IJulia ./task.jl:484
function Base.getindex(D::DiagonalMatrix, i::Int, j::Int)
if i == j
return D.diag[i]
else
return zero(eltype(D))
end
end
D = DiagonalMatrix([1, 2, 3])
3×3 DiagonalMatrix{Int64}:
1 0 0
0 2 0
0 0 3
Note how it’s automagically pretty printed (despite the fact that we never defined any special printing)!
D[2, 2]
2
D[1, 2]
0
size(D)
(3, 3)
D[3, 3] = 5
CanonicalIndexError: setindex! not defined for DiagonalMatrix{Int64}
Stacktrace:
[1] error_if_canonical_setindex(::IndexCartesian, ::DiagonalMatrix{Int64}, ::Int64, ::Int64)
@ Base ./abstractarray.jl:1354
[2] setindex!(::DiagonalMatrix{Int64}, ::Int64, ::Int64, ::Int64)
@ Base ./abstractarray.jl:1343
[3] top-level scope
@ In[25]:1
function Base.setindex!(D::DiagonalMatrix, v, i::Int, j::Int)
if i == j
D.diag[i] = v
else
throw(ArgumentError("cannot set off-diagonal entry ($i, $j)"))
end
return v
end
D[3, 3] = 5
5
D
3×3 DiagonalMatrix{Int64}:
1 0 0
0 2 0
0 0 5
D[3, 4] = 5
ArgumentError: cannot set off-diagonal entry (3, 4)
Stacktrace:
[1] setindex!(D::DiagonalMatrix{Int64}, v::Int64, i::Int64, j::Int64)
@ Main ./In[26]:5
[2] top-level scope
@ In[29]:1
But that’s not it. Because of duck typing, all kinds of different functions now “just work”.
eltype(D) # element data type
Int64
D + D # addition
3×3 Matrix{Int64}:
2 0 0
0 4 0
0 0 10
D * D # multiplication
3×3 Matrix{Int64}:
1 0 0
0 4 0
0 0 25
inv(D) # inversion
3×3 Matrix{Float64}:
1.0 0.0 0.0
0.0 0.5 0.0
0.0 0.0 0.2
sin.(D) # broadcasting
3×3 Matrix{Float64}:
0.841471 0.0 0.0
0.0 0.909297 0.0
0.0 0.0 -0.958924
using LinearAlgebra
eigen(D) # eigensolver
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
3-element Vector{Float64}:
1.0
2.0
5.0
vectors:
3×3 Matrix{Float64}:
1.0 0.0 0.0
0.0 1.0 0.0
0.0 0.0 1.0
Of course, so far, these operations have suboptimal performance because they don’t utilize the special structure inherent to our DiagonalMatrix
but fall back to generic implementations.
@which D + D
Base.:+(Da::DiagonalMatrix, Db::DiagonalMatrix) = DiagonalMatrix(Da.diag + Db.diag)
@which D + D
Important note: user defined types are just as good as built-in types!
There is nothing special about built-in types. In fact, they are implemented in essentially the same way!
Whereas diagonal arrays in something like Numpy are some C code
Side Note: Consistency of Interfaces#
Another benefit of the abstract type/interface/multiple dispatch style is that interfaces are very consistent within Julia, across not just the base language but across packages as well.
An article called Python vs. Julia: It’s also about Consistency shows this point quite well.
Benchmarking with BenchmarkTools.jl
#
Benchmarking is difficult to do right for many reasons
computers are noisy machines
global vs local scope
the first function call is special in Julia (more later)
…
g(x) = x + 2 * x
g (generic function with 1 method)
x = rand(2, 2)
@time g.(x)
0.059905 seconds (218.68 k allocations: 11.185 MiB, 99.52% compilation time)
2×2 Matrix{Float64}:
2.17578 0.275548
2.57232 0.509042
function f()
x = rand(2, 2)
@time g.(x)
end
f (generic function with 1 method)
f()
0.000001 seconds (1 allocation: 96 bytes)
2×2 Matrix{Float64}:
2.84491 1.27985
2.89121 1.09645
Fortunately, there are tools that do this for us. In addition, they also collect some statistics by running the benchmark multiple times.
General rule: Don’t use @time
but @btime
from BenchmarkTools.jl and interpolate ($
) input arguments.
using BenchmarkTools
@btime g.($x)
44.073 ns (1 allocation: 96 bytes)
2×2 Matrix{Float64}:
2.17578 0.275548
2.57232 0.509042
@benchmark g.($x)
BenchmarkTools.Trial: 10000 samples with 990 evaluations.
Range (min … max): 44.502 ns … 1.551 μs ┊ GC (min … max): 0.00% … 96.36%
Time (median): 46.556 ns ┊ GC (median): 0.00%
Time (mean ± σ): 51.344 ns ± 54.011 ns ┊ GC (mean ± σ): 4.63% ± 4.29%
▃▇█▇▄▂▂▂▂▂▁▂▁▁▁ ▁▂▁ ▂
█████████████████▆▄▄▄▃▃▁▁▁▃▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▅█████▇▆▇██ █
44.5 ns Histogram: log(frequency) by time 92 ns <
Memory estimate: 96 bytes, allocs estimate: 1.
Custom types are just as good as built-in types#
Let’s compare our custom DiagonalMatrix
against the standard Diagonal
type that ships in the LinearAlgebra
standard library.
using LinearAlgebra
x = rand(100);
Djl = Diagonal(x)
D = DiagonalMatrix(x)
@btime $Djl + $Djl;
@btime $D + $D;
91.078 ns (1 allocation: 896 bytes)
90.776 ns (1 allocation: 896 bytes)
Example 3: Cooler Diagonal Matrix#
What we made works quite well, but could be even neater if we remove the Vector
restriction, and instead just use AbstractVector
s. That way it could even work with ranges!
struct CoolerDiagonalMatrix{T} <: AbstractArray{T, 2}
diag::AbstractVector{T}
end
function Base.getindex(D::CoolerDiagonalMatrix, i::Int, j::Int)
if i == j
return D.diag[i]
else
return zero(eltype(D))
end
end
Base.:+(Da::CoolerDiagonalMatrix, Db::CoolerDiagonalMatrix) = CoolerDiagonalMatrix(Da.diag + Db.diag)
# Old version:
D = DiagonalMatrix(collect(1:10_000))
@btime $D + $D
# New:
D_cool = CoolerDiagonalMatrix(1:10_000)
@btime $D_cool + $D_cool;
5.581 μs (2 allocations: 78.17 KiB)
122.269 ns (3 allocations: 112 bytes)
Works! But… why are the allocations up? Let’s explore more:
D = DiagonalMatrix(collect(1:100))
@btime $D + $D
D_cool = CoolerDiagonalMatrix(collect(1:100))
@btime $D_cool + $D_cool;
86.342 ns (1 allocation: 896 bytes)
193.898 ns (2 allocations: 912 bytes)
Huh… the new type is slower even with the same arguments? Why?
We’ll explore this in the sessions tomorrow.
p.s. Numpy:
In [1]: import numpy as np
In [2]: d = np.diag(np.arange(1000))
In [3]: %timeit d + d
2.56 µs ± 35.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
Is still slower than any of the Julia code.
using PythonCall
@btime @pyexec """
import numpy as np
d = np.diag(np.arange(1_000))
d + d
"""
1.110 ms (0 allocations: 0 bytes)
Core messages of this notebook#
Duck typing is about shared behavior instead of shared structure.
User defined types are as good as built-in types.
We can extend Base functions for our types to implement arithmetics and such.
Subtyping an existing interface can give lots of functionality for free.
Functions should almost always be benchmarked with BenchmarkTools.jl’s
@btime
and@benchmark
instead of@time
.