One of the main foci of development during Julia 1.6 has been to reduce latency, the delay between starting your session and getting useful work done. This is sometimes called “time to first plot,” although it applies to far more than just plotting. While a lot of work (and success) has gone into reducing latency in Julia 1.6, users and developers will naturally want to shrink it even more. This is the inaugural post in a short series devoted to the topic of what package developers can do to reduce latency for their users. This particular installment covers background material–some key underlying concepts and structures–that will hopefully be useful in later installments.
Sources of latency, and reducing it with precompile
Most of Julia’s latency is due to code loading and compilation. Julia’s dynamic nature also makes it vulnerable to invalidation and the subsequent need to recompile previously-compiled code; this topic has been covered in a previous blog post, and that material will not be rehashed here. In this series, it is assumed that invalidations are not a dominant source of latency. (You do not need to read the previous blog post to understand this one.)
From our partners:
In very rough terms, using SomePkg
loads types and/or method definitions, after which calling SomePkg.f(args...)
forces SomePkg.f
to be compiled (if it hasn’t been already) for the specific types in args...
. The primary focus of this series is to explore the opportunity to reduce the cost of compilation. We’ll focus on precompilation,
julia> <span class="hljs-keyword">using</span> SomePkg
[ Info: Precompiling SomePkg [<span class="hljs-number">12345678</span>-abcd-<span class="hljs-number">9876</span>-efab-<span class="hljs-number">1234</span>abcd5e6f]
or the related Precompiling project...
output that occurs after updating packages on Julia 1.6. During precompilation, Julia writes module, type, and method definitions in an efficient serialized form. Precompilation in its most basic form happens nearly automatically, but with a bit of manual intervention developers also have an opportunity to save additional information: partial results of compilation, specifically the type inference stage of compilation. Because type inference takes time, this can reduce the latency for the first use of methods in the package.
To motivate this series, let’s start with a simple demonstration in which adding a single line to a package results in a five-fold decrease in latency. We’ll start with a package that we can define in a few lines (thanks to Julia’s metaprogramming capabilities) and depending on very little external code, but which has been designed to have measurable latency. You can copy/paste the following into Julia’s REPL (be aware that it creates a package directory DemoPkg
inside your current directory):
julia> <span class="hljs-keyword">using</span> Pkg; Pkg.generate(<span class="hljs-string">"DemoPkg"</span>)
Generating project DemoPkg:
DemoPkg/Project.toml
DemoPkg/src/DemoPkg.jl
<span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>, Base.UUID} with <span class="hljs-number">1</span> entry:
<span class="hljs-string">"DemoPkg"</span> => UUID(<span class="hljs-string">"4d70085e-4304-44c2-b3c3-070197146bfa"</span>)
julia> typedefs = join([<span class="hljs-string">"struct DemoType<span class="hljs-variable">$i</span> <: AbstractDemoType x::Int end; DemoType<span class="hljs-variable">$i</span>(d::AbstractDemoType) = DemoType<span class="hljs-variable">$i</span>(d.x)"</span> <span class="hljs-keyword">for</span> i = <span class="hljs-number">0</span>:<span class="hljs-number">1000</span>], '\n');
julia> codeblock = join([<span class="hljs-string">" d = DemoType<span class="hljs-variable">$i</span>(d)"</span> <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>:<span class="hljs-number">1000</span>], '\n');
julia> open(<span class="hljs-string">"DemoPkg/src/DemoPkg.jl"</span>, <span class="hljs-string">"w"</span>) <span class="hljs-keyword">do</span> io
write(io, <span class="hljs-string">"""
module DemoPkg
abstract type AbstractDemoType end
<span class="hljs-variable">$typedefs</span>
function f(x)
d = DemoType0(x)
<span class="hljs-variable">$codeblock</span>
return d
end
end
"""</span>)
<span class="hljs-keyword">end</span>
After executing this, you can open the DemoPkg.jl
file to see what f
actually looks like. If we load the package, the first call DemoPkg.f(5)
takes some time:
julia> push!(<span class="hljs-literal">LOAD_PATH</span>, <span class="hljs-string">"DemoPkg/"</span>);
julia> <span class="hljs-keyword">using</span> DemoPkg
julia> tstart = time(); DemoPkg.f(<span class="hljs-number">5</span>); tend=time(); tend-tstart
<span class="hljs-number">0.28725290298461914</span>
but the second one (in the same session) is much faster:
julia> tstart = time(); DemoPkg.f(<span class="hljs-number">5</span>); tend=time(); tend-tstart
<span class="hljs-number">0.0007619857788085938</span>
The extra cost for the first invocation is the time spent compiling the method. We can save some of this time by precompiling it and saving the result to disk. All we need to do is add a single line to the module definition: either
f(5)
, which executesf
while the package is being precompiled (and remember, execution triggers compilation, the latter being our actual goal)precompile(f, (Int,))
, if we don’t need the output off(5)
but only wish to trigger compilation off
for anInt
argument.
Here we’ll choose precompile
:
julia> open(<span class="hljs-string">"DemoPkg/src/DemoPkg.jl"</span>, <span class="hljs-string">"w"</span>) <span class="hljs-keyword">do</span> io
write(io, <span class="hljs-string">"""
module DemoPkg
abstract type AbstractDemoType end
<span class="hljs-variable">$typedefs</span>
function f(x)
d = DemoType0(x)
<span class="hljs-variable">$codeblock</span>
return d
end
precompile(f, (Int,)) # THE CRUCIAL ADDITION!
end
"""</span>)
<span class="hljs-keyword">end</span>
Now start a fresh session, load the package (you’ll need that push!(LOAD_PATH, "DemoPkg/")
again), and time it:
julia> tstart = time(); DemoPkg.f(<span class="hljs-number">5</span>); tend=time(); tend-tstart
<span class="hljs-number">0.056242942810058594</span>
julia> tstart = time(); DemoPkg.f(<span class="hljs-number">5</span>); tend=time(); tend-tstart
<span class="hljs-number">0.0007371902465820312</span>
It doesn’t eliminate all the latency, but at just one-fifth of the original this is a major improvement in responsivity. The fraction of compilation time saved by precompile
depends on the balance between type inference and other aspects of code generation, which in turn depends strongly on the nature of the code: “type-heavy” code, such as this example, often seems to be dominated by inference, whereas “type-light” code (e.g., code that does a lot of numeric computation with just a few types and operations) tends to be dominated by other aspects of code generation.
While currently precompile
can only save the time spent on type-inference, in the long run it may be hoped that Julia will also save the results from later stages of compilation. If that happens, precompile
will have even greater effect, and the savings will be less dependent on the balance between type-inference and other forms of code generation.
How does this magic work? During package precompilation, Julia creates a *.ji
file typically stored in .julia/compiled/v1.x/
, where 1.x
is your version of Julia. Your *.ji
file stores definitions of constants, types, and methods; this happens automatically while your package is being built. Optionally (if you’ve used a precompile
directive, or executed methods while the package is being built), it may also include the results of type-inference.
Box 1 It might be natural to wonder, “how does precompile
help? Doesn’t it just shift the cost of compilation to the time when I load the package?” The answer is “no,” because a *.ji
file is not a recording of all the steps you take when you define the module: instead, it’s a snapshot of the results of those steps. If you define a package
<span class="hljs-keyword">module</span> PackageThatPrints
println(<span class="hljs-string">"This prints only during precompilation"</span>)
<span class="hljs-keyword">function</span> __init__()
println(<span class="hljs-string">"This prints every time the package is loaded"</span>)
<span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>
you’ll see that things that happen transiently do not “make it” into the precompile file: the first println
displays only when you build the package, whereas the second one prints on subsequent using PackageThatPrints
even when that doesn’t require rebuilding the package.
To “make it” into the precompile file, statements have to be linked to constants, types, methods, and other durable code constructs. The __init__
function is special in that it automatically gets called, if present, at the end of module-loading.
A precompile
directive runs during precompilation, but the only thing relevant for the *.ji
file are the results (the compiled code) that it produces. Compiled objects (specifically the MethodInstance
s described below) may be written to the *.ji
file, and when you load the package those objects get loaded as well. Loading the results of type inference does take some time, but typically it’s a fair bit quicker than computing inference results from scratch.
Now that we’ve introduced the promise of precompile
, it’s time to acknowledge that this topic is complex. How do you know how much of your latency is due to type-inference? Moreover, even when type inference is the dominant source of latency, it turns out you can still find yourself in a circumstance where it is difficult to eliminate most of its cost. In previous Julia versions, this fact has led to more than a little frustration using precompile
. One source of trouble was invalidation, which frequently “spoiled” precompilation on earlier Julia versions, but that has been greatly improved (mostly behind-the-scenes, i.e., without package developers needing to do anything) in Julia 1.6. With invalidations largely eliminated, the trickiest remaining aspect of precompilation is one of code ownership: where should the results of precompilation be stored? When a bit of code requires methods from one package or library and types from another, how do you (or how does Julia) decide where to store the compiled code?
In this blog post, we take a big step backwards and start peering under the hood. The goal is to understand why precompile
sometimes has dramatic benefits, why sometimes it has nearly none at all, and when it fails how to rescue the situation. To do that, we’ll have to understand some of the “chain of dependencies” that link various bits of Julia code together.
Type-inference, MethodInstances, and backedges
We’ll introduce these concepts via a simple demo (users are encourage to try this and follow along). First, let’s open the Julia REPL and define the following methods:
double(x::<span class="hljs-built_in">Real</span>) = <span class="hljs-number">2</span>x
calldouble(container) = double(container[<span class="hljs-number">1</span>])
calldouble2(container) = calldouble(container)
calldouble2
calls calldouble
which calls double
on the first element in container
. Let’s create a container
object and run this code:
julia> c64 = [<span class="hljs-number">1.0</span>]
<span class="hljs-number">1</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>}:
<span class="hljs-number">1.0</span>
julia> calldouble2(c64) <span class="hljs-comment"># running it compiles the methods for these types</span>
<span class="hljs-number">2.0</span>
Now, let’s take a brief trip into some internals to understand what Julia’s compiler did when preparing to run that statement. It will be easiest to use the MethodAnalysis package:
julia> <span class="hljs-keyword">using</span> MethodAnalysis
julia> mi = methodinstance(double, (<span class="hljs-built_in">Float64</span>,))
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float64</span>)
methodinstance
is a lot like which
, except it asks about type-inferred code. We asked methodinstance
to find an instance of double
that had been inferred for a single Float64
argument; the fact that it returned a MethodInstance
, rather than nothing
, indicates that this instance already existed–the method had already been inferred for this argument type because we ran calldouble(c64)
which indirectly called double(::Float64)
. If you currently try methodinstance(double, (Int,))
, you should get nothing
, because we’ve never called double
with an Int
argument.
One of the crucial features of type-inference is that it keeps track of dependencies:
julia> <span class="hljs-keyword">using</span> AbstractTrees
julia> print_tree(mi)
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float64</span>)
└─ MethodInstance <span class="hljs-keyword">for</span> calldouble(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
└─ MethodInstance <span class="hljs-keyword">for</span> calldouble2(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
This indicates that the result for type-inference on calldouble2(::Vector{Float64})
depended on the result for calldouble(::Vector{Float64})
, which in turn depended on double(::Float64)
. That should make sense: there is no way that Julia can know what type calldouble2
returns unless it understands what its callees do. This is our first example of a chain of dependencies that will be a crucial component of understanding how Julia decides where to stash the results of compilation. In encoding this dependency chain, the callee (e.g., double
) stores a link to the caller (e.g., calldouble
); as a consequence, these links are typically called backedges.
Box 2 Backedges don’t just apply to code you write yourself, and they can link code across modules. For example, to implement 2x
, our double(::Float64)
calls *(::Int, ::Float64)
:
julia> mi = methodinstance(*, (<span class="hljs-built_in">Int</span>, <span class="hljs-built_in">Float64</span>))
MethodInstance <span class="hljs-keyword">for</span> *(::<span class="hljs-built_in">Int64</span>, ::<span class="hljs-built_in">Float64</span>)
We can see which Method
this instance is from:
julia> mi.def
*(x::<span class="hljs-built_in">Number</span>, y::<span class="hljs-built_in">Number</span>) <span class="hljs-keyword">in</span> Base at promotion.jl:<span class="hljs-number">322</span>
This is defined in Julia’s own Base
module. If we’ve run calldouble2(c64)
, our own double
is listed as one of its backedges:
julia> direct_backedges(mi)
<span class="hljs-number">5</span>-element <span class="hljs-built_in">Vector</span>{Core.MethodInstance}:
MethodInstance <span class="hljs-keyword">for</span> parse_inf(::Base.TOML.Parser, ::<span class="hljs-built_in">Int64</span>)
MethodInstance <span class="hljs-keyword">for</span> init(::<span class="hljs-built_in">Int64</span>, ::<span class="hljs-built_in">Float64</span>)
MethodInstance <span class="hljs-keyword">for</span> show_progress(::<span class="hljs-built_in">IOContext</span>{<span class="hljs-built_in">IOBuffer</span>}, ::Pkg.MiniProgressBars.MiniProgressBar)
MethodInstance <span class="hljs-keyword">for</span> show_progress(::<span class="hljs-built_in">IO</span>, ::Pkg.MiniProgressBars.MiniProgressBar)
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float64</span>)
direct_backedges
, as its name implies, returns a list of the compiled direct callers. (all_backedges
returns both direct and indirect callers.) The specific list you get here may depend on what other packages you’ve loaded, and
julia> print_tree(mi)
MethodInstance <span class="hljs-keyword">for</span> *(::<span class="hljs-built_in">Int64</span>, ::<span class="hljs-built_in">Float64</span>)
├─ MethodInstance <span class="hljs-keyword">for</span> parse_inf(::Parser, ::<span class="hljs-built_in">Int64</span>)
│ └─ MethodInstance <span class="hljs-keyword">for</span> parse_number_or_date_start(::Parser)
│ └─ MethodInstance <span class="hljs-keyword">for</span> parse_value(::Parser)
│ ├─ MethodInstance <span class="hljs-keyword">for</span> parse_entry(::Parser, ::<span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>, <span class="hljs-built_in">Any</span>})
│ │ ├─ MethodInstance <span class="hljs-keyword">for</span> parse_inline_table(::Parser)
│ │ │ ⋮
│ │ │
│ │ └─ MethodInstance <span class="hljs-keyword">for</span> parse_toplevel(::Parser)
│ │ ⋮
│ │
│ └─ MethodInstance <span class="hljs-keyword">for</span> parse_array(::Parser)
│ └─ MethodInstance <span class="hljs-keyword">for</span> parse_value(::Parser)
│ ⋮
│
├─ MethodInstance <span class="hljs-keyword">for</span> init(::<span class="hljs-built_in">Int64</span>, ::<span class="hljs-built_in">Float64</span>)
│ └─ MethodInstance <span class="hljs-keyword">for</span> __init__()
├─ MethodInstance <span class="hljs-keyword">for</span> show_progress(::<span class="hljs-built_in">IOContext</span>{<span class="hljs-built_in">IOBuffer</span>}, ::MiniProgressBar)
│ └─ MethodInstance <span class="hljs-keyword">for</span> (::<span class="hljs-string">var"#59#63"</span>{<span class="hljs-built_in">Int64</span>, <span class="hljs-built_in">Bool</span>, MiniProgressBar, <span class="hljs-built_in">Bool</span>, PackageSpec})(::<span class="hljs-built_in">IOContext</span>{<span class="hljs-built_in">IOBuffer</span>})
├─ MethodInstance <span class="hljs-keyword">for</span> show_progress(::<span class="hljs-built_in">IO</span>, ::MiniProgressBar)
└─ MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float64</span>)
└─ MethodInstance <span class="hljs-keyword">for</span> calldouble(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
└─ MethodInstance <span class="hljs-keyword">for</span> calldouble2(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
might be dramatically more complex if you’ve loaded and used large packages that do a lot of computation.
Box 3 Generally, the set of backedges is a graph, not a tree: in real code, it’s possible for f
to call itself (e.g., fibonacci(n) = fibonacci(n-1) + fibonacci(n-2)
), or for f
to call g
which calls f
. When following backedges, MethodAnalysis omits MethodInstances
that appeared previously, thus performing a “search” of the graph. The results of this search pattern can be visualized as a tree.
Type inference behaves similarly: it caches its results, and thus infers each MethodInstance
only once. (One wrinkle is constant propagation, which can cause the same MethodInstance
to be re-inferred for different constant values.) As a consequence, inference also performs a depth-first search of the call graph.
The creation of backedges is more subtle than it may seem at first glance. To start getting a hint of some of the complexities, first note that currently these are the only inferred instances of these methods:
julia> methodinstances(double)
<span class="hljs-number">1</span>-element <span class="hljs-built_in">Vector</span>{Core.MethodInstance}:
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float64</span>)
julia> methodinstances(calldouble)
<span class="hljs-number">1</span>-element <span class="hljs-built_in">Vector</span>{Core.MethodInstance}:
MethodInstance <span class="hljs-keyword">for</span> calldouble(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
julia> methodinstances(calldouble2)
<span class="hljs-number">1</span>-element <span class="hljs-built_in">Vector</span>{Core.MethodInstance}:
MethodInstance <span class="hljs-keyword">for</span> calldouble2(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
While methodinstance(f, typs)
returns a specific MethodInstance
, methodinstances(f)
returns all inferred instances of f
.
Let’s see if we can get Julia to add some additional instances: let’s create a new container, but in a twist this time we’ll use one with abstract element type, so that Julia’s type-inference cannot accurately predict the type of elements in the container. The element type of our container will be AbstractFloat
, an abstract type with several subtypes; every actual instance has to have a concrete type, and just to make sure it’s a new type (triggering new compilation) we’ll use Float32
:
julia> cabs = <span class="hljs-built_in">AbstractFloat</span>[<span class="hljs-number">1.0f0</span>] <span class="hljs-comment"># store a `Float32` inside a `Vector{AbstractFloat}`</span>
<span class="hljs-number">1</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">AbstractFloat</span>}:
<span class="hljs-number">1.0f0</span>
julia> calldouble2(cabs) <span class="hljs-comment"># compile for these new types</span>
<span class="hljs-number">2.0f0</span>
Now let’s look at the available instances:
julia> mis = methodinstances(double)
<span class="hljs-number">3</span>-element <span class="hljs-built_in">Vector</span>{Core.MethodInstance}:
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float64</span>)
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">AbstractFloat</span>)
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float32</span>)
We see that there are not two but three type-inferred instances of double
: one for Float64
, one for Float32
, and one for AbstractFloat
. Let’s check the backedges of each:
julia> print_tree(mis[<span class="hljs-number">1</span>])
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float64</span>)
└─ MethodInstance <span class="hljs-keyword">for</span> calldouble(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
└─ MethodInstance <span class="hljs-keyword">for</span> calldouble2(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
julia> print_tree(mis[<span class="hljs-number">2</span>])
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">AbstractFloat</span>)
julia> print_tree(mis[<span class="hljs-number">3</span>])
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float32</span>)
Why does the first have backedges to calldouble
and then to calldouble2
, but the second two do not? Moreover, why does every instance of calldouble
have backedges to calldouble2
julia> mis = methodinstances(calldouble)
<span class="hljs-number">2</span>-element <span class="hljs-built_in">Vector</span>{Core.MethodInstance}:
MethodInstance <span class="hljs-keyword">for</span> calldouble(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
MethodInstance <span class="hljs-keyword">for</span> calldouble(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">AbstractFloat</span>})
julia> print_tree(mis[<span class="hljs-number">1</span>])
MethodInstance <span class="hljs-keyword">for</span> calldouble(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
└─ MethodInstance <span class="hljs-keyword">for</span> calldouble2(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
julia> print_tree(mis[<span class="hljs-number">2</span>])
MethodInstance <span class="hljs-keyword">for</span> calldouble(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">AbstractFloat</span>})
└─ MethodInstance <span class="hljs-keyword">for</span> calldouble2(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">AbstractFloat</span>})
in seeming contradiction of the fact that some instances of double
lack backedges to calldouble
? The results here reflect the success or failure of concrete type-inference. In contrast with Float64
and Float32
, AbstractFloat
is not a concrete type:
julia> isconcretetype(<span class="hljs-built_in">Float32</span>)
<span class="hljs-literal">true</span>
julia> isconcretetype(<span class="hljs-built_in">AbstractFloat</span>)
<span class="hljs-literal">false</span>
It may surprise some readers that Vector{AbstractFloat}
is concrete:
julia> isconcretetype(<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float32</span>})
<span class="hljs-literal">true</span>
julia> isconcretetype(<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">AbstractFloat</span>})
<span class="hljs-literal">true</span>
The container is concrete–it has a fully-specified storage scheme and layout in memory–even if the elements are not.
AbstractVector{AbstractFloat}
abstract or concrete? How about AbstractVector{Float32}
? Check your answers using isconcretetype
.To look more deeply into the implications of concreteness and inference, a useful tool is @code_warntype
. You can see difference between c64
and cabs
, especially if you run this in the REPL yourself where you can see the red highlighting:
julia> <span class="hljs-meta">@code_warntype</span> calldouble2(c64)
Variables
<span class="hljs-comment">#self#::Core.Const(calldouble2)</span>
container::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>}
Body::<span class="hljs-built_in">Float64</span>
<span class="hljs-number">1</span> ─ %<span class="hljs-number">1</span> = Main.calldouble(container)::<span class="hljs-built_in">Float64</span>
└── <span class="hljs-keyword">return</span> %<span class="hljs-number">1</span>
julia> <span class="hljs-meta">@code_warntype</span> calldouble2(cabs)
Variables
<span class="hljs-comment">#self#::Core.Const(calldouble2)</span>
container::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">AbstractFloat</span>}
Body::<span class="hljs-built_in">Any</span>
<span class="hljs-number">1</span> ─ %<span class="hljs-number">1</span> = Main.calldouble(container)::<span class="hljs-built_in">Any</span>
└── <span class="hljs-keyword">return</span> %<span class="hljs-number">1</span>
Note that only the return type (::Float64
vs ::Any
) differs between these; this is what accounts for the fact that calldouble
has backedges to calldouble2
in both cases, because in both cases the specific caller/callee chain can be successfully inferred. The really big differences emerge one level lower:
julia> <span class="hljs-meta">@code_warntype</span> calldouble(c64)
Variables
<span class="hljs-comment">#self#::Core.Const(calldouble)</span>
container::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>}
Body::<span class="hljs-built_in">Float64</span>
<span class="hljs-number">1</span> ─ %<span class="hljs-number">1</span> = Base.getindex(container, <span class="hljs-number">1</span>)::<span class="hljs-built_in">Float64</span>
│ %<span class="hljs-number">2</span> = Main.double(%<span class="hljs-number">1</span>)::<span class="hljs-built_in">Float64</span>
└── <span class="hljs-keyword">return</span> %<span class="hljs-number">2</span>
julia> <span class="hljs-meta">@code_warntype</span> calldouble(cabs)
Variables
<span class="hljs-comment">#self#::Core.Const(calldouble)</span>
container::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">AbstractFloat</span>}
Body::<span class="hljs-built_in">Any</span>
<span class="hljs-number">1</span> ─ %<span class="hljs-number">1</span> = Base.getindex(container, <span class="hljs-number">1</span>)::<span class="hljs-built_in">AbstractFloat</span>
│ %<span class="hljs-number">2</span> = Main.double(%<span class="hljs-number">1</span>)::<span class="hljs-built_in">Any</span>
└── <span class="hljs-keyword">return</span> %<span class="hljs-number">2</span>
In the first case, getindex
was guaranteed to return a Float64
, but in the second case it’s only known to be an AbstractFloat
. Moreover, type-inference cannot predict a concrete type for the return of double(::AbstractFloat)
, though it can for double(::Float64)
. Consequently the call with ::AbstractFloat
is made via runtime dispatch, where execution pauses, Julia asks for the concrete type of the object, and then it makes the appropriate call to double
(in the case of cabs[1]
, to double(::Float32)
).
For completeness, what happens if we add another container with concrete eltype?
julia> c32 = [<span class="hljs-number">1.0f0</span>]
<span class="hljs-number">1</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float32</span>}:
<span class="hljs-number">1.0</span>
julia> calldouble2(c32)
<span class="hljs-number">2.0f0</span>
julia> mis = methodinstances(double)
<span class="hljs-number">3</span>-element <span class="hljs-built_in">Vector</span>{Core.MethodInstance}:
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float64</span>)
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">AbstractFloat</span>)
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float32</span>)
julia> print_tree(mis[<span class="hljs-number">1</span>])
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float64</span>)
└─ MethodInstance <span class="hljs-keyword">for</span> calldouble(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
└─ MethodInstance <span class="hljs-keyword">for</span> calldouble2(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>})
julia> print_tree(mis[<span class="hljs-number">2</span>])
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">AbstractFloat</span>)
julia> print_tree(mis[<span class="hljs-number">3</span>])
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float32</span>)
└─ MethodInstance <span class="hljs-keyword">for</span> calldouble(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float32</span>})
└─ MethodInstance <span class="hljs-keyword">for</span> calldouble2(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float32</span>})
So now both concretely-inferred versions of double
link all the way back to calldouble2
, but only when the element type of the container is also concrete. A single MethodInstance
may be called by multiple MethodInstance
s, but most commonly a backedge is created only when the call can be inferred.
Exercise 2 Does Julia ever compile methods, and introduce backedges, for abstract types? Start a fresh session, and instead of using the definitions above define double
using @nospecialize
:
double(<span class="hljs-meta">@nospecialize</span>(x::<span class="hljs-built_in">Real</span>)) = <span class="hljs-number">2</span>x
Now compare what kind of backedges you get with c64
and cabs
. It may be most informative to quit your session and start fresh between trying these two different container types. You’ll see that Julia is quite the opportunist when it comes to specialization!
Precompilation and backedges
Let’s turn the example above into a package:
julia> <span class="hljs-keyword">using</span> Pkg; Pkg.generate(<span class="hljs-string">"BackedgeDemo"</span>)
Generating project BackedgeDemo:
BackedgeDemo/Project.toml
BackedgeDemo/src/BackedgeDemo.jl
<span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>, Base.UUID} with <span class="hljs-number">1</span> entry:
<span class="hljs-string">"BackedgeDemo"</span> => UUID(<span class="hljs-string">"35dad884-25a6-48ad-b13b-11b63ee56c40"</span>)
julia> open(<span class="hljs-string">"BackedgeDemo/src/BackedgeDemo.jl"</span>, <span class="hljs-string">"w"</span>) <span class="hljs-keyword">do</span> io
write(io, <span class="hljs-string">"""
module BackedgeDemo
double(x::Real) = 2x
calldouble(container) = double(container[1])
calldouble2(container) = calldouble(container)
precompile(calldouble2, (Vector{Float32},))
precompile(calldouble2, (Vector{Float64},))
precompile(calldouble2, (Vector{AbstractFloat},))
end
"""</span>)
<span class="hljs-keyword">end</span>
<span class="hljs-number">282</span>
You can see we created a package and defined those three methods. Crucially, we’ve also added three precompile
directives, all for the top-level calldouble2
. We did not add any explicit precompile
directives for its callees calldouble
, double
, or anything needed by double
(like *
to implement 2*x
).
Now let’s load this package and see if we have any MethodInstance
s:
julia> push!(<span class="hljs-literal">LOAD_PATH</span>, <span class="hljs-string">"BackedgeDemo/"</span>)
<span class="hljs-number">4</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">String</span>}:
<span class="hljs-string">"@"</span>
<span class="hljs-string">"@v#.#"</span>
<span class="hljs-string">"@stdlib"</span>
<span class="hljs-string">"BackedgeDemo/"</span>
julia> <span class="hljs-keyword">using</span> BackedgeDemo
[ Info: Precompiling BackedgeDemo [<span class="hljs-number">44</span>c70eed-<span class="hljs-number">03</span>a3-<span class="hljs-number">46</span>c0-<span class="hljs-number">8383</span>-afc033fb6a27]
julia> <span class="hljs-keyword">using</span> MethodAnalysis
julia> methodinstances(BackedgeDemo.double)
<span class="hljs-number">3</span>-element <span class="hljs-built_in">Vector</span>{Core.MethodInstance}:
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float32</span>)
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">Float64</span>)
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">AbstractFloat</span>)
Hooray! Even though we’ve not used this code in this session, the type-inferred MethodInstance
s are already there! (This is true only because of those precompile
directives.) You can also verify that the same backedges get created as when we ran this code interactively above. We have successfully saved the results of type inference.
These MethodInstance
s got cached in BackedgeDemo.ji
. It’s worth noting that even though the precompile
directive got issued from this package, MethodInstances
for methods defined in other packages or libraries can be saved as well. For example, Julia does not come pre-built with the inferred code for Int * Float32
: in a fresh session,
julia> <span class="hljs-keyword">using</span> MethodAnalysis
julia> mi = methodinstance(*, (<span class="hljs-built_in">Int</span>, <span class="hljs-built_in">Float32</span>))
returns nothing
(the MethodInstance
doesn’t exist), whereas if we’ve loaded BackedgeDemo
then
julia> mi = methodinstance(*, (<span class="hljs-built_in">Int</span>, <span class="hljs-built_in">Float32</span>))
MethodInstance <span class="hljs-keyword">for</span> *(::<span class="hljs-built_in">Int64</span>, ::<span class="hljs-built_in">Float32</span>)
julia> mi.def <span class="hljs-comment"># what Method is this MethodInstance from?</span>
*(x::<span class="hljs-built_in">Number</span>, y::<span class="hljs-built_in">Number</span>) <span class="hljs-keyword">in</span> Base at promotion.jl:<span class="hljs-number">322</span>
So even though the method is defined in Base
, because BackedgeDemo
needed this type-inferred code it got stashed in BackedgeDemo.ji
.
This is fantastic, because it means the complete results of type-inference can be saved, even when they cross boundaries between packages and libraries. Nevertheless, there are significant limitations to this ability to stash MethodInstance
s from other modules. Most crucially, *.ji
files can only hold code they “own,” either:
- for a method defined in the package
- through a chain of backedges to a method defined by the package
Exercise 3 To see this limitation in action, delete the precompile(calldouble2, (Vector{Float32},))
directive from BackedgeDemo.jl
, so that it has only
precompile(calldouble2, (<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>},))
precompile(calldouble2, (<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">AbstractFloat</span>},))
but then add
precompile(*, (<span class="hljs-built_in">Int</span>, <span class="hljs-built_in">Float32</span>))
in an attempt to force inference of that method anyway.
Start a fresh session and load the package (it should precompile again), and check whether methodinstance(*, (Int, Float32))
returns a MethodInstance
or nothing
. Also run print_tree
on the results of each item in methodinstances(BackedgeDemo.double)
.
Where there is no “chain of ownership” to BackedgeDemo
, Julia doesn’t know where to stash the MethodInstance
s that get created by precompile
; those MethodInstance
s get created, but they do not get incorporated into the *.ji
file because there is no particular module-owned MethodInstance
s that they link back to. Consequently, we can’t precompile methods defined in other modules in and of themselves; we can only do it if those methods are linked by backedges to this package.
In practice, this means that even when packages add precompile
directives, if there are a lot of type-inference failures the results can be very incomplete and the consequential savings may be small.
Quiz Add a new type to BackedgeDemo
:
<span class="hljs-keyword">export</span> SCDType
<span class="hljs-keyword">struct</span> SCDType <span class="hljs-keyword">end</span>
and a precompile directive for Base.push!
:
precompile(push!, (<span class="hljs-built_in">Vector</span>{SCDType}, SCDType))
Now load the package and check whether the corresponding MethodInstance
exists. If not, can you think of a way to get that MethodInstance
added to the *.ji
file?
Answer is at the bottom of this post.
Box 4 precompile
can also be passed a complete Tuple
-type: precompile(calldouble2, (Vector{AbstractFloat},))
can alternatively be written
precompile(<span class="hljs-built_in">Tuple</span>{typeof(calldouble2), <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">AbstractFloat</span>}})
This form appears frequently if precompile
directives are issued by code that inspects MethodInstance
s, because this signature is in the specType
field of a MethodInstance
:
julia> mi = methodinstance(BackedgeDemo.double, (<span class="hljs-built_in">AbstractFloat</span>,))
MethodInstance <span class="hljs-keyword">for</span> double(::<span class="hljs-built_in">AbstractFloat</span>)
julia> mi.specTypes
<span class="hljs-built_in">Tuple</span>{typeof(BackedgeDemo.double), <span class="hljs-built_in">AbstractFloat</span>}
Box 5 One other topic we’ve not yet discussed is that when precompile
fails, it does so “almost” silently:
julia> methods(double)
<span class="hljs-comment"># 1 method for generic function "double":</span>
[<span class="hljs-number">1</span>] double(x::<span class="hljs-built_in">Real</span>) <span class="hljs-keyword">in</span> BackedgeDemo at /tmp/BackedgeDemo/src/BackedgeDemo.jl:<span class="hljs-number">3</span>
julia> precompile(double, (<span class="hljs-built_in">String</span>,))
<span class="hljs-literal">false</span>
Even though double
can’t be compiled for String
, the corresponding precompile
doesn’t error, it only returns false
. If you want to monitor the utility of your precompile
directives, sometimes it’s useful to preface them with @assert
; all’s well if precompilation succeeds, but if changes to the package mean that the precompile directive has “gone bad,” then you get an error. Hopefully, such errors would be caught before shipping the package to users!
Summary
In this tutorial, we’ve learned about MethodInstance
s, backedges, inference, and precompilation. Some important take-home messages are:
- you can store the results of type-inference with explicit
precompile
directives - to be useful,
precompile
has to be able to establish a chain of ownership to some package - chains-of-ownership are bigger and more complete when type-inference succeeds
An important conclusion is that precompilation works better when type inference succeeds. For some packages, time invested in improving inferrability can make your precompile
directives work better.
Looking ahead
Future installments will focus on describing some powerful new tools:
- tools to measure how inference is spending its time
- tools to help make decisions about (de)specialization
- tools to detect and fix inference failures
- tools to generate effective
precompile
directives
Stay tuned!
Answer to quiz Directly precompiling push!(::Vector{SCDType}, ::SCDType)
fails, because while your package “owns” SCDType
, it does not own the method of push!
.
However, if you add a method that calls push!
and then precompile it,
dopush() = push!(SCDType[], SCDType())
precompile(dopush, ())
then the MethodInstance
for push!(::Vector{SCDType}, ::SCDType)
will be added to the package through the backedge to dopush
(which you do own).
This was an artifical example, but in more typical cases this happens organically through the functionality of your package. But again, this works only for inferrable calls.
By Tim Holy
Source Julia Programming Language
For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!
Our humans need coffee too! Your support is highly appreciated, thank you!