Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Macro to avoid typing in Julia

Tags:

julia

I have read that Julia has Macros but I am unsure if the Macros Julia provide are the ones I am thinking about.

I have the following expression:

Global.data[
  Dates.value(Dates.Year(dtCursor)) - 2000, 
  Dates.value(Dates.Month(dtCursor)), 
  Dates.value(Dates.Day(dtCursor)),
  Dates.value(Dates.Hour(dtCursor)) + 1, 
  Dates.value(Dates.Minute(dtCursor)) + 1, 
  1
]

And I repeat this a lot. I am wondering if I could have a macro that with dtCursor as parameter (it might be other variables in other cases) types all that for me. I am therefore looking for the Macro expansion functionality which was traditionally found in Macro assemblers.

I definitively do not want to include this as a function as this code is executed tens of thousands of times and therefore I do not want to add the overhead of a function call.

I have tried:

macro readData(_dtCursor, value)
  return :(
    Global.data[
      Dates.value(Dates.Year(_dtCursor)) - 2000, 
      Dates.value(Dates.Month(_dtCursor)), 
      Dates.value(Dates.Day(_dtCursor)),
      Dates.value(Dates.Hour(_dtCursor)) + 1, 
      Dates.value(Dates.Minute(_dtCursor)) + 1, 
      value
    ]
  )
end

And later be invoked by:

println(@readData(dtCursor, 1))

Where dtCursor is a DateTime variable.

But I am getting:

ERROR: LoadError: UndefVarError: _dtCursor not defined

I have read https://docs.julialang.org/en/v1/manual/metaprogramming/index.html#man-macros-1 but a bit of help understanding what to do in this case is really welcomed.

like image 730
M.E. Avatar asked Sep 15 '19 08:09

M.E.


2 Answers

Use a function

I definitively do not want to include this as a function as this code is executed tens of thousands of times and therefore I do not want to add the overhead of a function call.

You are definitively wrong.
You might be right in some languages, but not in JuliaLang.
(I do think this is a very useful questiom though because can highlight for others not to do this šŸ˜€)

That function-call in-lines away, and even if it didn't we have other tools (@inline) we would want to use before using a macro.

Macro's are for syntactic transformations.
If you are not doing a syntactic tranformation think again before using macros.

Here is a link to a good point made during by Steven G. Johnson at his keynote in juliacon: "Functions are mostly good enough for Jeff Bezanson. Don't try to outsmart Jeff Bezason"

How to write it as a Macro and as a Function

The following answers your original question question

using Dates
using BenchmarkTools


macro readData(_dtCursor, value)
  return :(
    Global.data[
      Dates.value(Dates.Year($(esc(_dtCursor)))) - 2000, 
      Dates.value(Dates.Month($(esc(_dtCursor)))), 
      Dates.value(Dates.Day($(esc(_dtCursor)))),
      Dates.value(Dates.Hour($(esc(_dtCursor)))) + 1, 
      Dates.value(Dates.Minute($(esc(_dtCursor)))) + 1, 
      $value
    ]
  )
end


function readData(_dtCursor, value)
    Global.data[
      Dates.value(Dates.Year(_dtCursor)) - 2000, 
      Dates.value(Dates.Month(_dtCursor)), 
      Dates.value(Dates.Day(_dtCursor)),
      Dates.value(Dates.Hour(_dtCursor)) + 1, 
      Dates.value(Dates.Minute(_dtCursor)) + 1, 
      value
    ]
end

Benchmark it.

You say this is going to be run on 10,000s of times. So I will benchmark on 100_000 uses, just to be safe.


const Global = (; data=[join((y, m, d, h, M, s)," ") for y in 2000:2010, m in 1:3, d in 1:20, h in 1:10, M in 1:30, s in 1:30]);
size(Global.data)
length(Global.data)

const sample_dts = map(1:100_000) do _
   y, m, d, h, M, s = rand.(axes(Global.data))
   dt = DateTime(y+2000, m, d, h-1, M-1)
end;


func_demo() = [readData(dt, 3) for dt in sample_dts];
macro_demo() = [@readData(dt, 3) for dt in sample_dts];


@btime func_demo()
@btime macro_demo()

They benchmark as identical

julia> @btime macro_demo();
  5.409 ms (3 allocations: 781.34 KiB)

julia> @btime func_demo();
  5.393 ms (3 allocations: 781.34 KiB)

Infact they specialize into (basically) the same code.

julia> @code_typed macro_demo()
CodeInfo(
1 ā”€ %1 = Main.sample_dts::Core.Compiler.Const(DateTime[2002-01-18T04:19:00, 2001-01-19T08:22:00, 2006-02-08T04:07:00, 2011-01-08T09:03:00, 2006-02-10T06:18:00, 2002-03-12T00:05:00, 2011-02-20T08:29:00, 2011-02-20T07:12:00, 2005-01-13T03:22:00, 2006-01-01T00:29:00  ā€¦
  2005-03-10T04:29:00, 2002-03-12T09:11:00, 2002-03-11T00:28:00, 2007-02-12T02:26:00, 2003-02-15T07:29:00, 2009-01-01T02:02:00, 2009-
01-03T02:11:00, 2001-02-16T03:16:00, 2004-01-17T05:12:00, 2010-02-02T05:10:00], false)
ā”‚   %2 = %new(Base.Generator{Array{DateTime,1},getfield(Main, Symbol("##50#51"))}, getfield(Main, Symbol("##50#51"))(), %1)::Base.Gen
erator{Array{DateTime,1},getfield(Main, Symbol("##50#51"))}
ā”‚   %3 = invoke Base.collect(%2::Base.Generator{Array{DateTime,1},getfield(Main, Symbol("##50#51"))})::Array{String,1}
ā””ā”€ā”€      return %3
) => Array{String,1}


julia> @code_typed getfield(Main, Symbol("##50#51")).instance(1)  # check the internals
ā”‚   %1 = %1 = Main.Global::Core.Compiler.Const((#==GIANT Inlined Const ==#)
ā”‚   %2 = Base.getfield(%1, :data)::Array{String,6}
ā”‚   %3 = Base.sub_int(dt, 2000)::Int64
ā”‚   %4 = Base.add_int(dt, 1)::Int64
ā”‚   %5 = Base.add_int(dt, 1)::Int64
ā”‚   %6 = Base.arrayref(true, %2, %3, dt, dt, %4, %5, 3)::String
ā””ā”€ā”€      return %6
) => String



julia> @code_typed func_demo()
CodeInfo(
1 ā”€ %1 = Main.sample_dts::Core.Compiler.Const(DateTime[2002-01-18T04:19:00, 2001-01-19T08:22:00, 2006-02-08T04:07:00, 2011-01-08T09:03:00, 2006-02-10T06:18:00, 2002-03-12T00:05:00, 2011-02-20T08:29:00, 2011-02-20T07:12:00, 2005-01-13T03:22:00, 2006-01-01T00:29:00  ā€¦  2005-03-10T04:29:00, 2002-03-12T09:11:00, 2002-03-11T00:28:00, 2007-02-12T02:26:00, 2003-02-15T07:29:00, 2009-01-01T02:02:00, 2009-
01-03T02:11:00, 2001-02-16T03:16:00, 2004-01-17T05:12:00, 2010-02-02T05:10:00], false)
ā”‚   %2 = %new(Base.Generator{Array{DateTime,1},getfield(Main, Symbol("##43#44"))}, getfield(Main, Symbol("##43#44"))(), %1)::Base.Gen
erator{Array{DateTime,1},getfield(Main, Symbol("##43#44"))}
ā”‚   %3 = invoke Base.collect(%2::Base.Generator{Array{DateTime,1},getfield(Main, Symbol("##43#44"))})::Array{String,1}
ā””ā”€ā”€      return %3
) => Array{String,1}

julia> @code_typed getfield(Main, Symbol("##43#44")).instance(1)
CodeInfo(
1 ā”€ %1 = Main.Global::NamedTuple{(:data,),Tuple{Array{String,6}}}
ā”‚   %2 = Base.getfield(%1, :data)::Array{String,6}
ā”‚   %3 = Base.sub_int(dt, 2000)::Int64
ā”‚   %4 = Base.add_int(dt, 1)::Int64
ā”‚   %5 = Base.add_int(dt, 1)::Int64
ā”‚   %6 = Base.arrayref(true, %2, %3, dt, dt, %4, %5, 3)::String
ā””ā”€ā”€      return %6
) => String

There is a very minor difference in the generors function between the two. Wheree the value became a Compliler.Const or a NamedTuple when inlining, but after that goes the LLVM that difference goes way too I think (Check @code_llvm if your really interested. But we are already super deap into the weeds.)

This is probably the wrong code to be optimizing in the first place.

A long with the guidance to benchmark any optimization you do. One should also profile the code to decide what is worth optimizing. A function that is only called 10,000s of times and not allocating giant arrays etc, probably not worth worrying too much about. Especially if you are just worrying about function call overhead, which is only a handful of CPU cycles.

like image 161
Lyndon White Avatar answered Sep 28 '22 18:09

Lyndon White


You have to splice in the variables you pass as macro arguments:

julia> macro readData(dtCursor, value)
         return :(
           Global.data[
             Dates.value(Dates.Year($dtCursor)) - 2000, 
             Dates.value(Dates.Month($dtCursor)), 
             Dates.value(Dates.Day($dtCursor)),
             Dates.value(Dates.Hour($dtCursor)) + 1, 
             Dates.value(Dates.Minute($dtCursor)) + 1, 
             $value
           ]
         )
       end
@readData (macro with 1 method)

julia> @macroexpand @readData(dtCursor, 1)
:((Main.Global).data[(Main.Dates).value((Main.Dates).Year(Main.dtCursor)) - 2000, (Main.Dates).value((Main.Dates).Month(Main.dtCursor)), (Main.Dates).value((Main.Dates).Day(Main.dtCursor)), (Main.Dates).value((Main.Dates).Hour(Main.dtCursor)) + 1, (Main.Dates).value((Main.Dates).Minute(Main.dtCursor)) + 1, 1])

Furthermore, Julia macros are hygenic; that means that there will be no confusion about the name _dtCursor in the macro definition, and the name dtCursor at call site. One thing you might need to do is to escape the inputs, though.

Also, this might be an overkill. You should benchmark the macro version against the function version; maybe, there's enough inlining happening that the macro doesn't actually matter.

like image 44
phipsgabler Avatar answered Sep 28 '22 17:09

phipsgabler