Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I optimise this further so that it runs faster?

As you can see in the following pprof output, I have these nested for loops which take most of the time of my program. Source is in golang but code is explained below:

  8.55mins    1.18hrs     20:   for k := range mapSource {
  4.41mins    1.20hrs     21:           if positions, found := mapTarget[k]; found {
         .          .     22:                   // save all matches
  1.05mins   1.05mins     23:                   for _, targetPos := range positions {
  2.25mins   2.33mins     24:                           for _, sourcePos := range mapSource[k] {
     1.28s     15.78s     25:                                   matches = append(matches, match{int32(targetPos), int32(sourcePos)})
         .          .     26:                           }
         .          .     27:                   }
         .          .     28:           }
         .          .     29:   }

At the moment the structures I'm using are 2 map[int32][]int32, targetMap and sourceMap.

These maps contain, for a given key, an array of ints. Now I want to find the keys that match in both maps, and save the combinations of the elements in the arrays.

So for example:

sourceMap[1] = [3,4]
sourceMap[5] = [9,10]

targetMap[1] = [1,2,3]
targetMap[2] = [2,3]
targetMap[3] = [1,2]

The only key in common is 1 and the result will be [(3,1), (3,2), (3,3), (4,1), (4,2), (4,3)]

Is there any possible way (a more appropriate data structure or whatever) that could improve the speed of my program?

In my case, maps can contain somewhere between 1000 and 150000 keys, while the arrays inside are usually pretty small.

EDIT: Concurrency is not an option as this is already being run several times in several threads at the same time.

like image 358
Ivan Avatar asked Oct 31 '17 17:10

Ivan


1 Answers

Can I optimise this further so that it runs faster?

Is there any possible way (a more appropriate data structure or whatever) that could improve the speed of my program?

Probably.


The XY problem is asking about your attempted solution rather than your actual problem. This leads to enormous amounts of wasted time and energy, both on the part of people asking for help, and on the part of those providing help.


We don't have even the most basic information about your problem, a description of the form, content, and frequency of your original input data, and your desired output. What original data should drive a benchmark?

I created some fictional original data, which produced some fictional output and results:

BenchmarkPeterSO-4   30    44089894 ns/op    5776666 B/op      31 allocs/op
BenchmarkIvan-4      10   152300554 ns/op   26023924 B/op    6022 allocs/op

It is possible that your algorithms are slow.

like image 167
peterSO Avatar answered Sep 20 '22 19:09

peterSO