This question was asked in an interview. You have an array of small integers. You have to multiply all of them. You need not worry about overflow you have ample support for that. What can you do to speed up the multiplication on your machine? Would multiple additions be better in this case? I suggested multiplying using a divide and conquer approach but the interviewer was not impressed. What could be the best possible solution for this?

Here are some thoughts: Divide-and-Conquer with Multithreading: Split the input apart into n different blocks of size b and recursively multiply all the numbers in each block together. Then, recursively multiply all n / b blocks back together. If you have multiple cores and can run parts of this in parallel, you could save a lot of time overall. Word-Level Parallelism: Let's suppose that your numbers are all bounded from above by some number U, which happens to be a power of two. Now, suppose that you want to multiply together a, b, c, and d. Start off by computing (4U2a + b) × (4U2c + d) = 16U4ac + 4U2ad + 4U2bc + bd. Now, notice that this expression mod U2 is just bd. (Since bd < U2, we don't need to worry about the mod U2 step messing it up). This means that if we compute this product and take it mod U2, we get back bd. Since U2 is a power of two, this can be done with a bitmask. Next, notice that <blockquote> 4U2ad + 4U2bc + bd < 4U4 + 4U4 + U2 < 9U4 < 16U4 </blockquote> This means that if we divide the entire expression by 16U4 and round down, we will end up getting back just ad. This division can be done with a bitshift, since 16U4 is a power of two. Consequently, with one multiplication, you can get back the values of both ac and bd by applying a subsequent bitshift and bitmask. Once you have ac and bd, you can directly multiply them together to get back the value of abcd. Assuming that bitmasks and bitshifts are faster than multiplies, this reduces the number of multiplications necessary by 33% (two instead of three here). Hope this helps!

Efficient way to multiply a large set of small numbers

Tags:

algorithm

multiplication

This question was asked in an interview.

You have an array of small integers. You have to multiply all of them. You need not worry about overflow you have ample support for that. What can you do to speed up the multiplication on your machine?

Would multiple additions be better in this case?

I suggested multiplying using a divide and conquer approach but the interviewer was not impressed. What could be the best possible solution for this?

795

asked Apr 15 '14 17:04

Sohaib

1 Answers

Here are some thoughts:

Divide-and-Conquer with Multithreading: Split the input apart into n different blocks of size b and recursively multiply all the numbers in each block together. Then, recursively multiply all n / b blocks back together. If you have multiple cores and can run parts of this in parallel, you could save a lot of time overall.

Word-Level Parallelism: Let's suppose that your numbers are all bounded from above by some number U, which happens to be a power of two. Now, suppose that you want to multiply together a, b, c, and d. Start off by computing (4U²a + b) × (4U²c + d) = 16U⁴ac + 4U²ad + 4U²bc + bd. Now, notice that this expression mod U² is just bd. (Since bd < U², we don't need to worry about the mod U² step messing it up). This means that if we compute this product and take it mod U², we get back bd. Since U² is a power of two, this can be done with a bitmask.

Next, notice that

4U²ad + 4U²bc + bd < 4U⁴ + 4U⁴ + U² < 9U⁴ < 16U⁴

This means that if we divide the entire expression by 16U⁴ and round down, we will end up getting back just ad. This division can be done with a bitshift, since 16U⁴ is a power of two.

Consequently, with one multiplication, you can get back the values of both ac and bd by applying a subsequent bitshift and bitmask. Once you have ac and bd, you can directly multiply them together to get back the value of abcd. Assuming that bitmasks and bitshifts are faster than multiplies, this reduces the number of multiplications necessary by 33% (two instead of three here).

Hope this helps!

196

answered Nov 15 '22 07:11

templatetypedef

Related questions
                            
                                Find the State given Latitude and Longitude Coordinates
                            
                                Generating a random double between a range of values
                            
                                When strings are equivalent up to rotation
                            
                                How to implement depth-first search (DFS) on a binary tree in java?
                            
                                Algorithm for Image Sampling for Polygonal Representation using Canvas and JavaScript?
                            
                                Algorithmic solution to Minesweeper
                            
                                What is the big-O complexity of this naive code to compute combinations?
                            
                                Transform string from a1b2c3d4 to abcd1234
                            
                                Algorithm for comparing two lists
                            
                                C# algorithm refactor splitting an array into 3 parts?
                            
                                how to compute "m" in Jaro Winkler distance?
                            
                                creating a random number between 0 to 1 continuously
                            
                                Is it possible to calculate the number of count inversions using quicksort?
                            
                                Java Smooth Color Transition
                            
                                Reverse stack without using any data structure
                            
                                How to to determine the number of ways a number can be broken down into sums of smaller numbers
                            
                                Determine if a given string is a k-palindrome
                            
                                How to find the intersection of two NFA
                            
                                What sorting algorithm does visual c++ use in std::sort
                            
                                Count of squarefree numbers in range

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With