Edit: Thanks for the previous answers. but in fact I want to do it in CUDA, and apparently there is no function Fill for CUDA. I have to fill the matrix once for each thread so I want to make sure I'm using the fastest way possible. Is this for loop my best choice?
I want to set the matrix of float to the maximum value possible (in float). What is the correct way of doing this job?
float *matrix=new float[N*N];
for (int i=0;i<N*N;i++){
matrix[i*N+j]=999999;
}
Thanks in advance.
The easiest approach in CUDA is to use thrust::fill. Thrust is included with CUDA 4.0 and later, or you can install it if you are using CUDA 3.2.
#include <thrust/fill.h>
#include <thrust/device_vector.h>
...
thrust::device_vector<float> v(N*N);
thrust::fill(v.begin(), v.end(), std::numeric_limits<float>::max()); // or 999999.f if you prefer
You could also write pure CUDA code something like this:
template <typename T>
__global__ void initMatrix(T *matrix, int width, int height, T val) {
int idx = blockIdx.x * blockDim.x + threadIdx.x;
for (int i = idx; i < width * height; i += gridDim.x * blockDim.x) {
matrix[i]=val;
}
}
int main(void) {
float *matrix = 0;
cudaMalloc((void*)&matrix, N*N * sizeof(float));
int blockSize = 256;
int numBlocks = (N*N + blockSize - 1) / (N*N);
initMatrix<<<numBlocks, blockSize>>>(matrix, N, N,
std::numeric_limits<float>::max()); // or 999999.f if you prefer
}
Use std::numeric_limits<float>::max()
and std::fill
as:
#include <limits> //for std::numeric_limits<>
#include <algorithm> //for std::fill
std::fill(matrix, matrix + N*N, std::numeric_limits<float>::max());
Or, std::fill_n
as (looks better):
std::fill_n(matrix, N*N, std::numeric_limits<float>::max());
See these online documentation:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With