nvidia-cudaCUDA architecture has something which is very powerful to hide its worse bottleneck, the memory transfers. This capability give to the devices three different flows: host to device transfers, kernel computations and device to host transfers at the same time. In this article I show how we can increase up to 3,5x the kernel execution time (for this example) without modify any line of the kernel code.


Read more: CUDA. Overlapping computation and transfers.


opencl logoIn this article I'll discuss about of the high performance that a graphic card or coprocessor give us. The peak of performance is quite higher than in CPU for parallel problems, so we can resolve the same problem with less energy. OpenCL C is the standard language to parallelize algorithms in GPU, so that if you want to achieve greatest results the code can turn to more complex. To illustrate this I'll to resolve in both devices a typical example in parallel programming, iterative PI calculation. The straightforward code in C++ is the next:

Read more: OpenCL benchmark. PI calculation.

What is OpenCL?

opencl logoOpenCL means Open Computing Language and it is an standard for parallel programming of heterogeneous systems. OpenCL is supported by the main graphic card manufacturers such as NVIDIA or AMD so the code developed will be compatible across the platforms, which doesn't happen with CUDA. OpenCL is not only an API, not, it also includes OpenCL C language that allow us to perform several operations at the same time. For example, vector addition with four components each (x, y, z, w).


Read more: Installing OpenCL and running some examples in our NVIDIA device

Intel-Xeon-Phi-5110P¿Qué son realmente este nuevo tipo de procesadores liberadoa por Intel? Pues son realmente módulos de coprocesamiento, concretamente tarjetas PCI-Express, que pretenden traspasar cantidades ingentes de cálculo que se realizaría en el procesador a este coprocesador. Están idealmente pensados para ser integrados en computadores con procesadores Intel Xeon, pero no es un requisito.



Read more: Nuevos Intel Xeon Phi para HPC