Overview Developed by Facebook HiveQL is a SQL-like framework for data warehousing on top of MapReduce over HDFS. converts SQL query into a series of jobs for execution on a Hadoop cluster. Organizes HDFS data into tables - attaching structure. Schema on Read Versus Schema on Write - doesn’t verify the data when it is loaded, but rather when a query is issued. full-table scans are the norm and a table update is achieved by transforming the data into a new table. HDFS does not provide in-place file ......
in java.util.concurrent package - JDK 7 A Framework for Divide and Conquer recursively divides a task into smaller subtasks until threshold check indicates subtask size is small enough to execute serially. Optimal threshold is affected by specific computational steps & obtained through profiling – heuristic: between 100 and 10000. abstracts multithreading - automatically scale up. Leverages work-stealing - Each worker thread maintains a queue of tasks. If one worker thread’s queue is empty, it ......
Overview OpenCL is a GPGPU API that abstracts over acceleration devices (be they CPU, GPU or FPGA) to provide data-parallelism (as well as task-parallelism) behavior. heterogeneous portability is achieved by avoiding high level abstractions and exposing the hardware in a context that explicitly defines its work scheduling capabilities. An OpenCL application consists of two parts: the host program that runs on the CPU - API functions to discover devices and their capabilities & create a context, ......
OverviewA compute shader is a programmable shader stage that expands OpenGL beyond graphics programming. Like other programmable shaders, a compute shader is designed and implemented with GLSL. A compute shader provides single stage SIMD pipeline parallelized on the GPU. The compute shader provides memory sharing and thread synchronization features to allow more effective parallel programming methods. Create a Compute Shader Program: glCreateShader(GL_COMPUTE_S... - create a compute shader glShaderSource() ......
Where is WebCL ? The Khronos WebCL working group is working on a JavaScript binding to the OpenCL standard so that HTML 5 compliant browsers can host GPGPU web apps – e.g. for image processing or physics for WebGL games - http://www.khronos.org/webcl/ . While Nokia & Samsung have some protype WebCL APIs, Intel has one-upped them with a higher level of abstraction: RiverTrail. Intro to RiverTrail Intel Labs JavaScript RiverTrail provides GPU accelerated SIMD data-parallelism in web applications ......
HPC Job Types HPC has 3 types of jobs http://technet.microsoft.co... · Task Flow – vanilla sequence · Parametric Sweep – concurrently run multiple instances of the same program, each with a different work unit input · MPI – message passing between master & slave tasks But when you try go outside the box – job tasks that spawn jobs, blocking the parent task – you run the risk of resource starvation, deadlocks, and recursive, non-converging or exponential blow-up. ......
Financial Apps feel the need for speed – this can come via parallelization, and via infrastructure - fast messaging and non-blocking distributed memory management. This blogpost gives an overview + examples of various technologies that can squeeze performance out of your trading apps and clock cycles out of your modeling apps. Low Latency via Infrastructure ZeroMQ · ZeroMQ is a messaging library - ‘messaging middleware’ , ‘TCP on steroids’ , ‘new layer on the networking stack’. not a complete messaging ......
Daytona - Iterative MapReduce on Windows Azure Overview MapReduce is a framework for processing highly distributable problems across huge datasets using a large number of compute nodes. It is a generic mechanism that comprises 2 steps: Map step: The master node takes the input, partitions it up into smaller sub-problems, and distributes them to worker nodes. The worker node processes the smaller problem, and passes the answer back to its master node. Reduce step: The master node then collects the ......
Overview Windows HPC Server 2008 is infrastructure for high-end applications that require high performance computing clusters – i.e. for scaling out parallelizable across many compute nodes in a grid. These compute nodes can be coordinated by a head node , which in turn can be proxied via a service broker node that exposes a SOA WCF interface for job scheduling. Additional functionality includes the ability to coordinate between job processes running on nodes via MPI (message passing interface). ......
I’m leveraging a ConcurrentPriorityQueue – from http://code.msdn.microsoft.... This class basically is a thread safe IProducerConsumerCollection wrapper for a binary heap that prioritizes smaller values. You use it as you would a dictionary, where the priority is the key, except you can have duplicate keys (ie values with the same priority). I needed to demonstrate to a customer that it worked. I set up my queue and my priority enum values: var q = new ConcurrentPriorityQueue<... ......
Full Parallelism Archive