Tuesday, 23 June 2015

Kalray - new product meander

Kalray's interesting chips are modelled on the kind of hierarchical architecture that has proved itself in 1M core HPC configurations. Their solutions have 16 clusters of 16 tightly coupled processors in a VLIW architecture. Those clusters rely on message passing. Their main differentiator would be their energy claims of 20 pJ's per instruction.

ILP -> 16 threads -> 16 units = 256 core hierarchy

The previous Kalray generation's interconnect
They have just announced their V2 uP product and next gen PCIe card, TurboCard3, with a dutiful improvement in processing from 1TFLOP SP to 3TFLOP SP claimed in the just released PR. They are running a little behind as V2 was scheduled for  2014 with a 1024 core due in 2015 according to the 2013 presentations listed below.

Here is some background on Kalray;

Promising: 100GFLOPS/W with 1024 cores at 12W implying 1.2 TFLOPS == nice!
I'm rather fond of Kalray's type of approach but it hasn't been a happy hunting ground for companies in this space. Maybe Kalray will be the one to break through.

Tilera "failed" with its more than $100M in VC being acquired for $50M by EZchip. Its story is not over but its grand vision has been scaled back. Early chips were somewhat starved of memory and their CPU nodes were viewed by many as a little weak. It remains an interesting proposition and here's to hoping EZchip makes it work more broadly.

Picochip similarly raised $110M in VC and was acquired by Mindspeed for $52M. Hardware is hard as PixelFusion/Clearspeed also demonstrated.

Adapteva's Epiphany III and IV chips and Parallella boards represent a cool architecture. The company has achieved extraordinary results in the context of its limited funding and a successful Kickstarter. I'd love to see it take off.  Limited funding seems to be its main problem. Adapteva looks somewhat starved of funds and is unlikely to achieve their original hope of zillion core chips. The non-push on their 64 core variety, from an apparent lack of funds, seems to suggest a refocus. Epiphany IV, the 64 core chip, is EOL'd. Parallella is making good strides in the embedded / hobbyist space with an attractive SBC with a Zynq + 16 core chip but is unlikely to succeed wildly simply due to its cost. It's hard to measure yourself against the success / volume of either of the two systems sitting on my desk here: a $35 Raspberry Pi 2 and a $4 STM32 Cortex Arduino focused board.

Kalray's main competition is likely to be the tiled Intel and ARM processors from the pool of Intel Phi, AMD, Cavium, et cetera.

However, I personally think the future may be RISC-V. I buy RISC-V's "ISA is not so important just pick one and make it open" argument. If I was to build my own SoC for HFT (anyone?) or IoT, RISC-V, I'd choose you. Sorry Picachu. There must be a lot of people thinking the same thing as I'm a little slow.

Good-luck Kalray, the pJ per instruction argument is a good one. However, my new technology bet, if I was to bet on a company or start-up, would be on a RISC-V ISA based solution.


PS: Yes, I do think RISC-V has a very good shot at usurping x86[_64] and ARM...

No comments:

Post a Comment