Hacker News
How much linear memory access is enough?
smj-edison
|next
[-]
PhilipTrettner
|next
|previous
[-]
My concrete tasks will already reach peak performance before 128 kB and I couldn't find pure processing workloads that benefit significantly beyond 1 MB chunk size. Code is linked in the post, it would be nice to see results on more systems.
01HNNWZ0MV43FF
|next
|previous
[-]
It means if I'm doing very light processing (sums) I should try to move that to structure-of-arrays to take advantage of cache? But if I'm doing something very expensive, I can leave it as array-of-structures, since the computation will dominate the memory access in Amdahl's Law analysis?
This data should tell me something about organizing my data and accessing it, right?
_zoltan_
|previous
[-]
on GPU databases sometimes we go up to the GB range per "item of work" (input permitting) as it's very efficient.
I need to add it to my TODO list to have a look at your github code...
PhilipTrettner
|root
|parent
[-]
Do have a look, I've tried to roughly keep it small and readable. It's ~250 LOC effectively.
Also, this is CPU only. I'm not super sure what a good GPU version of my benchmark would be, though ... Maybe measuring a "map" more than a "reduction" like I do on the CPU? We should probably take a look at common chunking patterns there.