c# - Compute Shaders Input 3d array of floats -


writing compute shader used in unity 4. i'm attempting 3d noise.

the goal multidiminsional float3 array compute shader c# code. possible in straightforward manner (using kind of declaration) or can achieved using texture3d objects?

i have implementation of simplex noise working on individual float3 points, outputting single float -1 1. ported code found here compute shader.

i extend work on 3d array of float3's (i suppose closest comparison in c# vector3[,,]) applying noise operation each float3 point in array.

i've tried few other things, feel bizarre , miss point of using parallel approach. above imagine should like.

i managed scrawk's implemenation working vertex shaders. scrawk got 3d float4 array shader using texture3d. wasn't able extract floats texture. how compute shaders work well? relying on textures? have overlooked concerning getting values out of texture. seems how user getting data in in this post. similar question mine, not quite i'm looking for.

new shaders in general, , feel i'm missing pretty fundamental compute shaders , how work. goal (as i'm sure you've guessed) noise generation , mesh computation using marching cubes onto gpu using compute shaders (or whatever shader best suited kind of work).

constraints free trial edition of unity 4.

here's skeleton of c# code i'm using:

    int volumesize = 16;      compute.setint ("simplexseed", 10);       // float[,,] array our density values.      computebuffer output = new computebuffer (/*s ize goes here, no idea */, 16);     compute.setbuffer (compute.findkernel ("csmain"), "output", output);        // buffer filled float3[,,] equivalent, ever in c#. 'stride'?      // haven't found clear. think it's size of basic datatype we're using in buffer?     computebuffer voxelpositions = new computebuffer (/* size goes here, no idea */, 16);      compute.setbuffer (compute.findkernel ("csmain"), "voxelpos", voxelpositions);           compute.dispatch(0,16,16,16);     float[,,] res = new float[volumesize, volumesize, volumesize];      output.getdata(res); // <=== populated float density values      marchingcubes.dostuff(res); // <=== goal (obviously not implemented yet) 

and here's compute shader

#pragma kernel csmain  uniform int simplexseed; rwstructuredbuffer<float3[,,]> voxelpos;  // know these won't work, it's i'm trying rwstructuredbuffer<float[,,]> output;     // in there.   float simplexnoise(float3 input) {     /* ... bunch of awesome stuff pastebin guy did ...*/      return noise; }  /** bunch of other awesome stuff support simplexnoise function **/ /* .... */  /* here's entry point, (supposedly) supplied input kicking things off */ [numthreads(16,16,16)] // <== not sure if thread count correct?  void csmain (uint3 id : sv_dispatchthreadid) {     output[id.xyz] = simplexnoise(voxelpos.xyz); // action starts.      } 

typically use noise generate heightmap ... intention here? looks me generating value every point in array.

i have image in head here of taking chunk voxel engine (16 x 16 x 16 voxels) , generating noise values points.

whereas thing should doing making 2d problem. seudo cpu code might ...

for(x)   for(z)     fill voxels below ( generatey(x,z) ) 

based assumptions being correct might have shader wrong example ...

this try run 16 x 16 x 16 threads above 1024 thread limit group, can have unlimited groups each group can have no more 1024 threads.

[numthreads(16,16,16)] // <== not sure if thread count correct?  

what think need more [numthreads(16,1,16)] run noise function on 16 x 16 grid of points , raise each point noise x maxheight amount give point want.

your dispatch call ...

compute.dispatch(0,1,0,0); 

... result in single thread group producing height map values 16 x 16 points. once far can scale up.

all combined mention of marching cubes suggests doing same thing am, building voxel engine on gpu raw voxel data generated in gpu ram mesh generated it.

i have part of process cracked, hard part next stage, generating mesh / scene object resulting voxel array. depending on approach you'll want comfortable ray marching or appendbuffers next.

good luck!

flat buffer usage:

lets want array of 128*128*128 voxels , chunk 32*32*32 voxels ...

//cpu code  var size = 128*128*128; var stride = sizeof(float); computebuffer output = new computebuffer (size, stride); computeshader.setbuffer (0, "voxels", output); computeshader.dispatch(0, 4,4,4);  //gpu code #pragma kernel compute rwstructuredbuffer<float> voxels;  [numthreads(32,1,32)] // group chunk index, thread voxel within chunk void compute (uint3 threadid : sv_groupthreadid, uint3 groupid : sv_groupid) {     uint3 threadindex =  groupid * uint3(32, 1, 32) + threadid;    //todo: implement marching cubes / dual contouring functions in    //      here somewhere    uint3 endindex = uint(32, 0, 32) + threadindex;     float height = noise();    int voxelpos = voxpos.x+ voxpos.y*size+voxpos.z*size*size;     // chunks 32 * 32 blocks of columns whole height of volume    for(int y = threadindex.y; y < endindex.y; y++)    {       if(y < height)       {          voxels[voxelpos] = 1; // fill voxel       }           else           {                  voxels[voxelpos] = 0; // dont fill voxel           }    } 

this should produce (although ram in head might not spot on) 128*128*128 voxel array in buffer on gpu contains "terrain like".

i guess can take there need, prob drop "if" in compute shader if noise function passed xyz values threadindex (the voxel position).

let me know if find neat way of cracking this, it's i'm still working on myself.

my code works (well almost) ...

component start ... call compute gen voxel buffer. call compute gen vertex buffer voxelbuffer.

draw (each frame) ... render vertex buffer material


Comments

Popular posts from this blog

python - Subclassed QStyledItemDelegate ignores Stylesheet -

java - HttpClient 3.1 Connection pooling vs HttpClient 4.3.2 -

SQL: Divide the sum of values in one table with the count of rows in another -