I'm using glDrawArraysInstanced to draw 10000 instances of a simple shape composed of 8 triangles.
On changing the dedicated graphics card that is to be used to my NVIDIA GTX 1060, it seems i'm getting lower framerate and also some visible stuttering.
This is the code i'm using to see time taken for each frame :
std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
float i = (float)(std::chrono::duration_cast<std::chrono::microseconds>(end - begin).count()) / 1000000.0;
while (!glfwWindowShouldClose(window)){
end = std::chrono::steady_clock::now();
i = (float)(std::chrono::duration_cast<std::chrono::microseconds>(end - begin).count()) / 1000000.0;
std::cout << i << "\n";
begin = end; //Edit
//Other code for draw calls and to set uniforms.
}
Is this the wrong way to measure time elapsed per frame? If not, why is there a drop in performance?
Here is the comparison of the output :
Edit :
Fragment Shader simply sets color for each fragment directly.
Vertex Shader Code :
#version 450 core
in vec3 vertex;
out vec3 outVertex;
uniform mat4 mv_matrix;
uniform mat4 proj_matrix;
uniform float time;
const float vel = 1.0;
float PHI = 1.61803398874989484820459;
float noise(in vec2 xy, in float seed) {
return fract(tan(distance(xy * PHI, xy) * seed) * xy.x);
}
void main() {
float y_coord = noise(vec2(-500 + gl_InstanceID / 100, -500 + gl_InstanceID % 100),20) * 40 + vel * time;
y_coord = mod(y_coord, 40)-20;
mat4 translationMatrix = mat4(vec4(1,0,0, 0 ),vec4(0,1,0, 0 ),vec4(0,0,1, 0 ),vec4(-50 + gl_InstanceID/100, y_coord, -50 + gl_InstanceID%100,1));
gl_Position = proj_matrix * mv_matrix * translationMatrix*vec4(vertex, 1);
outVertex = vertex;
}
I'm changing the card used by Visual Studio for rendering here :
extern "C" {
_declspec(dllexport) DWORD NvOptimusEnablement = 0x00000001;
}
Output is same for both and is shown here :
Desired output is increased frame-rate while using dedicated GPU card to render, that is smaller time gaps between the rows in the Comparison image attached. For Intel Integrated Card, it takes <0.01 seconds to render 1 frame. For Dedicated GPU GTX 1060, it takes ~0.2 seconds to render 1 frame.
I solved the issues by Disabling NVIDIA Physx GPU acceleration. For some reason it slows down graphic rendering. Now I'm getting about ~280 FPS on my GPU even when rendering ~100k instances.
Your output clearly shows the times monotonically increasing, rather than jittering around some mean value. The reason for this is that your code is measuring total elapsed time, not per-frame time. To make it measure per-frame time insstead, you need a begin = end
call at the end of your loop, so that the reference point for each frame is the end of the preceding frame, rather then the start time of the whole program.
User contributions licensed under CC BY-SA 3.0