Experimental Renderer

TheEqualizer · February 27

So, I am building a renderer from scratch, mostly for fun, and I will post my progress here.

Before I try anything fancy, I will be testing how efficient my current renderer is. So, I will compare my renderer against a vanilla D3D8 version of the client.

To stress the render system, the test scene will contain 180 characters and 60 Metin stones, the resolution will be set to 1600x900 for both clients. I will compare render times, lower is better.

Results:

Vanilla D3D8: 18 - 20 ms;

Experimental Renderer: 4 - 5 ms;

My renderer is 4x faster. This is a good start.

DemOnJR · February 27

Nice, waiting to see the future progress

Marcos17 · February 27

Success with work bro

WeedHex · February 27

How do you generate the fake characters? I doubt you connected 180 clients

You should be sure that your "clones" are same of real players in level of heaviness.

Very curious about it, keep it up

Bubixon · February 27

It looks like swapped textures why is the texture of the middle of the city different?

TheEqualizer · February 27

29 minutes ago, WeedHex said:

How do you generate the fake characters? I doubt you connected 180 clients

You should be sure that your "clones" are same of real players in level of heaviness.

Very curious about it, keep it up

Your question is interesting. When you have the source you don't need players or servers to make things show up on screen. Both clients were changed to allow me to load maps, spawn entities, control the camera, etc without needing a server or other players. All entities are as "heavy" as they normally would be if loaded/used in real gameplay.

11 minutes ago, Bubixon said:

It looks like swapped textures why is the texture of the middle of the city different?

Because each client is using slightly different files. The version I am using for my renderer is the one I use normally, but then I downloaded another client just for comparisons, I didn't even realize some of the textures were different until I started testing. A couple of different textures has no effect on the results.

Denizeri24 · February 28

according to my analysis, 2 things tanking fps;

rendering meshs (DrawIndexedPrimitive from CGrannyModelInstance::RenderMeshNodeListWithOneTexture function)

and

granny mesh deforming function (GrannyDeformVertices from CGrannyMesh::DeformPNTVertices function)

I did deforming function parallel so its fixed fps tanking but still mesh rendering tanking my client fps.. I get 100 fps with 180 player instances (thanks to sh*tty mesh rendering).

I planning to upgrade mesh rendering with "Hardware Instancing" method, maybe this will fix fps tanking..

TheEqualizer · February 28

10 minutes ago, Denizeri24 said:

according to my analysis, 2 things tanking fps;

rendering meshs (DrawIndexedPrimitive from CGrannyModelInstance::RenderMeshNodeListWithOneTexture function)

and

granny mesh deforming function (GrannyDeformVertices from CGrannyMesh::DeformPNTVertices function)

I did deforming function parallel so its fixed fps tanking but still mesh rendering tanking my client fps.. I get 100 fps with 180 player instances (thanks to sh*tty mesh rendering).

I planning to upgrade mesh rendering with "Hardware Instancing" method, maybe this will fix fps tanking..

That is really good performance. What hardware did you use for your tests? Maybe you could share your scene? Are you sure that in your tests you didn't have other things active (like shadows)? 100fps is good performance, I don't think you need instancing, but implementing it is not a bad idea.

I am not using instancing.

Also multithreading the mesh deformer would require a lot of syncing, since a vertex buffer lock is performed before deforming (and D3D9 functions cannot/shouldn't be called from multiple threads, unless you create the device with D3DCREATE_MULTITHREADED flag, which causes the runtime to perform the syncing for you).

The best way to render is to avoid frequent state changes and frequent locks. Shader Model 3.0 supports vertex texture fetch, and you can use it to prepare one big texture containing data for many meshes, to efficiently render them in batches.

Denizeri24 · February 28

55 minutes ago, TheEqualizer said:

That is really good performance. What hardware did you use for your tests? Maybe you could share your scene? Are you sure that in your tests you didn't have other things active (like shadows)? 100fps is good performance, I don't think you need instancing, but implementing it is not a bad idea.

I am not using instancing.

Also multithreading the mesh deformer would require a lot of syncing, since a vertex buffer lock is performed before deforming (and D3D9 functions cannot/shouldn't be called from multiple threads, unless you create the device with D3DCREATE_MULTITHREADED flag, which causes the runtime to perform the syncing for you).

The best way to render is to avoid frequent state changes and frequent locks. Shader Model 3.0 supports vertex texture fetch, and you can use it to prepare one big texture containing data for many meshes, to efficiently render them in batches.

8700k w/ arc a770, full distance shadows + texts on

TheEqualizer · March 1

The renderer now has a D3D11 backend.

My initial intention was to have only a D3D11 backend, but I ran into some problems with D3D11, so I decided to have a D3D9Ex backend while I worked on the D3D11 backend. This week I finally got the D3D11 backend working, so the D3D9Ex backend will be deprecated, and all development will move to the D3D11 backend.

Now with D3D11, mesh deformation is performed once, in a compute shader.

TheEqualizer · March 7

Added anti-aliasing support.

"MSAA" is MSAAx4 (I chose this mode because all D3D11/D3D_FEATURE_LEVEL_11_0 GPUs are required to support it, so there is no need to check if it's supported).

FXAA does a decent job of removing aliasing but introduces some blurring.

MSAA+FXAA provides the best quality.

SMAA can be combined with MSAAx2 (a mode called "SMAA S2x" by the SMAA authors), but I chose not to implement this.

MSAA was used here only for comparison, only FXAA and SMAA will be supported. The reason is that MSAA will make supporting other features more difficult later.

DemOnJR · March 7

MSAA looks the best i think

Denizeri24 · March 7

25 minutes ago, DemOnJR said:

MSAA looks the best i think

I using 8x and its drop performance like ~30 fps.

TheEqualizer · March 8

8 hours ago, DemOnJR said:

MSAA looks the best i think

When combined with FXAA I think it produces the best quality.

In the future I might add other forms of anti-aliasing.

TheEqualizer · March 8

9 hours ago, Denizeri24 said:

I using 8x and its drop performance like ~30 fps.

MSAAx8 seems excessive to me.

I tested MSAAx8 (in my 180 characters test scene), at 2560x1440 resolution, and there was barely any performance difference relative to MSAAx4 or MSAA off. I think either you have a driver problem or Nvidia must have some special optimization for MSAA.

If you are using D3D9, then this could be a driver issue (since Intel does not have a good D3D9 driver, I think they use an emulation layer).

Also, I seem to remember Intel saying that ResizeBAR was necessary for good performance with ARC GPUs, so if you don't have that enabled, it could be the reason of the performance hit.

TheEqualizer · March 10

I decided to test the performance of the renderer when all the light slots are used. Right now the renderer supports a maximum of 17 lights (1 directional and 16 spot/point lights). In the test scene below, all 17 lights are active.

Performance was not affected. So, I will increase the maximum number of lights to 25 (1 directional, 24 spot/point lights). With some clever light management, it's possible to support much more than this, so I might revisit this later.

TheEqualizer · March 17

Implemented soft shadows using variance shadow mapping. Shadows add quite a bit of cost when many dynamic objects are visible, so this is a good candidate for multithreading.

There is decent self-shadowing as well.

DemOnJR · March 18

9 hours ago, TheEqualizer said:

Implemented soft shadows using variance shadow mapping. Shadows add quite a bit of cost when many dynamic objects are visible, so this is a good candidate for multithreading.

There is decent self-shadowing as well.

Looks amazing

TheEqualizer · March 18

Multiple light shadows are supported, but I don't expect this to be used.

Helia01 · March 21

Please tell me, do you also change the processing of effects in your changes? As far as I know this is a big problem in the original game.

TheEqualizer · March 21

21 minutes ago, Helia01 said:

Please tell me, do you also change the processing of effects in your changes? As far as I know this is a big problem in the original game.

Yes. The renderer is responsible for everything, anything that doesn't go through the renderer is not rendered.

Effects/Particles are pre-processed before submission so they can be rendered as efficiently as possible. I have tested having many effects at once and the impact was minimal.

One of the problems with the way Ymir renders things is that it sends very little work to the GPU (per draw call). GPUs work better when a large amount of work is sent because it allows the GPU/driver to better hide the latencies involved.

Punszz · March 21

Can you send a message in private? I would like to talk with you, i can't send messages right now

TheEqualizer · March 22

9 hours ago, Punszz said:

Can you send a message in private? I would like to talk with you, i can't send messages right now

I cannot send messages.

Punszz · March 22

4 minutes ago, TheEqualizer said:

I cannot send messages.

Send me on discord, discord username is same as my m2dev username xd

TheEqualizer · April 20

Implemented screen space ambient occlusion. In the images below, the effect was exaggerated a little to make it more visible.

Sign In

Experimental Renderer

Recommended Posts

TheEqualizer 42

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

TheEqualizer

TheEqualizer

TheEqualizer

DemOnJR 523

Link to comment

Share on other sites

Marcos17 17

Link to comment

Share on other sites

WeedHex 627

Link to comment

Share on other sites

Bubixon 3

Link to comment

Share on other sites

TheEqualizer 42

Link to comment

Share on other sites

Denizeri24 35

Link to comment

Share on other sites

TheEqualizer 42

Link to comment

Share on other sites

Denizeri24 35

Link to comment

Share on other sites

TheEqualizer 42

Link to comment

Share on other sites

TheEqualizer 42

Link to comment

Share on other sites

DemOnJR 523

Link to comment

Share on other sites

Denizeri24 35

Link to comment

Share on other sites

TheEqualizer 42

Link to comment

Share on other sites

TheEqualizer 42

Link to comment

Share on other sites

TheEqualizer 42

Link to comment

Share on other sites

TheEqualizer 42

Link to comment

Share on other sites

DemOnJR 523

Link to comment

Share on other sites

TheEqualizer 42

Link to comment

Share on other sites

Helia01 2065

Link to comment

Share on other sites

TheEqualizer 42

Link to comment

Share on other sites

Punszz 0

Link to comment

Share on other sites

TheEqualizer 42

Link to comment

Share on other sites