Libretro – A crossplatform application API, powering the crossplatform gaming platform Retro. Arch. I needed a break from para. A short article showing how to remove HDCP from HDMI signal, so non-HDCP devices will work with HDCP content. Mysql Trigger Update Multiple Rows In Sql there. LLEl RDP, and I wanted to give PSX a shot to have an excuse to write a higher level Vulkan renderer backend. The renderer backends in Beetle PSX are quite well abstracted away, so plugging in my own renderer was a trivial task. The original Play. Station is certainly a massively simpler architecture than N6. ![]()
After one evening of studying the Rustation renderer by simias and PSX GPU docs, I had a decent idea of how it worked. Many hardware features of the N6. Perspective correctness (no W from GTE)Texture filtering. Sub- pixel precision on vertices (wobbly polygons, wee)Mipmapping. No programmable texture cache. Depth buffering. Complex combiners. My goal was to create a very accurate HW renderer which supports internal upscaling. Making anything at native res for PSX is a waste of time as software renderers are basically perfected at this point in Mednafen and more than fast enough due to the simplicity. Another goal was to improve my experience with 2. D heavy games like the Square RPGs which heavily mix 2. D elements with 3. D. I always had issues with upscaling plugins back in the day as I always had to accept blocky and ugly 2. D in order to get crisp 3. D. Simply sampling all textures with bilinear is one approach, but it falls completely flat on PSX. Content was not designed with this in mind at all, and you’ll quickly find that tons of artifacts are created when the bilinear filtering tries to filter outside its designated blocks in VRAM. The final goal is to do all of this without ugly hacks, game specific workarounds or otherwise shitty code. It was excusable in a time where graphics APIs could not cleanly express what emulation authors wanted to express, but now we can. Development of this renderer was a fairly smooth ride, mostly done in spare time over ~2 months. Credits. This renderer would not exist without the excellent Mednafen emulator and Rustation GL renderer. Tested hardware/driversn. Vidia Linux/Windows 3. AMDGPU- PRO 1. 6. Linux (works fully)Mesa Intel (Ivy Bridge half- way working, Broadwell+, fully working, you’ll want to build from Git to get some important bug fixes which were uncovered by this renderer : D)Mesa Radeon RADV (fully working, you’ll want to build from Git to get support for input attachments)– But, but, I don’t have a Vulkan- capable GPUWell, read on anyways, some of this work will benefit the GL renderer as well.– But, but, you’re stupid, you should do this in GL 1. No ?– Fine, but clearly this is just for shits and giggles. Doing it for the lulz is always a valid reason. Source. The source will be merged upstream to Github immediately. PSX GPU overview. The PSX GPU is a very simple and dumb triangle rasterizer with some tricks. VRAMThe PSX has a 1. here. VRAM at 1. 6bpp, giving us 1. MB of VRAM to work with. Interestingly enough, this VRAM is actually organized as a 2. D grid, and not a flat array with width/height/stride. This certainly simplifies things a lot as we can now represent the VRAM as a texture instead of shuffling data in and out of SSBOs. Unlike N6. 4, the CPU doesn’t have direct access to this VRAM (phew), so access is mediated by various commands. Textures. The PSX can sample textures at 4- bit palettes, 8- bit palettes or straight ABGR1. Texture coordinates are confined to a texture window, which is basically an elaborate way to implement texture repeats. Textures are sampled directly from VRAM, but there is a small texture cache. For purposes of emulation, this cache is ignored (except for one particular case which we’ll get to …). An annoying feature is that the color “0x. PSX is always transparent, so all fragment shaders which sample textures might have to discard, another reason to be careful with bilinear. Shading options. PSX just has 3 shading options, which makes our life very simple: Interpolate color from vertices. Interpolate UV and sample nearest neighbor. Sample texture multiplied by interpolated color (gouraud shading)It is practical to not use uber- shading approaches here. Semi- transparency. PSX has a weird way of dealing with transparency. There is no real alpha channel to speak of, we only have one bit, so what PSX does is set a constant transparency formula, (A + B, 0. A + 0. 5. B, B – A, or 0. A + B). If the high- bit of a texture color is set, transparency is enabled, if not, the fragment is considered opaque. Semi- transparent color- only primitives are simply always transparent. Mask- bit. Possibly the most difficult feature of the PSX GPU is the mask- bit. The alpha bit in VRAM is considered a “read- only” bit if mask bit testing is enabled and the read- only bit is set. This affects rendering primitives as well as copies from CPU and VRAM- to- VRAM blits. Especially mask- bit emulation + semi- transparency creates a really difficult blending scenario which I haven’t found a way to do correctly with fixed function (but that won’t stop us in Vulkan). Correctly emulating mask- bit lets us render Silent Hill correctly. The trees have transparent quads around them without it. Intersecting VRAM blits. It is possible, and apparently, well defined on PSX to blit from one part of VRAM to another part where the rects intersect. Reading the Mednafen/Beetle software implementation, we need to kind of emulate the texture cache. Fortunately, this was very doable with compute shaders, although not very efficient. Implementation details. Feature – Adaptive smoothing. As mentioned, I prefer smooth 2. D with crisp- looking 3. D. I devised a scheme to do this in post. The basic idea is to look at our 4x or 8x scaled image, we then mip- map that down to 1x with a box filter. While mip- mapping, we analyze the variance within the 4×4 or 8×8 block and stick that in alpha. The assumption here is that if we have nearest- neighbor scaled 2. D elements, they typically have a 1: 1 pixel correspondency in native resolution, and hence, the variance within the block will be 0. With 3. D elements, there will be some kind of variance, either by values which were shaded slightly differently, or more dramatically, a geometry edge. We now compute an R8_UNORM “bias- mask” texture at 1x scale, which is 0. D elements, and 1. D. To avoid sharp transitions in LOD, the bias- mask is then blurred slightly with a 3×3 gaussian kernel (might be a better non- linear filter here for all I know). On final scanout we simply sample the bias- mask, multiply that by log. Lod() with trilinear sampling, and magically 2. D elements look smooth without compromising the 3. D sharpness. Sure, it’s not perfect, but I’m quite happy with the result. Consider this scene from FF IX. While some will prefer this look (it’s toggleable), I’m not a big fan of blocky nearest- neighbor backgrounds together with high- res models. With adaptive smoothing, we can smooth out the background and speech bubble back to native resolution where they belong. You may notice that the shadow under Vivi is sharp, because the shadow which modulates the background is not 1: 1. This is the downside of doing it in post certainly, but it’s hard to notice unless you’re really looking. The bias mask texture looks like this after the blur: Potential further ideas here would be to use the bias- mask as a lerp between x. BR- style upscalers if we wanted to actually make the GPU not fall asleep. There is nothing inherently Vulkan specific about this method, so it will possibly arrive in the GL backend at some point as well. It can probably be used with N6. Obviously, for 2. FMVs), the output is always in native resolution. GPU dump player. Just like the N6. RDP, having an offline dump player for debugging, playback and analysis is invaluable, so the first thing I did was to create a basic dump format which captures PSX GPU commands and plays them back. This is also nice for benchmarking as any half- capable GPU will be bottlenecked on CPU. PGXP support. Supporting PGXP for sub- pixel precision and perspective correctness was trivial as all the work happens outside the renderer abstraction to begin with. I just had to pass down W to the vertex shader. Mask bit emulation. Mask bit emulation without transparency is quite trivial. When rendering, we just use fixed function blending, src = INV_DST_ALPHA, dst = DST_ALPHA. With semi- transparency things get weird. To solve this, I made use of Vulkan’s subpass self- dependency feature which allows us to read the pixel of the framebuffer which enables programmable blending.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |