Tech/LxEngine/Tutorials/Tutorial 4

Overview
Tutorial 4 is similar to Tutorial 3 in that it acts as a simple viewer for various geometry and materials; however, in this tutorial, the application is built primarily using a combination of script files, configuration files, JSON data, and a native C++ plug-in within the generic lxengineapp.exe host application.

This tutorial covers a lot of ground (actually a bit too much) and introduces various aspects of LxEngine:


 * Creating point lists, line sets, and triangle meshes via Javascript scripts
 * Toon shader
 * Blur effect (multi-pass rendering)
 * Smooth camera changes (timed events)
 * High-level profiling (using LxEngine counters)
 * Custom JSON-to-native object conversion (lxvar _convert functions)

Source Code
The source for this tutorial is highly commented. If you prefer, reading the source code directly may be instructive:


 * samples/tutorials/tutorial4
 * data
 * src

LxEngineApp.exe
The first major difference in Tutorial 4 is how it is run.

All previous tutorials produced an executable to be run directly. In this case, we produce a DLL (or shared library) and a data directory that is then run using the lxengineapp.exe. The generic lxengineapp.exe acts as a host application that sets up LxEngine for its common usage and loads configuration files, scripts, and plug-in to turn the generic engine into a custom application.

Tutorial 4 is run by executing the lxengineapp.exe and providing a data file directory. In this case, the specific command is. lxengineapp.exe automatically searches the "apps" directory for subdirectory matching the given name.

Manifest.lx
LxEngine relies on conventions when it makes sense to do so.

The first specific example encountered in Tutorial 4 is that LxEngineApp.exe looks for a file named manifest.lx in the data directory. This is a standard JSON file that describes some common start-up configuration options for the application. This lets the engine know how to bootstrap the application. The manifest file simply provides an easier short-hand alternative to coding these options and directives directly in native C++.

The content in this case is quite simple:

The configuration basically tells the application to load the plug-in named "tutorial_04.dll" and start execution with the script main.js.

main.js
This application's custom execution begins with the script file main.js. Another convention of LxEngine is that it assumes the main script file has a function named  defined within it.

This should look fairly familiar if you've worked through the previous tutorials. The obvious difference is that the code is now written in Javascript rather than C++. In general, the Javascript API mirrors the C++ API except where Javascript affords a more convenient alternative (that convenience is one major point of using a scripting language in the first place!). In the interest of brevity, we'll assume the above all makes reasonable sense to someone familiar with the basics of LxEngine's API.

The Javascript context has some pre-defined variables such as  for the free functions in the C++ lx0 namespace as well as the   variable to reference the lx0::Engine singleton. Another convenience is that any method taking an  argument can take a Javascript object directly (e.g.  ).

Javascript API differences
The C++ API defines a method. The Javascript API does not correspond exactly to the C++ API. Instead, for convenience, it allows a "generic" Javascript object to be passed into the function and will construct a well-formed native UIBinding from that.

Renderer Plug-in
Admittedly, while this tutorial does introduce far more use of scripting, the most interesting part of this application still takes place in the native code.

The main.js file created a view using the "Renderer" View::Component and it is this "Renderer" component that handles the majority of the appplication behavior.

initializePlugin
The component identified by the "Renderer" name is defined in tutorial_04.dll. As a reminder, this plug-in was loaded as part of manifest.lx in the plug-ins section of the JSON file. All plug-ins define an exported function named. LxEngine will call this upon loading the plug-in. This is where the plug-in has an opportunity to register named custom components with the engine. In this particular case, our tutorial_04.dll will register the "Renderer" plug-in as a View::Component.

Multi-threaded Geometry Loading & Tasks
Tutorial 4 loads the Blender models in background threads. The general pattern to handle this is as follows:


 * Generate stub geometry to place in the geometry list
 * Load the blender file in a worker thread using only data local to that thread
 * "Pass" the data to the main thread using a main thread task

Geometry Loading Outer Function
The primary method for adding geometry based on a  XML element does two things: (1) it creates the stub entry in the geometry list, and (2) forks off the real work to a sub-method to handle either Blender files or geometry coming from a script.

The creation of stub geometry allows us to get the main application window (i.e. the View) up and running as soon as possible. The geometry will "pop in" as soon as it's ready (the pop effect would be bad in some applications, but it's the chosen behavior here).

acquireGeometry
The LxEngine rasterizer can a "named resources" to describe materials and geometry. This allows data to be shared by a string name rather than an explicit shared C++ variable (which can be quite convenient at times). In this case, the named resource is called "basic2d/Empty" - which represents the empty stub geometry we want.

On the first call, the rasterizer will search the cache and not find a matching GeometryPtr for "basic2d/Empty". Therefore, it will assume this a on-disk resource and search for it under the media/geometry directory. It will look for a file named Empty.json under the basic2d sub-directory and attempt to convert that JSON geometry description into a in-memory primitive_buffer object.

Since this is an "empty" object, the JSON description in Empty.json is trivial:

A slightly more interesting example might be basic2d/FullScreenQuad which is loaded internally when the engine needs to blit a full screen quad:

The JSON description essentially mirrors the glgeom::primitive_buffer data structure. For simple geometry, this direct JSON description of a primitive buffer may be preferable to loading geometry from a Blender file or other more complex format.

Coordinating Worker Threads & the Main Thread
The Blender geometry loading is one of the more interesting aspects of Tutorial 4 as it introduces multi-threading.

The method takes advantage of C++11's lamdba functions to coordinate a worker thread task and main thread task that will properly transfer data gathered from the worker thread into the main thread without explicit use of any locking primitives. (Note: the LxEngine event queue is thread-safe; thus there is implicit use of locking primitives in order to accomplish this.)

First, we create a  to encapsulate loading the Blender file. This function is going to be run in a worker thread, therefore we are careful to make sure it deals only with locally defined data (i.e. data that other threads will not be touching) or functions that are known to be thread-safe (e.g. ).  The lamdba function itself more or less simply calls   to do the real work of loading the Blender model.

The function is then forked off to the worker thread via. The worker threads each own a task queue and wait for tasks to be added.

Second, we then create a second  to encapsulate moving that loaded data into the   queue. The key point here is that the  function object does the portion of the work that needs to be run in the main thread, and thus is enqueued in the main thread queue via. This queue is processed by the main thread on every cycle of the main event loop.

The transfer between the two queues ensures a coordinated, sequential order of events that is thread-safe.

Loading Geometry from a Script
Tutorial 4 allows geometry to be loaded either via a blender file or a script.

One current restriction in LxEngine is that the Javascript interface has to be called from the main thread (it's not thread-safe and will crash if called outside the thread that initialized it). This means that the script processing cannot be forked off to a worker thread as is done with the Blender geometry.

Now, as the script processing needs to be done in the main thread, we could process the script immediately as part of the initial XML document processing. However, we're going to choose to wrap it as a task for the main thread and defer the loading until the application has fully started. As the code comment below denotes, this is done primarily for demonstration purposes - there's not a lot gained from this approach for this particular case. In a more complex application deferring tasks like this could be useful for increasing perceived performance and interactivity.

Points
One of the pieces of geometry in the tutorial is a simple set of points representing the shell of a cube. This geometry is generated via Javascript.



The actual technique used for generating the point set involves a bit of work for such simple geometry, but it's a good introduction.

First, let's look at the  section of document.xml. As the XML might suggest, this will include a Javascript file called meshlib.js. All files included in the  section of the document will be added to the scripting context for the entire document: i.e. any functions or classes defined in such files are accessible to other scripts used by that same document.

We'll skip looking at the contents of meshlib.js for a moment and simply note that it defines several helper classes for generating geometry. In this case, we're most interested in the  class. Since the PointList class is simply a container for a list of points (surprise!), one of the only two methods we need to worry about is the  method. The other is, which we'll get to in a moment.

Creating the Cube
The algorithm for creating the shell of the cube in points is straightforward (if inefficient). It cycles through a grid of points in the cube space and rejects anything other than the outermost layer.

The interesting parts to note are:
 * The script is run with the expectation of a return value; therefore, we wrap the code in a anonymous function call
 * The return value is not a PointList, but rather the PointList conversion to a "primitive buffer"

Primitive Buffer
The  is a generic wrapper on what largely amounts to a CPU-side vertex array object. It stores arrays of data such as position, normals, colors, UVs. It optionally can store an indexed primitive or a flat unindexed array of data. It stores a type as well (which corresponds to the OpenGL types such as,  , etc.)  Also, because it is a simple structure of basic data types and  array of arrays, it can be easily expressed in JSON form.

Because of this last property the glgeom::primitive_buffer is the preferred intermediate format for transferring geometric data into the engine. Using a generic format such as this prevents the core engine from having to natively understand numerous potential mesh formats (this is another instance of LxEngine choosing flexibility over raw performance).

With this in mind, let's look at the PointList class definition...

Lines
The next script generated primitive is a set of lines.



The basic pattern is just like the point set example. The lower-level details differ since the algorithm here uses a more complicated approach to generating the data. We're going to skim over the details to keep the tutorial from straying too far off course, but suffice to say this example suggests more complex ways scripts could be used to generate geometry.

The short summary of how the lines example work is this:
 * The  function creates a   object representing a cube
 * The  class is a wrapper of a "polygonal soup" mesh (no connectivity information)
 * A  is then created from the TriMesh; thus encapsulating that same object in a class that now does have face/edge/vertex connectivity information
 * It then iterates the edges* of the HalfEdgeMesh and adds a line segment for each edge
 * The HalfEdgeMesh is then converted to primitive buffer format for consumption by LxEngine

(*Note: in half-edge data structures, each logical edge is composed of a pair of half-edges which is why we mark the "opposite" half-edge as visited.)

Triangles
The third type of script generated geometry is a cube smoothed into a sphere using Catmull-Clark subdivision. The geometry is generated using a data structure and a standard implementation of Catmull-Clark subdivision. The data structure and the subdivision algorithm all are implemented in Javascript.

A cube with 5 levels of subdivision with the Catmull-Clark algorithm. Displayed in wireframe.

The basic pattern of the code for this should be familiar based on the last two examples. The Javascript creates a custom mesh object and eventually converts it to a generic primitive buffer for consumption by the engine.

The details of this geometry generation are not covered here. is fairly well documented elsewhere (better than could be here) and is not essential to the core of this tutorial. The core point is to be received here is that procedural generation of geometry via Javascript is no way limited to simple, basic shapes.

lxvar::convert
As a final sub-section, let's look at the C++ code for converting these JSON primitive buffers to native code:

The "source" variable contains the script string and  returns an. As a reminder, the lxvar class is native JSON-like data structure. We then use the  method to convert the lxvar to a glgeom::primitive_buffer.

There's a little bit of C++ trickery occurring here. The  method uses a sort of "overload by return value" mechanism to allow the single convert method to be used for any kind of conversion. The basic concept is described here. In short, it uses the templatized implicit conversion operator to then call a regular overloaded conversion function to do the work. This allows a simple syntax to be used to convert types, but also allows custom conversions to be registered without modifying the base lxvar data type.

Any function of the type  that is placed in the namespace   will be searched when looking to match a desired conversion.

LxEngine internally defines several common, built-in conversions in lxvar_convert.cpp. Note again that these _convert implementations are "add-ins" in the sense that they require no modification of the lxvar class.

Materials
Tutorial 4 supports various new materials via a simple shader loading system. Specifically, it introduces several variations on a Toon shader, a simple Terrain shader, as well as JSON-based shader graph materials from the previous tutorial.



acquireGeometry
The code for loading a material is similar to the  function describes previously. The rasterizer is queried for a named resource, and if it does not have that named resource cached, it will attempt to load the resource off of disk.

In this case, we pull the name of the resource from the XML file in the "src" and "instance" attributes:

MaterialClass and Material
LxEngine makes a distinction between a "material class" (i.e. ) and "material instance" (i.e. , shorted for brevity).


 * A MaterialClass effectively is a complete shader program without any specification of the parameters to that shader.
 * A Material is effectively a pointer to a MaterialClass along with a set of parameters to use with that shader

Therefore, only Materials can be assigned to Instances in LxEngine since a MaterialClass cannot be activated without setting the parameters. This is similar to the fact that a function cannot be called without specific values to pass as parameters.

However, for convenience, a MaterialClass does store a set of default parameters. Therefore, a default Material instance can be created from a MaterialClass. As a shorthand, when a MaterialClass is specified to acquire a Material, this will automatically invoke the default Material for that Class.

Toon Shader
(More information on the toon shader is available on the Material System blog post on the athile blog.)



The toon shader is stored in a directory under the media directory in ToonSimple. A material directory should contain:


 * A vertex shader named shader.vert
 * A fragment shader named shader.frag
 * Optional: a geometry shader named shader.geom
 * Optional: a JSON description of the default parameters named material.json
 * Optional: a set of JSON files named instance- .json describing alternate parameter sets

The toon shader used in the tutorial is adapted from on the GLSL toon shading article on lighthouse3d. The variation is that this shader uses a 1d texture map look-up rather than a if-else structure to choose the color.

We'll assume the reader is familiar with GLSL and not provide much explanation here. The lighthouse3d tutorial will likely help if more information is needed about the shader.

The vertex shader is a simple:

Likewise, the fragment shader is simple:

There is no geometry shader file, so LxEngine will create a shader program without a geometry shader.

The default parameters are then specified in the material.json file:

When the MaterialPtr is created via the acquireMaterial call, the engine (1) creates the material class and then (2) creates the material along with code to set the various parameters. The parameter setting is smart enough to realize that the parameter named "unifTexture0" is a sampler1D and that the string value it has been assigned is a image filename. Therefore, it will automatically load the texture map into the texture cache on creating the material, activate and assign that OpenGL texture map when activating the material itself. This happens "automatically", making it very easy to set up new shaders and shader variations.

Likewise, the vertex shader has several named uniforms - these however use "standard" names that LxEngine will recognize without any need to specify them in the material.json file. If LxEngine sees that the shader contains a uniform named "unifProjMatrix", it will automatically generate the code to set that uniform to the current projection matrix in the rasterizer when activating the material.

A partial list of standard uniforms includes:
 * mat4 unifProjMatrix - projection matrix
 * mat4 unifModelMatrix - model matrix
 * mat4 unifViewMatrix - view matrix
 * mat3 unifNormalMatrix - normal matrix
 * vec3 unifBBoxMin - minimum coordinate of the current object bounding box
 * vec3 unifBBoxMax - maximum coordinate of the current object bounding box

(Note: the list of standard uniforms should be documented, but for now it's a work-in-progress engine and checking the code is easiest way to get the definitive list of supported standard uniforms.)

Toon Shader Instances
The ToonSimple directory also contains several "instance-*.json" files. These essentially are JSON parameter sets describing alternate instances of the same material class. Any parameters specified here will override ones provided in the default material.json parameters.



For example, instance-Forest.json:

Terrain Shader
The terrain shader used in Tutorial 4 provides another example - along with the Toon shader - of a custom shader loaded from file.



The terrain shader mixes samples from several different textures and colors based on the vertex position relative to the object bounds. The shader does not introduce any new concepts as compared to the toon shader.

One particular point of interest is the use of the  and  uniform variables. These variables are both "standard" LxEngine uniforms that the engine will set to the object bounds when activating the shader. This can be useful for mapping shaders onto the entire span of an object and/or keeping the shader scale relative to the object rather than the world coordinate system.

The terrain shader code listing is located in TerrainSimple.

Shader Builder Materials
An alternative to describing a shader via vertex, geometry and fragment shaders is to provide a "shader.graph" file representing a shader graph described in JSON form.



This relies on a work-in-progress, unstable feature of LxEngine, so it is not well documented. The source for the shader grapher used above is located in BumpSample1.

Grayscale


Note that the material is "acquired" every frame. Resources created by an acquire method are always cached by the given look-up name; therefore, the first acquisition results in loading and creating the material. All subsequent acquisitions simply return a pointer to the material now already in the cache. The cache look-up every frame is slightly inefficient (as opposed to storing a member variable for the material), but overall is such a minor inefficiency that the simplicity of reacquiring every frame is preferred.

Color Inversion
This is another simple example, very similar to the grayscale effect.



Rendering the Scene
The general rasterization strategy has not changed since Tutorial 3. We simply have set up a more complex  to pass into the rasterizer functions.

Camera Change Animations
The objective here is to define some commands where a keypress zooms the camera in or out smoothly over a short period of time. This is not an "single-frame" event where the zoom level changes if and only if the key is currently pressed during that frame; but rather the keypress invokes a event that occurs over a series of frames.

The usual key binding to event sequence is used and then we add handlers for "zoom_in" and "zoom_out" events.

Canceling an Event
When an event is registered along with a handle, it is stored in the Engine queue as a. The use of this standard class is intentional so that the semantics of use (i.e. what's happening) should be easy to understand by the client code.

Upon registration, the engine returns a  to the client and internally a   is stored in the queue. Therefore, ownership of the event is given to the client and the engine internally maintains a passive reference to the event. With this in mind, for the client to cancel an event, it simply must release all references to the event.

In this case, the event handle is always stored in. To cancel the event, that  simply needs to be reset:

The  of the object will drop to zero, the weak_ptr stored internally will expire, and the engine will no longer attempt to process the event.

Profiling
LxEngine contains an high-level, internal profiling system. Emphasis is encouraged on "high-level", as this a simple, easy-to-use profiling system that is intended to gather coarse-grain data on the application. It is not intended to compete with or replace low-level profiling tools and counters such as are available via Visual Studio, PerfHUD, or gDebugger. The goal is to provide a high-level view of where time is being spent in an application.

As such, it is worth noting that (1) the LxEngine counters do add small, but appreciable overhead to a function, and (2) the profilers are always on in both debug and release builds of LxEngine. The intention is that the timers be used at a high enough level that both points (1) and (2) amount to neglible overall overhead.

On the plus-side, the profiling structures in LxEngine are simple and work correctly with recursion and multi-threading.

Local Profiling Data Structure
An easy pattern to use for the profiling code is to create a local structure for the profiling counters for a particularly piece of code, file, or subsystem.

The structure likely should be global if you are concerned about global function call counts and times. If, however, you want to create per-object counters the profiler counters could be member variables (though this is not necessarily recommended, as the LxEngine profiling system is designed for a relatively small number of counters and has not been tested in cases of tens of thousands of counters, which might occur with per object counters).

Wrapping the profile counters in a structure allows for several key components to be handled:
 * The counters can be set initially to zero in the constructor
 * An initialize method can name each counter via calls to
 * The initialize method can also create correlations between counters via calls to

Initialization
The profile counters are tracked via variables; however, these integer names need to be registered before they can be used. The only key requirement here is the profile counters need to be initialized before they can be used.

Usage
The current usage for a LxEngine profile counter is to time functions (both inclusive and exclusive run-time) and call counts on a per-thread basis. They can be safely used in recursive functions as well.

This is all handled by creating a scoped variable and passing in the relevant counter as an argument.

Results
At the end of an LxEngine application run, the profiling results are logged to in the working directory of the executable. The log is a straightforward text file.

Here's a partial view of a profile log from a run of Tutorial 4:

Profile Data
Thread 4364 --- WorkerThread lifetime         ::      1 calls    279703ms inc   277711ms ex 279703.928 avg WorkerThread tasks            ::      3 calls      1992ms inc     1992ms ex    664.111 avg BlendReader open              ::      2 calls       985ms inc        0ms ex    492.715 avg BlendReader readHeader        ::      2 calls         0ms inc        0ms ex      0.075 avg BlendReader readBlocks        ::      2 calls       954ms inc      954ms ex    477.119 avg BlendReader indexBlocks       ::      2 calls        30ms inc       30ms ex     15.442 avg WorkerThread tasks            >       0.712% of WorkerThread lifetime Thread 5344 --- WorkerThread lifetime         ::      1 calls    279702ms inc   275654ms ex 279702.184 avg WorkerThread tasks            ::      4 calls      4047ms inc     4047ms ex   1011.927 avg ... WorkerThread tasks             >       1.447% of WorkerThread lifetime Thread 6384 --- WorkerThread lifetime         ::      1 calls    279701ms inc   276008ms ex 279701.264 avg WorkerThread tasks            ::      4 calls      3692ms inc     3692ms ex    923.082 avg ... WorkerThread tasks             >       1.320% of WorkerThread lifetime Thread 6576 --- Engine run                    ::      1 calls    279714ms inc       16ms ex 279714.158 avg Engine runUpdate              ::  16286 calls      3687ms inc      159ms ex      0.226 avg Engine runLoop                ::  16286 calls    279697ms inc     5997ms ex     17.174 avg Document updateRun            ::  48858 calls      3527ms inc     3527ms ex      0.072 avg Renderer render               ::  16292 calls    266741ms inc     1027ms ex     16.373 avg Renderer update               ::  16286 calls       644ms inc      644ms ex      0.040 avg CanvasGL impRedraw            ::  16292 calls    268360ms inc   268360ms ex     16.472 avg Rasterizer acquireMaterial    ::     73 calls       180ms inc      178ms ex      2.473 avg Rasterizer rasterizeList      ::  16292 calls     15854ms inc      621ms ex      0.973 avg Rasterizer rasterizeItem      ::  16292 calls     13723ms inc     7972ms ex      0.842 avg Rasterizer beginFrame         ::  16292 calls    249610ms inc   249610ms ex     15.321 avg Rasterizer endFrame           ::  16292 calls       249ms inc        7ms ex      0.015 avg Renderer render               >      95.368% of Engine runLoop Renderer update               >       0.231% of Engine runLoop Rasterizer rasterizeList      >       5.668% of Engine run Rasterizer beginFrame         >     1574.409% of Rasterizer rasterizeList Rasterizer rasterizeItem      >      86.562% of Rasterizer rasterizeList Thread 7104 --- WorkerThread lifetime         ::      1 calls    279703ms inc   278326ms ex 279703.863 avg WorkerThread tasks            ::      3 calls      1377ms inc     1377ms ex    459.220 avg ... WorkerThread tasks             >       0.493% of WorkerThread lifetime

For demonstration purposes, let's note a couple pieces of information the above profile gives us.

Worker Thread Utilization
First, let's look at the "WorkerThread lifetime" to "WorkerThread tasks" correlation. The for the "lifetime" represents the creation time of the worker thread to the termination time of the thread. The for the "tasks" represents the subsection where the thread is activately processing a task (rather than blocked waiting for one to be put in the queue).

The correlations here tells us that between 0.493% and 1.447% of the worker threads lifetimes is spent actually processing tasks. Over 98% of the time in this application run, the worker threads were inactive. At least for this run, Tutorial 4 was not a heavily multithreaded application. Of course, this should not be surprising as the only times the worker threads are used is at initialization to load the geometry - so the longer the application runs in total, the less and less time will be spent in those worker threads.

Cost of glClear
Second, let's look at the rasterization process. The "Rasterizer beginFrame" counter is taking 15.3 ms on average versus the entire "Rastezier rasterizeLists" calls taking, by comparison, a mere 0.973 ms on average. Why is beginFrame taking so long? Well, the answer - it turns out - is the call. That's a slow, but mandatory call. The real conclusion from this data is not that beginFrame is too slow, but rather we could potentially render much more per frame without the scene rasterization necessarily becoming the relative bottleneck in the app.