Jump to content
Double Fine Action Forums
Sign in to follow this  
DF Du Bois

Save system notes

Recommended Posts

edit: had to attach the .lua file as .txt to appease the forum gods.

edit 2: the attachment isn't showing up, so use this link.

Problem Statement

At the beginning of Amnesia Fortnight, Brandon and I spent a bit of time talking about some of the technical features we wanted in the game. I want to talk about the save system a bit, because not only did it pose more challenges than either of us expected, the short timeframe (and the tricky problem) required some interesting implementation decisions. I've attached a copy of the source code here.

The original goal was to allow the use of a "memory diff" tool -- this is a real world tool that detects changes to a memory range over time. Ancient games didn't have a lot of memory to work with so their memory layouts tended to be quite flat and simple. However, the memory layout of a modern, dynamic language like lua is quite complicated. Values in tables shift around all the time as the table resizes itself or experiences collisions; and nil values are not even explicitly represented.

We toyed around with some ideas for mitigating this (ie, storing all interesting values in the array-portion of a single, global lua table) but it enforces a very artificial programming style; worse, it means that the tool would only be useful for a limited set of the game's data. This violated a main design principle: the tools should not be pretend tools, and they should not operate on a pretend game. They should operate on the game itself.

We reluctantly put that idea aside, and instead focused on creating faithful checkpoints of game state that could cope with all the shenanigans that the proposed toolset would allow. The two main possibilities were:

1. To save, write out the raw process memory. To load, figure out how to trick the OS into giving you a new process whose memory you can overwrite. Recreate the state held by the OS; stitch up the references from the app to the OS resources (file system, gpu, etc). Also stitch up the references from the OS to the app.

2. To save, write out the lua object graph. To load, reconstruct the object graph. Recreate the state held by the engine (Moai); stich up the references from lua to the Moai resources (MOAIProp, MOAITexture, etc). Also stitch up the references from Moai to the app. Also preserve little bits of global state, like the timestep.

The first is crazy, resource-intensive (150MB save files?), and risky. The second is less crazy, less resource-intensive, and less risky. That was a no-brainer decision.

For "write out the lua object graph", after some initial flailing around I realized there was no way around using big guns, so we integrated the pluto library by Ben Sunshine-Hill. It works very well and as advertised, although our use case required a couple easy generalizations.

See http://www.slideshare.net/hughreynolds/lua-and-fable-jonathan-shaw-lionhead for similar use in games, although they perhaps sanely restrict its traversal somewhat.

See https://github.com/hoelzro/pluto/blob/master/README for API details.

Although SaveLoad.lua is a large file (by lua standards) it is on the surface fairly simple. lib.save() saves off the _G (globals) and package.loaded (loaded lua files) arrays, and a couple more minor things. lib.load() reconstructs them and puts them back where they belong.

There are, of course, complications.

_G

The global table (also known as _G, package.loaded._G, _G._G, etc) can't easily be replaced; it's referenced by almost everyone's fenv, might be referenced from packages that we decide (for speed and save-game-size reasons) not to persist, and so on. So instead we shallow-copy it on save(), and reload it on load(). Similarly with package.loaded.

The design of Moai

One of the strengths of Moai is its tight integration between lua and C++. Almost all of our game objects are really C++ objects with little bits extending into the lua world, like the tips of icebergs. The "surface area" between the lua universe (which we can save in a mostly-automated fashion) and the C++ universe (for which we need custom-written code) is therefore large. You can see this in the large amount of mostly-boring code at the bottom of SaveLoad.lua.

Finding all Moai objects

Not every important game object is referenced from lua. For example, we create static Box2D objects when the room is loaded but don't keep references to them. The objects must somehow be saved and reloaded, though! The key here is that Moai itself holds references, to keep them from getting garbage collected. Even better, they're stored in the ref table (see my earlier post re: the structure of a Moai object), which is pretty easily accessible from lua.

So it turns out that all we need to do is save off all the reftables we find, hoping (correctly) that they'll contain references to all the invisible-but-important Moai objects we need. For example, the Box2D world's reftable contains refs to all the fixtures and bodies created in that world. Another example is the MOAIAction tree, which contains all the "threads" (coroutines) that we spawn off. Thanks to pluto, these also get persisted -- along with their stacks, local variables, the function they're executing, its upvalues, the instruction pointer, etc. Magical!

This is subtle, but we don't even need to reconstruct the reftable. As long as objects get into the save, they'll be recreated on load. I just trust that the mere act of creating these objects is enough to get them properly linked into any appropriate runtime data structures.

Reconstructing Moai objects

In order to properly reconstruct an object, we need to be able to query it for all its relevant information -- eg, position and rotation for a Box2D object, sprite sheet for a prop, etc. Moai has a vast number of set() methods on its objects, but is missing a lot of get() methods.

To get around this without having to constantly write a bunch of boilerplate C++ all the time, we took advantage of the "closure" and "upvalue" features of lua. The utility function _add_moai_getter() reaches into a Moai class, wraps the set() method so it also saves off a "shadow" copy of the thing being set, and inserts a new get() method which fetches this "shadow" state. This probably saved hours of manual, error-prone, and bo-ring C++ work; it's an excellent example of the power offered by dynamic languages like lua.

Additionally, our work dissecting Moai objects came in handy for saving and restoring values set on Moai objects by lua code (_extract_user_fields).

Share this post


Link to post
Share on other sites

Circular references

This is the big one! And it's the trickiest to write about without delving into a lot of jargon. But it's almost winter break! So my apologies, I'm just going to lay it out.

pluto does a great job handling circularities in lua object graphs. It dives in depth-first, keeping track of objects it's seen before. Internally, it keeps a "ref table", whose keys are "things that have been persisted" and whose values are small integer values used as backreferences. (If you realize that NaN is one of two values that are forbidden by lua from being table keys, you'll also realize where the NaN bug came from)

At unpersist time, pluto does the opposite. Upon creation of each new object, it saves off a reference to the object in an integer-keyed table. When it sees a backreference, it looks up the ref and uses that value.

This algorithm properly handles general cyclic structures; but it's important to note that for it to work, the sequence of events needs to be

1. Create new object

2. Record object

3. Initialize new object (ie, fill in table data)

If you flip steps 2 and 3, an object that refers to itself (directly or indirectly) will instead get a nil during the unpersist, because the "Record object" step comes too late.

This is fine for lua, but what happens for userdatas (ie C++ objects)? The out-of-the-box (and very clever!) pluto scheme is to call a __persist metamethod on the userdata. This should return a thunk (function of no arguments) which is persisted; on unpersist, the thunk is executed and is supposed to create and initialize a userdata, which is then recorded.

You probably see the problem already: userdata reconstruction performs steps 1 and 3 at the same time; then it performs step 2.

Laid out like this, the solution is pretty obvious: allow __persist to return 2 functions: a "create" thunk, and a unary "initialize" function. That's exactly what we did, and it was both elegant and provably correct.

Circular references, pt 2

It wasn't good enough! Some objects absolutely need other objects in order to be created. For example, the only way to create a (useful) Box2D fixture is to ask a Box2D body to create it for you. And the only way to create a (useful) Box2D body is to ask a Box2D world to create it for you.

The problem is that the algorithm is only provably correct if the construct thunks have no dependencies of their own. If a constructor relies on other objects you have to redo the entire analysis. With this simple 2-stage algorithm, it turns out that if object C's constructor relies on object P, it recursively relies on everying that object P's constructor and object P's initializer rely upon. This drastically expands the possibilities for broken circular refs. For example:

Construct player's Box2D Body

..Construct room's Box2D world

..Initialize room's Box2D world

....(via cxx_refs) Construct all the bodies and fixtures in the world

......Backref to player's Box2D Body

........Boom

As a side note, this sort of analysis is why it's useful to have a solid math background. You may not necessarily ever need the real analysis or abstract algebra or whatever; but it's good training for being able to think abstractly about knotty problems.

The solution, really quickly: make sure that certain objects get persisted (and therefore unpersisted) in a "correct" order (where "correct" is determined by the previous analysis; and thankfully there is at least one correct ordering). In this case, there is a very subtle hack in Room.lua that ensures that the Box2D world always gets persisted first; then there is some code in the Box2D world __persist that ensures that bodies are always created before fixtures.

OK holy moly that was a long post! But it's perfect timing -- looks like all my Cave builds are done uploading. That means it's tiem for some Xmas vacation. See you next year, and I hope this was half as interesting to read as it was for me to write.

Share this post


Link to post
Share on other sites

As was said in the other thread a gamestate tree diff tool could be interesting. It might be able to bring back some of gameplay you were originally looking at which doesn't work well dynamic languages + garbage collection.

Could introduce it in multiple stages, since generally if you have a full path like rWorld.rInventory[2].foo then its pretty obvious what it is. Could combine it with a comment from Tim during the final day that it might be interesting to have fields labelled with unknown hex names. So the first version of viewing and comparing gamestate you would only see the backref numbers (from looking over some DEBUG_PERSIST traces I'd say that sure helps confuse reading what's there), and have to make guesses which you can try out, similar to the memory poke idea. Then later on you might get an upgraded tool that decodes those so you can tell what's what more easily.

Seems like this sort of thing might have been a bit easier in a pure dynamic language, rather then having alot of userdata objects. I haven't done much along these lines recently but I think for example Python has the pickle class, which handles a bunch of circular reference stuff out of the box. Then I know the .NET seralization pattern allows special constructors for initializing from serialized state rather than going through the normal constructor. However I'm sure sticking with the MOAI engine probably makes alot of sense for you, so probably not really relevant. And sounds like you've got good solutions to many of the issues already, just with a few workarounds (which will probably try to breed, but hey noone said this had to be easy).

Might need to spend a while looking over how you did this if I do more fiddling with the World Tree over my break, since the general lack of getters on userdata objects makes it hard to make things nice.

Looking forward to Cave too :)

Share this post


Link to post
Share on other sites
As was said in the other thread a gamestate tree diff tool could be interesting. It might be able to bring back some of gameplay you were originally looking at which doesn't work well dynamic languages + garbage collection.

We actually gave this some thought, but I don't think it's a well-posed general problem. There is not enough context in a "path" to infer identity between objects. Of course, that just rules it out as an implementation strategy for hypergeneral systems like checkpointing. By limiting the problem domain one could write a more useful tool that uses heuristics to guess at identity; there might be interesting puzzles there.

Share this post


Link to post
Share on other sites

Heh, it almost sounds like building a VM and writing a good portion of the game in that is the 'easy' solution. This is either a very good day or a very bad day depending on what problems you like solving. xD

Share this post


Link to post
Share on other sites

This is beautiful. It brings a tear to my eye. One of my favorite Christmas presents this year.

I'm going to have to look through this and check it out.

Share this post


Link to post
Share on other sites

Thanks for all the prototype details.

Can you please supply a link to the SaveLoad.lua source file?

The forum shows no attachments at all.

Thanks

Share this post


Link to post
Share on other sites

Can you please supply a link to the SaveLoad.lua source file?

The forum shows no attachments at all.

Well, well. It looks like the forum software is hiding the attachment link because only image attachments are allowed. Well, this direct link works for me.

Share this post


Link to post
Share on other sites
Sign in to follow this  

×
×
  • Create New...