Consistency in Game Development and Life

One of the biggest problems I’ve had in solo-development, is just staying consistent. Just sitting down and doing the work on any sort of a consistent basis. Inspiration is something that comes and goes, and while ideally you’ll always be inspired and striking while the iron is hot, realistically that’s just not what happens. Inspiration ebbs and flows in anything creative but most noticeably so in anything long-form, which game development is for sure. It can takes weeks of work for even the smallest projects, and exponentially more for solo projects.

That’s not to say this is a problem unique to game development. I don’t think there’s a person alive who hasn’t struggled deeply at some point in their life to do something as simple as cleaning their room. To be, and to remain, consistent is something I always have and will continue to struggle with. And that’s okay. But there’s things I can do to help to fix it.

Notably, just breaking things down into smaller chunks has done wonders. That’s not exactly Earth-shattering advice, I know. I’ve heard that recommended time and time again. But there’s a reason people keeping saying it… it works. I implement this in game design starting at Day 0. Before I write a single line of code, or put a single pixel on a canvas, or write a single note of a piece music, I draw up a timeline of things that need to be done and estimates of how long they’ll realistically take. You can take as long as you need on this. Having a clear and concise plan-of-attack that breaks everything down into small, achievable goals is what I consider to be one of the most important things you can do in pre-production. Think of this timeline as living beside your Game Design Document (GDD). The GDD is what you want to make, and the timeline is how you’ll make it.

It can be as simple as a Word document listing things to do and when you hope to have them done by.

Super basic

I typically like to have one larger timeline for the big pieces of the puzzle, and then a separate one for smaller ‘sub-goals’ inside each of the bigger ones. Taking the world map as an example. I decided very early on that the playable game space would consist of a 2000 pixel by 2000 pixel square. One of the first things I did was to block off where the different parts of the world would be. A town here, a field here, etc. But when the time came to actually go in and add various details, and really draw up the game space outside of the placeholder colored squares, the task seemed almost too monumental to even begin. So I broke up the world-space. Instead of doing multiple passes on a canvas 20002 pixels, I would work on a single 500×500 pixel grid, before moving on to the next. This took a task that at first seemed so daunting that I would want to put it off, and turned it into something more than manageable, and with very clear daily goals.

Now, 20 grid squares is still a lot of work. But what’s important is to try and forget about the bigger picture and just focus on what exactly I’m supposed to do right now, which is a lot harder to do if you don’t split the task up. Today I’ll draw grids A-1 and A-2. Tomorrow I’ll draw A-3. The actual workload hasn’t really changed, but by cutting it up into bite-sized pieces, you get a daily sense of accomplishment and most importantly you maintain your forward momentum. When you focus too much on the big picture stuff, it’s easy to get rapidly overwhelmed. And I think that overwhelming feeling is really where the root of inconsistent work is.

You want to maintain a string of little victories along the way until one day you look up and realize all those little victories have cascaded into one really big one.

‘Animalese’ Style Text-to-Speech

I’ve been thinking a lot about text-to-speech synthesizers lately. For as long as text-based narrative has existed in games, developers have tried new ways to simulate the human voice. Doing so is a way to give the player a sense that they’re actually listening to a person talking to them instead of just reading a wall of text. Early games utilized what an excellent Polygon video refers to as ‘Beep-speech’, wherein the developer plays a long series of ‘beeps’ and ‘boops’ to simulate the cadence of a human voice. Over time this has evolved, and become more complex. The Sims is a great example of using real human voices to create a fake, gibberish language which they’ve called Simlish. Using this, they’re able to avoid the annoying repetition of canned voice lines which may have happened if they had actually recorded every line of dialog in the game. Not to mention the sheer cost of hiring voice actors to read the novels worth of dialog.

One of my favorite uses of this method is in Animal Crossing. They use what they call Animalese, which consists of a collection of recordings of real human voices pronouncing every letter of the alphabet. This audio is then re-timed, pitched shifted, and then played in real-time as the text is scrolled across the screen. This gives you the flexibility to alter the pitch and timing to create potentially hundreds of unique character voices using only a small sample of audio files.

Writing something like this is also stupid easy.

if (floor(char_current) != char_last)
{
	char_last = floor(char_current);
	
	var letterSound = asset_get_index("sound" + string_lower(string_copy(_str, char_last, 1)));
	
	if (letterSound > -1)
	{
		audio_sound_pitch(letterSound, 2.2 + random_range(-0.1, 0.1));
		play_sound(letterSound, 0.5, 1, false);
	}
}

char_current is the position of the “cursor” as it passes through the text letter-by-letter. Because it can move at a speed less than 1, I’m flooring it and checking it against char_last to ensure each sound is only played once per letter. What we’re doing here is simple: For each letter of the alphabet, I’ve recorded a separate sound file. As the dialog object is typing the text out, we’re checking each character and seeing if a corresponding sound for it exists. If it does, we play that sound at a base pitch of 2.2, which is then modulated up or down by a value between -0.1 – 0.1. The additional pitch modulation is just to help add a bit of variety to the sound. Both the base pitch and random pitch modulation can be changed to create unique voices.

I also wrote a very simple parser to read dialog straight from .txt files.

function dialog_parse(dialogFile)
{
	var file, outArray;

	file = file_read(dialogFile);
	outArray[0, 0] = "";
	
	for (i = 0; i < array_length(file); i++)
	{
		outArray[i, 0] = string_copy(file[i], 7, string_length(file[i]) - 6);
		outArray[i, 1] = real(string_copy(file[i], 2, 4));
	}
	
	return outArray;
}

Each read dialog file is stored in a 2-dimensional array which contains both the text and the speed at which to read it for every line of dialog in a conversation. The dialog files themselves are formatted like this:

Where the read-speed is listed in brackets before the line itself. And that’s really all there is to it.

Thanks for reading!