Yield Thought

it's not as hard as you think
formerly coderoom.wordpress.com

I recently read “The race for an artificial general intelligence: implications for public policy” at work. I don’t want to pick on this paper in particular, but there’s only so many times I can read sentences such as:

“the problem with a race for an AGI is that it may result in a poor-quality AGI that does not take the welfare of humanity into consideration”

before I can’t take it any more. This is just the paper that tipped me over the edge.

AGIs are already among us.

I promise I haven’t gone crazy after discovering one data preprocessing bug too many! I’m going to lay out some simple assumptions and show that this follows from them directly. By the end of this post you may even find you agree!

What will access to human-level AI be like?

This is a good starting point, because human-level intelligence clearly isn’t enough to recursively design smarter AIs or we’d already have done so. This lets us step away from the AI singularity dogma for a moment and think about how we would use this AGI in practice.

Let’s assume an AGI runs at real-time human-level intelligence on something like a small Google TPU v3 pod, which costs $32 / hour right now.

You can spin up a lot of these, but you can’t run an infinite number of them. For around $6b you could deploy a similar number of human-level intelligences as the CPU design industry and accomplish 3 years’ work in 1 year assuming AI doesn’t need to sleep. It might take 10 times that to train them to the same level as their human counterparts but we’ll assume someone else has done that and we can duplicate their checkpoint for free.

What did we just do here, apart from put CPU verification engineers out of work?

AGI let us spend capital ($6b) to achieve imprecisely-specified goals (improved CPU design) over time (1 year). In this brave new AI-enabled future anybody with access to capital and sufficient time can get human-level intelligences to work on their goals for them!

This would be revolutionary if it wasn’t already true. This is has been true since societies agreed on the use of currency - you can pay someone money to work towards your goals and then they do that instead of e.g. growing crops to feed their family, because they can buy those instead. Human-level intelligence has already been commoditized - we call it the labour market.

Human-level AGI would allow companies to arbitrage compute against human labour, which would be massively disruptive to the labour force and as such society as a whole, but only in the same way that outsourcing and globalization already were (i.e. massively).

Anyone with access to capital can start a company, hire someone as CEO and tell them to spend that money as necessary to achieve their goals. If the CEO is a human-level AGI then they’re cheaper, because you only have to pay the TPU hours. On the other hand, they can’t work for stock or options! Either way, the opportunity to you as a capital owner is basically the same. Money, time and goals in, results out.

The whole versus the sum of its parts

Perhaps you believe that hundreds or thousands of human-level AIs working together, day and night, will accomplish things that greatly outstrip that of a single human intelligence. That the effective sum intelligence of this entity will be far beyond that of any single individual?

I agree! That’s why humans work together all the time. No single human could achieve spaceflight, launch communications satellites, lay intercontinental cables across the ocean floor, design and build silicon fabs, CPUs, a mobile communications network, an iPhone and the internet and do so cheaply enough that they can afford to use it to send a video of Fluffy falling off the sofa to a group of strangers.

Companies - today mostly formed as corporations - are already a form of augmented super-human intelligence that work towards the goals specified by their owners.

We might end up with a “poor-quality AGI that does not take the welfare of humanity into consideration”

Yes, well. I think I could make the argument that we have literally billions of “poor-quality” general intelligences that do not take the welfare of humanity into consideration! They are not the biggest problem, though. The problem is that the goal-solving superintelligences of our time - particularly corporations - are generally aligned to the goals of their owners rather than to the welfare of humanity.

Those owners are, in turn, only human - so this should not come as a surprise. We are already suffering the effects of the “alignment problem”. People as individuals tend to put their own desires and families ahead of those of humanity as a whole. Some of those people have access to sufficient capital to direct huge expenditures of intelligence and labour towards their own desires and families and not towards the good of humanity as a whole.

And they do.

There is ample evidence throughout history both distant and recent that just because the individual parts are humans does not mean that an organization as a whole will show attributes such as compassion or conscience.

They do not.

AGIs are already changing the world

The promise of AGI is that you can specify a goal and provide resources and have those resources consumed to achieve that goal. This is already possible simply by employing another human intelligence. Corporations - which have legal status in many ways equivalent to a “person” - are a very successful way to commoditize this today. The legal person of a corporation can exhibit super-human intelligence and is at best aligned with its owner’s goals but not those of humanity as a whole. This is even enshrined in the principle of fiduciary responsibility to shareholders!

In every way that matters a corporation is already an artificial general intelligence. From the perspective of an owner of capital they solve the same problems - in particular, my problems and not everybody else’s.

This doesn’t let us off the hook

I wouldn’t argue that introducing a competing labour force won’t be massively disruptive. Or that, if attempted, it shouldn’t be managed by the only organizations that ostensibly represent the interests of large sections of humanity - their elected governments. I just can’t bear any more intellectual hand-wringing over the “oh but what if the AI doesn’t have humanity’s best interests at heart?” line of reasoning behind some interpretations of the “alignment problem”.

None of us have humanity’s best interests at heart. And we’re destroying ourselves over it. That’s the problem we need to solve - and time is running out.

I find it easy to agree with the many smart people and game-theoretic arguments that say it is essential for governments to regulate and tax AI as a means to ensure that it does not act against our interests.

I just feel that regulating, taxing and aligning corporations to humanity’s interests would be a better place to start.

Stefano J Attardi’s excellent blog on using a neural network to split trendlines in his mechanical watch tracking app attracted many comments on hacker news suggesting a simpler algorithm might be more appropriate:

If I understand correctly, the reason a CNN is used here is because we want to find the splits that a human would visually agree “looks” the best? So rather than a regression it’s more like the “line simplification” problem in graphics.

Just thought this solution seems a little overkill. Surely you can pick some error metric over the splits to optimize instead?

Stefano understood intuitively the problem he wanted to solve but couldn’t write down explicit rules for doing so. He’s trying to split trend lines and in the four cases shown below the two highlighted in red boxes are considered incorrect splits:

image

He tried a number of classic segmented and piecewise regression algorithms first without finding a reliable solution.

To me, this looks like a hard problem to solve analytically and is a great example of an unorthodox but entirely practical application of neural networks.

But the question goes deeper than this once case. It is: if you can train a neural network to solve a problem, should you?

What is overkill, anyway?

Neural networks are easier than ever to train and deploy. Stefano’s post highlights some of the remaining gritty detail, but we’re rapidly moving towards a world in which deploying executing a neural network is as simple and normal as linking with a library and calling a function in it.

High-performance neural network processors will soon be shipping in every new phone and laptop on the planet. We are designing machine learning directly into CPU instruction set architectures. This trend is not going away - it’s only just getting started.

Stefano’s use of neural networks here, whilst currently unconventional, is absolutely a sign of things to come. If you already have the data, the benefits of a neural network are many:

  • Improves over time without further developer effort. This fits the maxim of shipping something now and improving it rapidly. To get a better network you often just need more or better data. If you include some mechanism to collect data and feedback from users, this can even be automated.
  • Fewer critical bugs. A neural network will sometimes predict the wrong answer, but it will never introduce a segmentation fault, security vulnerability or a memory leak.
  • Predictable and scalable performance. A neural network takes the same time and memory to execute regardless of the input data, and be reducing or increasing the channels you can trivially trade performance and accuracy to match your needs.
  • Faster execution and lower power consumption. This is currently questionable, but will change dramatically when every chip has a machine learning processor executing multiple teraflops per watt embedded in it.

The time is coming when hand-coding an algorithm that a neural network could have learned directly from the data will be seen as overkill - or at best, delightfully retro.

To paraphrase the tongue-in-cheek Seventy Maxims of Effective Mercenaries: one person’s overkill is another person’s “time to reload”.

I've just found your blog, and it's inspirational to see someone who works so freely and happily. I'm currently a CS student, and I'd like to know what it is that you did to succeed, other than passing all the exams.

I got very lucky - my final year project supervisor started a company and asked me to join him. Fifteen years later we were acquired by Arm! Taking the advice of people who got lucky is like taking the advice of lottery winners. “Everyone said I was crazy but I kept on believing!”

In terms of the blog: the best writing advice I ever got was to spend ten minutes every day writing non-stop. It doesn’t matter if you write “I don’t know what to write I don’t know what to write” over and over, just get into the habit. I did not follow this advice, which is probably why there are so few blog posts. YMMV!

My advice for a CS student today: make sure you are training neural networks (Keras my only recommended framework). A lot of interesting problems will be solved with AI soon.

Do you still work from a mobile device after your new laptop has arrived? I think about going iPad for coding on the go, Mac Mini at the desk. Can't afford to waste money on a laptop which I only use 5--10% of the time, but don't know if my work will be doable.

These days I work on a company MacBook Pro - I don’t have the flexibility to do work on a Linode outside the company firewall now I’m part of a megacorp! I’m also playing around with mobile prototype boards and various Raspberry Pi versions, so the mobile iPad life doesn’t fit my day-to-day work very well.

Hey I wanted to give this a whirl myself and wondered what you did security-wise? I was using plain ssh, but my sysops friend recommended setting up a vpn so I can firewall every other port.

I use UFW to firewall all the ports on my Linode and redirect SSH to a specific non-standard port. I also configured a service (I forget which) to blacklist repeated login / port-scan attempts. Hasn’t been a problem so far!

Recently at work I trained a neural network on a supercomputer that took just 3.9 minutes to learn to beat Atari Pong from pixels.

Several people have asked for a step-by-step tutorial on this and one of those is on the way. But before that I wanted to write something else: I wanted to write about everything that didn’t work out along the way.

Most of the posts and papers I read about deep learning make their author look like an inspired genius, knowing exactly how to interpret the results and move forward to their inevitable success. I can’t rule out that everyone else in this field actually is an inspired genius! But my experience was anything other than smooth sailing and I’d love to share what trying to achieve even result as modest as this one was actually like.

The step-by-step tutorial will follow when I get back from holiday and can tidy up the source and stick it on Github for your pleasure.

Part One: Optimism and Repositories

It begins as all good things do: with a day full of optimism. The sun is bright, the sea breeze ruffles my hair playfully and the world is full of sweetness and so forth. I’ve just read DeepMind’s superb A3C paper and am full of optimism that I can take their work (which produces better reinforcement learning results by using multiple concurrent workers) and run it at supercomputer scales.

A quick search shows a satisfying range of projects that have implemented this work in various deep learning frameworks - from Torch to TensorFlow to Keras.

My plan: download one, run it using multiple threads on one machine, try it on some special machines with hundreds of cores, then parallelize it to use multiple machines. Simple!

Part Two: I Don’t Know Why It Doesn’t Work

A monk once told me that each of us is broken in their own special way - we are all beautiful yet flawed and finding others who accept us as we are is the greatest joy in this life. Well, GitHub projects are just like that.

Every single one I tried was broken in its own special way.

That’s not entirely fair. They were perhaps fine, but for my purposes unpredictably and frequently frustratingly unsuitable. Also sometimes just broken. Perhaps some examples will show you what I mean!

My favourite implementation was undoubtedly Kaixhin’s based on Torch. This one actually reimplements specific papers with hyperparameters provided by the authors! That level of attention to detail is both impressive and necessary, as we shall see later.

Getting this and an optimized Torch up and running was blessedly straightforward. When it came to running it on more cores I went to one of our Xeon Phi KNL machines with over 200 cores. Surely this would be perfect, I thought!

Single thread performance was abysmal, but after installing Intel’s optimized Torch and Numpy distributions I figured that was as good as I would get and started trying to scale up. This worked well up to a point. That point was when storing the arrays pushed Lua above 1GB memory.

Apparently on 64-bit machines Lua has a 1 GB memory limit. I’m not sure why anyone things this is an acceptable state of affairs but the workarounds did not seem like a fruitful avenue to pursue versus trying another implementation.

I found a TensorFlow implementation that already allowed you to run multiple distributed TensorFlow instances! Has someone solved this already, I thought? Oh, sweet summer’s child, how little I knew of the joys that awaited me.

The existing multi-GPU implementation blocked an entire GPU for the master instance apparently unnecessarily (I was able to eliminate this by making CUDA devices invisible to the master). TensorFlow itself would try to use all the cores simultaneously, competing with any other instances on the same physical node. Limiting inter- and intra-op parallelism seemed to have no effect on this. Incidentally, profiling showed TensorFlow spending a huge amount of time copying, serializing and deserializing data for transfer. This didn’t seem like a great start either.

I found another A3C implementation that ran in Keras/Theano which didn’t have these issues and left it running at the default settings for 24 hours.

It didn’t learn a damn thing.

This is the most perplexing part of training neural networks - there are so few tools to gain insight as to why a network fails to converge on even a poor solution. It could be any of: * Hyperparameters need to be more carefully tuned (algorithms can be rather sensitive even within similar domains) * Initialization is incorrect and weights are dropping to zero (vanishing gradient problem) or are becoming unstable (exploding gradient problem) - these at least you can check by producing images of the weights and staring hard at them, like astrologers seeking meaning in the stars. * The input is not preprocessed, normalized or augmented enough. Or it’s too much of one of those things. * The problem you’re trying to solve simply isn’t amenable to training by gradient descent.

Honestly at the moment training a neural network to do something even slightly novel feels like rather like feeding a stack of punch cards to a mechanical behemoth from the twentieth century and waiting several hours to see whether or not it goes boink.

Part Three: Rules of Thumb

At times I felt like a very poor reinforcement learning algorithm randomly casting about in the hope of getting some kind of reward at all and paying no attention to the gradient. After realizing the irony of this position I became a little more systematic. If you face the same situation, these rules of thumb might help you too:

  1. Set some expectations for what success or failure will look like that you can test rapidly. If the papers show some learning after 100k steps then run to 100k steps and check your network has made progress. The shorter this cycle the better, for obvious reasons. Remember: staring at a stream of fluctuating error rates and willing them to decrease is the first step towards madness…
  2. Every failure is an opportunity to learn. Ask why these hyperparameters or this network architecture or dataset did not show convergence. How could you disprove that theory? This can be slow, painstaking work sometimes, but I learned a lot. It really, really helps to do this on a network that you already know can work at least once. Play with all the settings and find the points at which it does not and see what those failure modes look like.
  3. Start with a very simple, direct model and get it to show some level of learning, however slight. Then build up gradually from there. This is the single piece of most important advice I ever received.
  4. Be prepared to revisit papers and lecture notes as frequently as you need to to make sure you have a decent mathematical intuition of what is happening in your network and why it might (or might not) converge. It’s not black magic and understanding the principles can make a big difference to your approach.

Part Four: Back to Basics

Following this advice led me to Karpathy’s wonderful 130-line python+numpy policy gradient example. This was the first time I ran a piece of code that actually showed learning right off the bat across a range of systems. You can follow what happened next in my more detailed blog posts about if you haven’t already.

The TL;DR is that I added MPI parallelization and scaled it up on a local machine, then in the cloud, then on a supercomputer. At times it looked like it wouldn’t scale well but this was often because of incidental details I was able to overcome rather than hitting fundamental scaling limits.

Part Five: I Don’t Know Why It Does Work

In the blog I refer briefly to parallel policy gradients working so well because it reduces the variance of the score function estimate. This is worked into the text in an offhand, casual manner designed to make me look like an inspired genius.

What actually happened when the first parallel implementations started converging within hours instead of days is that I was somewhat shocked. My previous experience with using data-level parallelism (in which you split the learning batch across multiple machines and train them all concurrently) had taught me that you quickly reach the point of diminishing returns by adding more parallel learners.

The problem I’d seen in supervised learning was that by doing so you’re increasing the batch size, and extremely large batches don’t converge as quickly as small ones. The literature suggests you can compensate by increasing the learning rate to a certain extent, but I didn’t know of anyone using batch sizes larger than a couple of thousand items.

In this case the effective batch size was already thousands of frame/action pairs on a single process. I didn’t expect to get a lot of mileage out of increasing that by several orders of magnitude and when I did I rather wondered why.

It was only after revisiting the policy gradient theorem that it became clear - each reward received is taken as an unbiased random sample of the expected score function. As long as this sampling is unbiased, the policy gradient method will eventually converge, but such random sampling is extremely noisy and has a very high variance. Most of the more elaborate policy learning methods attempt to minimize this in a variety of ways but simply taking lots and lots of samples is a very scalable and simple way to directly reduce the variance of the estimate too.

In fact by running several thousand games per batch the model only requires 70 weight updates to go from completely random behaviour to beating Atari Pong from pixels. I confess to a certain curiosity as to how low this can go. Pong is not a deep or complex game. Learning a winning strategy in a single weight update would be kinda neat!

Part Six: Finally Something I Can Do

Having seen that the approach was working, actually optimizing this and running it at extreme scale was the only straightforward part of this entire process. This is something I know how to do and something there are very good tools to measure and improve parallel performance. I rather suspect that scalar deep learning will need a similarly large investment in tools surrounding model correctness and debugging before it becomes widely accessible.

Part Seven: Is It Just Me?

So that’s the background to the story - dozens of dead ends, desperate and frustrated rereading of source code and papers to discover sigmoid activation functions paired with the mean squared error costs, single frames passed as input instead of a sequence or difference frame and all manner of other sins before finally managing to get something working.

I’d love to hear from others who have tried and failed or succeeded. Did your story mirror mine with similar amounts of desperation, persistence, surprise and luck? Have you found a sound method for exploring new use cases that I should be using and sharing?

Deep learning is the wild west right now - stories of exciting progress all around but very little hard and fast support to help you on your way. This short yet honest making-of is my attempt to change that.

Happy training!

Man, this is a good cup of tea. Feels good, letting the warmth soak through my hands. You know the best thing about tea? When fresh it’s too hot to drink. You’re forced to be still for a moment, enjoy the anticipation.

image

Here, pour yourself a cup and pull up a chair - this is the perfect moment to let our minds go wandering.

Have you had enough about AlphaGo and Lee Sedol? Yeah, it’s been everywhere. I like writing AIs for games as much as the next guy - well, probably more. That stuff is like crack! Used to play Go, too. Mankind against his own creation! What a story!

No I don’t worry about the Rise of the Machines too much. I think the future will be more interesting than that - it puts me in mind of the whole oil thing.

Yeah, oil. Listen - mmmaaah, that first sip is the best, don’t you think? Anyway, where was I? Oh yes: oil is like artificial muscle.

On Artificial Intelligence and Oil

Thousands of years ago all we had was muscle power. So when someone started to become wealthy and powerful they could only get more work done by getting other people (or animals) to do them. You want to build a granary? Better be able to pay at least enough food for the workers who will build it for you. Oh, and for the hired muscle to assert that it’s yours. And so this concentration of wealth you created, earned or stole is parceled out to some extent amongst those around you.

Want to rule a kingdom? Build a cathedral? Live like a prince? You’re going to need to pay for a lot of people, because people are the fundamental unit of work. Yes, you’re right - I think Adam Smith did write something like that. Even slaves were paid with food and basic shelter.

So when the industrial revolution comes around things start getting crazy, because we have oil! Artificial muscle power! Suddenly we can do more work with fewer people. A lot more!

We need not dilute our wealth amongst so many other people when employing it to the same effect. And the longer the revolution goes on, the more pronounced the effect becomes! A couple of hundred years later and a roomful of people can create a billion dollar business from almost nothing - isn’t that amazing?

Wow, empty already. I always reach the bottom of the cup sooner than I expect. Oh yes thank-you, I will have another. Milk, no sugar please. Oh go on then, just one.

As Gods amongst Men

Mmmh, thanks. Anyway, I think we’re standing at the cusp of another industrial revolution. We don’t need to wait for the mythical strong AI and the singularity or whatever. Even weak AI is just the perfect complement to oil! Oil allows the wealthy to employ artificial muscle. AI allows them to employ artificial intelligence to control that artificial muscle!

See, when AI can direct machines to extract oil, produce goods, repair and replace other machines - and you must agree this is ultimately within our grasp, right? When we do this, for the first time in history we can employ our wealth without giving a single cent to another human being! Isn’t that a crazy thought? But it’s the only logical consequence!

It won’t happen overnight, of course. The replacement will be gradual. An ever-shrinking elite of the wealthy and powerful and a circle of others around them fulfilling our wishes. But ultimately we won’t even need other humans to protect our wealth. Mere humans won’t pose a threat to our autonomous guards and drones!

We lucky few who have the most resources when the revolution comes will become like gods amongst men - capable of anything, the world bending itself to our every whim. The rest of the human race will struggle and squabble over the whatever resources we leave out of pity or compassion, or that are simply not worth claiming. To them our capricious whims and wars will be as incomprehensible as they are dangerous and their opinions won’t matter a damn.

We won’t need to overthrow governments, they’ll collapse by themselves as the tax revenue from the increasingly impoverished and unemployed masses disappears, our wealth safely beyond their desperate grasp.

We can reestablish the feudal system if we like, allowing the lesser peoples to pay rent on our properties whilst working the land in occupations that we find… suitable. We’ll never need them of course, but it’s so fashionable to have an estate with a few surrounding villages, don’t you think darling? The admiration of the people! I have mine stand aside and cheer whenever I pass, it always brightens one up after a good day’s hunt.

But Seriously

Oh, was that really the last of the tea? Back to the grindstone, I guess. Places to go, things to do.

What? No, no, don’t worry. I wasn’t really serious about all that. There’s no chance at all that you or I will be one of the wealthy elite! No, our only hope will be to join the hacker insurrection…

Anyway, I really do have to get back to work - these deep learning models aren’t going to train themselves, you know!

This little Nosgoth strategy and tactics guide offers a way to improve at the game quickly enough to enjoy the surprising depth of play at mid and high levels.

Update: Nosgoth is now in Open Beta - if you’re not playing yet then you should! You can start with a free booster using my friend referral link - see you in the game!

As a new player Nosgoth is pretty punishing. I still remember the confusion and slaughter of my very first game and perhaps you still remember yours.

Rather than hand out specific yet somehow unhelpful tips, such as “learn to time the charge duration on your warbow so the shot is ready when you need it”, I’ve chosen to write about a method for improving efficiently. It will help your first 100-200 hours of play up to level 40; after that you will understand more than enough to choose your own onward path.

The guide is divided into three sections:

  1. Basic advice and skills - aiming, listening, dying
  2. Playing to learn - set round goals, flow of play, learn from the masters
  3. Mid-level play - class counters, yomi, teamwork

Basic Advice and Skills

This guide assumes you already know the basics of play - what zoom does, how to dodge and so on. There are plenty of other guides that cover those kinds of basics. Instead I want to call out three things about Nosgoth that reward closer attention.

Aiming

Depending on your choice of class Nosgoth can be enjoyed without perfect twitch skills, but being able to aim accurately under duress will always make a big difference. If you play a lot of FPS shooters feel free to skip this short introduction.

If you don’t play a lot of FPS shooters and don’t know what inches/360 means then you need to know only one thing:

Your mouse sensitivity is too high.

Let’s fix that now before you get used to it. Edit My Games\Nosgoth\config\BCMPUserProfile.ini and make these changes:

  1. Change bMouseSmoothing=True to bMouseSmoothing=False
  2. Change LookSensitivity=whatever to LookSensitivity=1
  3. Change MouseSensitivity=25 to MouseSensitivity=15

Make changes #1 and #2 in BCMPUserProfile.ini too for good measure. I actually play with MouseSensitivity=9, but 15 is a good starting value if you’re used to oversensitive mouse movement. Once you’re used to it, try reducing it by 2 or so each time and notice how much easier aiming becomes!

You can read up what all these do on this forum post, but these values are fine to start playing with. Also turn off “improve pointer accuracy” in Windows’ mouse settings while you’re at it and put the sensitivity slider onto 6/10. Direct movement is easier to learn.

On the subject of aiming: many people find it more accurate to hold the mouse with their whole hand and move the arm/wrist to aim. Try this too instead of pushing the mouse around with your fingers!

Much of early Nosgoth play as a human is long periods of tension followed by extremely intense periods of action. This is a difficult environment to get used to aiming in.

Instead, spend a couple of minutes per session in the tutorial. Run through to the first health station then turn and find a marker on the wall behind the flags. Hover your crosshair over a flag, then rapidly move it to the marker, shoot it and move it back. Repeat this with many different flags, ranges, zoomed and unzoomed, moving and stationary for your 120 ammo them exit the tutorial and start a match.

After a few sessions you should find it quite easy to keep a vampire in your crosshairs during short range and melee combat. If the action seems too confusing, check your frame rate by going to the video settings during a match. Anything below 30fps will be a problem; turn every graphics option to low or off, reduce your resolution and if you are playing on a laptop check the right graphics card is used and your CPU’s turbo boost is disabled. While you’re changing graphics settings you might as well increase your gamma, too.

As a vampire rapid aiming is less important but do use dodge-melee-dodge exclusively for movement on the ground in your first few matches as it is faster and less predictable than running and is worth making second nature. Also force yourself to only use charged attacks for a couple of rounds to get a feeling for how long they take to reach full power and how far and fast you move. You can actually turn corners while diving forwards like this!

Listening

The aural soundscape of Nosgoth is more revealing and important than most other games. Wear headphones and turn the volume up enough to hear footsteps clearly. Often hearing an audio cue is the only thing that separates you from life and death, so learn to recognize the distinctive sounds each vampire makes before they attack.

When you unexpectedly die, take a moment to recall the sounds you heard leading up to that moment.

Dying

Every time you die, which will be a lot, a ten second or so timeout gives you a natural moment to pause for reflection. Cultivate the habit of using this time well. Often, particularly at the start, you’ll feel you were killed unfairly, with no chance to do anything. Maybe you were rammed and pounded by two Tyrants and unable to move or react at all until pasted across the floor. Instead of raging at “lame tactics” reflect on the life choices that led you to this moment.

There are no lame tactics in Nosgoth. Everything can be countered. Nosgoth is a high-skill game. A good player will reliably destroy a worse player. Random chance and luck play a much smaller part than it sometimes feels at first.

Often the lesson to learn from a death is Do Not Let This Happen. By the time the first Tyrant hit you it was all over, but the audio cue for a stampeding Tyrant is very distinctive and gives plenty of opportunity to dodge out of the way. Were you standing in a particularly charge-worthy place, too close to your teammates or trapped in a tight space? Were you even aware that two vampires were playing Tyrant?

Improving at Nosgoth efficiently means asking these questions every time and trying different behaviour as a result.

The hardest thing to do is not to blame your team. You will play most of your low and mid level games in public servers with random groups of inexperienced players. You cannot expect your team to always have your back, but you can learn to make it easier for them by standing where two or three of them can clearly see you, for example.

Always remember you are not playing to win yet, you are playing to learn. Dying is an important part of that. Quietly give thanks to your opponent for showing you a way to improve your play.

Playing to Learn

The greatest barrier to improving early on in Nosgoth is trying to win. Whether you want the highest individual score or for your team to win every match, these goals are largely out of your control in a randomly-filled public server.

We are not playing to win, yet. First, we are playing to learn.

Set Round Goals

Spend the time waiting for a match to start contemplating what you want to learn and improve in each round of this match. Pick a goal that is as much in your control as possible. For example, perhaps you are learning to play as a Reaver. Equip Savage Pounce and make your goal to perform four Savage Pounces and escapes in a row without dying.

This has two benefits. Firstly, you can be entirely satisfied and happy at the end of a losing round in which two teammates rage quit because you had the opportunity to work on more stealthy and surprising approaches. Secondly, it forces you to develop a whole range of skills and way of understanding the game and level environment that you otherwise wouldn’t have.

In our previous example, to maximize pounces before death you will quickly learn to sneak carefully around the map, to select targets who, momentarily, are not being watched by their teammates, to cover your entry with a distracting smoke bomb or choking haze, to find just the right psychological moment in your team’s attack in which the humans are fighting every man for themselves. More than continually trying to get in fast and LMB-spam inexperienced alchemists would ever teach you.

When the round ends consider your personal goal score, not the round score. Did you achieve your goal? Did your approach work? Noscam is a great way to review rounds that just didn’t work out. Often although you felt “unlucky” during the game a careful review from your opponent’s perspective will highlight how their style of play made things particularly difficult for you. Again, our opponents are teaching us. What friends we have in them!

The Flow of Play

If your match goal is WHAT you want to achieve, the flow of play is HOW you want to achieve it. This is best explained by example. Let’s take a flow of play for learning how to play as a Sentinel:

Goal: Kidnap, finish and drink from four humans in a row without dying.

Flow of play:

  1. Locate the humans
  2. Watch them, choose a target and a time to attack and an ideal place to drop them
  3. Dive in, pick up and get out with minimal damage
  4. Land and finish your target
  5. Safely drink from them, if possible, then repeat

You wouldn’t play a competitive match with a script like this, but breaking down the messy complexity of a round into distinct tasks and phases is an incredibly powerful way to learn faster.

By playing with this plan in your mind, you will begin the game fully focused on the best way to find the humans without getting sniped out of the sky. This leads to checking which classes they have, learning the common spots for humans to gather and the best approaches and observation areas. There’s no excuse for taking damage in this phase and you will quickly learn to accomplish it without being seen or shot at all.

Here’s another example that I used when learning how to snipe as the Scout:

Goal: Snipe four vampires with the warbow without dying.

Flow of play:

  1. Choose a good sniping spot close to my team’s location and defend it with a trap
  2. Watch for opportunistic targets and take early shots
  3. Move to a more hidden spot further from my team where I can cover them just before the vampires arrive, perhaps using Camouflage
  4. Snipe vampires in the melee, disrupting Tyrant jumps, Reaver pounces and Sentinel dives
  5. Either finish the fight quickly with LMB spam or move under cover to another vantage spot before too many vampires decide to come looking for me
  6. Heal safely and repeat

This flow taught me so much about the strengths and limitations of Camouflage vs just standing in a bush or dark doorway, about timing my draw, about vampire psychology and the right moment to choose a new spot (protip: in an extended fight, roughly 15 seconds after my first kill a vengeful respawned vampire will come to the spot I killed them from. A different flow includes “Find a spot overlooking the first one and kill them again for bonus points”).

You won’t use the same flow forever, only until you feel you have mastered it. New ideas and adjustments will come all the time and it’s good to try these out! Just remember to separate how well you played from how well-suited your flow was to this match when looking at the results. Understanding this difference is the first step towards mid-level strategic play.

Learn from the Masters

Watching high-level teams battling it out with each other in tournaments is fascinating, but most of what is happening and why will be opaque to you during your early levels.

A more efficient way to learn from experienced players is by watching their twitch streams and reading their guides. I learned everything I know about dealing death from above from Warmonic’s excellent guide and videos and the guides from high-ranked EU team Dead Sun.

When an experienced player talks about their understanding of the game, pay attention not just to what they say but to the way they are thinking about the game. A great example is Saturnity’s reply to a question about “stupid” stunlock:

Your stunlocking point directly ties in with spacing and dodging. High level players can roll out of the way of every vampire and potentially every human CC in the game.

Sent pickup gives audio cues you can use to dodge without even looking at the guy. Good tyrants do a wiggle charge, but you can predict how he’s going to try to swerve into you and get out of the way.

Think yomi. Reavers are the hardest to dodge but you have multiple ways to stop a pounce. You can shoot them down for fall damage, CC them out of the air, or just roll.

You can out-think humans when it comes to dodging their CCs. Did you just charge a melee attack through a human? Did they just roll backwards in a fight? Are they waiting for you to use a gap closer? In any of those situations, they’ll immediately want to hit you with a CC.

Either space+aim your attack so you slide past them or roll through them afterwards. If the human will predict that and aim the CC behind them, you can counter their counter by sneaking in a melee attack before dodging since they’ll wait a sec for the dodge. More yomi.

Tyrant grab can be interrupted by your teammates, use good positioning and spacing and don’t let them grab you at bad times. Don’t stand near a corner if you predict a tyrant will attack from there. He’ll grab you. Get away from jump landings or cancel the jump by airshotting with a CC move. He could grab you if he lands on you and your team is busy.

One of the top players, chriZor, has uploaded videos of the competitive ESL matches with German commentary. If you can understand the language, watch them and listen to his advice! If you can’t, watch them anyway and see how top players use your favourite class effectively.

Attempting to emulate a master has been a tried and tested method of learning for millenia, and with videos and Noscam we have more opportunities for this than ever. Use them!

Mid-level play

Class Counters

Players in the level 10-30 range often get the idea that class stacking (e.g. 4 Tyrants or 4 Scouts) is overpowered and unfair.

This is not true.

Competitive teams play with a mixture of classes because it is more effective. 4-stacks are rather rare in competitive play - so far in 9 ESL tournaments I’ve only seen three matches in which one team used a 4x class stack. The second half of this match is an example of a 4x Reaver team being taken apart by 1x Scout, 2x Hunter, 1x Prophet.

So why does class stacking in Nosgoth feel overpowered to relatively new players? There are two reasons.

Firstly, stacking classes is the simplest kind of team-level tactic to implement in a pub. “Hey let’s all go Tyrant and keep charging them lol” is all it takes. The hard counter to that would be something like a team of Alchemists, but low level players don’t necessarily have all the classes available to them or aren’t willing to change.

Secondly, matchmaking sometimes puts a much better team against a much worse one. In this case the better team will often just mess around. A fun way to mess around is to e.g. stack Sentinels and try to catch each others Kidnap drops. It would get slaughtered against a team of decent Scouts or Hunters but they don’t care because they’ve quickly realised they’re all better than you anyway. They’d beat you even more if they were playing properly.

So when you see a stacked team, embrace the challenge. Switch to a counter class and encourage your team mates to do so too. Learn the strengths and weaknesses of your opponents.

Here’s a rough-and-ready guide to class-based counters:

4x Hunter: mixture of classes - e.g. 1x Tyrant for CC, 2x Reaver, 1x Deceiver or Scout. Infect on the Deceiver can be fun because new players will tend to bunch far too close together as humans.

4x Prophet: also a mixture of classes; I’d field 1x or 2x Sentinels as the Prophet lacks any good way to get them out of the air (but beware of their accurate and powerful heavy pistols).

4x Scout: 4x Deceiver. Scouts hate deceivers. It’s easy to get in close to them without being hit by a charged shot but watch the ground for traps and be ready to disengage if they all throw their volleys down.

4x Alchemist: 4x Sentinel. Sentinel is such a hard counter to Alchemist that it isn’t even funny. Enjoy soaring through the skies with no danger of being shot at all. Hover above and throw air strikes to your heart’s content - take your time and try to stick them to a player, it’s hilarious.

4x Sentinel: Scouts and Hunters. It’s so easy to knock a careless Sentinel out of the sky with a warbow and on the ground they’re basically dead. Hunters can also practise their bola aiming.

4x Deceiver: a mixture of Hunters, Prophets and Alchemists. If the Deceivers are using illusions fire explosive shot to dispel them immediately. If they’re using shroud, the Prophet’s banish/hex shot will light them up like a Christmas tree for your team to focus fire.

4x Tyrant: at least 2x Alchemists and whatever else you fancy. If the Tyrants are using Leap, 1x or 2x Scouts with warbow will change their minds very quickly. It’s so easy to shoot them out of the sky, land them in a volley and then stun them with knives to keep them in it.

4x Reaver: a mixture of classes. I’d want to include at least 1x Scout with volley to create a protected area for the rest of the team to stand in and fight.

All of which leads us nicely on to the next section: yomi.

Yomi

David Sirlin has written such a good explanation of yomi that I won’t repeat it here. Go and read it now, then come back.

As you hit the 20s you will have a good feeling for your favourite classes and flows of play. You’ll often hit the top score on your team even if that isn’t what you were trying to do. You will find, now, that you CAN play to win, using the skills and techniques you have learned so far.

But sometimes it just won’t work. Sometimes there’s that one deceiver who always gets behind you as a scout. That one scout who always snipes you as a sentinel. That one alchemist whose firewall is always exactly blocking your approach. That one prophet who hits you with such inhuman accuracy you can’t believe they didn’t know you were coming.

Yomi level 0 is playing your favourite class in your favourite way. It’s what, given random opponents, pushes you to the top of the scoreboard when you play to win. Mid- and high-level players will recognize your effectiveness and will switch to a different class, loadout or positioning deliberately to counteract you. This is yomi level 1. To continue being effective you need to watch for this and to do it yourself.

Regularly checking the opponents’ classes is a one way to spot a yomi change coming. As soon as you spot it, you can react. Switch from a sentinel to a deceiver and stab those newly-spawned scouts in the back while they vainly scan the skies. Switch from a scout to an alchemist and tear that team of lumbering tyrants apart. Yomi level 2, bitches!

The best yomi moves come not from pure class changes, which are coarse and immediately visible to all attentive players, but from loadout and tactics changes. How will you play differently as a Scout, knowing that Deceiver is coming for YOU? Stand in front of a teammate? Always place a trap behind you in the bushes? Move across your team every 15 seconds? The choices here are both endless and extremely entertaining. Let your creativity soar!

As you reach higher levels you’ll start to feel yomi in individual fights as well. As a Sentinel you’ve just dropped a Hunter, landed behind him and hit him with Puncture. He 100% wants to dodge-roll away from you and then bola you in the face. So you want to dodge roll through him (vampires roll further than humans) so you’re still behind him when he stands up. But if he’s expecting that, he’ll roll through you instead and now you’re out of auto-attack range and have a bola around your waist while your snack disappears around the corner to rejoin his teammates.

Yomi. Find the most effective technique and exploit it until your opponents reactions to it become predictable. Then exploit that instead.

Teamwork

By this point most of the people you are playing against have a similar understanding of the flow and intricacies of a game of Nosgoth; the ever-shifting balance between the teams based on their respective health, distance, clip ammo, cooldowns and attention.

Now it is possible to play as a team. Basic teamwork such as “cover each other”, “drag a spare corpse to keep it fresh for a teammate” and “attack together” will already be well-ingrained. To go further you have to communicate. Is it better for the humans to stay in this camp or to keep moving to new health packs? Should the vampires rush the humans now while they are on the move or regroup and AoE them at the next health point? Do we need to change our classes and loadouts to counteract this team combination?

Some basic advice about team-level tactics:

  1. Humans: Positioning - stand far enough away from each other that a vampire who has killed one of you has a long and painful walk through a hail of bolts and bullets before they get into range of the next one, but not so far away that you can easily be too cut off from each other by scenery or smoke grenades. Watch chriZor’s ESL match videos to see good positioning at work.

  2. Humans: Defend your teammates - scouts should throw down a volley as soon as the vampires engage so the rest of their team can use it as a powerful place to stand and pick off the enemy. Hunters should use explosive shot to clear out deceiver illusions and blast reavers off teammates. Blinding shot and sunlight vial can be life savers for an outnumbered teammate and give you all an edge.

  3. Vampires: Coordinate your attack - plan in advance the order you will engage in. For example, have your sentinel throw an air strike to force their Scout to dodge and lose his drawn shot and your Reavers to lay down smoke bombs. Once the smoke hits have your Tyrant charge into at least 2 of the enemy and ground slam them to keep them stunned while the Reavers close the distance to melee range and your Sentinel swoops in to pick off the Scout before he can throw his volley down.

  4. Vampires: Focus fire - attack 2v1 or 3v1 to quickly down a human and then move onto the next one. The only exception here is the Sentinel, who will take one human out of the fight and then come back with e.g. a second air strike to help finish off the remaining humans.

Learning the depth of team-based tactics and strategies will take you way beyond the level of this guide and into clans and competitive play. You have already learned more than this little guide can teach you. Enjoy the rest of your journey towards mastery of Nosgoth!

CecilSunkure’s “How To Improve At Starcraft II 1v1 Efficiently” made a big impression on me when I first encountered it. It’s much better written and more complete than this and I heartily recommend reading it. The title pays homage to that great work; thank-you, Randy.

Also thanks to RazielWarmonic, whose Sentinel guide inspired me to write this in the first place, and to SilentVirtue for taking the time to teach me about positioning and whose competitive Sentinel play continues to provide inspiration.

Nosgoth is now in Open Beta - if you’re not playing yet then you should! You can start with a free booster using my friend referral link - see you in the game!

Writing AI to play games is a special kind of crack for me. Seeing my AI face off against other people - or better yet, AIs written by other people! It’s so… I don’t know how to describe it. I cannot resist its siren call.

I once lost several months of my life writing python-based AI for an open source game called The Battle for Wesnoth. What started as a fun way to learn python quickly turned into a several month-long obsession of unhealthy proportions. Towards the end of it I had an AI with an ego-inflating 98% win rate versus the default AI across all in-game factions and a selection of the smaller maps.

Then, on the very edge of victory, it all went horribly wrong.

Pride comes before a fall

My downfall begins on a hot summer’s night in Munich. I can’t sleep and I’m thinking about Wesnoth. I long to capture the last 2% and have an AI to share that wins every match, but I’ve exhausted all of the small tweaks and improvements I can think of.

Tonight I decide the time has come for a significant rewrite. It will put the AI on a much more strategically sound basis and practically guarantee total domination! I refil my glass, sit down and get started.

It take several evenings to finish tidying the changes up, but finally the moment for a full run-through comes. I hit the button and send my newly revamped AI into a gauntlet of battles against its soon-to-be-vanquished opponents. But something is wrong. It’s losing a match! Now another! And another! After thirty minutes or so the final figure comes back: 72%. SEVENTY. TWO. PERCENT.

The road from 72% to 98% took me almost a month of occasional evenings the first time around. What do I do now? Revert the changes and try something else? Or keep working on them, possibly throwing my time away for nothing?

Questions whirl around in my sleep-deprived mind, taunting with their unanswerability. Are these changes fundamentally worse than the previous strategy? Or are subtle bugs causing my AI to make stupid mistakes? If I spend two more weeks fine tuning this version, will it ever overtake my previous plateau or am I wasting my time?

In one especially lucid moment I feel a sudden connection to the hopeless fate of the hill-climbing algorithm, never aware whether it is stuck in a local minimum or not, always caught between tenacious hope and the terrible, crushing despair of futility.

Charlie Brown says that the tears of adversity water the soul. I learned that a single number just isn’t enough feedback to meaningfully understand the performance of a complex system. I also developed a deep sympathy for anyone who spends their life A/B testing website changes.

A few weeks later the Wesnoth team discontinue the python AI interface and I am rehabilitated back into normal life, but the experience makes a lasting an impression on me. How can we write better AI? How can we test and understand the behaviour of code that interacts with an overwhelmingly complex, stochastic environment in unpredictable ways?

Back on the wagon

The seasons change and years pass peacefully by. Purely by chance I stumble upon the Wesnoth website again and notice they have quite a decent lua-based AI subsystem in place now.

Unable to resist, I clone and compile the latest version and start learning how to use the new lua interfaces. It’s not long before I have a skeletal AI - in every sense of the word - managing a 30% win rate with lots of low-hanging improvements still unexplored. Yet I am uneasy. I can feel the same problem out there, an undefeated nemesis lying in wait.

I live by the sea now, so one evening I take my notepad and pen down to the beach, write down the problem as clearly as I can and stare helplessly at the shimmering waves, waiting for inspiration.

Despite vicious sea wind’s tenacious attempts to strip the flesh from my bones I manage to sketch out an idea for a workflow that feels like it has potential. As it turns out - to my own surprise and joy - it’s absolutely brilliant. Sometimes the Feynmann algorithm is the only one you need.

Machine learning, or is it machine teaching?

The problem I wrote down is that I’m drowning in an ocean of raw data without an easy way to understand the stories within it. Each automated match consists of hundreds of moves from a wide range of different units and to get any meaningful comparison I really need to run dozens or even hundreds of matches per change. Abstracting all of that as a single percentage throws too much information away but watching fifty replays each time I tweak a number is never going to happen.

At the beach I realized I need a machine to watch them for me.

This is how I built one.

Step one: visualize a single battle

Realizing I needed something between clicking through an entire replay and a single win/loss figure I started looking through battle logs picking out other statistics that seemed strategically relevant.

In Wesnoth there are 5 key metrics that together give you a pretty good feeling for which way the battle is going. I logged just these over time for each battle and added some unicode sparklines for readability:

yt_simple vs ai_default_rca  |  winner: 2 
units: ▄▄▄▄▄▄▄▄▃▃▃▂▂▂ 49 53 49 45 49 49 49 46 42 38 30 21 19 18 
cost : ▃▄▄▄▄▄▄▃▃▃▂▂▂▂ 40 48 45 43 44 44 44 41 38 31 27 21 22 18 
gold : ▄▁▄▄▄▄▄▅▅▃▃▄▃▃ 49  0 57 54 52 46 54 60 69 40 38 53 41 40 
vills: ▁▁▄▄▄▅▅▅▅▄▄▃▃▃  0  0 55 52 52 63 61 59 61 55 49 30 33 33 
inc  : ▃▂▄▄▄▅▅▅▅▄▄▃▃▃ 33 24 54 52 52 61 60 58 60 55 49 33 34 34 

Even so, reading through fifty of these reports at a time isn’t very illuminating, even when they are sorted. Sorting is my favourite cheap visualization trick but here I needed something more.

Step two: the unreasonable effectiveness of normalized compression distances

Over the past year I’ve been fascinated by the usefulness of the normalized compression distance (NCD) as a similarity measure. Basically it states that the degree of similarity between two objects can be approximated by the degree to which you can better compress them by concatenating them into one object rather than compressing them individually.

You can use this to detect the similarity between music, images, DNA and all sorts of arbitrary things wth surprisingly good performance. It’s one of the coolest and most implausibly effective hacks I’ve come across and I thouroughly recommend reading Vitanyi, Cilibrasi and Cohen’s papers on the subject.

NCD was perfect for this problem. I built up a similarity matrix by compressing my visual output summaries then used that to cluster them into an unrooted binary tree. A small python script injects some extra details to dot’s SVG output and suddenly I am looking at this:

The nice thing about having these in SVG files is they are mildly interactive - moving the mouse over any particular node shows the run summary in the bottom-left, as you can see for run 11 in the image above. The NCD clustering does a really great job here – the clusters really do seem intuitively similar and semantically meaningful even after deeper investigation.

Of particular interest are the battles that were lost but are clustered close to several wins. These scream “everything was fine until something unexpected happened” - a bug or unforseen situation the AI has blindly wandered into again. Watching the replay for one of these battles almost always shows a (sometimes rarely-occurring) bug at work. Just fixing these improved my AI massively - machine assisted debugging at its finest!

Step three: profit! My new machine-learning boosted workflow

When I sit down to improve my AI it now looks like this:

  1. Code up whichever improvement or brilliant/stupid idea I’ve been dying to try out
  2. Hit a key to run 50 battles in parallel using 8 cores, cluster the results and produce a prettified SVG file
  3. Get a feel for the overall shape of the results. Are there lots of quick wins, or are matches more often drawn out? Are there interesting clsuters of wins and losses?
  4. Tap on a few nodes to understand what is represented then pick one to look at in more detail, e.g. a loss surrounded by wins, or an example from the center of an interesting cluster of losses.
  5. Type in the number of that node into my terminal to launch Wesnoth and view the replay of the run from that node. I can click forwards and backwards through the match, watching for mistakes both obvious and subtle in the behaviour of my AI. Sometimes I look at a couple of runs for comparison, such as a similar win/loss or two similar lost runs.
  6. Sometimes I can see what the AI did wrong but don’t understand why. In this case I can pull up the AI debug log for that numbered run, jump to the turn I’m looking at in the replay viewer and read in depressing detail the reason my thief decided to go toe-to-toe with a mounted Lancer in waist-deep water instead of e.g. nipping onto that nice safe village and stabbing him in the back for 2x damage.
  7. Come up with a new idea to improve the existing code and repeat from step 1.

Working like this is so beautiful because there’s a smooth multi-level mapping from the complex emergent behaviour in the game all the way back to the actual lines of code that evoked it and it’s so frictionless to move between levels of detail that it’s joyful in and of itself! It’s also extremely effective.

Some Promising Results

Within a week my lua AI has reached a 98% win rate and I haven’t even brought out my best tricks yet.

Every time a change decreases the win rate I’m able to quickly find the group of replays demonstrating the negative impact and follow through from there to the debug logs to the code, where I can fix it. It’s a whole new world compared to the way I worked last time, with just a single figure telling me “better” or “worse”.

I would be surprised if writing AI is the only application for a workflow like this.

A/B testing might benefit from a similar setup – take the logged or recorded traces of each visitor’s interactions with the site and cluster them so that when a test decreases signups you can look at the clustering and see which behavioural subgroup is failing to sign up, then go and watch a few sample recordings to understand why they might have changed their behaviour.

I don’t write a website analytics package any more, but if you do and you’d like to experiment with this I’d be interested in working with you. This stuff is a lot of fun!

There must be a lot of other situations that would benefit, too. Do you know of any others? Let me know, I love learning new things!

Update: I’ve added a few more links and technical details to the Hacker News discussion.

I kept wanting to check this on my iPad so went ahead and hacked together a cron job and static Google AppEngine site for it. It’s the daily probability that any given team will win the world cup, visualized over time.

Enjoy!