Quick start: profiling node.js on linux

Inspired by Brendan Gregg’s article on flame graphs, I recently revisited node profiling on my native linux machine. For a little bit of back story, basic support in 0.10.x wasn’t pretty:

Profile captured on Node.js 0.10.33
--prof capture from 0.10.33

Node 0.11 has made marked improvement, however. JavaScript symbols are now exported to the profiler, enabling systems that don’t support DTrace (i.e., linux) to still obtain somewhat meaningful v8 profiles.

For the quick start, we’ll profile a script with several expensive IO and JSON operations and review the results using Chrome’s built-in Trace Event Profiling Tool:

// profileme.js
var fs = require('fs');

var RUNS = 1000;
var filename = './_profileme_data',

var i, j, sampleData;

function profileMeOnRead (err, result) {
  if (err) throw err;
  return JSON.parse(result);
}

function profileMeOnWrite (err, result) {
  if (err) throw err;
  fs.readFile(filename, profileMeOnRead);
}

sampleData = {};

// Generate sample data
for (i = 0; i < 26; i++) {
  sampleData[String.fromCharCode('a'.charCodeAt(0) + i)] = i;
}


for (j = 0; j < RUNS; j++) {
  fs.writeFile(filename, JSON.stringify(sampleData), profileMeOnWrite);
}

The script runs fairly quickly for a thousand passes through a small dataset.

$ node -v
v0.11.13

$ time node --prof profileme.js
real    0m0.265s
user    0m0.168s
sys     0m0.159s

But the fun part is including the --prof flag. Besides the output of the script, we also get a v8.log file dumped into the current directory.

Since v8 is v8, the Chrome developers have helpfully provided a visualizer for profile dumps. Visit chrome://tracing/ in Chrome (or Chromium), click “load”, and select the v8.log file. The results of the profile will be rendered as a timeline; selecting different calls on the timeline will load samples of the call stack into a flamegraph for closer inspection.

Profile captured on Node.js 0.11.13
--prof capture from 0.11.13

Leaving the finer details of the graph to another of Brendan’s excellent articles, the two parameters represented in the graph are stack depth (extending radially outwards) and average call time (the width about the axis of each call’s “slice”). Taller methods are deeper; methods that were called more frequently (or took longer to run) are larger segments.

But there we are! We can now generate CPU profiles for help in reviewing and tuning application performance and use Chrome’s built-in tools to access them.

Featured