I've been very excited about Node JS for awhile. I finally decided to knuckle down and write a test project to learn about generators in the latest Harmony build of Node.
Here is my very simple test project:
https://github.com/kirkouimet/project-node
To run my test project, you can easily pull the files from Github and then run it with:
node --harmony App.js
Here's my problem - I can't seem to get Node's asynchronous fs.readdir method to run inline with generators. Other projects out there, such as Galaxy and suspend seem to be able to do it.
Here is the block of code I need to fix. I want to be able to instantiate an object of type FileSystem and call the .list() method on it:
https://github.com/kirkouimet/project-node/blob/4c77294f42da9e078775bb84c763d4c60f21e1cc/FileSystem.js#L7-L11
FileSystem = Class.extend({
construct: function() {
this.currentDirectory = null;
},
list: function*(path) {
var list = yield NodeFileSystem.readdir(path);
return list;
}
});
Do I need to do something ahead of time to convert Node's fs.readdir into a generator?
One important note, I am parsing all class functions as they are created. This lets me handle generator functions differently than normal functions:
https://github.com/kirkouimet/project-node/blob/4c77294f42da9e078775bb84c763d4c60f21e1cc/Class.js#L31-L51
I've been really stumped with this project. Would love any assistance!
Here is what I am trying to accomplish:
I've tried to implement your example function and I am running into some trouble.
list: function*(path) {
var list = null;
var whatDoesCoReturn = co(function*() {
list = yield readdir(path);
console.log(list); // This shows an array of files (good!)
return list; // Just my guess that co should get this back, it doesn't
})();
console.log(whatDoesCoReturn); // This returns undefined (sad times)
// I need to use `list` right here
return list; // This returns as null
}
The Node.js file system module allows you to work with the file system on your computer. To include the File System module, use the require() method: var fs = require('fs'); Common use for the File System module: Read files.
Generators are function executions that can be suspended and resumed at a later point. Generators are useful when carrying out concepts such as 'lazy execution'. This basically means that by suspending execution and resuming at will, we are able to pull values only when we need to.
The read() method of fs package reads the file using a file descriptor. In order to read files without file descriptor the readFile() method of fs package can be used.
First and foremost, it is important to have a good model in your head of exactly what a generator is. A generator function is a function that returns a generator object, and that generator object will step through yield
statements within the generator function as you call .next()
on it.
Given that description, you should notice that asynchronous behavior is not mentioned. Any action on a generator on its own is synchronous. You can run to the first yield
immediately and then do a setTimeout
and then call .next()
to go to the next yield
, but it is the setTimeout
that causes asynchronous behavior, not the generator itself.
So let's cast this in the light of fs.readdir
. fs.readdir
is an async function, and using it in a generator on its own will have no effect. Let's look at your example:
function * read(path){
return yield fs.readdir(path);
}
var gen = read(path);
// gen is now a generator object.
var first = gen.next();
// This is equivalent to first = fs.readdir(path);
// Which means first === undefined since fs.readdir returns nothing.
var final = gen.next();
// This is equivalent to final = undefined;
// Because you are returning the result of 'yield', and that is the value passed
// into .next(), and you are not passing anything to it.
Hopefully it makes it clearer that what you are still calling readdir
synchronously, and you are not passing any callback, so it will probably throw an error or something.
Generally this is accomplished by having the generator yield a special object that represents the result of readdir
before the value has actually been calculated.
For (unrealistic) example, yield
ing a function is a simple way to yield something that represents the value.
function * read(path){
return yield function(callback){
fs.readdir(path, callback);
};
}
var gen = read(path);
// gen is now a generator object.
var first = gen.next();
// This is equivalent to first = function(callback){ ... };
// Trigger the callback to calculate the value here.
first(function(err, dir){
var dirData = gen.next(dir);
// This will just return 'dir' since we are directly returning the yielded value.
// Do whatever.
});
Really, you would want this type of logic to continue calling the generator until all of the yield
calls are done, rather than hard-coding each call. The main thing to notice with this though, is now the generator itself looks synchronous, and everything outside the read
function is super generic.
You need some kind of generator wrapper function that handles this yield value process, and your example of the suspend
does exactly this. Another example is co
.
The standard method for the method of "return something representing the value" is to return a promise
or a thunk
since returning a function like I did is kind of ugly.
With the thunk
and co
libraries, you with do the above without the example function:
var thunkify = require('thunkify');
var co = require('co');
var fs = require('fs');
var readdir = thunkify(fs.readdir);
co(function * (){
// `readdir` will call the node function, and return a thunk representing the
// directory, which is then `yield`ed to `co`, which will wait for the data
// to be ready, and then it will start the generator again, passing the value
// as the result of the `yield`.
var dirData = yield readdir(path, callback);
// Do whatever.
})(function(err, result){
// This callback is called once the synchronous-looking generator has returned.
// or thrown an exception.
});
Your update still has some confusion. If you want your list
function to be a generator, then you will need to use co
outside of list
wherever you are calling it. Everything inside of co
should be generator-based and everything outside co
should be callback-based. co
does not make list
automatically asynchronous. co
is used to translate a generator-based async flow control into callback-based flow control.
e.g.
list: function(path, callback){
co(function * (){
var list = yield readdir(path);
// Use `list` right here.
return list;
})(function(err, result){
// err here would be set if your 'readdir' call had an error
// result is the return value from 'co', so it would be 'list'.
callback(err, result);
})
}
@loganfsmyth already provides a great answer to your question. The goal of my answer is to help you understand how JavaScript generators actually work, as this is a very important step to using them correctly.
Generators implement a state machine, the concept which is nothing new by itself. What's new is that generators allow to use the familiar JavaScript language construct (e.g., for
, if
, try/catch
) to implement a state machine without giving up the linear code flow.
The original goal for generators is to generate a sequence of data, which has nothing to do with asynchrony. Example:
// with generator
function* sequence()
{
var i = 0;
while (i < 10)
yield ++i * 2;
}
for (var j of sequence())
console.log(j);
// without generator
function bulkySequence()
{
var i = 0;
var nextStep = function() {
if ( i >= 10 )
return { value: undefined, done: true };
return { value: ++i * 2, done: false };
}
return { next: nextStep };
}
for (var j of bulkySequence())
console.log(j);
The second part (bulkySequence
) shows how to implement the same state machine in the traditional way, without generators. In this case, we no longer able to use while
loop to generate values, and the continuation happens via nextStep
callback. This code is bulky and unreadable.
Let's introduce asynchrony. In this case, the continuation to the next step of the state machine will be driven not by for of
loop, but by some external event. I'll use a timer interval as a source of the event, but it may as well be a Node.js operation completion callback, or a promise resolution callback.
The idea is to show how it works without using any external libraries (like Q
, Bluebird
, Co
etc). Nothing stops the generator from self-driving itself to the next step, and that's what the following code does. Once all steps of the asynchronous logic have completed (the 10 timer ticks), doneCallback
will be invoked. Note, I don't return any meaningful data with yield
here. I merely use it to suspend and resume the execution:
function workAsync(doneCallback)
{
var worker = (function* () {
// the timer callback drivers to the next step
var interval = setInterval(function() {
worker.next(); }, 500);
try {
var tick = 0;
while (tick < 10 ) {
// resume upon next tick
yield null;
console.log("tick: " + tick++);
}
doneCallback(null, null);
}
catch (ex) {
doneCallback(ex, null);
}
finally {
clearInterval(interval);
}
})();
// initial step
worker.next();
}
workAsync(function(err, result) {
console.log("Done, any errror: " + err); });
Finally, let's create a sequence of events:
function workAsync(doneCallback)
{
var worker = (function* () {
// the timer callback drivers to the next step
setTimeout(function() {
worker.next(); }, 1000);
yield null;
console.log("timer1 fired.");
setTimeout(function() {
worker.next(); }, 2000);
yield null;
console.log("timer2 fired.");
setTimeout(function() {
worker.next(); }, 3000);
yield null;
console.log("timer3 fired.");
doneCallback(null, null);
})();
// initial step
worker.next();
}
workAsync(function(err, result) {
console.log("Done, any errror: " + err); });
Once you understand this concept, you can move on with using promises as wrappers for generators, which takes it to the next powerful level.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With