Node.js: fs.readFile/writeFile 和 fs.createReadStream/writeStream 区别

1. 先说说各自的用法:

How do I read files in node.js?

fs = require('fs');
fs.readFile(file, [encoding], [callback]);

// file = (string) filepath of the file to read

encoding is an optional parameter that specifies the type of encoding to read the file. Possible encodings are ‘ascii’, ‘utf8’, and ‘base64’. If no encoding is provided, the default is utf8.

callback is a function to call when the file has been read and the contents are ready – it is passed two arguments, error and data. If there is no error, error will be null and data will contain the file contents; otherwise err contains the error message.

So if we wanted to read /etc/hosts and print it to stdout (just like UNIX cat):

fs = require('fs')
fs.readFile('/etc/hosts', 'utf8', function (err,data) {
  if (err) {
    return console.log(err);
  }
  console.log(data);
});

The contents of /etc/hosts should now be visible to you, provided you have permission to read the file in the first place.

Let’s now take a look at an example of what happens when you try to read an invalid file – the easiest example is one that doesn’t exist.

fs = require('fs');
fs.readFile('/doesnt/exist', 'utf8', function (err,data) {
  if (err) {
    return console.log(err);
  }
  console.log(data);
});

This is the output:

{ stack: [Getter/Setter],
  arguments: undefined,
  type: undefined,
  message: 'ENOENT, No such file or directory \'/doesnt/exist\'',
  errno: 2,
  code: 'ENOENT',
  path: '/doesnt/exist' }

How to use fs.createReadStream?

var http = require('http');
var fs = require('fs');

http.createServer(function(req, res) {
  // The filename is simple the local directory and tacks on the requested url
  var filename = __dirname+req.url;

  // This line opens the file as a readable stream
  var readStream = fs.createReadStream(filename);

  // This will wait until we know the readable stream is actually valid before piping
  readStream.on('open', function () {
    // This just pipes the read stream to the response object (which goes to the client)
    readStream.pipe(res);
  });

  // This catches any errors that happen while creating the readable stream (usually invalid names)
  readStream.on('error', function(err) {
    res.end(err);
  });
}).listen(8080);

2. 区别:

createReadStream是给你一个ReadableStream,你可以听它的’data’,一点一点儿处理文件,用过的部分会被GC(垃圾回收),所以占内存少。 readFile是把整个文件全部读到内存里。

nodejs的fs模块并没有提供一个copy的方法,但我们可以很容易的实现一个,比如:

var source = fs.readFileSync('/path/to/source', {encoding: 'utf8'});
fs.writeFileSync('/path/to/dest', source);

这种方式是把文件内容全部读入内存,然后再写入文件,对于小型的文本文件,这没有多大问题,比如grunt-file-copy就是这样实现的。但是对于体积较大的二进制文件,比如音频、视频文件,动辄几个GB大小,如果使用这种方法,很容易使内存“爆仓”。理想的方法应该是读一部分,写一部分,不管文件有多大,只要时间允许,总会处理完成,这里就需要用到流的概念

Stream在nodejs中是EventEmitter的实现,并且有多种实现形式,例如:

  • http responses request
  • fs read write streams
  • zlib streams
  • tcp sockets
  • child process stdout and stderr

上面的文件复制可以简单实现一下:

var fs = require('fs');
var readStream = fs.createReadStream('/path/to/source');
var writeStream = fs.createWriteStream('/path/to/dest');

readStream.on('data', function(chunk) { // 当有数据流出时,写入数据
    writeStream.write(chunk);
});

readStream.on('end', function() { // 当没有数据时,关闭数据流
    writeStream.end();
});

上面的写法有一些问题,如果写入的速度跟不上读取的速度,有可能导致数据丢失。正常的情况应该是,写完一段,再读取下一段,如果没有写完的话,就让读取流先暂停,等写完再继续,于是代码可以修改为:

var fs = require('fs');
var readStream = fs.createReadStream('/path/to/source');
var writeStream = fs.createWriteStream('/path/to/dest');

readStream.on('data', function(chunk) { // 当有数据流出时,写入数据
    if (writeStream.write(chunk) === false) { // 如果没有写完,暂停读取流
        readStream.pause();
    }
});

writeStream.on('drain', function() { // 写完后,继续读取
    readStream.resume();
});

readStream.on('end', function() { // 当没有数据时,关闭数据流
    writeStream.end();
});

或者使用更直接的pipe

// pipe自动调用了data,end等事件
fs.createReadStream('/path/to/source').pipe(fs.createWriteStream('/path/to/dest'));

下面是一个更加完整的复制文件的过程

var fs = require('fs'),
    path = require('path'),
    out = process.stdout;

var filePath = '/Users/chen/Movies/Game.of.Thrones.S04E07.1080p.HDTV.x264-BATV.mkv';

var readStream = fs.createReadStream(filePath);
var writeStream = fs.createWriteStream('file.mkv');

var stat = fs.statSync(filePath);

var totalSize = stat.size;
var passedLength = 0;
var lastSize = 0;
var startTime = Date.now();

readStream.on('data', function(chunk) {

    passedLength += chunk.length;

    if (writeStream.write(chunk) === false) {
        readStream.pause();
    }
});

readStream.on('end', function() {
    writeStream.end();
});

writeStream.on('drain', function() {
    readStream.resume();
});

setTimeout(function show() {
    var percent = Math.ceil((passedLength / totalSize) * 100);
    var size = Math.ceil(passedLength / 1000000);
    var diff = size - lastSize;
    lastSize = size;
    out.clearLine();
    out.cursorTo(0);
    out.write('已完成' + size + 'MB, ' + percent + '%, 速度:' + diff * 2 + 'MB/s');
    if (passedLength < totalSize) {
        setTimeout(show, 500);
    } else {
        var endTime = Date.now();
        console.log();
        console.log('共用时:' + (endTime - startTime) / 1000 + '秒。');
    }
}, 500);

可以把上面的代码保存为copy.js试验一下

我们添加了一个递归的setTimeout(或者直接使用setInterval)来做一个旁观者,每500ms观察一次完成进度,并把已完成的大小、百分比和复制速度一并写到控制台上,当复制完成时,计算总的耗费时间。

结合nodejs的readlineprocess.argv等模块,我们可以添加覆盖提示、强制覆盖、动态指定文件路径等完整的复制方法,有兴趣的可以实现一下,实现完成,可以

ln -s /path/to/copy.js /usr/local/bin/mycopy

这样就可以使用自己写的mycopy命令替代系统的cp命令。

更多fs说明,可以查看API: http://nodeapi.ucdok.com/#/api/fs.html

 

本文:Node.js: fs.readFile/writeFile 和 fs.createReadStream/writeStream 区别

Leave a Reply