Hi, now you can extract the text content from doc/docx without installing external dependencies.

You can use the node library called any-text

Currently, it supports a number of file extensions like PDF, XLSX, XLS, CSV etc

Usage is very simple:

- Install the library as a dependency (/dev-dependency)

```
npm i -D any-text
```

- Make use of the `getText` method to read the text content

```
var reader = require(‘any-text’);

reader.getText(`path-to-file`).then(function (data) {
console.log(data);
});
```

- You can also use the `async/await` notation

```
var reader = require(‘any-text’);

const text = await reader.getText(`path-to-file`);

console.log(text);
```

### Sample Test

```
var reader = require(‘any-text’);

const chai = require(‘chai’);
const expect = chai.expect;

describe(‘file reader checks’, () => {
it(‘check docx file content’, async () => {
expect(
await reader.getText(`${process.cwd()}/test/files/dummy.docx`)
).to.contains(‘Lorem ipsum’);
});
});
```

I hope it will help!

--

Tech Lead Manager @Postman 🚀 | Space Movie Lover 🪐 | Coder 👨‍💻 | Traveller ⛰️

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Abhinaba Ghosh

Abhinaba Ghosh

Tech Lead Manager @Postman 🚀 | Space Movie Lover 🪐 | Coder 👨‍💻 | Traveller ⛰️