This is an accompanying exercise to From Promises to Futures in Javascript. As a CLI application running on Node.js, this will attempt to parse large amounts of data by constructing a stateless pipeline - with Futures.
- Written with Typescript and runs on Node.js
- DEMO
- Tested on node v8.11.1
- Clone the repository
- Install the dependencies with
yarn
- Build the source with Typescript compiler:
yarn build
Search works in two different modes.
Interactive mode will guide you with conducting a search step by step. It will only consider files in the repo's data
directory.
To run:
yarn start -i
Streaming mode will consume a stream from standard input.
Example:
cat data/organizations.json | yarn start --streaming -f domain_names -t kage
Regex is accepted as a search term in both streaming and interactive mode
Example:
→ yarn start -i
✔ Which of the following files do you want to search in? › tickets.json
✔ Which field do you want to search in? › status
✔ What's your search term? … (pending|hold) # <----------
> Searching tickets.json for status: (pending|hold)
---
_id: 436bf9b0...
...
✔ Search completed with 42 results
Tests are written with jest.
yarn test
- Using file streaming, I've tried to account for reading large JSON files that may exceed the device memory and returning results that exceed the device memory as well.
- I've made extensive use of Futures to conduct async tasks and control the execution flow.
- State is managed via a single reducer.
- Typescript is used as a type system and for general goodnes™.
- Result parsing (pretty print and searching within the js object) is currently blocking. A collection of large JSON objects (~ several hundered complex fields per object) will slow down the search.
- Data directory (
data
) is not configurable. This is a low-hanging point of improvement. - Currently, searchable fields are only the top level fields. This can be extended to include sub fields.