unified

Learn/Recipe/Tree traversal

How to walk a tree

Contents

Tree traversal

Tree traversal is a common task when working with syntax trees. The term tree here means a node and all its descendants (all the nodes inside it). Traversal means stopping at every node to do something. So, tree traversal means doing something for every node in a tree.

Tree traversal is often also called “walking a tree”. Or “visiting a tree”.

To learn more, continue reading, but when working with unist (unified’s trees) you probably need either:

Set up

Glad you’re still here! Let’s say we have the following fragment of HTML, in a file example.html:

<p>
  <!-- A comment. -->
  Some <strong>strong importance</strong>, <em>emphasis</em>, and a dash of
  <code>code</code>.
</p>

You could parse that with the following code (using unified and rehype-parse):

import fs from 'node:fs/promises'
import rehypeParse from 'rehype-parse'
import {unified} from 'unified'

const document = await fs.readFile('example.html')

const tree = unified().use(rehypeParse, {fragment: true}).parse(document)

console.log(tree)
(alias) module "node:fs/promises"
import fs
(alias) const rehypeParse: Plugin<[(Options | null | undefined)?], string, Root>
import rehypeParse

Plugin to add support for parsing from HTML.

  • @this processor.
  • @param Configuration (optional).
  • @returns Nothing.
(alias) const unified: Processor<undefined, undefined, undefined, undefined, undefined>
import unified

Create a new processor.

  • @example This example shows how a new processor can be created (from remark) and linked to stdin(4) and stdout(4).
    import process from 'node:process'
    import concatStream from 'concat-stream'
    import {remark} from 'remark'
    
    process.stdin.pipe(
      concatStream(function (buf) {
        process.stdout.write(String(remark().processSync(buf)))
      })
    )
    
  • @returns New unfrozen processor (processor). This processor is configured to work the same as its ancestor. When the descendant processor is configured in the future it does not affect the ancestral processor.
const document: Buffer
(alias) module "node:fs/promises"
import fs
function readFile(path: PathLike | fs.FileHandle, options?: ({
    encoding?: null | undefined;
    flag?: OpenMode | undefined;
} & EventEmitter<T extends EventMap<T> = DefaultEventMap>.Abortable) | null): Promise<Buffer> (+2 overloads)

Asynchronously reads the entire contents of a file.

If no encoding is specified (using options.encoding), the data is returned as a Buffer object. Otherwise, the data will be a string.

If options is a string, then it specifies the encoding.

When the path is a directory, the behavior of fsPromises.readFile() is platform-specific. On macOS, Linux, and Windows, the promise will be rejected with an error. On FreeBSD, a representation of the directory's contents will be returned.

An example of reading a package.json file located in the same directory of the running code:

import { readFile } from 'node:fs/promises';
try {
  const filePath = new URL('./package.json', import.meta.url);
  const contents = await readFile(filePath, { encoding: 'utf8' });
  console.log(contents);
} catch (err) {
  console.error(err.message);
}

It is possible to abort an ongoing readFile using an AbortSignal. If a request is aborted the promise returned is rejected with an AbortError:

import { readFile } from 'node:fs/promises';

try {
  const controller = new AbortController();
  const { signal } = controller;
  const promise = readFile(fileName, { signal });

  // Abort the request before the promise settles.
  controller.abort();

  await promise;
} catch (err) {
  // When a request is aborted - err is an AbortError
  console.error(err);
}

Aborting an ongoing request does not abort individual operating system requests but rather the internal buffering fs.readFile performs.

Any specified FileHandle has to support reading.

  • @since v10.0.0
  • @param path filename or FileHandle
  • @return Fulfills with the contents of the file.
const tree: Root
(alias) unified(): Processor<undefined, undefined, undefined, undefined, undefined>
import unified

Create a new processor.

  • @example This example shows how a new processor can be created (from remark) and linked to stdin(4) and stdout(4).
    import process from 'node:process'
    import concatStream from 'concat-stream'
    import {remark} from 'remark'
    
    process.stdin.pipe(
      concatStream(function (buf) {
        process.stdout.write(String(remark().processSync(buf)))
      })
    )
    
  • @returns New unfrozen processor (processor). This processor is configured to work the same as its ancestor. When the descendant processor is configured in the future it does not affect the ancestral processor.
(method) Processor<undefined, undefined, undefined, undefined, undefined>.use<[(Options | null | undefined)?], string, Root>(plugin: Plugin<[(Options | null | undefined)?], string, Root>, ...parameters: [(Options | null | undefined)?] | [...]): Processor<...> (+2 overloads)

Configure the processor to use a plugin, a list of usable values, or a preset.

If the processor is already using a plugin, the previous plugin configuration is changed based on the options that are passed in. In other words, the plugin is not added a second time.

Note: use cannot be called on frozen processors. Call the processor first to create a new unfrozen processor.

  • @example There are many ways to pass plugins to .use(). This example gives an overview:
    import {unified} from 'unified'
    
    unified()
      // Plugin with options:
      .use(pluginA, {x: true, y: true})
      // Passing the same plugin again merges configuration (to `{x: true, y: false, z: true}`):
      .use(pluginA, {y: false, z: true})
      // Plugins:
      .use([pluginB, pluginC])
      // Two plugins, the second with options:
      .use([pluginD, [pluginE, {}]])
      // Preset with plugins and settings:
      .use({plugins: [pluginF, [pluginG, {}]], settings: {position: false}})
      // Settings only:
      .use({settings: {position: false}})
    
  • @template {Array} [Parameters=[]]
  • @template {Node | string | undefined} [Input=undefined]
  • @template [Output=Input]
  • @overload
  • @overload
  • @overload
  • @param value Usable value.
  • @param parameters Parameters, when a plugin is given as a usable value.
  • @returns Current processor.
(alias) const rehypeParse: Plugin<[(Options | null | undefined)?], string, Root>
import rehypeParse

Plugin to add support for parsing from HTML.

  • @this processor.
  • @param Configuration (optional).
  • @returns Nothing.
(property) fragment?: boolean | null | undefined

Specify whether to parse a fragment, instead of a complete document (default: false).

In document mode, unopened html, head, and body elements are opened in just the right places.

(method) Processor<Root, undefined, undefined, undefined, undefined>.parse(file?: Compatible | undefined): Root

Parse text to a syntax tree.

Note: parse freezes the processor if not already frozen.

Note: parse performs the parse phase, not the run phase or other phases.

  • @param file file to parse (optional); typically string or VFile; any value accepted as x in new VFile(x).
  • @returns Syntax tree representing file.
const document: Buffer
namespace console
var console: Console

The console module provides a simple debugging console that is similar to the JavaScript console mechanism provided by web browsers.

The module exports two specific components:

  • A Console class with methods such as console.log(), console.error() and console.warn() that can be used to write to any Node.js stream.
  • A global console instance configured to write to process.stdout and process.stderr. The global console can be used without calling require('console').

Warning: The global console object's methods are neither consistently synchronous like the browser APIs they resemble, nor are they consistently asynchronous like all other Node.js streams. See the note on process I/O for more information.

Example using the global console:

console.log('hello world');
// Prints: hello world, to stdout
console.log('hello %s', 'world');
// Prints: hello world, to stdout
console.error(new Error('Whoops, something bad happened'));
// Prints error message and stack trace to stderr:
//   Error: Whoops, something bad happened
//     at [eval]:5:15
//     at Script.runInThisContext (node:vm:132:18)
//     at Object.runInThisContext (node:vm:309:38)
//     at node:internal/process/execution:77:19
//     at [eval]-wrapper:6:22
//     at evalScript (node:internal/process/execution:76:60)
//     at node:internal/main/eval_string:23:3

const name = 'Will Robinson';
console.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to stderr

Example using the Console class:

const out = getStreamSomehow();
const err = getStreamSomehow();
const myConsole = new console.Console(out, err);

myConsole.log('hello world');
// Prints: hello world, to out
myConsole.log('hello %s', 'world');
// Prints: hello world, to out
myConsole.error(new Error('Whoops, something bad happened'));
// Prints: [Error: Whoops, something bad happened], to err

const name = 'Will Robinson';
myConsole.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to err
(method) Console.log(message?: any, ...optionalParams: any[]): void

Prints to stdout with newline. Multiple arguments can be passed, with the first used as the primary message and all additional used as substitution values similar to printf(3) (the arguments are all passed to util.format()).

const count = 5;
console.log('count: %d', count);
// Prints: count: 5, to stdout
console.log('count:', count);
// Prints: count: 5, to stdout

See util.format() for more information.

  • @since v0.1.100
const tree: Root

Which would yield (ignoring positional info for brevity):

{
  type: 'root',
  children: [
    {
      type: 'element',
      tagName: 'p',
      properties: {},
      children: [
        { type: 'text', value: '\n  ' },
        { type: 'comment', value: ' A comment. ' },
        { type: 'text', value: '\n  Some ' },
        {
          type: 'element',
          tagName: 'strong',
          properties: {},
          children: [ { type: 'text', value: 'strong importance' } ]
        },
        { type: 'text', value: ', ' },
        {
          type: 'element',
          tagName: 'em',
          properties: {},
          children: [ { type: 'text', value: 'emphasis' } ]
        },
        { type: 'text', value: ', and a dash of\n  ' },
        {
          type: 'element',
          tagName: 'code',
          properties: {},
          children: [ { type: 'text', value: 'code' } ]
        },
        { type: 'text', value: '.\n' }
      ]
    },
    { type: 'text', value: '\n' }
  ],
  data: { quirksMode: false }
}

As we are all set up, we can traverse the tree.

Traverse the tree

A useful utility for that is unist-util-visit, and it works like so:

import {visit} from 'unist-util-visit'

// …

visit(tree, function (node) {
  console.log(node.type)
})
(alias) function visit<Tree extends Node, Check extends Test>(tree: Tree, check: Check, visitor: BuildVisitor<Tree, Check>, reverse?: boolean | null | undefined): undefined (+1 overload)
import visit
(alias) visit<Root, Test>(tree: Root, visitor: BuildVisitor<Root, Test>, reverse?: boolean | null | undefined): undefined (+1 overload)
import visit
const tree: Root
  • @import
(parameter) node: Root | Doctype | ElementContent
namespace console
var console: Console

The console module provides a simple debugging console that is similar to the JavaScript console mechanism provided by web browsers.

The module exports two specific components:

  • A Console class with methods such as console.log(), console.error() and console.warn() that can be used to write to any Node.js stream.
  • A global console instance configured to write to process.stdout and process.stderr. The global console can be used without calling require('console').

Warning: The global console object's methods are neither consistently synchronous like the browser APIs they resemble, nor are they consistently asynchronous like all other Node.js streams. See the note on process I/O for more information.

Example using the global console:

console.log('hello world');
// Prints: hello world, to stdout
console.log('hello %s', 'world');
// Prints: hello world, to stdout
console.error(new Error('Whoops, something bad happened'));
// Prints error message and stack trace to stderr:
//   Error: Whoops, something bad happened
//     at [eval]:5:15
//     at Script.runInThisContext (node:vm:132:18)
//     at Object.runInThisContext (node:vm:309:38)
//     at node:internal/process/execution:77:19
//     at [eval]-wrapper:6:22
//     at evalScript (node:internal/process/execution:76:60)
//     at node:internal/main/eval_string:23:3

const name = 'Will Robinson';
console.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to stderr

Example using the Console class:

const out = getStreamSomehow();
const err = getStreamSomehow();
const myConsole = new console.Console(out, err);

myConsole.log('hello world');
// Prints: hello world, to out
myConsole.log('hello %s', 'world');
// Prints: hello world, to out
myConsole.error(new Error('Whoops, something bad happened'));
// Prints: [Error: Whoops, something bad happened], to err

const name = 'Will Robinson';
myConsole.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to err
(method) Console.log(message?: any, ...optionalParams: any[]): void

Prints to stdout with newline. Multiple arguments can be passed, with the first used as the primary message and all additional used as substitution values similar to printf(3) (the arguments are all passed to util.format()).

const count = 5;
console.log('count: %d', count);
// Prints: count: 5, to stdout
console.log('count:', count);
// Prints: count: 5, to stdout

See util.format() for more information.

  • @since v0.1.100
(parameter) node: Root | Doctype | ElementContent
(property) type: "root" | "comment" | "doctype" | "element" | "text"

Node type of hast root.

Node type of HTML comments in hast.

Node type of HTML document types in hast.

Node type of elements.

Node type of HTML character data (plain text) in hast.

root
element
text
comment
text
element
text
text
element
text
text
element
text
text
text

We traversed the entire tree, and for each node, we printed its type.

Visiting a certain kind of node

To “visit” only a certain type of node, pass a test to unist-util-visit like so:

visit(tree, 'element', function (node) {
  console.log(node.tagName)
})
(alias) visit<Root, "element">(tree: Root, check: "element", visitor: BuildVisitor<Root, "element">, reverse?: boolean | null | undefined): undefined (+1 overload)
import visit
const tree: Root
(parameter) node: Element
namespace console
var console: Console

The console module provides a simple debugging console that is similar to the JavaScript console mechanism provided by web browsers.

The module exports two specific components:

  • A Console class with methods such as console.log(), console.error() and console.warn() that can be used to write to any Node.js stream.
  • A global console instance configured to write to process.stdout and process.stderr. The global console can be used without calling require('console').

Warning: The global console object's methods are neither consistently synchronous like the browser APIs they resemble, nor are they consistently asynchronous like all other Node.js streams. See the note on process I/O for more information.

Example using the global console:

console.log('hello world');
// Prints: hello world, to stdout
console.log('hello %s', 'world');
// Prints: hello world, to stdout
console.error(new Error('Whoops, something bad happened'));
// Prints error message and stack trace to stderr:
//   Error: Whoops, something bad happened
//     at [eval]:5:15
//     at Script.runInThisContext (node:vm:132:18)
//     at Object.runInThisContext (node:vm:309:38)
//     at node:internal/process/execution:77:19
//     at [eval]-wrapper:6:22
//     at evalScript (node:internal/process/execution:76:60)
//     at node:internal/main/eval_string:23:3

const name = 'Will Robinson';
console.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to stderr

Example using the Console class:

const out = getStreamSomehow();
const err = getStreamSomehow();
const myConsole = new console.Console(out, err);

myConsole.log('hello world');
// Prints: hello world, to out
myConsole.log('hello %s', 'world');
// Prints: hello world, to out
myConsole.error(new Error('Whoops, something bad happened'));
// Prints: [Error: Whoops, something bad happened], to err

const name = 'Will Robinson';
myConsole.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to err
(method) Console.log(message?: any, ...optionalParams: any[]): void

Prints to stdout with newline. Multiple arguments can be passed, with the first used as the primary message and all additional used as substitution values similar to printf(3) (the arguments are all passed to util.format()).

const count = 5;
console.log('count: %d', count);
// Prints: count: 5, to stdout
console.log('count:', count);
// Prints: count: 5, to stdout

See util.format() for more information.

  • @since v0.1.100
(parameter) node: Element
(property) Element.tagName: string

Tag name (such as 'body') of the element.

p
strong
em
code

You can do this yourself as well. The above works the same as:

visit(tree, function (node) {
  if (node.type === 'element') {
    console.log(node.tagName)
  }
})
(alias) visit<Root, Test>(tree: Root, visitor: BuildVisitor<Root, Test>, reverse?: boolean | null | undefined): undefined (+1 overload)
import visit
const tree: Root
(parameter) node: Root | Doctype | ElementContent
(parameter) node: Root | Doctype | ElementContent
(property) type: "root" | "comment" | "doctype" | "element" | "text"

Node type of hast root.

Node type of HTML comments in hast.

Node type of HTML document types in hast.

Node type of elements.

Node type of HTML character data (plain text) in hast.

namespace console
var console: Console

The console module provides a simple debugging console that is similar to the JavaScript console mechanism provided by web browsers.

The module exports two specific components:

  • A Console class with methods such as console.log(), console.error() and console.warn() that can be used to write to any Node.js stream.
  • A global console instance configured to write to process.stdout and process.stderr. The global console can be used without calling require('console').

Warning: The global console object's methods are neither consistently synchronous like the browser APIs they resemble, nor are they consistently asynchronous like all other Node.js streams. See the note on process I/O for more information.

Example using the global console:

console.log('hello world');
// Prints: hello world, to stdout
console.log('hello %s', 'world');
// Prints: hello world, to stdout
console.error(new Error('Whoops, something bad happened'));
// Prints error message and stack trace to stderr:
//   Error: Whoops, something bad happened
//     at [eval]:5:15
//     at Script.runInThisContext (node:vm:132:18)
//     at Object.runInThisContext (node:vm:309:38)
//     at node:internal/process/execution:77:19
//     at [eval]-wrapper:6:22
//     at evalScript (node:internal/process/execution:76:60)
//     at node:internal/main/eval_string:23:3

const name = 'Will Robinson';
console.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to stderr

Example using the Console class:

const out = getStreamSomehow();
const err = getStreamSomehow();
const myConsole = new console.Console(out, err);

myConsole.log('hello world');
// Prints: hello world, to out
myConsole.log('hello %s', 'world');
// Prints: hello world, to out
myConsole.error(new Error('Whoops, something bad happened'));
// Prints: [Error: Whoops, something bad happened], to err

const name = 'Will Robinson';
myConsole.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to err
(method) Console.log(message?: any, ...optionalParams: any[]): void

Prints to stdout with newline. Multiple arguments can be passed, with the first used as the primary message and all additional used as substitution values similar to printf(3) (the arguments are all passed to util.format()).

const count = 5;
console.log('count: %d', count);
// Prints: count: 5, to stdout
console.log('count:', count);
// Prints: count: 5, to stdout

See util.format() for more information.

  • @since v0.1.100
(parameter) node: Element
(property) Element.tagName: string

Tag name (such as 'body') of the element.

But the test passed to visit can be more advanced, such as the following to visit different kinds of nodes.

visit(tree, ['comment', 'text'], function (node) {
  console.log([node.value])
})
(alias) visit<Root, string[]>(tree: Root, check: string[], visitor: BuildVisitor<Root, string[]>, reverse?: boolean | null | undefined): undefined (+1 overload)
import visit
const tree: Root
(parameter) node: Root | Doctype | ElementContent
namespace console
var console: Console

The console module provides a simple debugging console that is similar to the JavaScript console mechanism provided by web browsers.

The module exports two specific components:

  • A Console class with methods such as console.log(), console.error() and console.warn() that can be used to write to any Node.js stream.
  • A global console instance configured to write to process.stdout and process.stderr. The global console can be used without calling require('console').

Warning: The global console object's methods are neither consistently synchronous like the browser APIs they resemble, nor are they consistently asynchronous like all other Node.js streams. See the note on process I/O for more information.

Example using the global console:

console.log('hello world');
// Prints: hello world, to stdout
console.log('hello %s', 'world');
// Prints: hello world, to stdout
console.error(new Error('Whoops, something bad happened'));
// Prints error message and stack trace to stderr:
//   Error: Whoops, something bad happened
//     at [eval]:5:15
//     at Script.runInThisContext (node:vm:132:18)
//     at Object.runInThisContext (node:vm:309:38)
//     at node:internal/process/execution:77:19
//     at [eval]-wrapper:6:22
//     at evalScript (node:internal/process/execution:76:60)
//     at node:internal/main/eval_string:23:3

const name = 'Will Robinson';
console.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to stderr

Example using the Console class:

const out = getStreamSomehow();
const err = getStreamSomehow();
const myConsole = new console.Console(out, err);

myConsole.log('hello world');
// Prints: hello world, to out
myConsole.log('hello %s', 'world');
// Prints: hello world, to out
myConsole.error(new Error('Whoops, something bad happened'));
// Prints: [Error: Whoops, something bad happened], to err

const name = 'Will Robinson';
myConsole.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to err
(method) Console.log(message?: any, ...optionalParams: any[]): void

Prints to stdout with newline. Multiple arguments can be passed, with the first used as the primary message and all additional used as substitution values similar to printf(3) (the arguments are all passed to util.format()).

const count = 5;
console.log('count: %d', count);
// Prints: count: 5, to stdout
console.log('count:', count);
// Prints: count: 5, to stdout

See util.format() for more information.

  • @since v0.1.100
(parameter) node: Root | Doctype | ElementContent
Property 'value' does not exist on type 'Root | Doctype | ElementContent'.
  Property 'value' does not exist on type 'Root'. (2339)
any
[ '\n  ' ]
[ ' A comment. ' ]
[ '\n  Some ' ]
[ 'strong importance' ]
[ ', ' ]
[ 'emphasis' ]
[ ', and a dash of\n  ' ]
[ 'code' ]
[ '.\n' ]
[ '\n' ]

Sadly, TypeScript isn’t great with arrays and discriminated unions. When you want to do more complex tests with TypeScript, it’s recommended to omit the test and use explicit if statements:

visit(tree, function (node) {
  if (node.type === 'comment' || node.type === 'text')  {
    console.log([node.value])
  }
})
(alias) visit<Root, Test>(tree: Root, visitor: BuildVisitor<Root, Test>, reverse?: boolean | null | undefined): undefined (+1 overload)
import visit
const tree: Root
(parameter) node: Root | Doctype | ElementContent
(parameter) node: Root | Doctype | ElementContent
(property) type: "root" | "comment" | "doctype" | "element" | "text"

Node type of hast root.

Node type of HTML comments in hast.

Node type of HTML document types in hast.

Node type of elements.

Node type of HTML character data (plain text) in hast.

(parameter) node: Root | Doctype | Element | Text
(property) type: "root" | "doctype" | "element" | "text"

Node type of hast root.

Node type of HTML document types in hast.

Node type of elements.

Node type of HTML character data (plain text) in hast.

namespace console
var console: Console

The console module provides a simple debugging console that is similar to the JavaScript console mechanism provided by web browsers.

The module exports two specific components:

  • A Console class with methods such as console.log(), console.error() and console.warn() that can be used to write to any Node.js stream.
  • A global console instance configured to write to process.stdout and process.stderr. The global console can be used without calling require('console').

Warning: The global console object's methods are neither consistently synchronous like the browser APIs they resemble, nor are they consistently asynchronous like all other Node.js streams. See the note on process I/O for more information.

Example using the global console:

console.log('hello world');
// Prints: hello world, to stdout
console.log('hello %s', 'world');
// Prints: hello world, to stdout
console.error(new Error('Whoops, something bad happened'));
// Prints error message and stack trace to stderr:
//   Error: Whoops, something bad happened
//     at [eval]:5:15
//     at Script.runInThisContext (node:vm:132:18)
//     at Object.runInThisContext (node:vm:309:38)
//     at node:internal/process/execution:77:19
//     at [eval]-wrapper:6:22
//     at evalScript (node:internal/process/execution:76:60)
//     at node:internal/main/eval_string:23:3

const name = 'Will Robinson';
console.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to stderr

Example using the Console class:

const out = getStreamSomehow();
const err = getStreamSomehow();
const myConsole = new console.Console(out, err);

myConsole.log('hello world');
// Prints: hello world, to out
myConsole.log('hello %s', 'world');
// Prints: hello world, to out
myConsole.error(new Error('Whoops, something bad happened'));
// Prints: [Error: Whoops, something bad happened], to err

const name = 'Will Robinson';
myConsole.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to err
(method) Console.log(message?: any, ...optionalParams: any[]): void

Prints to stdout with newline. Multiple arguments can be passed, with the first used as the primary message and all additional used as substitution values similar to printf(3) (the arguments are all passed to util.format()).

const count = 5;
console.log('count: %d', count);
// Prints: count: 5, to stdout
console.log('count:', count);
// Prints: count: 5, to stdout

See util.format() for more information.

  • @since v0.1.100
(parameter) node: Comment | Text
(property) Literal.value: string

Plain-text value.

Code that is more explicit and is understandable by TypeScript, is often also easier to understand by humans.

Read more about unist-util-visit in its readme.