Nowadays, there is at least one better tool, called slimit:
SlimIt is a JavaScript minifier written in Python. It compiles JavaScript into more compact code so that it downloads and runs faster.
SlimIt also provides a library that includes a JavaScript parser, lexer, pretty printer and a tree visitor.
Demo:
Imagine we have the following javascript code:
$.ajax({
type: "POST",
url: 'http://www.example.com',
data: {
email: '[email protected]',
phone: '9999999999',
name: 'XYZ'
}
});
And now we need to get email, phone and name values from the data object.
The idea here would be to instantiate a slimit parser, visit all nodes, filter all assignments and put them into the dictionary:
from slimit import ast
from slimit.parser import Parser
from slimit.visitors import nodevisitor
data = """
$.ajax({
type: "POST",
url: 'http://www.example.com',
data: {
email: '[email protected]',
phone: '9999999999',
name: 'XYZ'
}
});
"""
parser = Parser()
tree = parser.parse(data)
fields = {getattr(node.left, 'value', ''): getattr(node.right, 'value', '')
for node in nodevisitor.visit(tree)
if isinstance(node, ast.Assign)}
print fields
It prints:
{'name': "'XYZ'",
'url': "'http://www.example.com'",
'type': '"POST"',
'phone': "'9999999999'",
'data': '',
'email': "'[email protected]'"}
ANTLR, ANother Tool for Language Recognition, is a language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions in a variety of target languages.
The ANTLR site provides many grammars, including one for JavaScript.
As it happens, there is a Python API available - so you can call the lexer (recognizer) generated from the grammar directly from Python (good luck).
» npm install dt-python-parser
The TypeScript parser doesn't directly produce a tree like that, but you can still use its object model to do all sorts of things. We use it in some tools to do syntax transforms for testing purposes, for example. Here's a snippet that you can use to print the syntax tree:
import ts from "typescript";
const code = "enum { x = 1 }";
const sc = ts.createSourceFile("x.ts", code, ts.ScriptTarget.Latest, true);
let indent = 0;
function print(node: ts.Node) {
console.log(new Array(indent + 1).join(" ") + ts.SyntaxKind[node.kind]);
indent++;
ts.forEachChild(node, print);
indent--;
}
print(sc);
This question came up before back in September.
There isn't currently something that will do this for you - there is no magic getSyntaxTree method to call that will do this.
The TypeScript compiler is open-source, though - and written entirely in TypeScript so you can scan it to find out if there is something you can use / add a handle to.
The up-side of this is that you have a big opportunity to release your work as an open-source project as judging by the up-votes on the two questions, there is some demand for this.
Alternatively, grab the syntax tree from the compiled JavaScript (which is the code that will actually execute at runtime) using Esprima or SpiderMonkey.