diff --git a/jedi/parsing.py b/jedi/parsing.py index ed5c74bd..a38b8746 100644 --- a/jedi/parsing.py +++ b/jedi/parsing.py @@ -1,24 +1,13 @@ """ -This module is split in two parts: +The ``Parser`` tries to convert the available Python code in an easy to read +format, something like an abstract syntax tree. The classes who represent this +tree, are sitting in the :mod:`parsing_representation` module. -- the Python parser itself -- classes for the parser to save the information - -The ``Parser`` tries to represent the available Python code in an easy to read -format. The Python module ``tokenize`` is a very important part in the -``Parser``, because it splits the code into different words (tokens). -Sometimes it looks a bit messy. Sorry for that! You might ask now: "Why didn't -you use the ``ast`` module for this? Well, ``ast`` does a very good job -understanding proper Python code, but fails to work as soon as there's a single -line of broken code. - -The classes are not very hard to understand. They are being named like you -would call them: ``Import``, ``Class``, etc. - -A very central class is ``Scope``. It is not used directly by the parser, but -inherited. It's used by ``Function``, ``Class``, ``Flow``, etc. A ``Scope`` may -have ``subscopes``, ``imports`` and ``statements``. The entire parser is based -on scopes, because they also stand for indentation. +The Python module ``tokenize`` is a very important part in the ``Parser``, +because it splits the code into different words (tokens). Sometimes it looks a +bit messy. Sorry for that! You might ask now: "Why didn't you use the ``ast`` +module for this? Well, ``ast`` does a very good job understanding proper Python +code, but fails to work as soon as there's a single line of broken code. There's one important optimization that needs to be known: Statements are not being parsed completely. ``Statement`` is just a representation of the tokens @@ -26,14 +15,6 @@ within the statement. This lowers memory usage and cpu time and reduces the complexity of the ``Parser`` (there's another parser sitting inside ``Statement``, which produces ``Array`` and ``Call``). -Another strange thing about the parser is that ``Array`` is two dimensional. -This has been caused by the fact that each array element can be defined by -operations: ``[1, 2+3]``. So I chose to use a second dimension for ``2+3``. In -the future it might be useful to use Statements there, too. This is also how -``Param`` works. Every single ``Param`` is a ``Statement``. - - -.. todo:: remove docstr params from Scope.__init__() """ from _compatibility import next, StringIO, unicode diff --git a/jedi/parsing_representation.py b/jedi/parsing_representation.py index 3d3e1e4f..b6a4646a 100644 --- a/jedi/parsing_representation.py +++ b/jedi/parsing_representation.py @@ -1,3 +1,25 @@ +""" +If you know what an abstract syntax tree (ast) is, you'll see that this module +is pretty much that. The classes represent syntax elements: ``Import``, +``Function``. + +A very central class is ``Scope``. It is not used directly by the parser, but +inherited. It's used by ``Function``, ``Class``, ``Flow``, etc. A ``Scope`` may +have ``subscopes``, ``imports`` and ``statements``. The entire parser is based +on scopes, because they also stand for indentation. + +One strange thing about the parser is that ``Array`` is two dimensional. This +has been caused by the fact that each array element can be defined by +operations: ``[1, 2+33]``. So I chose to use a second dimension for ``2+33``, +where each element would lie in the array like this: ``[2, '+', 33]``. In the +future it might be useful to use Statements there, too (remove those crappy two +dimensional arrays). This is also how ``Param`` works. Every single ``Param`` +is a ``Statement``. + + +.. todo:: remove docstr params from Scope.__init__() +""" + import os import re import tokenize