307 Commits

Author SHA1 Message Date
Dave Halter ef90bba3b3 Flake8 improvements 2020-12-11 02:04:23 +01:00
Dave Halter a9d0cc1179 Release parso 0.8.1 2020-12-10 16:09:05 +01:00
Dave Halter f45ffa1948 Merge pull request #162 from pybpc/walrus-set-and-index
Allow unparenthesized walrus in set literals, set comprehensions and indexes
2020-12-10 15:27:56 +01:00
gousaiyang b287476366 Allow unparenthesized walrus in set literals, set comprehensions and indexes 2020-11-27 14:46:54 -08:00
Tim Hatch d39aadc4cc Support named unicode characters in f-strings (#160)
* Support named unicode characters in f-strings

Fixes #154

The previous behavior misinterpreted the curly braces as enclosing an
expression.  This change does some cursory validation so we can still
get parse errors in the most egregious cases, but does not validate that
the names are actually valid, only that they are name-shaped and have a
chance of being valid.

The character names appear to obey a few rules:
* Case insensitive
* Name characters are `[A-Z0-9 \-]`
* Whitespace before or after is not allowed
* Whitespace in the middle may only be a single space between words
* Dashes may occur at the start or middle of a word

```py
f"\N{A B}"           # might be legal
f"\N{a b}"           # equivalent to above
f"\N{A     B}"       # no way
f"\N{    A B     }"  # no way
f"""\N{A
B}"""                # no way
```

For confirming this regex matches all (current) unicode character names:

```py
import re
import sys
import unicodedata

R = re.compile(r"[A-Za-z0-9\-]+(?: [A-Za-z0-9\-]+)*")

for i in range(sys.maxunicode):
    try:
        name = unicodedata.name(chr(i))
    except ValueError:
        # Some small values like 0 and 1 have no name, /shrug
        continue
    m = R.fullmatch(name)
    if m is None:
        print("FAIL", repr(name))
```

* Improve tests for named unicode escapes
2020-11-22 15:37:04 +03:00
Saiyang Gou b08b61b578 Allow some unparenthesized syntactic structures in f-string expression part (#159)
Resolves #157, #158
2020-11-19 16:32:59 +03:00
Saiyang Gou 034a9e8944 Properly check for invalid conversion character with f-string debugging syntax (#156) 2020-11-18 12:56:04 +03:00
Dave Halter 634df56d90 Merge pull request #152 from isidentical/issue-151
Retrieve all kinds of assignment targets from with test
2020-09-29 00:14:19 +02:00
Batuhan Taskaya 52cfa5a8ac satisfy flake8 2020-09-24 10:48:22 +03:00
Batuhan Taskaya 606c528803 Retrieve all kinds of assignment targets from with test 2020-09-24 10:42:56 +03:00
Dave Halter 6ae0efa415 Prepare the 0.8.0 release 2020-08-05 00:51:16 +02:00
Dave Halter 1714c1d0de Make namedexpr_test a proper NamedExpr class 2020-08-05 00:26:16 +02:00
Dave Halter 6405a1227f Merge pull request #146 from davidhalter/python3
Dropping Python <3.6
2020-07-26 13:30:32 +02:00
Dave Halter cb7d15b332 Prepare the CHANGELOG 2020-07-26 13:19:41 +02:00
Dave Halter 0dec1a4003 Another review suggestion 2020-07-26 13:16:41 +02:00
Dave Halter 020d7a9acb Use a better intersphinx mapping 2020-07-26 13:16:41 +02:00
Dave Halter f79432ecab Remove support for multiple languages, which was never used 2020-07-26 13:16:41 +02:00
Dave Halter b0de7e363a Don't use print if not necessary 2020-07-26 13:16:41 +02:00
Dave Halter f6859538b0 Simplify a cache path
Co-authored-by: Batuhan Taskaya <batuhanosmantaskaya@gmail.com>
2020-07-26 13:03:58 +02:00
Dave Halter ea6b01b968 Use pathlib.Path instead of strings 2020-07-26 01:19:41 +02:00
Dave Halter 97c10facf7 Remove super arguments 2020-07-25 23:54:21 +02:00
Dave Halter dcc756a373 Remove object inheritance 2020-07-25 18:20:56 +02:00
Dave Halter 3c3c0b54dc Fix some remaining flake8 issues 2020-07-25 18:17:11 +02:00
Dave Halter 70ec8eecd1 Fix some last mypy issues 2020-07-25 18:16:01 +02:00
Dave Halter d3c274afa0 Fix an issue with encoding detection 2020-07-25 18:11:49 +02:00
Dave Halter 6a4bb35d80 Removed stubs from generator and grammar_parser and put the annotations into the corresponding files 2020-07-25 18:09:12 +02:00
Dave Halter ce0fac7630 Move grammar stubs into grammar.py 2020-07-25 17:48:01 +02:00
Dave Halter 395af26fa8 Removed parso/__init__.pyi
It is intentional that the parse function is not typed. It's just a
helper function. For a typed version of it, please have a look at
Grammar.parse.
2020-07-25 16:22:35 +02:00
Dave Halter e65fb2464e Remove utils.pyi in favor of inline stubs 2020-07-25 16:20:26 +02:00
Dave Halter 4d86f0fdcc Remove the pgen2 stub, didn't really contain anything 2020-07-25 16:12:41 +02:00
Dave Halter b816c00e77 Remove the token stub in favor of inline annotations 2020-07-25 16:12:04 +02:00
Dave Halter 2197e4c9e8 Fix a linter issue 2020-07-25 15:59:33 +02:00
Dave Halter b176ed6eee Run mypy and flake8 in CI 2020-07-25 15:54:55 +02:00
Dave Halter 67904f4d24 Make mypy happy 2020-07-25 15:43:28 +02:00
Dave Halter 8a34245239 Get rid of mypy issues with tokenize.py 2020-07-25 15:34:29 +02:00
Dave Halter a474895764 Start working with mypy 2020-07-25 15:05:42 +02:00
Dave Halter 34152d29b2 Initializing a Grammar now uses keyword only arguments 2020-07-25 14:33:42 +02:00
Dave Halter d9f60b3473 Remove compatibility in parser for Python 2 2020-07-25 02:38:46 +02:00
Dave Halter 75b467e681 Some more small Python 3 changes 2020-07-25 02:33:24 +02:00
Dave Halter 02eb9b9507 Use keyword only arguments in grammar.py 2020-07-25 02:21:51 +02:00
Dave Halter f17f94e120 Get rid of the old star checking logic 2020-07-25 02:16:02 +02:00
Dave Halter 902885656d Remove some Python 3.6 references 2020-07-25 02:10:10 +02:00
Dave Halter 4f9f193747 Remove some Python 3.5/3.4 references 2020-07-25 02:04:58 +02:00
Dave Halter 86d53add2d Remove sys.version_info usages that are no longer necessary 2020-07-25 01:53:51 +02:00
Dave Halter 22fb62336e Remove failing examples that are just Python 2 examples 2020-07-25 01:49:44 +02:00
Dave Halter 6eb6ac0bb2 Ignore Python 2 specific code in tests 2020-07-25 01:41:33 +02:00
Dave Halter 7c68ba4c45 Remove absolute import future import checking 2020-07-25 01:33:11 +02:00
Dave Halter d7ab138864 Remove more python 2 specific code 2020-07-25 01:31:19 +02:00
Dave Halter 4c09583072 Remove Python 2 stuff from errors.py 2020-07-25 01:25:38 +02:00
Dave Halter 19f4550ced Use enum instead of our own logic 2020-07-24 17:39:49 +02:00
Dave Halter a0662b3b3b flake8 changes 2020-07-24 16:11:06 +02:00
Dave Halter 2962517be0 Get rid of the xfails 2020-07-24 15:43:41 +02:00
Dave Halter 62b4589293 Remove tokenizer support for Python 2 2020-07-24 15:39:18 +02:00
Dave Halter 93e74efc01 Some whitespace changes 2020-07-24 14:50:01 +02:00
Dave Halter b5e2e67a4d Remove support for parsing Python 2 2020-07-24 14:48:02 +02:00
Dave Halter 5ac4bac368 Pin the precise 3.8 version 2020-07-24 02:29:18 +02:00
Dave Halter 5dd4301235 Remove tox 2020-07-24 02:25:11 +02:00
Dave Halter 1a99fdd333 Don't run older Python versions on travis 2020-07-24 02:15:44 +02:00
Dave Halter 9c5fb1ac94 Fix the tokenizer 2020-07-24 02:14:52 +02:00
Dave Halter 7780cc1c1b Get rid of some Python 2 idiosyncrasies 2020-07-24 02:09:04 +02:00
Dave Halter 561f434f39 Use yield from where possible 2020-07-24 02:01:48 +02:00
Dave Halter a1829ecc7f Remove the unicode compatibility function 2020-07-24 01:51:36 +02:00
Dave Halter 21f782dc34 Fix the tests 2020-07-24 01:45:31 +02:00
Dave Halter 164489cf97 Remove the u function and u literals 2020-07-24 01:39:03 +02:00
Dave Halter 020b2861df Remove some weird code 2020-07-24 01:33:34 +02:00
Dave Halter 44c0395113 Remove use_metaclass, it's no longer used 2020-07-24 01:31:52 +02:00
Dave Halter a2fc850dc9 Remove scandir compatibility 2020-07-24 01:28:40 +02:00
Dave Halter be5429c02c Remove utf8_repr from _compatibility.py 2020-07-24 01:26:36 +02:00
Dave Halter 736f616787 Remove FileNotFoundError and PermissionError from _compatibility.py 2020-07-24 01:24:59 +02:00
Dave Halter b601ade90b Drop Python 2.7, 3.4 and 3.5 2020-07-24 01:21:44 +02:00
Dave Halter 3b263f0a0d Fix a failing test 2020-07-24 01:01:23 +02:00
Dave Halter f52103f236 Prepare 0.7.1 release 2020-07-24 00:54:07 +02:00
Dave Halter c53321a440 Comprehensions are not valid as class params, fixes #122 2020-07-24 00:32:24 +02:00
Dave Halter d8a70abf19 Merge pull request #145 from PeterJCLaw/expose-type-stubs
Let consumers know that we have type annotations
2020-07-21 23:42:01 +02:00
Peter Law c19d7c4e6d Let consumers know that we have type annotations
As well as the type stubs, this includes both the py.typed flag
file (for tools) and the classifier (for people).
2020-07-21 22:33:39 +01:00
Batuhan Taskaya d42c0f1b3b Merge pull request #143 from Carreau/parse-alpha
Parse alpha, beta and rc versions strings.
2020-07-01 11:14:40 +03:00
Matthias Bussonnier 40e78ff7e0 Parse alpha, beta and rc versions strings.
fixes #142
2020-06-30 13:28:09 -07:00
Batuhan Taskaya c88a2675b0 Merge pull request #140 from Kazy/fix-139-async-for-newline
Fix #139: newlines in async for comprehension
2020-06-29 20:01:53 +03:00
Jocelyn Boullier 88874a5a9f Fix #139: newlines in async for comprehension 2020-06-29 18:40:55 +02:00
Dave Halter 1e4076f9d9 Merge pull request #141 from isidentical/f-string-errors
Handle 3.9>= f-string errors
2020-06-29 00:03:57 +02:00
Batuhan Taskaya 73796f309d Just raise the f-string error, pass the other 2020-06-28 19:53:57 +03:00
Batuhan Taskaya 1cacdf366e Raise custom errors after break tokens 2020-06-28 19:48:11 +03:00
Batuhan Taskaya d352bede13 Cover errors that raised by ErrorFinder 2020-06-28 19:37:22 +03:00
Batuhan Taskaya 572be783f3 Cover invalid syntaxes 2020-06-28 18:41:18 +03:00
Batuhan Taskaya 31171d7ae6 Handle 3.9>= f-string errors 2020-06-28 18:04:42 +03:00
Dave Halter 7e0586b0b9 Add a PyPI downloads badge 2020-06-27 15:18:27 +02:00
Dave Halter cc347b1d3b Merge pull request #137 from isidentical/cannot-delete-starred
Update starred deletion messages for 3.9+
2020-06-22 00:15:01 +02:00
Batuhan Taskaya 841a5d96b3 Update starred deletion messages for 3.9+ 2020-06-21 19:47:18 +03:00
Dave Halter d68b4e0cab Use Python 3 in deployment script 2020-06-20 01:21:35 +02:00
Dave Halter d55b4f08dc Merge pull request #136 from davidhalter/permission_errors
Ignore permission errors when saving to cache
2020-06-19 20:27:59 +02:00
Dave Halter 58790c119e Fix issues of #136 2020-06-19 20:20:00 +02:00
Dave Halter 3923ecf12f Ignore permission errors when saving to cache
This might happen when a user doesn't have full access to his home directory.
Fixes davidhalter/jedi#1615
2020-06-19 12:06:46 +02:00
Dave Halter bd33e4ef7e Merge pull request #135 from isidentical/starred-expr
Improve handling of starred expression on different contexts
2020-06-05 12:58:14 +02:00
Batuhan Taskaya 891bfdaa04 Test only python3+ 2020-06-04 22:09:04 +03:00
Batuhan Taskaya 5e1828b3f0 Check full error message 2020-06-04 22:02:12 +03:00
Batuhan Taskaya 6daf91880b Add a special case against augassign 2020-06-04 21:47:28 +03:00
Batuhan Taskaya 44cf64a5f7 Improve handling of starred expression on different contexts (load/store) 2020-06-04 21:35:48 +03:00
Batuhan Taskaya fe24f0dc1b Implement garbage collections for inactive cache files (#121)
Cache files that weren't accessed in the last 30 days will be automatically
garbage collected. This collection happens when the `save_module` is called
via a lock system that would make it happen only one time per day.
2020-06-02 12:36:05 +03:00
Dave Halter 450e9d0a19 Merge pull request #130 from yuan-xy/patch-1
fix dump_nfa
2020-05-30 12:11:08 +02:00
yuan 93b5e6dffc Fix a one-word typo 2020-05-29 10:30:08 +03:00
yuan 4403b5cac5 Update generator.py 2020-05-29 08:56:38 +08:00
Batuhan Taskaya 6f29c551fd Adjust invalid aug assign target for 3.9+ 2020-05-27 00:55:31 +02:00
Dave Halter d6b1d19d87 Merge pull request #129 from isidentical/extended-rhs-for-annassign
Extend annotated assignment rule's RHS
2020-05-26 00:13:46 +02:00
Batuhan Taskaya e0dc415bbc Extend annotated assignment rule's RHS 2020-05-26 01:10:04 +03:00
Batuhan Taskaya 4c2c0ad077 Add python3.10 grammar (#125) 2020-05-26 00:58:09 +03:00
Batuhan Taskaya 5daa8b1db6 Merge pull request #124 from isidentical/nightly-builds 2020-05-25 00:18:29 +03:00
Batuhan Taskaya c05e14c24e Test parso on nightly builds 2020-05-25 00:11:46 +03:00
Dave Halter 846513584e Merge pull request #119 from isidentical/check-all-args
Check all arguments for unparenthesized generator expressions
2020-05-23 23:18:00 +02:00
Batuhan Taskaya 6b0e01c220 Revert trailing comma for 3.6< 2020-05-23 21:17:08 +03:00
Batuhan Taskaya 92396a9a16 allow trailing comma <3.6, test both postive/negative cases 2020-05-23 17:45:20 +03:00
Batuhan Taskaya fe54800cdd Check all arguments for unparenthesized generator expressions
Previously only the first argument on the argument list checked
against the generator expressions, now all argumnets are controlled.
2020-05-23 16:57:34 +03:00
Dave Halter 6ecd975516 Merge pull request #117 from isidentical/repeated-kwarg-39
Show which keyword argument is repeated on 3.9+
2020-05-23 15:15:14 +02:00
Batuhan Taskaya 27a7c16803 assert full message 2020-05-23 15:51:00 +03:00
Batuhan Taskaya a06521d912 Don't give syntax errors for parenthesised kwargs <3.8 2020-05-23 14:43:43 +02:00
Batuhan Taskaya 216a77dce5 Show which keyword argument is repeated on 3.9+ 2020-05-23 14:06:24 +03:00
Dave Halter 8bb211fafb Merge pull request #116 from isidentical/forbidden-name
Raise violation on starred expressions where the child is a boolean/none
2020-05-23 11:51:08 +02:00
Batuhan Taskaya 342e308f57 Move checking to the _CheckAssignmentRule 2020-05-23 01:18:23 +03:00
Batuhan Taskaya 8f46481aaf Raise violation on starred expressions where the child is a boolean/none 2020-05-23 01:09:38 +03:00
Dave Halter 00621977b7 Merge pull request #115 from isidentical/finally-in-continue
Support finally in continue on 3.8+
2020-05-22 23:44:26 +02:00
Batuhan Taskaya 077e34be84 Support finally in continue on 3.8+
Thanks to [bpo-32489](https://bugs.python.org/issue32489) and sadly
for rejection of my [PEP 601](https://www.python.org/dev/peps/pep-0601/)
finally in continue is supported in 3.8+. I checked the blame and looks
like there was already a commit for the same subject, but that only
changes the test and not actually changes the checker (dfe7fba08e)
2020-05-22 18:47:46 +03:00
Dave Halter a3f851d8f6 Merge pull request #114 from isidentical/future-annotations
Add support for 'from __future__ import annotations'
2020-05-22 16:18:53 +02:00
Batuhan Taskaya 261132e74c Add support for 'from __future__ import annotations'
PEP 563 brought a new `__future__` import for post-poning evaluation
of annotations that introduced in 3.7. This patch adds support for
that future feature, and removes 'all_feature_names' from that list
since it is not valid a syntax
(`from __future__ import all_feature_names`). Also it fixes a bug
related usage of `ALLOWED_FUTURES` (global and version independant
flags) instead of `allowed_futures` (extended version of the previ
ous flag that has some version specific flags, probably unnoticed)
2020-05-22 17:14:33 +03:00
Batuhan Taskaya 345374d040 Allow 'any' expression on decorators, PEP 614 2020-05-22 10:17:17 +02:00
Batuhan Taskaya f8709852e3 Adapt Python3.9 errors on multiple star target
In Python3.9, the message "two starred expression in ..." changed
to "multiple starred expression in ...", with python/cpython#19168
2020-05-21 20:46:41 +02:00
Batuhan Taskaya 2dcc0d3770 Quick fix about invalid version test 2020-05-21 20:45:10 +02:00
Batuhan Taskaya 34b8b7dd79 Correctly parse 2-digit minor versions (py3.10) 2020-05-21 16:21:22 +02:00
WinChua caadf3bf4c approve hit msg when python version is unsupported
currently, when the python version used is not supported, it will raise "Python version None is currently not supported."
2020-05-17 16:52:40 +02:00
Dave Halter 1b4c75608a Fix a python_bytes_to_unicode issue, fixes #107 2020-05-14 23:34:14 +02:00
Dave Halter 15403fd998 Use a Windows cache folder change from Jedi
See also 1115cbd94dcae6fb7b215c51f0407333c92c956e in Jedi and the PR in davidhalter/jedi#1575
2020-05-10 11:50:00 +02:00
Dave Halter b9725364ab Add a lot of comment to the diff parser 2020-04-13 11:46:36 +02:00
Dave Halter 66ecc264f9 Write 0.7.0 release notes 2020-04-13 11:15:05 +02:00
Dave Halter 63b73a05e6 Diff parser: Take care of one line function error recovery with decorator 2020-04-13 11:07:37 +02:00
Dave Halter baec4ac58f Diff parser: Take care of one line function error recovery 2020-04-12 02:47:46 +02:00
Dave Halter b5f58ac33c Ignore some slow files for the fuzzer 2020-04-12 01:14:24 +02:00
Dave Halter 83cb71f7a1 The fuzzer now tries to reuse previous modfiications as well sometimes 2020-04-11 23:29:00 +02:00
Dave Halter 30a2b2f40d Fix an error case with prefixes 2020-04-11 22:51:17 +02:00
Dave Halter d81e393c0c Fix indentation issues with backslashes and def error recovery 2020-04-10 21:48:28 +02:00
Dave Halter 7822f8be84 Python 2 compatibility 2020-04-09 22:47:50 +02:00
Dave Halter 93788a3e09 Add a test for the diff parser that xfails 2020-04-09 00:03:39 +02:00
Dave Halter 085f666ca1 Add more tokens that can break parens to tokenizer 2020-04-08 23:24:30 +02:00
Dave Halter 9e546e42de Diff parser: Fix another byte order mark issue 2020-04-07 22:58:47 +02:00
Dave Halter 7b14a86e0a Fix tokenizer error tokens 2020-04-07 09:55:28 +02:00
Dave Halter f45941226f Diff parser: Fix other BOM issues 2020-04-07 01:06:03 +02:00
Dave Halter e04552b14a Fix tests for Python 2 2020-04-06 23:52:29 +02:00
Dave Halter cd9c213a62 Fix fstring issues when error leaves are involved 2020-04-06 23:34:27 +02:00
Dave Halter 561e81df00 Replace non utf8 errors properly in diff fuzzer 2020-04-06 02:04:48 +02:00
Dave Halter 556ce86cde Tokenizer: It should not be possible to break out of backslashes on the next line, even if it was an error 2020-04-06 01:25:06 +02:00
Dave Halter b12dd498bb Diff parser: Fix BOM with indentation issues 2020-04-05 20:47:49 +02:00
Dave Halter db10b4fa72 Diff parser: Need to care for eror dedents in some open parentheses/always break contexts 2020-04-05 14:39:56 +02:00
Dave Halter ed38518052 Diff parser: Make sure that nested suites get properly copied 2020-04-05 02:48:41 +02:00
Dave Halter ebc69545c7 Fix error recovery for multi line strings at the end of the file 2020-04-05 00:13:55 +02:00
Dave Halter 67ebb6acac async is actually a token that cannot appear in brackets 2020-04-04 23:14:10 +02:00
Dave Halter bcf76949b6 Diff parser: Remove error statements before caring about nested functions 2020-04-04 22:43:33 +02:00
Dave Halter 6c7b397cc7 Diff parser: Check indentation for copies correctly 2020-04-04 20:36:19 +02:00
Dave Halter 1927ba7254 Start using the parser count/copy count again 2020-04-04 17:49:35 +02:00
Dave Halter a6c33411d4 Remove all the error dedent/indent additions in the diff parser
The parser should just reparse stuff that is strangely indented
2020-04-04 16:15:17 +02:00
Dave Halter f8dce76ef7 Make sure to only copy nodes that have the same indentation in diff parser 2020-04-04 16:07:54 +02:00
Dave Halter 3242e36859 Python 2 compatibility 2020-04-04 15:45:03 +02:00
Dave Halter 734a4b0e67 Remove support for specialized treatment of form feeds
This is a very intentional change. Previously form feeds were handled very
poorly and sometimes where not counted as indentation. This obviously makes
sense. But at the same time indentation is very tricky to deal with (both for
editors and parso).

Especially in the diff parser this led to a lot of very weird issues. The
decision probably makes sense since:

1. Almost nobody uses form feeds in the first place.
2. People that use form feeds like Barry Warsaw often put a newline ater them.
   (e.g Python's email.__init__)
3. If you write an editor you want to be able to identify a unicode character
   with a clear line/column. This would not be the case if form feeds were just
   ignored when counting.

Form feeds will still work in Jedi, will not cause parse errors and in general
you should be fine using them. It might just cause Jedi to count them as
indentation **if** you use it like '\f  foo()'. This is however confusing for
most editors anyway. It leads to a weird display e.g. in VIM, even if it's
perfectly valid code in Python.

Since parso is a code analysis parser and not the languages parser I think it's
fine to ignore this edge case.
2020-04-04 15:38:10 +02:00
Dave Halter 1047204654 Small tokenizer refactoring 2020-04-04 13:13:00 +02:00
Dave Halter ae6af7849e Diff parser: All indent checks should use _get_indent 2020-04-04 13:08:47 +02:00
Dave Halter e1632cdadc Fix some issues with async funcs 2020-04-04 04:01:15 +02:00
Dave Halter 7f0dd35c37 Remove the piece of shit _get_insertion_node function 2020-04-04 03:51:28 +02:00
Dave Halter ad88783ac9 Remove get_first_indentation 2020-04-03 16:47:00 +02:00
Dave Halter 8550a52e48 Remove indents from _NodesTreeNode 2020-04-03 16:26:01 +02:00
Dave Halter c88a736e35 Fix indent issues 2020-04-03 16:24:26 +02:00
Dave Halter a07146f8a5 Deal with indents in diff parser more explicitly 2020-04-03 12:41:28 +02:00
Dave Halter 0c0aa31a91 Don't use max as a variable 2020-04-03 03:35:21 +02:00
Dave Halter 77327a4cea Make node insertion a bit easier 2020-04-03 03:28:14 +02:00
Dave Halter 8bbd304eb9 Define token types a bit different in diff parser 2020-04-03 01:05:11 +02:00
Dave Halter 62fd03edda Pass tokens in diff tokenizer 2020-04-03 01:01:37 +02:00
Dave Halter 12063d42fc When debugging print 2020-04-03 00:56:59 +02:00
Dave Halter c86af743df Initialize start pos properly in diff parser 2020-04-03 00:54:13 +02:00
Dave Halter fb2ea551d5 Move the tokenizer/diff parser closer together 2020-04-03 00:18:35 +02:00
Dave Halter ce170e8aae WIP: Try to use the tokenizer in a more native way 2020-04-02 02:00:35 +02:00
Dave Halter d674bc9895 Fix a backslash issue 2020-03-29 23:59:53 +02:00
Dave Halter 0d9886c22a Diff parser: Rewrite tokenizer modifications a bit 2020-03-29 22:41:59 +02:00
Dave Halter 9f8a68677d Tokenizer: It's now clearer when an error dedent appears 2020-03-29 13:50:36 +02:00
Dave Halter a950b82066 Fix tokenizer for random invalid unicode points 2020-03-28 21:02:04 +01:00
Dave Halter 38b7763e9a Use _assert_nodes_are_equal in the fuzzer 2020-03-28 14:51:27 +01:00
Dave Halter cf880f43d4 Tokenizer: Add error dedents only if parens are not open 2020-03-28 14:41:10 +01:00
Dave Halter 8e49d8ab5f Fix tokenizer fstring end positions 2020-03-28 11:22:32 +01:00
Dave Halter 77b3ad5843 Small flake8 refactoring 2020-03-28 10:41:00 +01:00
Dave Halter 29e3545241 Fix adding error indents/dedents only at the right places 2020-03-27 17:05:05 +01:00
Dave Halter 3d95b65b21 Fix an issue with unfinished f string literals 2020-03-27 11:17:31 +01:00
Dave Halter b86ea25435 Add a bit to the CHANGELOG 2020-03-24 22:38:18 +01:00
Dave Halter 4c42a82ebc Allow multiple newlines in a suite, this makes the diff parser easier 2020-03-24 22:35:21 +01:00
Dave Halter 43651ef219 Diff parser: Make sure dedent start pos are matching 2020-03-24 22:27:04 +01:00
Dave Halter 419d9e3174 Diff parser: Fix a few more indentation issues 2020-03-24 22:03:29 +01:00
Dave Halter 2bef3cf6ff Fix an issue where indents where repeated unnessecarily 2020-03-24 00:24:53 +01:00
Dave Halter 8e95820d78 Don't show logs in pytest, because they already appear by default 2020-03-23 23:53:23 +01:00
Dave Halter c18c89eb6b Diff parser: Correctly add indent issues 2020-03-23 00:16:47 +01:00
Dave Halter afc556d809 Diff parser: Prepare for indent error leaf insertion 2020-03-22 22:57:58 +01:00
Dave Halter cdb791fbdb Diff parser: Add error dedents if necessary, see also davidhalter/jedi#1499 2020-03-22 21:37:25 +01:00
Dave Halter 93f1cdebbc Try to make parsed trees more similar for incomplete dedents, see also davidhalter/jedi#1499 2020-03-22 21:15:22 +01:00
Dave Halter d3ceafee01 Specify in tests how another dedent issue is recovered from 2020-03-22 19:34:12 +01:00
Dave Halter 237dc9e135 Diff parser: Make sure to pop nodes directly after error nodes, see also davidhalter/jedi#1499 2020-03-22 14:49:22 +01:00
Dave Halter bd37353042 Move a bit of code 2020-03-22 13:46:13 +01:00
Dave Halter 51a044cc70 Fix diff parser: Invalid dedents meant that sometimes the wrong parents were chosen, fixes davidhalter/jedi#1499 2020-03-22 12:41:19 +01:00
Dave Halter 2cd0d6c9fc Fix: Dedent omission was wrong, see davidhalter/jedi#1499 2020-03-22 12:41:19 +01:00
Daniel Hahler 287a86c242 ci: Travis: use Python 3.8.2
Ref: https://github.com/davidhalter/parso/issues/103
2020-02-28 00:51:06 +01:00
Dave Halter 0234a70e95 Python 3.8.2 was released and an error message changed, fixes #103 2020-02-28 00:31:58 +01:00
Dave Halter 7ba49a9695 Prepare the 0.6.2 release 2020-02-27 02:10:06 +01:00
Dave Halter 53da7e8e6b Fix get_next_sibling on module, fixes #102 2020-02-21 18:31:13 +01:00
Dave Halter 6dd29c8efb Fix ExprStmt.get_rhs for annotations 2020-02-21 18:31:13 +01:00
Dave Halter e4a9cfed86 Give parso refactoring tools 2020-02-21 18:31:13 +01:00
Joe Antonakakis a7f4499644 Add venv to .gitignore (#101) 2020-02-14 14:28:07 +01:00
Dave Halter 4306e8b34b Change the release date for 0.6.1 2020-02-03 21:46:25 +01:00
Dave Halter 2ce3898690 Prepare the next release 0.6.1 2020-02-03 18:40:05 +01:00
Dave Halter 16f257356e Make end_pos public for syntax issues 2020-02-03 18:36:47 +01:00
Dave Halter c864ca60d1 Bump version to 0.6.0 2020-01-26 20:01:38 +01:00
Dave Halter a47b5433d4 Make sure iter_funcdefs includes async functions with decorators, fixes #98 2020-01-26 20:00:56 +01:00
Dave Halter 6982cf8321 Add a bit to the changelog 2020-01-26 19:47:46 +01:00
Dave Halter 844ca3d35a del_stmt is now considered a name definition 2020-01-26 19:42:12 +01:00
Dave Halter 9abe5d1e55 Forgot to increase the pickle version 2020-01-20 01:28:06 +01:00
Jarry Shaw 84874aace3 Revision on fstring issues (#100)
* f-string expression part cannot include a backslash
 * failing example `f"{'\n'}"` for tests
2020-01-09 21:49:34 +01:00
Jarry Shaw 55531ab65b Revision on assignment errors (#97)
* Revision on assignment expression errors

 * added rule for __debug__ (should be a keyword)
 * reviewed error messages
 * added new failing samples

* Adjustment upon Dave's review

 * rewind several changes in assignment errors
 * patched is_definition: command not found for assignment expressions
 * patched Python 2 inconsistent error messages in test_python_errors.py: command not found
2020-01-08 23:07:37 +01:00
Dave Halter 31c059fc30 Add a Changelog note about dropping 2.6/3.3 2020-01-06 00:05:11 +01:00
Dave Halter cfef1d74e7 Fix a Python 2.7 issue 2020-01-06 00:02:26 +01:00
Dave Halter 9ee7409d8a Get rid of Python 3.3 artifacts 2020-01-05 23:59:38 +01:00
Dave Halter 4090c80401 Remove Python 2.6 grammar 2020-01-05 23:55:03 +01:00
Dave Halter 95f353a15f Merge branch 'rm-2.6' of https://github.com/hugovk/parso 2020-01-05 23:50:20 +01:00
Dave Halter 2b0b093276 Make sure to limit the amount of cached files parso stores, fixes davidhalter/jedi#1340 2020-01-05 23:44:51 +01:00
Tim Gates 29b57d93bd Fix simple typo: utitilies -> utilities
Closes #94
2019-12-17 10:00:28 +01:00
Dave Halter fb010f2b5d Add a release date to the Changelog 2019-12-15 01:00:38 +01:00
Dave Halter 5e12ea5e04 Prepare the next release v0.5.2 2019-12-15 00:55:19 +01:00
Dave Halter ceb1ee81fa Merge pull request #93 from yangyangxcf/fstring_tokenize
fixed #86 and #87
2019-12-15 00:47:32 +01:00
Dave Halter bc94293794 Add information about named expressions (#90) to the Changelog 2019-12-15 00:29:41 +01:00
Dave Halter 1122822b7d Use a lower pytest version so python3.4 is able to pass 2019-12-15 00:13:48 +01:00
Dave Halter 09abe42cce Use Python 3.8 on travis for testing 2019-12-15 00:12:36 +01:00
Dave Halter 38cdcceba5 Whitespace changes 2019-12-15 00:06:37 +01:00
Dave Halter 753e1999fe Fix: Add more cases for named expression errors, see #89, #90 2019-12-15 00:04:38 +01:00
Dave Halter 3c475b1e63 Add Python 3.8 to tested environments for tox 2019-12-14 23:59:16 +01:00
Dave Halter 5f04dad9ab Fix: Catch some additional cases named expr errors, see #89, #90 2019-12-14 23:31:43 +01:00
Dave Halter dbba1959f7 Make sure that function executions are errors as well, see #90 2019-12-14 23:23:00 +01:00
Dave Halter 5fda85275b Some minor refactorings for #90
- search_ancestor is now used instead of using node = node.parent
- Some lines were too long
2019-12-14 23:12:16 +01:00
Dave Halter 32584ac731 Merge https://github.com/JarryShaw/parso into master 2019-12-14 22:21:22 +01:00
Jarry Shaw 89c4d959e9 * moved all namedexpr_test related rules to _NamedExprRule
* added valid examples
2019-12-14 09:37:16 +01:00
Jarry Shaw 776e151370 Revised implementation
* search ancestors of namedexpr_test directly for comprehensions
 * added test samples for invalid namedexpr_test syntax
2019-12-13 11:55:53 +08:00
yangyang 53a6d0c17a spelling 2019-12-06 15:24:33 +08:00
yangyang b90e5cd758 fixed #86 and #87 2019-12-05 19:22:58 +08:00
Robin Fourcade e496b07b63 Fix trailing comma error 2019-12-04 22:59:24 +01:00
Jarry Shaw 76fe4792e7 Deal with nested comprehension
e.g. `[i for i, j in range(5) for k in range (10) if True or (i := 1)]`
2019-12-01 16:23:18 +08:00
Jarry Shaw 8cae7ed526 Fixing davidhalter/parso#89
[all changes are in parso/python/errors.py]

* utility function (`_get_namedexpr`) extracting all assignment expression (`namedexpr_test`) nodes
* add `is_namedexpr` parameter to `_CheckAssignmentRule._check_assignment` and special error message for assignment expression related assignment issues (*cannot use named assignment with xxx*)
* add assignment expression check to `_CompForRule` (*assignment expression cannot be used in a comprehension iterable expression*)
* add `_NamedExprRule` for special assignment expression checks
  - *cannot use named assignment with lambda*
  - *cannot use named assignment with subscript*
  - *cannot use named assignment with attribute*
  - and fallback general checks in `_CheckAssignmentRule._check_assignment`
* add `_ComprehensionRule` for special checks on assignment expression in a comprehension
  - *assignment expression within a comprehension cannot be used in a class body*
  - *assignment expression cannot rebind comprehension iteration variable 'xxx'*
2019-12-01 15:43:17 +08:00
Ian Tabolt ee2995c110 Remove debug print statement 2019-09-28 11:01:52 +02:00
Naglis 76aaa2ddba Fix typo (#84) 2019-09-15 19:53:30 +02:00
Dave Halter 3ecd4dddb4 Fix is_definition test 2019-09-05 23:28:46 +02:00
Dave Halter 8f83e9b3c5 Add include_setitem for get_defined_names, is_definition and get_definition 2019-09-04 09:52:55 +02:00
Dave Halter e8653a49ff Make is_definition work on setitem modifications, see #66 2019-09-04 09:52:55 +02:00
Hugo d3383b6c41 Fix string/tuple concatenation 2019-08-08 16:49:42 +03:00
Hugo 9da4df20d1 Add python_requires to help pip 2019-08-08 14:57:13 +03:00
Hugo 0341f69691 Drop support for EOL Python 3.3 2019-08-08 14:57:13 +03:00
Hugo f6bdba65c0 Drop support for EOL Python 2.6 2019-08-08 14:56:27 +03:00
Thomas A Caswell 3bb46563d4 ENH: update grammar for py39 (#78)
* ENH: update grammar for py39

Grammar is copied from cpython commit
b4e68960b90627422325fdb75f463df1e4153c6e

There appears to be 3 new tokens in the grammar (ASYNC, AWAIT, and
TYPE_COMMENT)

* MNT: revert back to py38 grammar as py39 grammar pt1: comments

Looks like upstream has added some comments, remove them

* MNT: remove TYPE_COMMENT added upstream

* MNT: add string / fstring related changes from parso 38 grammer

* MNT: remove changes to support upstream grammar file
2019-07-21 23:45:51 +02:00
Dave Halter e723b3e74b Refactor the ambiguity tests a bit, see #70 2019-07-13 20:15:56 +02:00
Benjamin Woodruff 0032bae041 Make pgen2's grammar ambiguity detection handle more cases
Under the old implementation,

```
outer: A [inner] B C
inner: B C [inner]
```

wouldn't get detected as the ambiguous grammar that it is, whereas

```
outer: A rest
rest: [inner] B C
inner: B C [inner]
```

would.

This would manifest itself as non-determinism in the DFA state
generation. See the discussion #62 on for a full explanation.

This modifies the ambiguity detection to work on a broader class of
issues, so it should now hopefully detect all cases where the given
grammar is ambiguous.

At some point, we could extend this logic to allow developers to
optionally set precedence of grammar productions, which could resolve
ambiguities, but that's not a strict requirement for parsing python.
2019-07-13 20:04:32 +02:00
Dave Halter c0ace63a69 For Python 2.7 and 3.4 pytest 5 doesn't work anymore 2019-07-13 15:46:58 +02:00
Dave Halter 399e8e5043 Prepare the 0.5.1 release 2019-07-13 15:39:44 +02:00
Dave Halter 0a5b5f3346 Fix name tokenizing for Python 2 2019-07-13 15:34:23 +02:00
Dave Halter 2b8544021f Fix positioning for names that are interleaved with error tokens 2019-07-13 12:34:49 +02:00
Dave Halter 99dd4a84d4 Merge branch 'master' of github.com:davidhalter/parso 2019-07-12 21:35:06 +02:00
Dave Halter 9501b0bde0 Fixed name tokenizing issues for tamil characters, fixes davidhalter/jedi#1368 2019-07-12 21:31:49 +02:00
Benjamin Woodruff ad57a51800 Fix line continuation characters inside f-strings
Line continuation characters are valid inside of strings, but weren't
handled correctly in certain cases with f-strings, due to some small
tokenizer bugs.

This pull request to address those issues, and adds tests to validate
the new logic.
2019-07-12 21:20:00 +02:00
Dave Halter 19de3eb5ca Document that the cache uses pickle files 2019-07-10 00:17:28 -07:00
Dave Halter 7441e6b1d2 Fix changelog dates, fixes #77 2019-06-28 02:00:35 -07:00
Dave Halter df3c494e02 Try to use collections.abc.Mapping instead of collections.Mapping
The latter is deprecated and will be removed in Python 3.9, fixes #76
2019-06-21 10:17:18 +02:00
Dave Halter 59df3fab43 Some small changes to the changelog 2019-06-20 21:15:53 +02:00
Dave Halter 803cb5f25f Make parso work at least somewhat with an older Jedi version 2019-06-20 20:33:14 +02:00
Dave Halter 3fa8630ba9 Use an immutable map for used names, so that it can be use for hashing 2019-06-18 09:12:33 +02:00
Dave Halter 1ca5ae4008 Bump the version number to the next release: 0.5.0 2019-06-13 17:26:08 +02:00
Dave Halter c3c16169b5 Ignore positional only arguments slash when listing params 2019-06-09 22:55:37 +02:00
Dave Halter ecbe2b9926 Add positional only arguments to grammar 2019-06-09 21:15:03 +02:00
Dave Halter 1929c144dc Increate the _PICKLE_VERSION to avoid issues with the latest breaking change 2019-06-09 18:11:21 +02:00
Dave Halter b5d50392a4 comp_for is now called sync_comp_for for all Python versions to be compatible with the Python 3.8 Grammar 2019-06-09 18:00:32 +02:00
Dave Halter a7aa23a7f0 Parse named expressions 2019-06-02 23:34:37 +02:00
Dave Halter 5430415d44 Change a test, because it doesn't really matter
The test had changed behavior for Python 3.8, a syntax error of:

SyntaxError: unexpected EOF while parsing

instead of

SyntaxError: invalid syntax
2019-06-02 22:54:45 +02:00
Dave Halter 6cdd47fe2b f-string syntax in Python 3.8 was enhanced
See e.g. https://twitter.com/raymondh/status/1135253771846471680
2019-06-02 22:48:47 +02:00
Dave Halter 917b4421f3 Fix fstring format spec parsing, fixes #74 2019-06-02 15:18:42 +02:00
Dave Halter 4f5fdd5a70 Add release notes for the next release 0.4.1 2019-06-02 11:28:00 +02:00
prim 93ddf5322a parse long number notation (#72)
* parse long number notation

* parse long number notation
2019-06-02 11:14:15 +02:00
Dave Halter a9b61149eb Fix get_decorators for async functions 2019-05-27 01:08:42 +02:00
Dave Halter de416b082e Make it clear that get_last_modified should not raise an exception, but return None, if it cannot look up a file 2019-05-22 00:16:26 +02:00
Carl Meyer 4b440159b1 Fix __init__.pyi re-exports. 2019-05-10 09:12:32 +02:00
Carl Meyer 6f2d2362c9 Add type stubs. 2019-05-10 09:12:32 +02:00
Dave Halter 8a06f0da05 0.4.0 release notes 2019-04-05 18:57:21 +02:00
Dave Halter bd95989c2e Change the default tox environments to test
These version will be tested before deploying
2019-04-05 18:55:23 +02:00
Miro Hrončok 57e91262cd Add Python 3.8 to tox.ini
Otherwise we get:

    Matching undeclared envs is deprecated.
    Be sure all the envs that Tox should run are declared in the tox config.
2019-04-05 18:43:43 +02:00
Miro Hrončok 476383cca9 Test on Python 3.8 2019-04-05 18:43:43 +02:00
Dave Halter b2ab64d8f9 Fix Python 3.8 error issues 2019-04-05 18:30:48 +02:00
Dave Halter 18cbeb1a3d Fix an issue, because sync_comp_for exists now 2019-04-05 16:27:17 +02:00
Dave Halter a5686d6cda PEP 8 2019-04-05 16:25:45 +02:00
Dave Halter dfe7fba08e continue in finally is no longer an error 2019-04-05 16:17:30 +02:00
Dave Halter 6db7f40942 Python 2 compatibility 2019-04-03 01:24:06 +02:00
Dave Halter d5eb96309c Increase the pickle version. With all the changes lately, it's better this way 2019-04-03 01:07:25 +02:00
Dave Halter 4c65368056 Some minor changes to file_io 2019-03-27 01:02:27 +01:00
Dave Halter 3e2956264c Add FileIO to make it possible to cache e.g. files from zip files 2019-03-25 00:48:59 +01:00
Dave Halter e77a67cd36 PEP 8 2019-03-22 20:17:59 +01:00
Daniel Hahler c4d6de2aab tests: add coverage tox factor, use it on Travis 2019-03-22 11:01:22 +01:00
Daniel Hahler 7770e73609 ci: Travis: use dist=xenial 2019-03-22 11:01:22 +01:00
Dave Halter acccb4f28d 0.3.4 release 2019-02-13 00:19:07 +01:00
Dave Halter 3f6fc8a5ad Fix an f-string tokenizer issue 2019-02-13 00:17:37 +01:00
Dave Halter f1ee7614c9 Release of 0.3.3 2019-02-06 09:55:18 +01:00
Dave Halter 58850f8bfa Rename a test 2019-02-06 09:51:46 +01:00
Dave Halter d38a60278e Remove some unused code 2019-02-06 09:50:27 +01:00
Dave Halter 6c65aea47d Fix working with async functions in the diff parser, fixes #56 2019-02-06 09:31:46 +01:00
Dave Halter 0d37ff865c Fix bytes/fstring mixing when using iter_errors, fixes #57. 2019-02-06 01:28:47 +01:00
Dave Halter 076e296497 Improve a docstring, fixes #55. 2019-01-26 21:34:56 +01:00
72 changed files with 3818 additions and 2231 deletions
+1
View File
@@ -1,4 +1,5 @@
[run] [run]
source = parso
[report] [report]
# Regexes for lines to exclude from consideration # Regexes for lines to exclude from consideration
+1 -1
View File
@@ -1,7 +1,6 @@
*~ *~
*.sw? *.sw?
*.pyc *.pyc
.tox
.coveralls.yml .coveralls.yml
.coverage .coverage
/build/ /build/
@@ -11,3 +10,4 @@ parso.egg-info/
/.cache/ /.cache/
/.pytest_cache /.pytest_cache
test/fuzz-redo.pickle test/fuzz-redo.pickle
/venv/
+24 -18
View File
@@ -1,25 +1,31 @@
dist: xenial
language: python language: python
sudo: false
python: python:
- 2.6
- 2.7
- 3.4
- 3.5
- 3.6 - 3.6
- pypy - 3.7
- 3.8.2
- nightly
matrix: matrix:
include:
- { python: "3.7", dist: xenial, sudo: true }
- python: 3.5
env: TOXENV=cov
allow_failures: allow_failures:
- env: TOXENV=cov - python: nightly
include:
- python: 3.8
install:
- 'pip install .[qa]'
script:
# Ignore F401, which are unused imports. flake8 is a primitive tool and is sometimes wrong.
- 'flake8 --extend-ignore F401 parso test/*.py setup.py scripts/'
- mypy parso
- python: 3.8.2
script:
- 'pip install coverage'
- 'coverage run -m pytest'
- 'coverage report'
after_script:
- |
pip install --quiet coveralls
coveralls
install: install:
- pip install --quiet tox-travis - pip install .[testing]
script: script:
- tox - pytest
after_script:
- if [ $TOXENV == "cov" ]; then
pip install --quiet coveralls;
coveralls;
fi
+3
View File
@@ -49,6 +49,9 @@ Mathias Rav (@Mortal) <rav@cs.au.dk>
Daniel Fiterman (@dfit99) <fitermandaniel2@gmail.com> Daniel Fiterman (@dfit99) <fitermandaniel2@gmail.com>
Simon Ruggier (@sruggier) Simon Ruggier (@sruggier)
Élie Gouzien (@ElieGouzien) Élie Gouzien (@ElieGouzien)
Tim Gates (@timgates42) <tim.gates@iress.com>
Batuhan Taskaya (@isidentical) <isidentical@gmail.com>
Jocelyn Boullier (@Kazy) <jocelyn@boullier.bzh>
Note: (@user) means a github user name. Note: (@user) means a github user name.
+94 -1
View File
@@ -3,7 +3,100 @@
Changelog Changelog
--------- ---------
0.3.2 (2018-01-24) Unreleased
++++++++++
0.8.1 (2020-12-10)
++++++++++++++++++
- Various small bugfixes
0.8.0 (2020-08-05)
++++++++++++++++++
- Dropped Support for Python 2.7, 3.4, 3.5
- It's possible to use ``pathlib.Path`` objects now in the API
- The stubs are gone, we are now using annotations
- ``namedexpr_test`` nodes are now a proper class called ``NamedExpr``
- A lot of smaller refactorings
0.7.1 (2020-07-24)
++++++++++++++++++
- Fixed a couple of smaller bugs (mostly syntax error detection in
``Grammar.iter_errors``)
This is going to be the last release that supports Python 2.7, 3.4 and 3.5.
0.7.0 (2020-04-13)
++++++++++++++++++
- Fix a lot of annoying bugs in the diff parser. The fuzzer did not find
issues anymore even after running it for more than 24 hours (500k tests).
- Small grammar change: suites can now contain newlines even after a newline.
This should really not matter if you don't use error recovery. It allows for
nicer error recovery.
0.6.2 (2020-02-27)
++++++++++++++++++
- Bugfixes
- Add Grammar.refactor (might still be subject to change until 0.7.0)
0.6.1 (2020-02-03)
++++++++++++++++++
- Add ``parso.normalizer.Issue.end_pos`` to make it possible to know where an
issue ends
0.6.0 (2020-01-26)
++++++++++++++++++
- Dropped Python 2.6/Python 3.3 support
- del_stmt names are now considered as a definition
(for ``name.is_definition()``)
- Bugfixes
0.5.2 (2019-12-15)
++++++++++++++++++
- Add include_setitem to get_definition/is_definition and get_defined_names (#66)
- Fix named expression error listing (#89, #90)
- Fix some f-string tokenizer issues (#93)
0.5.1 (2019-07-13)
++++++++++++++++++
- Fix: Some unicode identifiers were not correctly tokenized
- Fix: Line continuations in f-strings are now working
0.5.0 (2019-06-20)
++++++++++++++++++
- **Breaking Change** comp_for is now called sync_comp_for for all Python
versions to be compatible with the Python 3.8 Grammar
- Added .pyi stubs for a lot of the parso API
- Small FileIO changes
0.4.0 (2019-04-05)
++++++++++++++++++
- Python 3.8 support
- FileIO support, it's now possible to use abstract file IO, support is alpha
0.3.4 (2019-02-13)
+++++++++++++++++++
- Fix an f-string tokenizer error
0.3.3 (2019-02-06)
+++++++++++++++++++
- Fix async errors in the diff parser
- A fix in iter_errors
- This is a very small bugfix release
0.3.2 (2019-01-24)
+++++++++++++++++++ +++++++++++++++++++
- 20+ bugfixes in the diff parser and 3 in the tokenizer - 20+ bugfixes in the diff parser and 3 in the tokenizer
-1
View File
@@ -5,7 +5,6 @@ include AUTHORS.txt
include .coveragerc include .coveragerc
include conftest.py include conftest.py
include pytest.ini include pytest.ini
include tox.ini
include parso/python/grammar*.txt include parso/python/grammar*.txt
recursive-include test * recursive-include test *
recursive-include docs * recursive-include docs *
+5 -1
View File
@@ -11,6 +11,10 @@ parso - A Python Parser
:target: https://coveralls.io/github/davidhalter/parso?branch=master :target: https://coveralls.io/github/davidhalter/parso?branch=master
:alt: Coverage Status :alt: Coverage Status
.. image:: https://pepy.tech/badge/parso
:target: https://pepy.tech/project/parso
:alt: PyPI Downloads
.. image:: https://raw.githubusercontent.com/davidhalter/parso/master/docs/_static/logo_characters.png .. image:: https://raw.githubusercontent.com/davidhalter/parso/master/docs/_static/logo_characters.png
Parso is a Python parser that supports error recovery and round-trip parsing Parso is a Python parser that supports error recovery and round-trip parsing
@@ -27,7 +31,7 @@ A simple example:
.. code-block:: python .. code-block:: python
>>> import parso >>> import parso
>>> module = parso.parse('hello + 1', version="3.6") >>> module = parso.parse('hello + 1', version="3.9")
>>> expr = module.children[0] >>> expr = module.children[0]
>>> expr >>> expr
PythonNode(arith_expr, [<Name: hello@1,0>, <Operator: +>, <Number: 1>]) PythonNode(arith_expr, [<Name: hello@1,0>, <Operator: +>, <Number: 1>])
+18 -28
View File
@@ -2,8 +2,8 @@ import re
import tempfile import tempfile
import shutil import shutil
import logging import logging
import sys
import os import os
from pathlib import Path
import pytest import pytest
@@ -13,8 +13,7 @@ from parso.utils import parse_version_string
collect_ignore = ["setup.py"] collect_ignore = ["setup.py"]
VERSIONS_2 = '2.6', '2.7' _SUPPORTED_VERSIONS = '3.6', '3.7', '3.8', '3.9', '3.10'
VERSIONS_3 = '3.3', '3.4', '3.5', '3.6', '3.7'
@pytest.fixture(scope='session') @pytest.fixture(scope='session')
@@ -30,7 +29,7 @@ def clean_parso_cache():
""" """
old = cache._default_cache_path old = cache._default_cache_path
tmp = tempfile.mkdtemp(prefix='parso-test-') tmp = tempfile.mkdtemp(prefix='parso-test-')
cache._default_cache_path = tmp cache._default_cache_path = Path(tmp)
yield yield
cache._default_cache_path = old cache._default_cache_path = old
shutil.rmtree(tmp) shutil.rmtree(tmp)
@@ -52,16 +51,13 @@ def pytest_generate_tests(metafunc):
ids=[c.name for c in cases] ids=[c.name for c in cases]
) )
elif 'each_version' in metafunc.fixturenames: elif 'each_version' in metafunc.fixturenames:
metafunc.parametrize('each_version', VERSIONS_2 + VERSIONS_3) metafunc.parametrize('each_version', _SUPPORTED_VERSIONS)
elif 'each_py2_version' in metafunc.fixturenames: elif 'version_ge_py38' in metafunc.fixturenames:
metafunc.parametrize('each_py2_version', VERSIONS_2) ge38 = set(_SUPPORTED_VERSIONS) - {'3.6', '3.7'}
elif 'each_py3_version' in metafunc.fixturenames: metafunc.parametrize('version_ge_py38', sorted(ge38))
metafunc.parametrize('each_py3_version', VERSIONS_3)
elif 'version_ge_py36' in metafunc.fixturenames:
metafunc.parametrize('version_ge_py36', ['3.6', '3.7'])
class NormalizerIssueCase(object): class NormalizerIssueCase:
""" """
Static Analysis cases lie in the static_analysis folder. Static Analysis cases lie in the static_analysis folder.
The tests also start with `#!`, like the goto_definition tests. The tests also start with `#!`, like the goto_definition tests.
@@ -85,15 +81,15 @@ def pytest_configure(config):
root = logging.getLogger() root = logging.getLogger()
root.setLevel(logging.DEBUG) root.setLevel(logging.DEBUG)
ch = logging.StreamHandler(sys.stdout) #ch = logging.StreamHandler(sys.stdout)
ch.setLevel(logging.DEBUG) #ch.setLevel(logging.DEBUG)
#formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') #formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
#ch.setFormatter(formatter) #ch.setFormatter(formatter)
root.addHandler(ch) #root.addHandler(ch)
class Checker(): class Checker:
def __init__(self, version, is_passing): def __init__(self, version, is_passing):
self.version = version self.version = version
self._is_passing = is_passing self._is_passing = is_passing
@@ -135,23 +131,17 @@ def works_not_in_py(each_version):
@pytest.fixture @pytest.fixture
def works_in_py2(each_version): def works_in_py(each_version):
return Checker(each_version, each_version.startswith('2')) return Checker(each_version, True)
@pytest.fixture @pytest.fixture
def works_ge_py27(each_version): def works_ge_py38(each_version):
version_info = parse_version_string(each_version) version_info = parse_version_string(each_version)
return Checker(each_version, version_info >= (2, 7)) return Checker(each_version, version_info >= (3, 8))
@pytest.fixture @pytest.fixture
def works_ge_py3(each_version): def works_ge_py39(each_version):
version_info = parse_version_string(each_version) version_info = parse_version_string(each_version)
return Checker(each_version, version_info >= (3, 0)) return Checker(each_version, version_info >= (3, 9))
@pytest.fixture
def works_ge_py35(each_version):
version_info = parse_version_string(each_version)
return Checker(each_version, version_info >= (3, 5))
+3 -3
View File
@@ -23,10 +23,10 @@ cd $PROJECT_NAME
git checkout $BRANCH git checkout $BRANCH
# Test first. # Test first.
tox pytest
# Create tag # Create tag
tag=v$(python -c "import $PROJECT_NAME; print($PROJECT_NAME.__version__)") tag=v$(python3 -c "import $PROJECT_NAME; print($PROJECT_NAME.__version__)")
master_ref=$(git show-ref -s heads/$BRANCH) master_ref=$(git show-ref -s heads/$BRANCH)
tag_ref=$(git show-ref -s $tag || true) tag_ref=$(git show-ref -s $tag || true)
@@ -43,7 +43,7 @@ fi
# Package and upload to PyPI # Package and upload to PyPI
#rm -rf dist/ - Not needed anymore, because the folder is never reused. #rm -rf dist/ - Not needed anymore, because the folder is never reused.
echo `pwd` echo `pwd`
python setup.py sdist bdist_wheel python3 setup.py sdist bdist_wheel
# Maybe do a pip install twine before. # Maybe do a pip install twine before.
twine upload dist/* twine upload dist/*
+9 -9
View File
@@ -43,8 +43,8 @@ source_encoding = 'utf-8'
master_doc = 'index' master_doc = 'index'
# General information about the project. # General information about the project.
project = u'parso' project = 'parso'
copyright = u'parso contributors' copyright = 'parso contributors'
import parso import parso
from parso.utils import version_info from parso.utils import version_info
@@ -200,8 +200,8 @@ latex_elements = {
# Grouping the document tree into LaTeX files. List of tuples # Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, documentclass [howto/manual]). # (source start file, target name, title, author, documentclass [howto/manual]).
latex_documents = [ latex_documents = [
('index', 'parso.tex', u'parso documentation', ('index', 'parso.tex', 'parso documentation',
u'parso contributors', 'manual'), 'parso contributors', 'manual'),
] ]
# The name of an image file (relative to this directory) to place at the top of # The name of an image file (relative to this directory) to place at the top of
@@ -230,8 +230,8 @@ latex_documents = [
# One entry per manual page. List of tuples # One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section). # (source start file, name, description, authors, manual section).
man_pages = [ man_pages = [
('index', 'parso', u'parso Documentation', ('index', 'parso', 'parso Documentation',
[u'parso contributors'], 1) ['parso contributors'], 1)
] ]
# If true, show URL addresses after external links. # If true, show URL addresses after external links.
@@ -244,8 +244,8 @@ man_pages = [
# (source start file, target name, title, author, # (source start file, target name, title, author,
# dir menu entry, description, category) # dir menu entry, description, category)
texinfo_documents = [ texinfo_documents = [
('index', 'parso', u'parso documentation', ('index', 'parso', 'parso documentation',
u'parso contributors', 'parso', 'Awesome Python autocompletion library.', 'parso contributors', 'parso', 'Awesome Python autocompletion library.',
'Miscellaneous'), 'Miscellaneous'),
] ]
@@ -273,7 +273,7 @@ autodoc_default_flags = []
# -- Options for intersphinx module -------------------------------------------- # -- Options for intersphinx module --------------------------------------------
intersphinx_mapping = { intersphinx_mapping = {
'http://docs.python.org/': ('https://docs.python.org/3.6', None), 'http://docs.python.org/': ('https://docs.python.org/3', None),
} }
+6 -6
View File
@@ -21,18 +21,18 @@ The deprecation process is as follows:
Testing Testing
------- -------
The test suite depends on ``tox`` and ``pytest``:: The test suite depends on ``pytest``::
pip install tox pytest pip install pytest
To run the tests for all supported Python versions:: To run the tests use the following::
tox pytest
If you want to test only a specific Python version (e.g. Python 2.7), it's as If you want to test only a specific Python version (e.g. Python 3.9), it's as
easy as:: easy as::
tox -e py27 python3.9 -m pytest
Tests are also run automatically on `Travis CI Tests are also run automatically on `Travis CI
<https://travis-ci.org/davidhalter/parso/>`_. <https://travis-ci.org/davidhalter/parso/>`_.
+2 -2
View File
@@ -13,7 +13,7 @@ Parso consists of a small API to parse Python and analyse the syntax tree.
A simple example: A simple example:
>>> import parso >>> import parso
>>> module = parso.parse('hello + 1', version="3.6") >>> module = parso.parse('hello + 1', version="3.9")
>>> expr = module.children[0] >>> expr = module.children[0]
>>> expr >>> expr
PythonNode(arith_expr, [<Name: hello@1,0>, <Operator: +>, <Number: 1>]) PythonNode(arith_expr, [<Name: hello@1,0>, <Operator: +>, <Number: 1>])
@@ -43,7 +43,7 @@ from parso.grammar import Grammar, load_grammar
from parso.utils import split_lines, python_bytes_to_unicode from parso.utils import split_lines, python_bytes_to_unicode
__version__ = '0.3.2' __version__ = '0.8.1'
def parse(code=None, **kwargs): def parse(code=None, **kwargs):
-100
View File
@@ -1,103 +1,3 @@
"""
To ensure compatibility from Python ``2.6`` - ``3.3``, a module has been
created. Clearly there is huge need to use conforming syntax.
"""
import sys
import platform import platform
# Cannot use sys.version.major and minor names, because in Python 2.6 it's not
# a namedtuple.
py_version = int(str(sys.version_info[0]) + str(sys.version_info[1]))
# unicode function
try:
unicode = unicode
except NameError:
unicode = str
is_pypy = platform.python_implementation() == 'PyPy' is_pypy = platform.python_implementation() == 'PyPy'
def use_metaclass(meta, *bases):
""" Create a class with a metaclass. """
if not bases:
bases = (object,)
return meta("HackClass", bases, {})
try:
encoding = sys.stdout.encoding
if encoding is None:
encoding = 'utf-8'
except AttributeError:
encoding = 'ascii'
def u(string):
"""Cast to unicode DAMMIT!
Written because Python2 repr always implicitly casts to a string, so we
have to cast back to a unicode (and we know that we always deal with valid
unicode, because we check that in the beginning).
"""
if py_version >= 30:
return str(string)
if not isinstance(string, unicode):
return unicode(str(string), 'UTF-8')
return string
try:
FileNotFoundError = FileNotFoundError
except NameError:
FileNotFoundError = IOError
def utf8_repr(func):
"""
``__repr__`` methods in Python 2 don't allow unicode objects to be
returned. Therefore cast them to utf-8 bytes in this decorator.
"""
def wrapper(self):
result = func(self)
if isinstance(result, unicode):
return result.encode('utf-8')
else:
return result
if py_version >= 30:
return func
else:
return wrapper
try:
from functools import total_ordering
except ImportError:
# Python 2.6
def total_ordering(cls):
"""Class decorator that fills in missing ordering methods"""
convert = {
'__lt__': [('__gt__', lambda self, other: not (self < other or self == other)),
('__le__', lambda self, other: self < other or self == other),
('__ge__', lambda self, other: not self < other)],
'__le__': [('__ge__', lambda self, other: not self <= other or self == other),
('__lt__', lambda self, other: self <= other and not self == other),
('__gt__', lambda self, other: not self <= other)],
'__gt__': [('__lt__', lambda self, other: not (self > other or self == other)),
('__ge__', lambda self, other: self > other or self == other),
('__le__', lambda self, other: not self > other)],
'__ge__': [('__le__', lambda self, other: (not self >= other) or self == other),
('__gt__', lambda self, other: self >= other and not self == other),
('__lt__', lambda self, other: not self >= other)]
}
roots = set(dir(cls)) & set(convert)
if not roots:
raise ValueError('must define at least one ordering operation: < > <= >=')
root = max(roots) # prefer __lt__ to __le__ to __gt__ to __ge__
for opname, opfunc in convert[root]:
if opname not in roots:
opfunc.__name__ = opname
opfunc.__doc__ = getattr(int, opname).__doc__
setattr(cls, opname, opfunc)
return cls
+155 -42
View File
@@ -5,20 +5,38 @@ import hashlib
import gc import gc
import shutil import shutil
import platform import platform
import errno
import logging import logging
import warnings
try: import pickle
import cPickle as pickle from pathlib import Path
except: from typing import Dict, Any
import pickle
from parso._compatibility import FileNotFoundError
LOG = logging.getLogger(__name__) LOG = logging.getLogger(__name__)
_CACHED_FILE_MINIMUM_SURVIVAL = 60 * 10 # 10 minutes
"""
Cached files should survive at least a few minutes.
"""
_PICKLE_VERSION = 30 _CACHED_FILE_MAXIMUM_SURVIVAL = 60 * 60 * 24 * 30
"""
Maximum time for a cached file to survive if it is not
accessed within.
"""
_CACHED_SIZE_TRIGGER = 600
"""
This setting limits the amount of cached files. It's basically a way to start
garbage collection.
The reasoning for this limit being as big as it is, is the following:
Numpy, Pandas, Matplotlib and Tensorflow together use about 500 files. This
makes Jedi use ~500mb of memory. Since we might want a bit more than those few
libraries, we just increase it a bit.
"""
_PICKLE_VERSION = 33
""" """
Version number (integer) for file system cache. Version number (integer) for file system cache.
@@ -40,19 +58,20 @@ _VERSION_TAG = '%s-%s%s-%s' % (
""" """
Short name for distinguish Python implementations and versions. Short name for distinguish Python implementations and versions.
It's like `sys.implementation.cache_tag` but for Python < 3.3 It's a bit similar to `sys.implementation.cache_tag`.
we generate something similar. See: See: http://docs.python.org/3/library/sys.html#sys.implementation
http://docs.python.org/3/library/sys.html#sys.implementation
""" """
def _get_default_cache_path(): def _get_default_cache_path():
if platform.system().lower() == 'windows': if platform.system().lower() == 'windows':
dir_ = os.path.join(os.getenv('LOCALAPPDATA') or '~', 'Parso', 'Parso') dir_ = Path(os.getenv('LOCALAPPDATA') or '~', 'Parso', 'Parso')
elif platform.system().lower() == 'darwin': elif platform.system().lower() == 'darwin':
dir_ = os.path.join('~', 'Library', 'Caches', 'Parso') dir_ = Path('~', 'Library', 'Caches', 'Parso')
else: else:
dir_ = os.path.join(os.getenv('XDG_CACHE_HOME') or '~/.cache', 'parso') dir_ = Path(os.getenv('XDG_CACHE_HOME') or '~/.cache', 'parso')
return os.path.expanduser(dir_) return dir_.expanduser()
_default_cache_path = _get_default_cache_path() _default_cache_path = _get_default_cache_path()
""" """
@@ -64,48 +83,61 @@ On Linux, if environment variable ``$XDG_CACHE_HOME`` is set,
``$XDG_CACHE_HOME/parso`` is used instead of the default one. ``$XDG_CACHE_HOME/parso`` is used instead of the default one.
""" """
parser_cache = {} _CACHE_CLEAR_THRESHOLD = 60 * 60 * 24
class _NodeCacheItem(object): def _get_cache_clear_lock_path(cache_path=None):
"""
The path where the cache lock is stored.
Cache lock will prevent continous cache clearing and only allow garbage
collection once a day (can be configured in _CACHE_CLEAR_THRESHOLD).
"""
cache_path = cache_path or _default_cache_path
return cache_path.joinpath("PARSO-CACHE-LOCK")
parser_cache: Dict[str, Any] = {}
class _NodeCacheItem:
def __init__(self, node, lines, change_time=None): def __init__(self, node, lines, change_time=None):
self.node = node self.node = node
self.lines = lines self.lines = lines
if change_time is None: if change_time is None:
change_time = time.time() change_time = time.time()
self.change_time = change_time self.change_time = change_time
self.last_used = change_time
def load_module(hashed_grammar, path, cache_path=None): def load_module(hashed_grammar, file_io, cache_path=None):
""" """
Returns a module or None, if it fails. Returns a module or None, if it fails.
""" """
try: p_time = file_io.get_last_modified()
p_time = os.path.getmtime(path) if p_time is None:
except FileNotFoundError:
return None return None
try: try:
module_cache_item = parser_cache[hashed_grammar][path] module_cache_item = parser_cache[hashed_grammar][file_io.path]
if p_time <= module_cache_item.change_time: if p_time <= module_cache_item.change_time:
module_cache_item.last_used = time.time()
return module_cache_item.node return module_cache_item.node
except KeyError: except KeyError:
return _load_from_file_system(hashed_grammar, path, p_time, cache_path=cache_path) return _load_from_file_system(
hashed_grammar,
file_io.path,
p_time,
cache_path=cache_path
)
def _load_from_file_system(hashed_grammar, path, p_time, cache_path=None): def _load_from_file_system(hashed_grammar, path, p_time, cache_path=None):
cache_path = _get_hashed_path(hashed_grammar, path, cache_path=cache_path) cache_path = _get_hashed_path(hashed_grammar, path, cache_path=cache_path)
try: try:
try: if p_time > os.path.getmtime(cache_path):
if p_time > os.path.getmtime(cache_path): # Cache is outdated
# Cache is outdated return None
return None
except OSError as e:
if e.errno == errno.ENOENT:
# In Python 2 instead of an IOError here we get an OSError.
raise FileNotFoundError
else:
raise
with open(cache_path, 'rb') as f: with open(cache_path, 'rb') as f:
gc.disable() gc.disable()
@@ -116,22 +148,50 @@ def _load_from_file_system(hashed_grammar, path, p_time, cache_path=None):
except FileNotFoundError: except FileNotFoundError:
return None return None
else: else:
parser_cache.setdefault(hashed_grammar, {})[path] = module_cache_item _set_cache_item(hashed_grammar, path, module_cache_item)
LOG.debug('pickle loaded: %s', path) LOG.debug('pickle loaded: %s', path)
return module_cache_item.node return module_cache_item.node
def save_module(hashed_grammar, path, module, lines, pickling=True, cache_path=None): def _set_cache_item(hashed_grammar, path, module_cache_item):
if sum(len(v) for v in parser_cache.values()) >= _CACHED_SIZE_TRIGGER:
# Garbage collection of old cache files.
# We are basically throwing everything away that hasn't been accessed
# in 10 minutes.
cutoff_time = time.time() - _CACHED_FILE_MINIMUM_SURVIVAL
for key, path_to_item_map in parser_cache.items():
parser_cache[key] = {
path: node_item
for path, node_item in path_to_item_map.items()
if node_item.last_used > cutoff_time
}
parser_cache.setdefault(hashed_grammar, {})[path] = module_cache_item
def try_to_save_module(hashed_grammar, file_io, module, lines, pickling=True, cache_path=None):
path = file_io.path
try: try:
p_time = None if path is None else os.path.getmtime(path) p_time = None if path is None else file_io.get_last_modified()
except OSError: except OSError:
p_time = None p_time = None
pickling = False pickling = False
item = _NodeCacheItem(module, lines, p_time) item = _NodeCacheItem(module, lines, p_time)
parser_cache.setdefault(hashed_grammar, {})[path] = item _set_cache_item(hashed_grammar, path, item)
if pickling and path is not None: if pickling and path is not None:
_save_to_file_system(hashed_grammar, path, item, cache_path=cache_path) try:
_save_to_file_system(hashed_grammar, path, item, cache_path=cache_path)
except PermissionError:
# It's not really a big issue if the cache cannot be saved to the
# file system. It's still in RAM in that case. However we should
# still warn the user that this is happening.
warnings.warn(
'Tried to save a file to %s, but got permission denied.',
Warning
)
else:
_remove_cache_and_update_lock(cache_path=cache_path)
def _save_to_file_system(hashed_grammar, path, item, cache_path=None): def _save_to_file_system(hashed_grammar, path, item, cache_path=None):
@@ -146,17 +206,70 @@ def clear_cache(cache_path=None):
parser_cache.clear() parser_cache.clear()
def clear_inactive_cache(
cache_path=None,
inactivity_threshold=_CACHED_FILE_MAXIMUM_SURVIVAL,
):
if cache_path is None:
cache_path = _default_cache_path
if not cache_path.exists():
return False
for dirname in os.listdir(cache_path):
version_path = cache_path.joinpath(dirname)
if not version_path.is_dir():
continue
for file in os.scandir(version_path):
if file.stat().st_atime + _CACHED_FILE_MAXIMUM_SURVIVAL <= time.time():
try:
os.remove(file.path)
except OSError: # silently ignore all failures
continue
else:
return True
def _touch(path):
try:
os.utime(path, None)
except FileNotFoundError:
try:
file = open(path, 'a')
file.close()
except (OSError, IOError): # TODO Maybe log this?
return False
return True
def _remove_cache_and_update_lock(cache_path=None):
lock_path = _get_cache_clear_lock_path(cache_path=cache_path)
try:
clear_lock_time = os.path.getmtime(lock_path)
except FileNotFoundError:
clear_lock_time = None
if (
clear_lock_time is None # first time
or clear_lock_time + _CACHE_CLEAR_THRESHOLD <= time.time()
):
if not _touch(lock_path):
# First make sure that as few as possible other cleanup jobs also
# get started. There is still a race condition but it's probably
# not a big problem.
return False
clear_inactive_cache(cache_path=cache_path)
def _get_hashed_path(hashed_grammar, path, cache_path=None): def _get_hashed_path(hashed_grammar, path, cache_path=None):
directory = _get_cache_directory_path(cache_path=cache_path) directory = _get_cache_directory_path(cache_path=cache_path)
file_hash = hashlib.sha256(path.encode("utf-8")).hexdigest() file_hash = hashlib.sha256(str(path).encode("utf-8")).hexdigest()
return os.path.join(directory, '%s-%s.pkl' % (hashed_grammar, file_hash)) return os.path.join(directory, '%s-%s.pkl' % (hashed_grammar, file_hash))
def _get_cache_directory_path(cache_path=None): def _get_cache_directory_path(cache_path=None):
if cache_path is None: if cache_path is None:
cache_path = _default_cache_path cache_path = _default_cache_path
directory = os.path.join(cache_path, _VERSION_TAG) directory = cache_path.joinpath(_VERSION_TAG)
if not os.path.exists(directory): if not directory.exists():
os.makedirs(directory) os.makedirs(directory)
return directory return directory
+38
View File
@@ -0,0 +1,38 @@
import os
from pathlib import Path
from typing import Union
class FileIO:
def __init__(self, path: Union[os.PathLike, str]):
if isinstance(path, str):
path = Path(path)
self.path = path
def read(self): # Returns bytes/str
# We would like to read unicode here, but we cannot, because we are not
# sure if it is a valid unicode file. Therefore just read whatever is
# here.
with open(self.path, 'rb') as f:
return f.read()
def get_last_modified(self):
"""
Returns float - timestamp or None, if path doesn't exist.
"""
try:
return os.path.getmtime(self.path)
except FileNotFoundError:
return None
def __repr__(self):
return '%s(%s)' % (self.__class__.__name__, self.path)
class KnownContentFileIO(FileIO):
def __init__(self, path, content):
super().__init__(path)
self._content = content
def read(self):
return self._content
+92 -78
View File
@@ -1,33 +1,42 @@
import hashlib import hashlib
import os import os
from typing import Generic, TypeVar, Union, Dict, Optional, Any
from pathlib import Path
from parso._compatibility import FileNotFoundError, is_pypy from parso._compatibility import is_pypy
from parso.pgen2 import generate_grammar from parso.pgen2 import generate_grammar
from parso.utils import split_lines, python_bytes_to_unicode, parse_version_string from parso.utils import split_lines, python_bytes_to_unicode, \
PythonVersionInfo, parse_version_string
from parso.python.diff import DiffParser from parso.python.diff import DiffParser
from parso.python.tokenize import tokenize_lines, tokenize from parso.python.tokenize import tokenize_lines, tokenize
from parso.python.token import PythonTokenTypes from parso.python.token import PythonTokenTypes
from parso.cache import parser_cache, load_module, save_module from parso.cache import parser_cache, load_module, try_to_save_module
from parso.parser import BaseParser from parso.parser import BaseParser
from parso.python.parser import Parser as PythonParser from parso.python.parser import Parser as PythonParser
from parso.python.errors import ErrorFinderConfig from parso.python.errors import ErrorFinderConfig
from parso.python import pep8 from parso.python import pep8
from parso.file_io import FileIO, KnownContentFileIO
from parso.normalizer import RefactoringNormalizer, NormalizerConfig
_loaded_grammars = {} _loaded_grammars: Dict[str, 'Grammar'] = {}
_NodeT = TypeVar("_NodeT")
class Grammar(object): class Grammar(Generic[_NodeT]):
""" """
:py:func:`parso.load_grammar` returns instances of this class. :py:func:`parso.load_grammar` returns instances of this class.
Creating custom none-python grammars by calling this is not supported, yet. Creating custom none-python grammars by calling this is not supported, yet.
"""
#:param text: A BNF representation of your grammar.
_error_normalizer_config = None
_token_namespace = None
_default_normalizer_config = pep8.PEP8NormalizerConfig()
def __init__(self, text, tokenizer, parser=BaseParser, diff_parser=None): :param text: A BNF representation of your grammar.
"""
_start_nonterminal: str
_error_normalizer_config: Optional[ErrorFinderConfig] = None
_token_namespace: Any = None
_default_normalizer_config: NormalizerConfig = pep8.PEP8NormalizerConfig()
def __init__(self, text: str, *, tokenizer, parser=BaseParser, diff_parser=None):
self._pgen_grammar = generate_grammar( self._pgen_grammar = generate_grammar(
text, text,
token_namespace=self._get_token_namespace() token_namespace=self._get_token_namespace()
@@ -37,7 +46,16 @@ class Grammar(object):
self._diff_parser = diff_parser self._diff_parser = diff_parser
self._hashed = hashlib.sha256(text.encode("utf-8")).hexdigest() self._hashed = hashlib.sha256(text.encode("utf-8")).hexdigest()
def parse(self, code=None, **kwargs): def parse(self,
code: Union[str, bytes] = None,
*,
error_recovery=True,
path: Union[os.PathLike, str] = None,
start_symbol: str = None,
cache=False,
diff_cache=False,
cache_path: Union[os.PathLike, str] = None,
file_io: FileIO = None) -> _NodeT:
""" """
If you want to parse a Python file you want to start here, most likely. If you want to parse a Python file you want to start here, most likely.
@@ -56,7 +74,8 @@ class Grammar(object):
:param str path: The path to the file you want to open. Only needed for caching. :param str path: The path to the file you want to open. Only needed for caching.
:param bool cache: Keeps a copy of the parser tree in RAM and on disk :param bool cache: Keeps a copy of the parser tree in RAM and on disk
if a path is given. Returns the cached trees if the corresponding if a path is given. Returns the cached trees if the corresponding
files on disk have not changed. files on disk have not changed. Note that this stores pickle files
on your file system (e.g. for Linux in ``~/.cache/parso/``).
:param bool diff_cache: Diffs the cached python module against the new :param bool diff_cache: Diffs the cached python module against the new
code and tries to parse only the parts that have changed. Returns code and tries to parse only the parts that have changed. Returns
the same (changed) module that is found in cache. Using this option the same (changed) module that is found in cache. Using this option
@@ -71,37 +90,33 @@ class Grammar(object):
:return: A subclass of :py:class:`parso.tree.NodeOrLeaf`. Typically a :return: A subclass of :py:class:`parso.tree.NodeOrLeaf`. Typically a
:py:class:`parso.python.tree.Module`. :py:class:`parso.python.tree.Module`.
""" """
if 'start_pos' in kwargs: if code is None and path is None and file_io is None:
raise TypeError("parse() got an unexpected keyword argument.")
return self._parse(code=code, **kwargs)
def _parse(self, code=None, error_recovery=True, path=None,
start_symbol=None, cache=False, diff_cache=False,
cache_path=None, start_pos=(1, 0)):
"""
Wanted python3.5 * operator and keyword only arguments. Therefore just
wrap it all.
start_pos here is just a parameter internally used. Might be public
sometime in the future.
"""
if code is None and path is None:
raise TypeError("Please provide either code or a path.") raise TypeError("Please provide either code or a path.")
if isinstance(path, str):
path = Path(path)
if isinstance(cache_path, str):
cache_path = Path(cache_path)
if start_symbol is None: if start_symbol is None:
start_symbol = self._start_nonterminal start_symbol = self._start_nonterminal
if error_recovery and start_symbol != 'file_input': if error_recovery and start_symbol != 'file_input':
raise NotImplementedError("This is currently not implemented.") raise NotImplementedError("This is currently not implemented.")
if cache and path is not None: if file_io is None:
module_node = load_module(self._hashed, path, cache_path=cache_path) if code is None:
file_io = FileIO(path) # type: ignore
else:
file_io = KnownContentFileIO(path, code)
if cache and file_io.path is not None:
module_node = load_module(self._hashed, file_io, cache_path=cache_path)
if module_node is not None: if module_node is not None:
return module_node return module_node # type: ignore
if code is None: if code is None:
with open(path, 'rb') as f: code = file_io.read()
code = f.read()
code = python_bytes_to_unicode(code) code = python_bytes_to_unicode(code)
lines = split_lines(code, keepends=True) lines = split_lines(code, keepends=True)
@@ -110,14 +125,14 @@ class Grammar(object):
raise TypeError("You have to define a diff parser to be able " raise TypeError("You have to define a diff parser to be able "
"to use this option.") "to use this option.")
try: try:
module_cache_item = parser_cache[self._hashed][path] module_cache_item = parser_cache[self._hashed][file_io.path]
except KeyError: except KeyError:
pass pass
else: else:
module_node = module_cache_item.node module_node = module_cache_item.node
old_lines = module_cache_item.lines old_lines = module_cache_item.lines
if old_lines == lines: if old_lines == lines:
return module_node return module_node # type: ignore
new_node = self._diff_parser( new_node = self._diff_parser(
self._pgen_grammar, self._tokenizer, module_node self._pgen_grammar, self._tokenizer, module_node
@@ -125,13 +140,13 @@ class Grammar(object):
old_lines=old_lines, old_lines=old_lines,
new_lines=lines new_lines=lines
) )
save_module(self._hashed, path, new_node, lines, try_to_save_module(self._hashed, file_io, new_node, lines,
# Never pickle in pypy, it's slow as hell. # Never pickle in pypy, it's slow as hell.
pickling=cache and not is_pypy, pickling=cache and not is_pypy,
cache_path=cache_path) cache_path=cache_path)
return new_node return new_node # type: ignore
tokens = self._tokenizer(lines, start_pos) tokens = self._tokenizer(lines)
p = self._parser( p = self._parser(
self._pgen_grammar, self._pgen_grammar,
@@ -141,11 +156,11 @@ class Grammar(object):
root_node = p.parse(tokens=tokens) root_node = p.parse(tokens=tokens)
if cache or diff_cache: if cache or diff_cache:
save_module(self._hashed, path, root_node, lines, try_to_save_module(self._hashed, file_io, root_node, lines,
# Never pickle in pypy, it's slow as hell. # Never pickle in pypy, it's slow as hell.
pickling=cache and not is_pypy, pickling=cache and not is_pypy,
cache_path=cache_path) cache_path=cache_path)
return root_node return root_node # type: ignore
def _get_token_namespace(self): def _get_token_namespace(self):
ns = self._token_namespace ns = self._token_namespace
@@ -164,6 +179,9 @@ class Grammar(object):
return self._get_normalizer_issues(node, self._error_normalizer_config) return self._get_normalizer_issues(node, self._error_normalizer_config)
def refactor(self, base_node, node_to_str_map):
return RefactoringNormalizer(node_to_str_map).walk(base_node)
def _get_normalizer(self, normalizer_config): def _get_normalizer(self, normalizer_config):
if normalizer_config is None: if normalizer_config is None:
normalizer_config = self._default_normalizer_config normalizer_config = self._default_normalizer_config
@@ -196,8 +214,8 @@ class PythonGrammar(Grammar):
_token_namespace = PythonTokenTypes _token_namespace = PythonTokenTypes
_start_nonterminal = 'file_input' _start_nonterminal = 'file_input'
def __init__(self, version_info, bnf_text): def __init__(self, version_info: PythonVersionInfo, bnf_text: str):
super(PythonGrammar, self).__init__( super().__init__(
bnf_text, bnf_text,
tokenizer=self._tokenize_lines, tokenizer=self._tokenize_lines,
parser=PythonParser, parser=PythonParser,
@@ -205,46 +223,42 @@ class PythonGrammar(Grammar):
) )
self.version_info = version_info self.version_info = version_info
def _tokenize_lines(self, lines, start_pos): def _tokenize_lines(self, lines, **kwargs):
return tokenize_lines(lines, self.version_info, start_pos=start_pos) return tokenize_lines(lines, version_info=self.version_info, **kwargs)
def _tokenize(self, code): def _tokenize(self, code):
# Used by Jedi. # Used by Jedi.
return tokenize(code, self.version_info) return tokenize(code, version_info=self.version_info)
def load_grammar(**kwargs): def load_grammar(*, version: str = None, path: str = None):
""" """
Loads a :py:class:`parso.Grammar`. The default version is the current Python Loads a :py:class:`parso.Grammar`. The default version is the current Python
version. version.
:param str version: A python version string, e.g. ``version='3.3'``. :param str version: A python version string, e.g. ``version='3.8'``.
:param str path: A path to a grammar file :param str path: A path to a grammar file
""" """
def load_grammar(language='python', version=None, path=None): version_info = parse_version_string(version)
if language == 'python':
version_info = parse_version_string(version)
file = path or os.path.join( file = path or os.path.join(
'python', 'python',
'grammar%s%s.txt' % (version_info.major, version_info.minor) 'grammar%s%s.txt' % (version_info.major, version_info.minor)
)
global _loaded_grammars
path = os.path.join(os.path.dirname(__file__), file)
try:
return _loaded_grammars[path]
except KeyError:
try:
with open(path) as f:
bnf_text = f.read()
grammar = PythonGrammar(version_info, bnf_text)
return _loaded_grammars.setdefault(path, grammar)
except FileNotFoundError:
message = "Python version %s.%s is currently not supported." % (
version_info.major, version_info.minor
) )
raise NotImplementedError(message)
global _loaded_grammars
path = os.path.join(os.path.dirname(__file__), file)
try:
return _loaded_grammars[path]
except KeyError:
try:
with open(path) as f:
bnf_text = f.read()
grammar = PythonGrammar(version_info, bnf_text)
return _loaded_grammars.setdefault(path, grammar)
except FileNotFoundError:
message = "Python version %s is currently not supported." % version
raise NotImplementedError(message)
else:
raise NotImplementedError("No support for language %s." % language)
return load_grammar(**kwargs)
+33 -19
View File
@@ -1,6 +1,5 @@
from contextlib import contextmanager from contextlib import contextmanager
from typing import Dict, List
from parso._compatibility import use_metaclass
class _NormalizerMeta(type): class _NormalizerMeta(type):
@@ -11,7 +10,10 @@ class _NormalizerMeta(type):
return new_cls return new_cls
class Normalizer(use_metaclass(_NormalizerMeta)): class Normalizer(metaclass=_NormalizerMeta):
_rule_type_instances: Dict[str, List[type]] = {}
_rule_value_instances: Dict[str, List[type]] = {}
def __init__(self, grammar, config): def __init__(self, grammar, config):
self.grammar = grammar self.grammar = grammar
self._config = config self._config = config
@@ -41,8 +43,8 @@ class Normalizer(use_metaclass(_NormalizerMeta)):
except AttributeError: except AttributeError:
return self.visit_leaf(node) return self.visit_leaf(node)
else: else:
with self.visit_node(node): with self.visit_node(node):
return ''.join(self.visit(child) for child in children) return ''.join(self.visit(child) for child in children)
@contextmanager @contextmanager
def visit_node(self, node): def visit_node(self, node):
@@ -74,7 +76,7 @@ class Normalizer(use_metaclass(_NormalizerMeta)):
return True return True
@classmethod @classmethod
def register_rule(cls, **kwargs): def register_rule(cls, *, value=None, values=(), type=None, types=()):
""" """
Use it as a class decorator:: Use it as a class decorator::
@@ -83,10 +85,6 @@ class Normalizer(use_metaclass(_NormalizerMeta)):
class MyRule(Rule): class MyRule(Rule):
error_code = 42 error_code = 42
""" """
return cls._register_rule(**kwargs)
@classmethod
def _register_rule(cls, value=None, values=(), type=None, types=()):
values = list(values) values = list(values)
types = list(types) types = list(types)
if value is not None: if value is not None:
@@ -107,7 +105,7 @@ class Normalizer(use_metaclass(_NormalizerMeta)):
return decorator return decorator
class NormalizerConfig(object): class NormalizerConfig:
normalizer_class = Normalizer normalizer_class = Normalizer
def create_normalizer(self, grammar): def create_normalizer(self, grammar):
@@ -117,9 +115,8 @@ class NormalizerConfig(object):
return self.normalizer_class(grammar, self) return self.normalizer_class(grammar, self)
class Issue(object): class Issue:
def __init__(self, node, code, message): def __init__(self, node, code, message):
self._node = node
self.code = code self.code = code
""" """
An integer code that stands for the type of error. An integer code that stands for the type of error.
@@ -133,6 +130,7 @@ class Issue(object):
The start position position of the error as a tuple (line, column). As The start position position of the error as a tuple (line, column). As
always in |parso| the first line is 1 and the first column 0. always in |parso| the first line is 1 and the first column 0.
""" """
self.end_pos = node.end_pos
def __eq__(self, other): def __eq__(self, other):
return self.start_pos == other.start_pos and self.code == other.code return self.start_pos == other.start_pos and self.code == other.code
@@ -147,10 +145,9 @@ class Issue(object):
return '<%s: %s>' % (self.__class__.__name__, self.code) return '<%s: %s>' % (self.__class__.__name__, self.code)
class Rule:
class Rule(object): code: int
code = None message: str
message = None
def __init__(self, normalizer): def __init__(self, normalizer):
self._normalizer = normalizer self._normalizer = normalizer
@@ -161,7 +158,7 @@ class Rule(object):
def get_node(self, node): def get_node(self, node):
return node return node
def _get_message(self, message): def _get_message(self, message, node):
if message is None: if message is None:
message = self.message message = self.message
if message is None: if message is None:
@@ -174,7 +171,7 @@ class Rule(object):
if code is None: if code is None:
raise ValueError("The error code on the class is not set.") raise ValueError("The error code on the class is not set.")
message = self._get_message(message) message = self._get_message(message, node)
self._normalizer.add_issue(node, code, message) self._normalizer.add_issue(node, code, message)
@@ -182,3 +179,20 @@ class Rule(object):
if self.is_issue(node): if self.is_issue(node):
issue_node = self.get_node(node) issue_node = self.get_node(node)
self.add_issue(issue_node) self.add_issue(issue_node)
class RefactoringNormalizer(Normalizer):
def __init__(self, node_to_str_map):
self._node_to_str_map = node_to_str_map
def visit(self, node):
try:
return self._node_to_str_map[node]
except KeyError:
return super().visit(node)
def visit_leaf(self, leaf):
try:
return self._node_to_str_map[leaf]
except KeyError:
return super().visit_leaf(leaf)
+8 -7
View File
@@ -23,6 +23,8 @@ within the statement. This lowers memory usage and cpu time and reduces the
complexity of the ``Parser`` (there's another parser sitting inside complexity of the ``Parser`` (there's another parser sitting inside
``Statement``, which produces ``Array`` and ``Call``). ``Statement``, which produces ``Array`` and ``Call``).
""" """
from typing import Dict
from parso import tree from parso import tree
from parso.pgen2.generator import ReservedString from parso.pgen2.generator import ReservedString
@@ -71,7 +73,7 @@ class Stack(list):
return list(iterate()) return list(iterate())
class StackNode(object): class StackNode:
def __init__(self, dfa): def __init__(self, dfa):
self.dfa = dfa self.dfa = dfa
self.nodes = [] self.nodes = []
@@ -86,7 +88,7 @@ class StackNode(object):
def _token_to_transition(grammar, type_, value): def _token_to_transition(grammar, type_, value):
# Map from token to label # Map from token to label
if type_.contains_syntax: if type_.value.contains_syntax:
# Check for reserved words (keywords) # Check for reserved words (keywords)
try: try:
return grammar.reserved_syntax_strings[value] return grammar.reserved_syntax_strings[value]
@@ -96,7 +98,7 @@ def _token_to_transition(grammar, type_, value):
return type_ return type_
class BaseParser(object): class BaseParser:
"""Parser engine. """Parser engine.
A Parser instance contains state pertaining to the current token A Parser instance contains state pertaining to the current token
@@ -108,11 +110,10 @@ class BaseParser(object):
When a syntax error occurs, error_recovery() is called. When a syntax error occurs, error_recovery() is called.
""" """
node_map = {} node_map: Dict[str, type] = {}
default_node = tree.Node default_node = tree.Node
leaf_map = { leaf_map: Dict[str, type] = {}
}
default_leaf = tree.Leaf default_leaf = tree.Leaf
def __init__(self, pgen_grammar, start_nonterminal='file_input', error_recovery=False): def __init__(self, pgen_grammar, start_nonterminal='file_input', error_recovery=False):
@@ -134,7 +135,7 @@ class BaseParser(object):
# However, the error recovery might have added the token again, if # However, the error recovery might have added the token again, if
# the stack is empty, we're fine. # the stack is empty, we're fine.
raise InternalParseError( raise InternalParseError(
"incomplete input", token.type, token.value, token.start_pos "incomplete input", token.type, token.string, token.start_pos
) )
if len(self.stack) > 1: if len(self.stack) > 1:
+55 -31
View File
@@ -27,11 +27,14 @@ because we made some optimizations.
""" """
from ast import literal_eval from ast import literal_eval
from typing import TypeVar, Generic, Mapping, Sequence, Set, Union
from parso.pgen2.grammar_parser import GrammarParser, NFAState from parso.pgen2.grammar_parser import GrammarParser, NFAState
_TokenTypeT = TypeVar("_TokenTypeT")
class Grammar(object):
class Grammar(Generic[_TokenTypeT]):
""" """
Once initialized, this class supplies the grammar tables for the Once initialized, this class supplies the grammar tables for the
parsing engine implemented by parse.py. The parsing engine parsing engine implemented by parse.py. The parsing engine
@@ -41,18 +44,21 @@ class Grammar(object):
dfas. dfas.
""" """
def __init__(self, start_nonterminal, rule_to_dfas, reserved_syntax_strings): def __init__(self,
self.nonterminal_to_dfas = rule_to_dfas # Dict[str, List[DFAState]] start_nonterminal: str,
rule_to_dfas: Mapping[str, Sequence['DFAState[_TokenTypeT]']],
reserved_syntax_strings: Mapping[str, 'ReservedString']):
self.nonterminal_to_dfas = rule_to_dfas
self.reserved_syntax_strings = reserved_syntax_strings self.reserved_syntax_strings = reserved_syntax_strings
self.start_nonterminal = start_nonterminal self.start_nonterminal = start_nonterminal
class DFAPlan(object): class DFAPlan:
""" """
Plans are used for the parser to create stack nodes and do the proper Plans are used for the parser to create stack nodes and do the proper
DFA state transitions. DFA state transitions.
""" """
def __init__(self, next_dfa, dfa_pushes=[]): def __init__(self, next_dfa: 'DFAState', dfa_pushes: Sequence['DFAState'] = []):
self.next_dfa = next_dfa self.next_dfa = next_dfa
self.dfa_pushes = dfa_pushes self.dfa_pushes = dfa_pushes
@@ -60,7 +66,7 @@ class DFAPlan(object):
return '%s(%s, %s)' % (self.__class__.__name__, self.next_dfa, self.dfa_pushes) return '%s(%s, %s)' % (self.__class__.__name__, self.next_dfa, self.dfa_pushes)
class DFAState(object): class DFAState(Generic[_TokenTypeT]):
""" """
The DFAState object is the core class for pretty much anything. DFAState The DFAState object is the core class for pretty much anything. DFAState
are the vertices of an ordered graph while arcs and transitions are the are the vertices of an ordered graph while arcs and transitions are the
@@ -70,20 +76,21 @@ class DFAState(object):
transitions are then calculated to connect the DFA state machines that have transitions are then calculated to connect the DFA state machines that have
different nonterminals. different nonterminals.
""" """
def __init__(self, from_rule, nfa_set, final): def __init__(self, from_rule: str, nfa_set: Set[NFAState], final: NFAState):
assert isinstance(nfa_set, set) assert isinstance(nfa_set, set)
assert isinstance(next(iter(nfa_set)), NFAState) assert isinstance(next(iter(nfa_set)), NFAState)
assert isinstance(final, NFAState) assert isinstance(final, NFAState)
self.from_rule = from_rule self.from_rule = from_rule
self.nfa_set = nfa_set self.nfa_set = nfa_set
self.arcs = {} # map from terminals/nonterminals to DFAState # map from terminals/nonterminals to DFAState
self.arcs: Mapping[str, DFAState] = {}
# In an intermediary step we set these nonterminal arcs (which has the # In an intermediary step we set these nonterminal arcs (which has the
# same structure as arcs). These don't contain terminals anymore. # same structure as arcs). These don't contain terminals anymore.
self.nonterminal_arcs = {} self.nonterminal_arcs: Mapping[str, DFAState] = {}
# Transitions are basically the only thing that the parser is using # Transitions are basically the only thing that the parser is using
# with is_final. Everyting else is purely here to create a parser. # with is_final. Everyting else is purely here to create a parser.
self.transitions = {} #: Dict[Union[TokenType, ReservedString], DFAPlan] self.transitions: Mapping[Union[_TokenTypeT, ReservedString], DFAPlan] = {}
self.is_final = final in nfa_set self.is_final = final in nfa_set
def add_arc(self, next_, label): def add_arc(self, next_, label):
@@ -111,22 +118,20 @@ class DFAState(object):
return False return False
return True return True
__hash__ = None # For Py3 compatibility.
def __repr__(self): def __repr__(self):
return '<%s: %s is_final=%s>' % ( return '<%s: %s is_final=%s>' % (
self.__class__.__name__, self.from_rule, self.is_final self.__class__.__name__, self.from_rule, self.is_final
) )
class ReservedString(object): class ReservedString:
""" """
Most grammars will have certain keywords and operators that are mentioned Most grammars will have certain keywords and operators that are mentioned
in the grammar as strings (e.g. "if") and not token types (e.g. NUMBER). in the grammar as strings (e.g. "if") and not token types (e.g. NUMBER).
This class basically is the former. This class basically is the former.
""" """
def __init__(self, value): def __init__(self, value: str):
self.value = value self.value = value
def __repr__(self): def __repr__(self):
@@ -149,7 +154,6 @@ def _simplify_dfas(dfas):
for j in range(i + 1, len(dfas)): for j in range(i + 1, len(dfas)):
state_j = dfas[j] state_j = dfas[j]
if state_i == state_j: if state_i == state_j:
#print " unify", i, j
del dfas[j] del dfas[j]
for state in dfas: for state in dfas:
state.unifystate(state_j, state_i) state.unifystate(state_j, state_i)
@@ -212,7 +216,8 @@ def _dump_nfa(start, finish):
todo = [start] todo = [start]
for i, state in enumerate(todo): for i, state in enumerate(todo):
print(" State", i, state is finish and "(final)" or "") print(" State", i, state is finish and "(final)" or "")
for label, next_ in state.arcs: for arc in state.arcs:
label, next_ = arc.nonterminal_or_string, arc.next
if next_ in todo: if next_ in todo:
j = todo.index(next_) j = todo.index(next_)
else: else:
@@ -232,7 +237,7 @@ def _dump_dfas(dfas):
print(" %s -> %d" % (nonterminal, dfas.index(next_))) print(" %s -> %d" % (nonterminal, dfas.index(next_)))
def generate_grammar(bnf_grammar, token_namespace): def generate_grammar(bnf_grammar: str, token_namespace) -> Grammar:
""" """
``bnf_text`` is a grammar in extended BNF (using * for repetition, + for ``bnf_text`` is a grammar in extended BNF (using * for repetition, + for
at-least-once repetition, [] for optional parts, | for alternatives and () at-least-once repetition, [] for optional parts, | for alternatives and ()
@@ -244,19 +249,19 @@ def generate_grammar(bnf_grammar, token_namespace):
rule_to_dfas = {} rule_to_dfas = {}
start_nonterminal = None start_nonterminal = None
for nfa_a, nfa_z in GrammarParser(bnf_grammar).parse(): for nfa_a, nfa_z in GrammarParser(bnf_grammar).parse():
#_dump_nfa(a, z) # _dump_nfa(nfa_a, nfa_z)
dfas = _make_dfas(nfa_a, nfa_z) dfas = _make_dfas(nfa_a, nfa_z)
#_dump_dfas(dfas) # _dump_dfas(dfas)
# oldlen = len(dfas) # oldlen = len(dfas)
_simplify_dfas(dfas) _simplify_dfas(dfas)
# newlen = len(dfas) # newlen = len(dfas)
rule_to_dfas[nfa_a.from_rule] = dfas rule_to_dfas[nfa_a.from_rule] = dfas
#print(nfa_a.from_rule, oldlen, newlen) # print(nfa_a.from_rule, oldlen, newlen)
if start_nonterminal is None: if start_nonterminal is None:
start_nonterminal = nfa_a.from_rule start_nonterminal = nfa_a.from_rule
reserved_strings = {} reserved_strings: Mapping[str, ReservedString] = {}
for nonterminal, dfas in rule_to_dfas.items(): for nonterminal, dfas in rule_to_dfas.items():
for dfa_state in dfas: for dfa_state in dfas:
for terminal_or_nonterminal, next_dfa in dfa_state.arcs.items(): for terminal_or_nonterminal, next_dfa in dfa_state.arcs.items():
@@ -271,7 +276,7 @@ def generate_grammar(bnf_grammar, token_namespace):
dfa_state.transitions[transition] = DFAPlan(next_dfa) dfa_state.transitions[transition] = DFAPlan(next_dfa)
_calculate_tree_traversal(rule_to_dfas) _calculate_tree_traversal(rule_to_dfas)
return Grammar(start_nonterminal, rule_to_dfas, reserved_strings) return Grammar(start_nonterminal, rule_to_dfas, reserved_strings) # type: ignore
def _make_transition(token_namespace, reserved_syntax_strings, label): def _make_transition(token_namespace, reserved_syntax_strings, label):
@@ -309,13 +314,39 @@ def _calculate_tree_traversal(nonterminal_to_dfas):
_calculate_first_plans(nonterminal_to_dfas, first_plans, nonterminal) _calculate_first_plans(nonterminal_to_dfas, first_plans, nonterminal)
# Now that we have calculated the first terminals, we are sure that # Now that we have calculated the first terminals, we are sure that
# there is no left recursion or ambiguities. # there is no left recursion.
for dfas in nonterminal_to_dfas.values(): for dfas in nonterminal_to_dfas.values():
for dfa_state in dfas: for dfa_state in dfas:
transitions = dfa_state.transitions
for nonterminal, next_dfa in dfa_state.nonterminal_arcs.items(): for nonterminal, next_dfa in dfa_state.nonterminal_arcs.items():
for transition, pushes in first_plans[nonterminal].items(): for transition, pushes in first_plans[nonterminal].items():
dfa_state.transitions[transition] = DFAPlan(next_dfa, pushes) if transition in transitions:
prev_plan = transitions[transition]
# Make sure these are sorted so that error messages are
# at least deterministic
choices = sorted([
(
prev_plan.dfa_pushes[0].from_rule
if prev_plan.dfa_pushes
else prev_plan.next_dfa.from_rule
),
(
pushes[0].from_rule
if pushes else next_dfa.from_rule
),
])
raise ValueError(
"Rule %s is ambiguous; given a %s token, we "
"can't determine if we should evaluate %s or %s."
% (
(
dfa_state.from_rule,
transition,
) + tuple(choices)
)
)
transitions[transition] = DFAPlan(next_dfa, pushes)
def _calculate_first_plans(nonterminal_to_dfas, first_plans, nonterminal): def _calculate_first_plans(nonterminal_to_dfas, first_plans, nonterminal):
@@ -345,13 +376,6 @@ def _calculate_first_plans(nonterminal_to_dfas, first_plans, nonterminal):
raise ValueError("left recursion for rule %r" % nonterminal) raise ValueError("left recursion for rule %r" % nonterminal)
for t, pushes in first_plans2.items(): for t, pushes in first_plans2.items():
check = new_first_plans.get(t)
if check is not None:
raise ValueError(
"Rule %s is ambiguous; %s is the"
" start of the rule %s as well as %s."
% (nonterminal, t, nonterminal2, check[-1].from_rule)
)
new_first_plans[t] = [next_] + pushes new_first_plans[t] = [next_] + pushes
first_plans[nonterminal] = new_first_plans first_plans[nonterminal] = new_first_plans
+28 -24
View File
@@ -4,25 +4,49 @@
# Modifications: # Modifications:
# Copyright David Halter and Contributors # Copyright David Halter and Contributors
# Modifications are dual-licensed: MIT and PSF. # Modifications are dual-licensed: MIT and PSF.
from typing import Optional, Iterator, Tuple, List
from parso.python.tokenize import tokenize from parso.python.tokenize import tokenize
from parso.utils import parse_version_string from parso.utils import parse_version_string
from parso.python.token import PythonTokenTypes from parso.python.token import PythonTokenTypes
class GrammarParser(): class NFAArc:
def __init__(self, next_: 'NFAState', nonterminal_or_string: Optional[str]):
self.next: NFAState = next_
self.nonterminal_or_string: Optional[str] = nonterminal_or_string
def __repr__(self):
return '<%s: %s>' % (self.__class__.__name__, self.nonterminal_or_string)
class NFAState:
def __init__(self, from_rule: str):
self.from_rule: str = from_rule
self.arcs: List[NFAArc] = []
def add_arc(self, next_, nonterminal_or_string=None):
assert nonterminal_or_string is None or isinstance(nonterminal_or_string, str)
assert isinstance(next_, NFAState)
self.arcs.append(NFAArc(next_, nonterminal_or_string))
def __repr__(self):
return '<%s: from %s>' % (self.__class__.__name__, self.from_rule)
class GrammarParser:
""" """
The parser for Python grammar files. The parser for Python grammar files.
""" """
def __init__(self, bnf_grammar): def __init__(self, bnf_grammar: str):
self._bnf_grammar = bnf_grammar self._bnf_grammar = bnf_grammar
self.generator = tokenize( self.generator = tokenize(
bnf_grammar, bnf_grammar,
version_info=parse_version_string('3.6') version_info=parse_version_string('3.9')
) )
self._gettoken() # Initialize lookahead self._gettoken() # Initialize lookahead
def parse(self): def parse(self) -> Iterator[Tuple[NFAState, NFAState]]:
# grammar: (NEWLINE | rule)* ENDMARKER # grammar: (NEWLINE | rule)* ENDMARKER
while self.type != PythonTokenTypes.ENDMARKER: while self.type != PythonTokenTypes.ENDMARKER:
while self.type == PythonTokenTypes.NEWLINE: while self.type == PythonTokenTypes.NEWLINE:
@@ -134,23 +158,3 @@ class GrammarParser():
line = self._bnf_grammar.splitlines()[self.begin[0] - 1] line = self._bnf_grammar.splitlines()[self.begin[0] - 1]
raise SyntaxError(msg, ('<grammar>', self.begin[0], raise SyntaxError(msg, ('<grammar>', self.begin[0],
self.begin[1], line)) self.begin[1], line))
class NFAArc(object):
def __init__(self, next_, nonterminal_or_string):
self.next = next_
self.nonterminal_or_string = nonterminal_or_string
class NFAState(object):
def __init__(self, from_rule):
self.from_rule = from_rule
self.arcs = [] # List[nonterminal (str), NFAState]
def add_arc(self, next_, nonterminal_or_string=None):
assert nonterminal_or_string is None or isinstance(nonterminal_or_string, str)
assert isinstance(next_, NFAState)
self.arcs.append(NFAArc(next_, nonterminal_or_string))
def __repr__(self):
return '<%s: from %s>' % (self.__class__.__name__, self.from_rule)
View File
+293 -123
View File
@@ -1,9 +1,29 @@
""" """
Basically a contains parser that is faster, because it tries to parse only The diff parser is trying to be a faster version of the normal parser by trying
parts and if anything changes, it only reparses the changed parts. to reuse the nodes of a previous pass over the same file. This is also called
incremental parsing in parser literature. The difference is mostly that with
incremental parsing you get a range that needs to be reparsed. Here we
calculate that range ourselves by using difflib. After that it's essentially
incremental parsing.
It works with a simple diff in the beginning and will try to reuse old parser The biggest issue of this approach is that we reuse nodes in a mutable way. The
fragments. intial design and idea is quite problematic for this parser, but it is also
pretty fast. Measurements showed that just copying nodes in Python is simply
quite a bit slower (especially for big files >3 kLOC). Therefore we did not
want to get rid of the mutable nodes, since this is usually not an issue.
This is by far the hardest software I ever wrote, exactly because the initial
design is crappy. When you have to account for a lot of mutable state, it
creates a ton of issues that you would otherwise not have. This file took
probably 3-6 months to write, which is insane for a parser.
There is a fuzzer in that helps test this whole thing. Please use it if you
make changes here. If you run the fuzzer like::
test/fuzz_diff_parser.py random -n 100000
you can be pretty sure that everything is still fine. I sometimes run the
fuzzer up to 24h to make sure everything is still ok.
""" """
import re import re
import difflib import difflib
@@ -13,7 +33,7 @@ import logging
from parso.utils import split_lines from parso.utils import split_lines
from parso.python.parser import Parser from parso.python.parser import Parser
from parso.python.tree import EndMarker from parso.python.tree import EndMarker
from parso.python.tokenize import PythonToken from parso.python.tokenize import PythonToken, BOM_UTF8_STRING
from parso.python.token import PythonTokenTypes from parso.python.token import PythonTokenTypes
LOG = logging.getLogger(__name__) LOG = logging.getLogger(__name__)
@@ -21,21 +41,37 @@ DEBUG_DIFF_PARSER = False
_INDENTATION_TOKENS = 'INDENT', 'ERROR_DEDENT', 'DEDENT' _INDENTATION_TOKENS = 'INDENT', 'ERROR_DEDENT', 'DEDENT'
NEWLINE = PythonTokenTypes.NEWLINE
DEDENT = PythonTokenTypes.DEDENT
NAME = PythonTokenTypes.NAME
ERROR_DEDENT = PythonTokenTypes.ERROR_DEDENT
ENDMARKER = PythonTokenTypes.ENDMARKER
def _is_indentation_error_leaf(node):
return node.type == 'error_leaf' and node.token_type in _INDENTATION_TOKENS
def _get_previous_leaf_if_indentation(leaf): def _get_previous_leaf_if_indentation(leaf):
while leaf and leaf.type == 'error_leaf' \ while leaf and _is_indentation_error_leaf(leaf):
and leaf.token_type in _INDENTATION_TOKENS:
leaf = leaf.get_previous_leaf() leaf = leaf.get_previous_leaf()
return leaf return leaf
def _get_next_leaf_if_indentation(leaf): def _get_next_leaf_if_indentation(leaf):
while leaf and leaf.type == 'error_leaf' \ while leaf and _is_indentation_error_leaf(leaf):
and leaf.token_type in _INDENTATION_TOKENS: leaf = leaf.get_next_leaf()
leaf = leaf.get_previous_leaf()
return leaf return leaf
def _get_suite_indentation(tree_node):
return _get_indentation(tree_node.children[1])
def _get_indentation(tree_node):
return tree_node.start_pos[1]
def _assert_valid_graph(node): def _assert_valid_graph(node):
""" """
Checks if the parent/children relationship is correct. Checks if the parent/children relationship is correct.
@@ -70,6 +106,10 @@ def _assert_valid_graph(node):
actual = line, len(splitted[-1]) actual = line, len(splitted[-1])
else: else:
actual = previous_start_pos[0], previous_start_pos[1] + len(content) actual = previous_start_pos[0], previous_start_pos[1] + len(content)
if content.startswith(BOM_UTF8_STRING) \
and node.get_start_pos_of_prefix() == (1, 0):
# Remove the byte order mark
actual = actual[0], actual[1] - 1
assert node.start_pos == actual, (node.start_pos, actual) assert node.start_pos == actual, (node.start_pos, actual)
else: else:
@@ -78,6 +118,26 @@ def _assert_valid_graph(node):
_assert_valid_graph(child) _assert_valid_graph(child)
def _assert_nodes_are_equal(node1, node2):
try:
children1 = node1.children
except AttributeError:
assert not hasattr(node2, 'children'), (node1, node2)
assert node1.value == node2.value, (node1, node2)
assert node1.type == node2.type, (node1, node2)
assert node1.prefix == node2.prefix, (node1, node2)
assert node1.start_pos == node2.start_pos, (node1, node2)
return
else:
try:
children2 = node2.children
except AttributeError:
assert False, (node1, node2)
for n1, n2 in zip(children1, children2):
_assert_nodes_are_equal(n1, n2)
assert len(children1) == len(children2), '\n' + repr(children1) + '\n' + repr(children2)
def _get_debug_error_message(module, old_lines, new_lines): def _get_debug_error_message(module, old_lines, new_lines):
current_lines = split_lines(module.get_code(), keepends=True) current_lines = split_lines(module.get_code(), keepends=True)
current_diff = difflib.unified_diff(new_lines, current_lines) current_diff = difflib.unified_diff(new_lines, current_lines)
@@ -95,6 +155,15 @@ def _get_last_line(node_or_leaf):
if _ends_with_newline(last_leaf): if _ends_with_newline(last_leaf):
return last_leaf.start_pos[0] return last_leaf.start_pos[0]
else: else:
n = last_leaf.get_next_leaf()
if n.type == 'endmarker' and '\n' in n.prefix:
# This is a very special case and has to do with error recovery in
# Parso. The problem is basically that there's no newline leaf at
# the end sometimes (it's required in the grammar, but not needed
# actually before endmarker, CPython just adds a newline to make
# source code pass the parser, to account for that Parso error
# recovery allows small_stmt instead of simple_stmt).
return last_leaf.end_pos[0] + 1
return last_leaf.end_pos[0] return last_leaf.end_pos[0]
@@ -178,7 +247,7 @@ def _update_positions(nodes, line_offset, last_leaf):
_update_positions(children, line_offset, last_leaf) _update_positions(children, line_offset, last_leaf)
class DiffParser(object): class DiffParser:
""" """
An advanced form of parsing a file faster. Unfortunately comes with huge An advanced form of parsing a file faster. Unfortunately comes with huge
side effects. It changes the given module. side effects. It changes the given module.
@@ -233,7 +302,7 @@ class DiffParser(object):
if operation == 'equal': if operation == 'equal':
line_offset = j1 - i1 line_offset = j1 - i1
self._copy_from_old_parser(line_offset, i2, j2) self._copy_from_old_parser(line_offset, i1 + 1, i2, j2)
elif operation == 'replace': elif operation == 'replace':
self._parse(until_line=j2) self._parse(until_line=j2)
elif operation == 'insert': elif operation == 'insert':
@@ -249,8 +318,14 @@ class DiffParser(object):
# If there is reasonable suspicion that the diff parser is not # If there is reasonable suspicion that the diff parser is not
# behaving well, this should be enabled. # behaving well, this should be enabled.
try: try:
assert self._module.get_code() == ''.join(new_lines) code = ''.join(new_lines)
assert self._module.get_code() == code
_assert_valid_graph(self._module) _assert_valid_graph(self._module)
without_diff_parser_module = Parser(
self._pgen_grammar,
error_recovery=True
).parse(self._tokenizer(new_lines))
_assert_nodes_are_equal(self._module, without_diff_parser_module)
except AssertionError: except AssertionError:
print(_get_debug_error_message(self._module, old_lines, new_lines)) print(_get_debug_error_message(self._module, old_lines, new_lines))
raise raise
@@ -268,7 +343,7 @@ class DiffParser(object):
if self._module.get_code() != ''.join(lines_new): if self._module.get_code() != ''.join(lines_new):
LOG.warning('parser issue:\n%s\n%s', ''.join(old_lines), ''.join(lines_new)) LOG.warning('parser issue:\n%s\n%s', ''.join(old_lines), ''.join(lines_new))
def _copy_from_old_parser(self, line_offset, until_line_old, until_line_new): def _copy_from_old_parser(self, line_offset, start_line_old, until_line_old, until_line_new):
last_until_line = -1 last_until_line = -1
while until_line_new > self._nodes_tree.parsed_until_line: while until_line_new > self._nodes_tree.parsed_until_line:
parsed_until_line_old = self._nodes_tree.parsed_until_line - line_offset parsed_until_line_old = self._nodes_tree.parsed_until_line - line_offset
@@ -282,12 +357,18 @@ class DiffParser(object):
p_children = line_stmt.parent.children p_children = line_stmt.parent.children
index = p_children.index(line_stmt) index = p_children.index(line_stmt)
from_ = self._nodes_tree.parsed_until_line + 1 if start_line_old == 1 \
copied_nodes = self._nodes_tree.copy_nodes( and p_children[0].get_first_leaf().prefix.startswith(BOM_UTF8_STRING):
p_children[index:], # If there's a BOM in the beginning, just reparse. It's too
until_line_old, # complicated to account for it otherwise.
line_offset copied_nodes = []
) else:
from_ = self._nodes_tree.parsed_until_line + 1
copied_nodes = self._nodes_tree.copy_nodes(
p_children[index:],
until_line_old,
line_offset
)
# Match all the nodes that are in the wanted range. # Match all the nodes that are in the wanted range.
if copied_nodes: if copied_nodes:
self._copy_count += 1 self._copy_count += 1
@@ -333,7 +414,10 @@ class DiffParser(object):
node = self._try_parse_part(until_line) node = self._try_parse_part(until_line)
nodes = node.children nodes = node.children
self._nodes_tree.add_parsed_nodes(nodes) self._nodes_tree.add_parsed_nodes(nodes, self._keyword_token_indents)
if self._replace_tos_indent is not None:
self._nodes_tree.indents[-1] = self._replace_tos_indent
LOG.debug( LOG.debug(
'parse_part from %s to %s (to %s in part parser)', 'parse_part from %s to %s (to %s in part parser)',
nodes[0].get_start_pos_of_prefix()[0], nodes[0].get_start_pos_of_prefix()[0],
@@ -369,34 +453,39 @@ class DiffParser(object):
return self._active_parser.parse(tokens=tokens) return self._active_parser.parse(tokens=tokens)
def _diff_tokenize(self, lines, until_line, line_offset=0): def _diff_tokenize(self, lines, until_line, line_offset=0):
is_first_token = True was_newline = False
omitted_first_indent = False indents = self._nodes_tree.indents
indents = [] initial_indentation_count = len(indents)
tokens = self._tokenizer(lines, (1, 0))
stack = self._active_parser.stack
for typ, string, start_pos, prefix in tokens:
start_pos = start_pos[0] + line_offset, start_pos[1]
if typ == PythonTokenTypes.INDENT:
indents.append(start_pos[1])
if is_first_token:
omitted_first_indent = True
# We want to get rid of indents that are only here because
# we only parse part of the file. These indents would only
# get parsed as error leafs, which doesn't make any sense.
is_first_token = False
continue
is_first_token = False
# In case of omitted_first_indent, it might not be dedented fully. tokens = self._tokenizer(
# However this is a sign for us that a dedent happened. lines,
if typ == PythonTokenTypes.DEDENT \ start_pos=(line_offset + 1, 0),
or typ == PythonTokenTypes.ERROR_DEDENT \ indents=indents,
and omitted_first_indent and len(indents) == 1: is_first_token=line_offset == 0,
indents.pop() )
if omitted_first_indent and not indents: stack = self._active_parser.stack
self._replace_tos_indent = None
self._keyword_token_indents = {}
# print('start', line_offset + 1, indents)
for token in tokens:
# print(token, indents)
typ = token.type
if typ == DEDENT:
if len(indents) < initial_indentation_count:
# We are done here, only thing that can come now is an # We are done here, only thing that can come now is an
# endmarker or another dedented code block. # endmarker or another dedented code block.
typ, string, start_pos, prefix = next(tokens) while True:
typ, string, start_pos, prefix = token = next(tokens)
if typ in (DEDENT, ERROR_DEDENT):
if typ == ERROR_DEDENT:
# We want to force an error dedent in the next
# parser/pass. To make this possible we just
# increase the location by one.
self._replace_tos_indent = start_pos[1] + 1
pass
else:
break
if '\n' in prefix or '\r' in prefix: if '\n' in prefix or '\r' in prefix:
prefix = re.sub(r'[^\n\r]+\Z', '', prefix) prefix = re.sub(r'[^\n\r]+\Z', '', prefix)
else: else:
@@ -404,36 +493,38 @@ class DiffParser(object):
if start_pos[1] - len(prefix) == 0: if start_pos[1] - len(prefix) == 0:
prefix = '' prefix = ''
yield PythonToken( yield PythonToken(
PythonTokenTypes.ENDMARKER, '', ENDMARKER, '',
(start_pos[0] + line_offset, 0), start_pos,
prefix prefix
) )
break break
elif typ == PythonTokenTypes.NEWLINE and start_pos[0] >= until_line: elif typ == NEWLINE and token.start_pos[0] >= until_line:
yield PythonToken(typ, string, start_pos, prefix) was_newline = True
# Check if the parser is actually in a valid suite state. elif was_newline:
if _suite_or_file_input_is_valid(self._pgen_grammar, stack): was_newline = False
start_pos = start_pos[0] + 1, 0 if len(indents) == initial_indentation_count:
while len(indents) > int(omitted_first_indent): # Check if the parser is actually in a valid suite state.
indents.pop() if _suite_or_file_input_is_valid(self._pgen_grammar, stack):
yield PythonToken(PythonTokenTypes.DEDENT, '', start_pos, '') yield PythonToken(ENDMARKER, '', token.start_pos, '')
break
yield PythonToken(PythonTokenTypes.ENDMARKER, '', start_pos, '') if typ == NAME and token.string in ('class', 'def'):
break self._keyword_token_indents[token.start_pos] = list(indents)
else:
continue
yield PythonToken(typ, string, start_pos, prefix) yield token
class _NodesTreeNode(object): class _NodesTreeNode:
_ChildrenGroup = namedtuple('_ChildrenGroup', 'prefix children line_offset last_line_offset_leaf') _ChildrenGroup = namedtuple(
'_ChildrenGroup',
'prefix children line_offset last_line_offset_leaf')
def __init__(self, tree_node, parent=None): def __init__(self, tree_node, parent=None, indentation=0):
self.tree_node = tree_node self.tree_node = tree_node
self._children_groups = [] self._children_groups = []
self.parent = parent self.parent = parent
self._node_children = [] self._node_children = []
self.indentation = indentation
def finish(self): def finish(self):
children = [] children = []
@@ -461,10 +552,13 @@ class _NodesTreeNode(object):
def add_child_node(self, child_node): def add_child_node(self, child_node):
self._node_children.append(child_node) self._node_children.append(child_node)
def add_tree_nodes(self, prefix, children, line_offset=0, last_line_offset_leaf=None): def add_tree_nodes(self, prefix, children, line_offset=0,
last_line_offset_leaf=None):
if last_line_offset_leaf is None: if last_line_offset_leaf is None:
last_line_offset_leaf = children[-1].get_last_leaf() last_line_offset_leaf = children[-1].get_last_leaf()
group = self._ChildrenGroup(prefix, children, line_offset, last_line_offset_leaf) group = self._ChildrenGroup(
prefix, children, line_offset, last_line_offset_leaf
)
self._children_groups.append(group) self._children_groups.append(group)
def get_last_line(self, suffix): def get_last_line(self, suffix):
@@ -491,42 +585,30 @@ class _NodesTreeNode(object):
return max(line, self._node_children[-1].get_last_line(suffix)) return max(line, self._node_children[-1].get_last_line(suffix))
return line return line
def __repr__(self):
return '<%s: %s>' % (self.__class__.__name__, self.tree_node)
class _NodesTree(object):
class _NodesTree:
def __init__(self, module): def __init__(self, module):
self._base_node = _NodesTreeNode(module) self._base_node = _NodesTreeNode(module)
self._working_stack = [self._base_node] self._working_stack = [self._base_node]
self._module = module self._module = module
self._prefix_remainder = '' self._prefix_remainder = ''
self.prefix = '' self.prefix = ''
self.indents = [0]
@property @property
def parsed_until_line(self): def parsed_until_line(self):
return self._working_stack[-1].get_last_line(self.prefix) return self._working_stack[-1].get_last_line(self.prefix)
def _get_insertion_node(self, indentation_node): def _update_insertion_node(self, indentation):
indentation = indentation_node.start_pos[1] for node in reversed(list(self._working_stack)):
if node.indentation < indentation or node is self._working_stack[0]:
# find insertion node
while True:
node = self._working_stack[-1]
tree_node = node.tree_node
if tree_node.type == 'suite':
# A suite starts with NEWLINE, ...
node_indentation = tree_node.children[1].start_pos[1]
if indentation >= node_indentation: # Not a Dedent
# We might be at the most outer layer: modules. We
# don't want to depend on the first statement
# having the right indentation.
return node
elif tree_node.type == 'file_input':
return node return node
self._working_stack.pop() self._working_stack.pop()
def add_parsed_nodes(self, tree_nodes): def add_parsed_nodes(self, tree_nodes, keyword_token_indents):
old_prefix = self.prefix old_prefix = self.prefix
tree_nodes = self._remove_endmarker(tree_nodes) tree_nodes = self._remove_endmarker(tree_nodes)
if not tree_nodes: if not tree_nodes:
@@ -535,23 +617,27 @@ class _NodesTree(object):
assert tree_nodes[0].type != 'newline' assert tree_nodes[0].type != 'newline'
node = self._get_insertion_node(tree_nodes[0]) node = self._update_insertion_node(tree_nodes[0].start_pos[1])
assert node.tree_node.type in ('suite', 'file_input') assert node.tree_node.type in ('suite', 'file_input')
node.add_tree_nodes(old_prefix, tree_nodes) node.add_tree_nodes(old_prefix, tree_nodes)
# tos = Top of stack # tos = Top of stack
self._update_tos(tree_nodes[-1]) self._update_parsed_node_tos(tree_nodes[-1], keyword_token_indents)
def _update_tos(self, tree_node): def _update_parsed_node_tos(self, tree_node, keyword_token_indents):
if tree_node.type in ('suite', 'file_input'): if tree_node.type == 'suite':
new_tos = _NodesTreeNode(tree_node) def_leaf = tree_node.parent.children[0]
new_tos = _NodesTreeNode(
tree_node,
indentation=keyword_token_indents[def_leaf.start_pos][-1],
)
new_tos.add_tree_nodes('', list(tree_node.children)) new_tos.add_tree_nodes('', list(tree_node.children))
self._working_stack[-1].add_child_node(new_tos) self._working_stack[-1].add_child_node(new_tos)
self._working_stack.append(new_tos) self._working_stack.append(new_tos)
self._update_tos(tree_node.children[-1]) self._update_parsed_node_tos(tree_node.children[-1], keyword_token_indents)
elif _func_or_class_has_suite(tree_node): elif _func_or_class_has_suite(tree_node):
self._update_tos(tree_node.children[-1]) self._update_parsed_node_tos(tree_node.children[-1], keyword_token_indents)
def _remove_endmarker(self, tree_nodes): def _remove_endmarker(self, tree_nodes):
""" """
@@ -561,7 +647,8 @@ class _NodesTree(object):
is_endmarker = last_leaf.type == 'endmarker' is_endmarker = last_leaf.type == 'endmarker'
self._prefix_remainder = '' self._prefix_remainder = ''
if is_endmarker: if is_endmarker:
separation = max(last_leaf.prefix.rfind('\n'), last_leaf.prefix.rfind('\r')) prefix = last_leaf.prefix
separation = max(prefix.rfind('\n'), prefix.rfind('\r'))
if separation > -1: if separation > -1:
# Remove the whitespace part of the prefix after a newline. # Remove the whitespace part of the prefix after a newline.
# That is not relevant if parentheses were opened. Always parse # That is not relevant if parentheses were opened. Always parse
@@ -577,6 +664,26 @@ class _NodesTree(object):
tree_nodes = tree_nodes[:-1] tree_nodes = tree_nodes[:-1]
return tree_nodes return tree_nodes
def _get_matching_indent_nodes(self, tree_nodes, is_new_suite):
# There might be a random dedent where we have to stop copying.
# Invalid indents are ok, because the parser handled that
# properly before. An invalid dedent can happen, because a few
# lines above there was an invalid indent.
node_iterator = iter(tree_nodes)
if is_new_suite:
yield next(node_iterator)
first_node = next(node_iterator)
indent = _get_indentation(first_node)
if not is_new_suite and indent not in self.indents:
return
yield first_node
for n in node_iterator:
if _get_indentation(n) != indent:
return
yield n
def copy_nodes(self, tree_nodes, until_line, line_offset): def copy_nodes(self, tree_nodes, until_line, line_offset):
""" """
Copies tree nodes from the old parser tree. Copies tree nodes from the old parser tree.
@@ -588,19 +695,38 @@ class _NodesTree(object):
# issues. # issues.
return [] return []
self._get_insertion_node(tree_nodes[0]) indentation = _get_indentation(tree_nodes[0])
old_working_stack = list(self._working_stack)
old_prefix = self.prefix
old_indents = self.indents
self.indents = [i for i in self.indents if i <= indentation]
new_nodes, self._working_stack, self.prefix = self._copy_nodes( self._update_insertion_node(indentation)
new_nodes, self._working_stack, self.prefix, added_indents = self._copy_nodes(
list(self._working_stack), list(self._working_stack),
tree_nodes, tree_nodes,
until_line, until_line,
line_offset, line_offset,
self.prefix, self.prefix,
) )
if new_nodes:
self.indents += added_indents
else:
self._working_stack = old_working_stack
self.prefix = old_prefix
self.indents = old_indents
return new_nodes return new_nodes
def _copy_nodes(self, working_stack, nodes, until_line, line_offset, prefix=''): def _copy_nodes(self, working_stack, nodes, until_line, line_offset,
prefix='', is_nested=False):
new_nodes = [] new_nodes = []
added_indents = []
nodes = list(self._get_matching_indent_nodes(
nodes,
is_new_suite=is_nested,
))
new_prefix = '' new_prefix = ''
for node in nodes: for node in nodes:
@@ -620,26 +746,83 @@ class _NodesTree(object):
if _func_or_class_has_suite(node): if _func_or_class_has_suite(node):
new_nodes.append(node) new_nodes.append(node)
break break
try:
c = node.children
except AttributeError:
pass
else:
# This case basically appears with error recovery of one line
# suites like `def foo(): bar.-`. In this case we might not
# include a newline in the statement and we need to take care
# of that.
n = node
if n.type == 'decorated':
n = n.children[-1]
if n.type in ('async_funcdef', 'async_stmt'):
n = n.children[-1]
if n.type in ('classdef', 'funcdef'):
suite_node = n.children[-1]
else:
suite_node = c[-1]
if suite_node.type in ('error_leaf', 'error_node'):
break
new_nodes.append(node) new_nodes.append(node)
# Pop error nodes at the end from the list
if new_nodes:
while new_nodes:
last_node = new_nodes[-1]
if (last_node.type in ('error_leaf', 'error_node')
or _is_flow_node(new_nodes[-1])):
# Error leafs/nodes don't have a defined start/end. Error
# nodes might not end with a newline (e.g. if there's an
# open `(`). Therefore ignore all of them unless they are
# succeeded with valid parser state.
# If we copy flows at the end, they might be continued
# after the copy limit (in the new parser).
# In this while loop we try to remove until we find a newline.
new_prefix = ''
new_nodes.pop()
while new_nodes:
last_node = new_nodes[-1]
if last_node.get_last_leaf().type == 'newline':
break
new_nodes.pop()
continue
if len(new_nodes) > 1 and new_nodes[-2].type == 'error_node':
# The problem here is that Parso error recovery sometimes
# influences nodes before this node.
# Since the new last node is an error node this will get
# cleaned up in the next while iteration.
new_nodes.pop()
continue
break
if not new_nodes: if not new_nodes:
return [], working_stack, prefix return [], working_stack, prefix, added_indents
tos = working_stack[-1] tos = working_stack[-1]
last_node = new_nodes[-1] last_node = new_nodes[-1]
had_valid_suite_last = False had_valid_suite_last = False
# Pop incomplete suites from the list
if _func_or_class_has_suite(last_node): if _func_or_class_has_suite(last_node):
suite = last_node suite = last_node
while suite.type != 'suite': while suite.type != 'suite':
suite = suite.children[-1] suite = suite.children[-1]
suite_tos = _NodesTreeNode(suite) indent = _get_suite_indentation(suite)
added_indents.append(indent)
suite_tos = _NodesTreeNode(suite, indentation=_get_indentation(last_node))
# Don't need to pass line_offset here, it's already done by the # Don't need to pass line_offset here, it's already done by the
# parent. # parent.
suite_nodes, new_working_stack, new_prefix = self._copy_nodes( suite_nodes, new_working_stack, new_prefix, ai = self._copy_nodes(
working_stack + [suite_tos], suite.children, until_line, line_offset working_stack + [suite_tos], suite.children, until_line, line_offset,
is_nested=True,
) )
added_indents += ai
if len(suite_nodes) < 2: if len(suite_nodes) < 2:
# A suite only with newline is not valid. # A suite only with newline is not valid.
new_nodes.pop() new_nodes.pop()
@@ -650,25 +833,6 @@ class _NodesTree(object):
working_stack = new_working_stack working_stack = new_working_stack
had_valid_suite_last = True had_valid_suite_last = True
if new_nodes:
last_node = new_nodes[-1]
if (last_node.type in ('error_leaf', 'error_node') or
_is_flow_node(new_nodes[-1])):
# Error leafs/nodes don't have a defined start/end. Error
# nodes might not end with a newline (e.g. if there's an
# open `(`). Therefore ignore all of them unless they are
# succeeded with valid parser state.
# If we copy flows at the end, they might be continued
# after the copy limit (in the new parser).
# In this while loop we try to remove until we find a newline.
new_prefix = ''
new_nodes.pop()
while new_nodes:
last_node = new_nodes[-1]
if last_node.get_last_leaf().type == 'newline':
break
new_nodes.pop()
if new_nodes: if new_nodes:
if not _ends_with_newline(new_nodes[-1].get_last_leaf()) and not had_valid_suite_last: if not _ends_with_newline(new_nodes[-1].get_last_leaf()) and not had_valid_suite_last:
p = new_nodes[-1].get_next_leaf().prefix p = new_nodes[-1].get_next_leaf().prefix
@@ -682,15 +846,19 @@ class _NodesTree(object):
last = new_nodes[-1] last = new_nodes[-1]
if last.type == 'decorated': if last.type == 'decorated':
last = last.children[-1] last = last.children[-1]
if last.type in ('async_funcdef', 'async_stmt'):
last = last.children[-1]
last_line_offset_leaf = last.children[-2].get_last_leaf() last_line_offset_leaf = last.children[-2].get_last_leaf()
assert last_line_offset_leaf == ':' assert last_line_offset_leaf == ':'
else: else:
last_line_offset_leaf = new_nodes[-1].get_last_leaf() last_line_offset_leaf = new_nodes[-1].get_last_leaf()
tos.add_tree_nodes(prefix, new_nodes, line_offset, last_line_offset_leaf) tos.add_tree_nodes(
prefix, new_nodes, line_offset, last_line_offset_leaf,
)
prefix = new_prefix prefix = new_prefix
self._prefix_remainder = '' self._prefix_remainder = ''
return new_nodes, working_stack, prefix return new_nodes, working_stack, prefix, added_indents
def close(self): def close(self):
self._base_node.finish() self._base_node.finish()
@@ -706,6 +874,8 @@ class _NodesTree(object):
lines = split_lines(self.prefix) lines = split_lines(self.prefix)
assert len(lines) > 0 assert len(lines) > 0
if len(lines) == 1: if len(lines) == 1:
if lines[0].startswith(BOM_UTF8_STRING) and end_pos == [1, 0]:
end_pos[1] -= 1
end_pos[1] += len(lines[0]) end_pos[1] += len(lines[0])
else: else:
end_pos[0] += len(lines) - 1 end_pos[0] += len(lines) - 1
+425 -179
View File
@@ -6,7 +6,7 @@ from contextlib import contextmanager
from parso.normalizer import Normalizer, NormalizerConfig, Issue, Rule from parso.normalizer import Normalizer, NormalizerConfig, Issue, Rule
from parso.python.tree import search_ancestor from parso.python.tree import search_ancestor
from parso.parser import ParserSyntaxError from parso.python.tokenize import _get_token_collection
_BLOCK_STMTS = ('if_stmt', 'while_stmt', 'for_stmt', 'try_stmt', 'with_stmt') _BLOCK_STMTS = ('if_stmt', 'while_stmt', 'for_stmt', 'try_stmt', 'with_stmt')
_STAR_EXPR_PARENTS = ('testlist_star_expr', 'testlist_comp', 'exprlist') _STAR_EXPR_PARENTS = ('testlist_star_expr', 'testlist_comp', 'exprlist')
@@ -14,9 +14,84 @@ _STAR_EXPR_PARENTS = ('testlist_star_expr', 'testlist_comp', 'exprlist')
_MAX_BLOCK_SIZE = 20 _MAX_BLOCK_SIZE = 20
_MAX_INDENT_COUNT = 100 _MAX_INDENT_COUNT = 100
ALLOWED_FUTURES = ( ALLOWED_FUTURES = (
'all_feature_names', 'nested_scopes', 'generators', 'division', 'nested_scopes', 'generators', 'division', 'absolute_import',
'absolute_import', 'with_statement', 'print_function', 'unicode_literals', 'with_statement', 'print_function', 'unicode_literals', 'generator_stop',
) )
_COMP_FOR_TYPES = ('comp_for', 'sync_comp_for')
def _get_rhs_name(node, version):
type_ = node.type
if type_ == "lambdef":
return "lambda"
elif type_ == "atom":
comprehension = _get_comprehension_type(node)
first, second = node.children[:2]
if comprehension is not None:
return comprehension
elif second.type == "dictorsetmaker":
if version < (3, 8):
return "literal"
else:
if second.children[1] == ":" or second.children[0] == "**":
return "dict display"
else:
return "set display"
elif (
first == "("
and (second == ")"
or (len(node.children) == 3 and node.children[1].type == "testlist_comp"))
):
return "tuple"
elif first == "(":
return _get_rhs_name(_remove_parens(node), version=version)
elif first == "[":
return "list"
elif first == "{" and second == "}":
return "dict display"
elif first == "{" and len(node.children) > 2:
return "set display"
elif type_ == "keyword":
if "yield" in node.value:
return "yield expression"
if version < (3, 8):
return "keyword"
else:
return str(node.value)
elif type_ == "operator" and node.value == "...":
return "Ellipsis"
elif type_ == "comparison":
return "comparison"
elif type_ in ("string", "number", "strings"):
return "literal"
elif type_ == "yield_expr":
return "yield expression"
elif type_ == "test":
return "conditional expression"
elif type_ in ("atom_expr", "power"):
if node.children[0] == "await":
return "await expression"
elif node.children[-1].type == "trailer":
trailer = node.children[-1]
if trailer.children[0] == "(":
return "function call"
elif trailer.children[0] == "[":
return "subscript"
elif trailer.children[0] == ".":
return "attribute"
elif (
("expr" in type_ and "star_expr" not in type_) # is a substring
or "_test" in type_
or type_ in ("term", "factor")
):
return "operator"
elif type_ == "star_expr":
return "starred"
elif type_ == "testlist_star_expr":
return "tuple"
elif type_ == "fstring":
return "f-string expression"
return type_ # shouldn't reach here
def _iter_stmts(scope): def _iter_stmts(scope):
@@ -35,12 +110,12 @@ def _iter_stmts(scope):
def _get_comprehension_type(atom): def _get_comprehension_type(atom):
first, second = atom.children[:2] first, second = atom.children[:2]
if second.type == 'testlist_comp' and second.children[1].type == 'comp_for': if second.type == 'testlist_comp' and second.children[1].type in _COMP_FOR_TYPES:
if first == '[': if first == '[':
return 'list comprehension' return 'list comprehension'
else: else:
return 'generator expression' return 'generator expression'
elif second.type == 'dictorsetmaker' and second.children[-1].type == 'comp_for': elif second.type == 'dictorsetmaker' and second.children[-1].type in _COMP_FOR_TYPES:
if second.children[1] == ':': if second.children[1] == ':':
return 'dict comprehension' return 'dict comprehension'
else: else:
@@ -52,7 +127,7 @@ def _is_future_import(import_from):
# It looks like a __future__ import that is relative is still a future # It looks like a __future__ import that is relative is still a future
# import. That feels kind of odd, but whatever. # import. That feels kind of odd, but whatever.
# if import_from.level != 0: # if import_from.level != 0:
# return False # return False
from_names = import_from.get_from_names() from_names = import_from.get_from_names()
return [n.value for n in from_names] == ['__future__'] return [n.value for n in from_names] == ['__future__']
@@ -94,19 +169,29 @@ def _is_future_import_first(import_from):
def _iter_definition_exprs_from_lists(exprlist): def _iter_definition_exprs_from_lists(exprlist):
for child in exprlist.children[::2]: def check_expr(child):
if child.type == 'atom' and child.children[0] in ('(', '['): if child.type == 'atom':
testlist_comp = child.children[0] if child.children[0] == '(':
if testlist_comp.type == 'testlist_comp': testlist_comp = child.children[1]
for expr in _iter_definition_exprs_from_lists(testlist_comp): if testlist_comp.type == 'testlist_comp':
yield expr yield from _iter_definition_exprs_from_lists(testlist_comp)
continue return
else:
# It's a paren that doesn't do anything, like 1 + (1)
yield from check_expr(testlist_comp)
return
elif child.children[0] == '[': elif child.children[0] == '[':
yield testlist_comp yield testlist_comp
continue return
yield child yield child
if exprlist.type in _STAR_EXPR_PARENTS:
for child in exprlist.children[::2]:
yield from check_expr(child)
else:
yield from check_expr(exprlist)
def _get_expr_stmt_definition_exprs(expr_stmt): def _get_expr_stmt_definition_exprs(expr_stmt):
exprs = [] exprs = []
for list_ in expr_stmt.children[:-2:2]: for list_ in expr_stmt.children[:-2:2]:
@@ -119,12 +204,25 @@ def _get_expr_stmt_definition_exprs(expr_stmt):
def _get_for_stmt_definition_exprs(for_stmt): def _get_for_stmt_definition_exprs(for_stmt):
exprlist = for_stmt.children[1] exprlist = for_stmt.children[1]
if exprlist.type != 'exprlist':
return [exprlist]
return list(_iter_definition_exprs_from_lists(exprlist)) return list(_iter_definition_exprs_from_lists(exprlist))
class _Context(object): def _is_argument_comprehension(argument):
return argument.children[1].type in _COMP_FOR_TYPES
def _any_fstring_error(version, node):
if version < (3, 9) or node is None:
return False
if node.type == "error_node":
return any(child.type == "fstring_start" for child in node.children)
elif node.type == "fstring":
return True
else:
return search_ancestor(node, "fstring")
class _Context:
def __init__(self, node, add_syntax_error, parent_context=None): def __init__(self, node, add_syntax_error, parent_context=None):
self.node = node self.node = node
self.blocks = [] self.blocks = []
@@ -164,8 +262,7 @@ class _Context(object):
self._analyze_names(self._global_names, 'global') self._analyze_names(self._global_names, 'global')
self._analyze_names(self._nonlocal_names, 'nonlocal') self._analyze_names(self._nonlocal_names, 'nonlocal')
# Python2.6 doesn't have dict comprehensions. global_name_strs = {n.value: n for n in self._global_names}
global_name_strs = dict((n.value, n) for n in self._global_names)
for nonlocal_name in self._nonlocal_names: for nonlocal_name in self._nonlocal_names:
try: try:
global_name = global_name_strs[nonlocal_name.value] global_name = global_name_strs[nonlocal_name.value]
@@ -253,7 +350,7 @@ class ErrorFinder(Normalizer):
Searches for errors in the syntax tree. Searches for errors in the syntax tree.
""" """
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
super(ErrorFinder, self).__init__(*args, **kwargs) super().__init__(*args, **kwargs)
self._error_dict = {} self._error_dict = {}
self.version = self.grammar.version_info self.version = self.grammar.version_info
@@ -273,12 +370,11 @@ class ErrorFinder(Normalizer):
def visit(self, node): def visit(self, node):
if node.type == 'error_node': if node.type == 'error_node':
with self.visit_node(node): with self.visit_node(node):
# Don't need to investigate the inners of an error node. We # Don't need to investigate the inners of an error node. We
# might find errors in there that should be ignored, because # might find errors in there that should be ignored, because
# the error node itself already shows that there's an issue. # the error node itself already shows that there's an issue.
return '' return ''
return super(ErrorFinder, self).visit(node) return super().visit(node)
@contextmanager @contextmanager
def visit_node(self, node): def visit_node(self, node):
@@ -323,6 +419,13 @@ class ErrorFinder(Normalizer):
match = re.match('\\w{,2}("{1,3}|\'{1,3})', leaf.value) match = re.match('\\w{,2}("{1,3}|\'{1,3})', leaf.value)
if match is None: if match is None:
message = 'invalid syntax' message = 'invalid syntax'
if (
self.version >= (3, 9)
and leaf.value in _get_token_collection(
self.version
).always_break_tokens
):
message = "f-string: " + message
else: else:
if len(match.group(1)) == 1: if len(match.group(1)) == 1:
message = 'EOL while scanning string literal' message = 'EOL while scanning string literal'
@@ -336,7 +439,7 @@ class ErrorFinder(Normalizer):
self.context = self.context.add_context(parent) self.context = self.context.add_context(parent)
# The rest is rule based. # The rest is rule based.
return super(ErrorFinder, self).visit_leaf(leaf) return super().visit_leaf(leaf)
def _add_indentation_error(self, spacing, message): def _add_indentation_error(self, spacing, message):
self.add_issue(spacing, 903, "IndentationError: " + message) self.add_issue(spacing, 903, "IndentationError: " + message)
@@ -361,8 +464,8 @@ class ErrorFinder(Normalizer):
class IndentationRule(Rule): class IndentationRule(Rule):
code = 903 code = 903
def _get_message(self, message): def _get_message(self, message, node):
message = super(IndentationRule, self)._get_message(message) message = super()._get_message(message, node)
return "IndentationError: " + message return "IndentationError: " + message
@@ -386,21 +489,34 @@ class ErrorFinderConfig(NormalizerConfig):
class SyntaxRule(Rule): class SyntaxRule(Rule):
code = 901 code = 901
def _get_message(self, message): def _get_message(self, message, node):
message = super(SyntaxRule, self)._get_message(message) message = super()._get_message(message, node)
if (
"f-string" not in message
and _any_fstring_error(self._normalizer.version, node)
):
message = "f-string: " + message
return "SyntaxError: " + message return "SyntaxError: " + message
@ErrorFinder.register_rule(type='error_node') @ErrorFinder.register_rule(type='error_node')
class _InvalidSyntaxRule(SyntaxRule): class _InvalidSyntaxRule(SyntaxRule):
message = "invalid syntax" message = "invalid syntax"
fstring_message = "f-string: invalid syntax"
def get_node(self, node): def get_node(self, node):
return node.get_next_leaf() return node.get_next_leaf()
def is_issue(self, node): def is_issue(self, node):
# Error leafs will be added later as an error. error = node.get_next_leaf().type != 'error_leaf'
return node.get_next_leaf().type != 'error_leaf' if (
error
and _any_fstring_error(self._normalizer.version, node)
):
self.add_issue(node, message=self.fstring_message)
else:
# Error leafs will be added later as an error.
return error
@ErrorFinder.register_rule(value='await') @ErrorFinder.register_rule(value='await')
@@ -439,7 +555,11 @@ class _ContinueChecks(SyntaxRule):
in_loop = True in_loop = True
if block.type == 'try_stmt': if block.type == 'try_stmt':
last_block = block.children[-3] last_block = block.children[-3]
if last_block == 'finally' and leaf.start_pos > last_block.start_pos: if (
last_block == "finally"
and leaf.start_pos > last_block.start_pos
and self._normalizer.version < (3, 8)
):
self.add_issue(leaf, message=self.message_in_finally) self.add_issue(leaf, message=self.message_in_finally)
return False # Error already added return False # Error already added
if not in_loop: if not in_loop:
@@ -455,26 +575,19 @@ class _YieldFromCheck(SyntaxRule):
def is_issue(self, leaf): def is_issue(self, leaf):
return leaf.parent.type == 'yield_arg' \ return leaf.parent.type == 'yield_arg' \
and self._normalizer.context.is_async_funcdef() and self._normalizer.context.is_async_funcdef()
@ErrorFinder.register_rule(type='name') @ErrorFinder.register_rule(type='name')
class _NameChecks(SyntaxRule): class _NameChecks(SyntaxRule):
message = 'cannot assign to __debug__' message = 'cannot assign to __debug__'
message_keyword = 'assignment to keyword'
message_none = 'cannot assign to None' message_none = 'cannot assign to None'
def is_issue(self, leaf): def is_issue(self, leaf):
self._normalizer.context.add_name(leaf) self._normalizer.context.add_name(leaf)
if leaf.value == '__debug__' and leaf.is_definition(): if leaf.value == '__debug__' and leaf.is_definition():
if self._normalizer.version < (3, 0): return True
return True
else:
self.add_issue(leaf, message=self.message_keyword)
if leaf.value == 'None' and self._normalizer.version < (3, 0) \
and leaf.is_definition():
self.add_issue(leaf, message=self.message_none)
@ErrorFinder.register_rule(type='string') @ErrorFinder.register_rule(type='string')
@@ -482,38 +595,32 @@ class _StringChecks(SyntaxRule):
message = "bytes can only contain ASCII literal characters." message = "bytes can only contain ASCII literal characters."
def is_issue(self, leaf): def is_issue(self, leaf):
string_prefix = leaf.string_prefix.lower() string_prefix = leaf.string_prefix.lower()
if 'b' in string_prefix \ if 'b' in string_prefix \
and self._normalizer.version >= (3, 0) \ and any(c for c in leaf.value if ord(c) > 127):
and any(c for c in leaf.value if ord(c) > 127): # b'ä'
# b'ä' return True
return True
if 'r' not in string_prefix: if 'r' not in string_prefix:
# Raw strings don't need to be checked if they have proper # Raw strings don't need to be checked if they have proper
# escaping. # escaping.
is_bytes = self._normalizer.version < (3, 0)
if 'b' in string_prefix:
is_bytes = True
if 'u' in string_prefix:
is_bytes = False
payload = leaf._get_payload() payload = leaf._get_payload()
if is_bytes: if 'b' in string_prefix:
payload = payload.encode('utf-8') payload = payload.encode('utf-8')
func = codecs.escape_decode func = codecs.escape_decode
else: else:
func = codecs.unicode_escape_decode func = codecs.unicode_escape_decode
try: try:
with warnings.catch_warnings(): with warnings.catch_warnings():
# The warnings from parsing strings are not relevant. # The warnings from parsing strings are not relevant.
warnings.filterwarnings('ignore') warnings.filterwarnings('ignore')
func(payload) func(payload)
except UnicodeDecodeError as e: except UnicodeDecodeError as e:
self.add_issue(leaf, message='(unicode error) ' + str(e)) self.add_issue(leaf, message='(unicode error) ' + str(e))
except ValueError as e: except ValueError as e:
self.add_issue(leaf, message='(value error) ' + str(e)) self.add_issue(leaf, message='(value error) ' + str(e))
@ErrorFinder.register_rule(value='*') @ErrorFinder.register_rule(value='*')
@@ -539,7 +646,7 @@ class _StarStarCheck(SyntaxRule):
def is_issue(self, leaf): def is_issue(self, leaf):
if leaf.parent.type == 'dictorsetmaker': if leaf.parent.type == 'dictorsetmaker':
comp_for = leaf.get_next_sibling().get_next_sibling() comp_for = leaf.get_next_sibling().get_next_sibling()
return comp_for is not None and comp_for.type == 'comp_for' return comp_for is not None and comp_for.type in _COMP_FOR_TYPES
@ErrorFinder.register_rule(value='yield') @ErrorFinder.register_rule(value='yield')
@@ -558,10 +665,6 @@ class _ReturnAndYieldChecks(SyntaxRule):
and any(self._normalizer.context.node.iter_yield_exprs()): and any(self._normalizer.context.node.iter_yield_exprs()):
if leaf.value == 'return' and leaf.parent.type == 'return_stmt': if leaf.value == 'return' and leaf.parent.type == 'return_stmt':
return True return True
elif leaf.value == 'yield' \
and leaf.get_next_leaf() != 'from' \
and self._normalizer.version == (3, 5):
self.add_issue(self.get_node(leaf), message=self.message_async_yield)
@ErrorFinder.register_rule(type='strings') @ErrorFinder.register_rule(type='strings')
@@ -570,15 +673,16 @@ class _BytesAndStringMix(SyntaxRule):
message = "cannot mix bytes and nonbytes literals" message = "cannot mix bytes and nonbytes literals"
def _is_bytes_literal(self, string): def _is_bytes_literal(self, string):
if string.type == 'fstring':
return False
return 'b' in string.string_prefix.lower() return 'b' in string.string_prefix.lower()
def is_issue(self, node): def is_issue(self, node):
first = node.children[0] first = node.children[0]
if first.type == 'string' and self._normalizer.version >= (3, 0): first_is_bytes = self._is_bytes_literal(first)
first_is_bytes = self._is_bytes_literal(first) for string in node.children[1:]:
for string in node.children[1:]: if first_is_bytes != self._is_bytes_literal(string):
if first_is_bytes != self._is_bytes_literal(string): return True
return True
@ErrorFinder.register_rule(type='import_as_names') @ErrorFinder.register_rule(type='import_as_names')
@@ -587,7 +691,7 @@ class _TrailingImportComma(SyntaxRule):
message = "trailing comma not allowed without surrounding parentheses" message = "trailing comma not allowed without surrounding parentheses"
def is_issue(self, node): def is_issue(self, node):
if node.children[-1] == ',': if node.children[-1] == ',' and node.parent.children[-1] != ')':
return True return True
@@ -611,52 +715,38 @@ class _FutureImportRule(SyntaxRule):
for from_name, future_name in node.get_paths(): for from_name, future_name in node.get_paths():
name = future_name.value name = future_name.value
allowed_futures = list(ALLOWED_FUTURES) allowed_futures = list(ALLOWED_FUTURES)
if self._normalizer.version >= (3, 5): if self._normalizer.version >= (3, 7):
allowed_futures.append('generator_stop') allowed_futures.append('annotations')
if name == 'braces': if name == 'braces':
self.add_issue(node, message = "not a chance") self.add_issue(node, message="not a chance")
elif name == 'barry_as_FLUFL': elif name == 'barry_as_FLUFL':
m = "Seriously I'm not implementing this :) ~ Dave" m = "Seriously I'm not implementing this :) ~ Dave"
self.add_issue(node, message=m) self.add_issue(node, message=m)
elif name not in ALLOWED_FUTURES: elif name not in allowed_futures:
message = "future feature %s is not defined" % name message = "future feature %s is not defined" % name
self.add_issue(node, message=message) self.add_issue(node, message=message)
@ErrorFinder.register_rule(type='star_expr') @ErrorFinder.register_rule(type='star_expr')
class _StarExprRule(SyntaxRule): class _StarExprRule(SyntaxRule):
message = "starred assignment target must be in a list or tuple"
message_iterable_unpacking = "iterable unpacking cannot be used in comprehension" message_iterable_unpacking = "iterable unpacking cannot be used in comprehension"
message_assignment = "can use starred expression only as assignment target" message_assignment = "can use starred expression only as assignment target"
def is_issue(self, node): def is_issue(self, node):
if node.parent.type not in _STAR_EXPR_PARENTS:
return True
if node.parent.type == 'testlist_comp': if node.parent.type == 'testlist_comp':
# [*[] for a in [1]] # [*[] for a in [1]]
if node.parent.children[1].type == 'comp_for': if node.parent.children[1].type in _COMP_FOR_TYPES:
self.add_issue(node, message=self.message_iterable_unpacking) self.add_issue(node, message=self.message_iterable_unpacking)
if self._normalizer.version <= (3, 4):
n = search_ancestor(node, 'for_stmt', 'expr_stmt')
found_definition = False
if n is not None:
if n.type == 'expr_stmt':
exprs = _get_expr_stmt_definition_exprs(n)
else:
exprs = _get_for_stmt_definition_exprs(n)
if node in exprs:
found_definition = True
if not found_definition:
self.add_issue(node, message=self.message_assignment)
@ErrorFinder.register_rule(types=_STAR_EXPR_PARENTS) @ErrorFinder.register_rule(types=_STAR_EXPR_PARENTS)
class _StarExprParentRule(SyntaxRule): class _StarExprParentRule(SyntaxRule):
def is_issue(self, node): def is_issue(self, node):
if node.parent.type == 'del_stmt': if node.parent.type == 'del_stmt':
self.add_issue(node.parent, message="can't use starred expression here") if self._normalizer.version >= (3, 9):
self.add_issue(node.parent, message="cannot delete starred")
else:
self.add_issue(node.parent, message="can't use starred expression here")
else: else:
def is_definition(node, ancestor): def is_definition(node, ancestor):
if ancestor is None: if ancestor is None:
@@ -675,7 +765,10 @@ class _StarExprParentRule(SyntaxRule):
args = [c for c in node.children if c != ','] args = [c for c in node.children if c != ',']
starred = [c for c in args if c.type == 'star_expr'] starred = [c for c in args if c.type == 'star_expr']
if len(starred) > 1: if len(starred) > 1:
message = "two starred expressions in assignment" if self._normalizer.version < (3, 9):
message = "two starred expressions in assignment"
else:
message = "multiple starred expressions in assignment"
self.add_issue(starred[1], message=message) self.add_issue(starred[1], message=message)
elif starred: elif starred:
count = args.index(starred[0]) count = args.index(starred[0])
@@ -712,8 +805,8 @@ class _AnnotatorRule(SyntaxRule):
if not (lhs.type == 'name' if not (lhs.type == 'name'
# subscript/attributes are allowed # subscript/attributes are allowed
or lhs.type in ('atom_expr', 'power') or lhs.type in ('atom_expr', 'power')
and trailer.type == 'trailer' and trailer.type == 'trailer'
and trailer.children[0] != '('): and trailer.children[0] != '('):
return True return True
else: else:
# x, y: str # x, y: str
@@ -725,15 +818,27 @@ class _AnnotatorRule(SyntaxRule):
class _ArgumentRule(SyntaxRule): class _ArgumentRule(SyntaxRule):
def is_issue(self, node): def is_issue(self, node):
first = node.children[0] first = node.children[0]
if self._normalizer.version < (3, 8):
# a((b)=c) is valid in <3.8
first = _remove_parens(first)
if node.children[1] == '=' and first.type != 'name': if node.children[1] == '=' and first.type != 'name':
if first.type == 'lambdef': if first.type == 'lambdef':
# f(lambda: 1=1) # f(lambda: 1=1)
message = "lambda cannot contain assignment" if self._normalizer.version < (3, 8):
message = "lambda cannot contain assignment"
else:
message = 'expression cannot contain assignment, perhaps you meant "=="?'
else: else:
# f(+x=1) # f(+x=1)
message = "keyword can't be an expression" if self._normalizer.version < (3, 8):
message = "keyword can't be an expression"
else:
message = 'expression cannot contain assignment, perhaps you meant "=="?'
self.add_issue(first, message=message) self.add_issue(first, message=message)
if _is_argument_comprehension(node) and node.parent.type == 'classdef':
self.add_issue(node, message='invalid syntax')
@ErrorFinder.register_rule(type='nonlocal_stmt') @ErrorFinder.register_rule(type='nonlocal_stmt')
class _NonlocalModuleLevelRule(SyntaxRule): class _NonlocalModuleLevelRule(SyntaxRule):
@@ -753,58 +858,49 @@ class _ArglistRule(SyntaxRule):
return "Generator expression must be parenthesized" return "Generator expression must be parenthesized"
def is_issue(self, node): def is_issue(self, node):
first_arg = node.children[0] arg_set = set()
if first_arg.type == 'argument' \ kw_only = False
and first_arg.children[1].type == 'comp_for': kw_unpacking_only = False
# e.g. foo(x for x in [], b) for argument in node.children:
return len(node.children) >= 2 if argument == ',':
else: continue
arg_set = set()
kw_only = False
kw_unpacking_only = False
is_old_starred = False
# In python 3 this would be a bit easier (stars are part of
# argument), but we have to understand both.
for argument in node.children:
if argument == ',':
continue
if argument in ('*', '**'): if argument.type == 'argument':
# Python < 3.5 has the order engraved in the grammar first = argument.children[0]
# file. No need to do anything here. if _is_argument_comprehension(argument) and len(node.children) >= 2:
is_old_starred = True # a(a, b for b in c)
continue return True
if is_old_starred:
is_old_starred = False
continue
if argument.type == 'argument': if first in ('*', '**'):
first = argument.children[0] if first == '*':
if first in ('*', '**'): if kw_unpacking_only:
if first == '*': # foo(**kwargs, *args)
if kw_unpacking_only: message = "iterable argument unpacking " \
# foo(**kwargs, *args) "follows keyword argument unpacking"
message = "iterable argument unpacking follows keyword argument unpacking" self.add_issue(argument, message=message)
self.add_issue(argument, message=message) else:
kw_unpacking_only = True
else: # Is a keyword argument.
kw_only = True
if first.type == 'name':
if first.value in arg_set:
# f(x=1, x=2)
message = "keyword argument repeated"
if self._normalizer.version >= (3, 9):
message += ": {}".format(first.value)
self.add_issue(first, message=message)
else: else:
kw_unpacking_only = True arg_set.add(first.value)
else: # Is a keyword argument. else:
kw_only = True if kw_unpacking_only:
if first.type == 'name': # f(**x, y)
if first.value in arg_set: message = "positional argument follows keyword argument unpacking"
# f(x=1, x=2) self.add_issue(argument, message=message)
self.add_issue(first, message="keyword argument repeated") elif kw_only:
else: # f(x=2, y)
arg_set.add(first.value) message = "positional argument follows keyword argument"
else: self.add_issue(argument, message=message)
if kw_unpacking_only:
# f(**x, y)
message = "positional argument follows keyword argument unpacking"
self.add_issue(argument, message=message)
elif kw_only:
# f(x=2, y)
message = "positional argument follows keyword argument"
self.add_issue(argument, message=message)
@ErrorFinder.register_rule(type='parameters') @ErrorFinder.register_rule(type='parameters')
@ErrorFinder.register_rule(type='lambdef') @ErrorFinder.register_rule(type='lambdef')
@@ -846,6 +942,7 @@ class _TryStmtRule(SyntaxRule):
@ErrorFinder.register_rule(type='fstring') @ErrorFinder.register_rule(type='fstring')
class _FStringRule(SyntaxRule): class _FStringRule(SyntaxRule):
_fstring_grammar = None _fstring_grammar = None
message_expr = "f-string expression part cannot include a backslash"
message_nested = "f-string: expressions nested too deeply" message_nested = "f-string: expressions nested too deeply"
message_conversion = "f-string: invalid conversion character: expected 's', 'r', or 'a'" message_conversion = "f-string: invalid conversion character: expected 's', 'r', or 'a'"
@@ -856,7 +953,15 @@ class _FStringRule(SyntaxRule):
if depth >= 2: if depth >= 2:
self.add_issue(fstring_expr, message=self.message_nested) self.add_issue(fstring_expr, message=self.message_nested)
conversion = fstring_expr.children[2] expr = fstring_expr.children[1]
if '\\' in expr.get_code():
self.add_issue(expr, message=self.message_expr)
children_2 = fstring_expr.children[2]
if children_2.type == 'operator' and children_2.value == '=':
conversion = fstring_expr.children[3]
else:
conversion = children_2
if conversion.type == 'fstring_conversion': if conversion.type == 'fstring_conversion':
name = conversion.children[1] name = conversion.children[1]
if name.value not in ('s', 'r', 'a'): if name.value not in ('s', 'r', 'a'):
@@ -876,7 +981,7 @@ class _FStringRule(SyntaxRule):
class _CheckAssignmentRule(SyntaxRule): class _CheckAssignmentRule(SyntaxRule):
def _check_assignment(self, node, is_deletion=False): def _check_assignment(self, node, is_deletion=False, is_namedexpr=False, is_aug_assign=False):
error = None error = None
type_ = node.type type_ = node.type
if type_ == 'lambdef': if type_ == 'lambdef':
@@ -886,19 +991,48 @@ class _CheckAssignmentRule(SyntaxRule):
error = _get_comprehension_type(node) error = _get_comprehension_type(node)
if error is None: if error is None:
if second.type == 'dictorsetmaker': if second.type == 'dictorsetmaker':
error = 'literal' if self._normalizer.version < (3, 8):
error = 'literal'
else:
if second.children[1] == ':':
error = 'dict display'
else:
error = 'set display'
elif first == "{" and second == "}":
if self._normalizer.version < (3, 8):
error = 'literal'
else:
error = "dict display"
elif first == "{" and len(node.children) > 2:
if self._normalizer.version < (3, 8):
error = 'literal'
else:
error = "set display"
elif first in ('(', '['): elif first in ('(', '['):
if second.type == 'yield_expr': if second.type == 'yield_expr':
error = 'yield expression' error = 'yield expression'
elif second.type == 'testlist_comp': elif second.type == 'testlist_comp':
# ([a, b] := [1, 2])
# ((a, b) := [1, 2])
if is_namedexpr:
if first == '(':
error = 'tuple'
elif first == '[':
error = 'list'
# This is not a comprehension, they were handled # This is not a comprehension, they were handled
# further above. # further above.
for child in second.children[::2]: for child in second.children[::2]:
self._check_assignment(child, is_deletion) self._check_assignment(child, is_deletion, is_namedexpr, is_aug_assign)
else: # Everything handled, must be useless brackets. else: # Everything handled, must be useless brackets.
self._check_assignment(second, is_deletion) self._check_assignment(second, is_deletion, is_namedexpr, is_aug_assign)
elif type_ == 'keyword': elif type_ == 'keyword':
error = 'keyword' if node.value == "yield":
error = "yield expression"
elif self._normalizer.version < (3, 8):
error = 'keyword'
else:
error = str(node.value)
elif type_ == 'operator': elif type_ == 'operator':
if node.value == '...': if node.value == '...':
error = 'Ellipsis' error = 'Ellipsis'
@@ -923,44 +1057,88 @@ class _CheckAssignmentRule(SyntaxRule):
assert trailer.type == 'trailer' assert trailer.type == 'trailer'
if trailer.children[0] == '(': if trailer.children[0] == '(':
error = 'function call' error = 'function call'
elif is_namedexpr and trailer.children[0] == '[':
error = 'subscript'
elif is_namedexpr and trailer.children[0] == '.':
error = 'attribute'
elif type_ == "fstring":
if self._normalizer.version < (3, 8):
error = 'literal'
else:
error = "f-string expression"
elif type_ in ('testlist_star_expr', 'exprlist', 'testlist'): elif type_ in ('testlist_star_expr', 'exprlist', 'testlist'):
for child in node.children[::2]: for child in node.children[::2]:
self._check_assignment(child, is_deletion) self._check_assignment(child, is_deletion, is_namedexpr, is_aug_assign)
elif ('expr' in type_ and type_ != 'star_expr' # is a substring elif ('expr' in type_ and type_ != 'star_expr' # is a substring
or '_test' in type_ or '_test' in type_
or type_ in ('term', 'factor')): or type_ in ('term', 'factor')):
error = 'operator' error = 'operator'
elif type_ == "star_expr":
if is_deletion:
if self._normalizer.version >= (3, 9):
error = "starred"
else:
self.add_issue(node, message="can't use starred expression here")
elif not search_ancestor(node, *_STAR_EXPR_PARENTS) and not is_aug_assign:
self.add_issue(node, message="starred assignment target must be in a list or tuple")
self._check_assignment(node.children[1])
if error is not None: if error is not None:
message = "can't %s %s" % ("delete" if is_deletion else "assign to", error) if is_namedexpr:
message = 'cannot use assignment expressions with %s' % error
else:
cannot = "can't" if self._normalizer.version < (3, 8) else "cannot"
message = ' '.join([cannot, "delete" if is_deletion else "assign to", error])
self.add_issue(node, message=message) self.add_issue(node, message=message)
@ErrorFinder.register_rule(type='comp_for') @ErrorFinder.register_rule(type='sync_comp_for')
class _CompForRule(_CheckAssignmentRule): class _CompForRule(_CheckAssignmentRule):
message = "asynchronous comprehension outside of an asynchronous function" message = "asynchronous comprehension outside of an asynchronous function"
def is_issue(self, node): def is_issue(self, node):
# Some of the nodes here are already used, so no else if expr_list = node.children[1]
expr_list = node.children[1 + int(node.children[0] == 'async')]
if expr_list.type != 'expr_list': # Already handled. if expr_list.type != 'expr_list': # Already handled.
self._check_assignment(expr_list) self._check_assignment(expr_list)
return node.children[0] == 'async' \ return node.parent.children[0] == 'async' \
and not self._normalizer.context.is_async_funcdef() and not self._normalizer.context.is_async_funcdef()
@ErrorFinder.register_rule(type='expr_stmt') @ErrorFinder.register_rule(type='expr_stmt')
class _ExprStmtRule(_CheckAssignmentRule): class _ExprStmtRule(_CheckAssignmentRule):
message = "illegal expression for augmented assignment" message = "illegal expression for augmented assignment"
extended_message = "'{target}' is an " + message
def is_issue(self, node): def is_issue(self, node):
for before_equal in node.children[:-2:2]:
self._check_assignment(before_equal)
augassign = node.children[1] augassign = node.children[1]
if augassign != '=' and augassign.type != 'annassign': # Is augassign. is_aug_assign = augassign != '=' and augassign.type != 'annassign'
return node.children[0].type in ('testlist_star_expr', 'atom', 'testlist')
if self._normalizer.version <= (3, 8) or not is_aug_assign:
for before_equal in node.children[:-2:2]:
self._check_assignment(before_equal, is_aug_assign=is_aug_assign)
if is_aug_assign:
target = _remove_parens(node.children[0])
# a, a[b], a.b
if target.type == "name" or (
target.type in ("atom_expr", "power")
and target.children[1].type == "trailer"
and target.children[-1].children[0] != "("
):
return False
if self._normalizer.version <= (3, 8):
return True
else:
self.add_issue(
node,
message=self.extended_message.format(
target=_get_rhs_name(node.children[0], self._normalizer.version)
),
)
@ErrorFinder.register_rule(type='with_item') @ErrorFinder.register_rule(type='with_item')
@@ -992,3 +1170,71 @@ class _ForStmtRule(_CheckAssignmentRule):
expr_list = for_stmt.children[1] expr_list = for_stmt.children[1]
if expr_list.type != 'expr_list': # Already handled. if expr_list.type != 'expr_list': # Already handled.
self._check_assignment(expr_list) self._check_assignment(expr_list)
@ErrorFinder.register_rule(type='namedexpr_test')
class _NamedExprRule(_CheckAssignmentRule):
# namedexpr_test: test [':=' test]
def is_issue(self, namedexpr_test):
# assigned name
first = namedexpr_test.children[0]
def search_namedexpr_in_comp_for(node):
while True:
parent = node.parent
if parent is None:
return parent
if parent.type == 'sync_comp_for' and parent.children[3] == node:
return parent
node = parent
if search_namedexpr_in_comp_for(namedexpr_test):
# [i+1 for i in (i := range(5))]
# [i+1 for i in (j := range(5))]
# [i+1 for i in (lambda: (j := range(5)))()]
message = 'assignment expression cannot be used in a comprehension iterable expression'
self.add_issue(namedexpr_test, message=message)
# defined names
exprlist = list()
def process_comp_for(comp_for):
if comp_for.type == 'sync_comp_for':
comp = comp_for
elif comp_for.type == 'comp_for':
comp = comp_for.children[1]
exprlist.extend(_get_for_stmt_definition_exprs(comp))
def search_all_comp_ancestors(node):
has_ancestors = False
while True:
node = search_ancestor(node, 'testlist_comp', 'dictorsetmaker')
if node is None:
break
for child in node.children:
if child.type in _COMP_FOR_TYPES:
process_comp_for(child)
has_ancestors = True
break
return has_ancestors
# check assignment expressions in comprehensions
search_all = search_all_comp_ancestors(namedexpr_test)
if search_all:
if self._normalizer.context.node.type == 'classdef':
message = 'assignment expression within a comprehension ' \
'cannot be used in a class body'
self.add_issue(namedexpr_test, message=message)
namelist = [expr.value for expr in exprlist if expr.type == 'name']
if first.type == 'name' and first.value in namelist:
# [i := 0 for i, j in range(5)]
# [[(i := i) for j in range(5)] for i in range(5)]
# [i for i, j in range(5) if True or (i := 1)]
# [False and (i := 0) for i, j in range(5)]
message = 'assignment expression cannot rebind ' \
'comprehension iteration variable %r' % first.value
self.add_issue(namedexpr_test, message=message)
self._check_assignment(first, is_namedexpr=True)
-159
View File
@@ -1,159 +0,0 @@
# Grammar for Python
# Note: Changing the grammar specified in this file will most likely
# require corresponding changes in the parser module
# (../Modules/parsermodule.c). If you can't make the changes to
# that module yourself, please co-ordinate the required changes
# with someone who can; ask around on python-dev for help. Fred
# Drake <fdrake@acm.org> will probably be listening there.
# NOTE WELL: You should also follow all the steps listed in PEP 306,
# "How to Change Python's Grammar"
# Commands for Kees Blom's railroad program
#diagram:token NAME
#diagram:token NUMBER
#diagram:token STRING
#diagram:token NEWLINE
#diagram:token ENDMARKER
#diagram:token INDENT
#diagram:output\input python.bla
#diagram:token DEDENT
#diagram:output\textwidth 20.04cm\oddsidemargin 0.0cm\evensidemargin 0.0cm
#diagram:rules
# Start symbols for the grammar:
# single_input is a single interactive statement;
# file_input is a module or sequence of commands read from an input file;
# eval_input is the input for the eval() and input() functions.
# NB: compound_stmt in single_input is followed by extra NEWLINE!
single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE
file_input: (NEWLINE | stmt)* ENDMARKER
eval_input: testlist NEWLINE* ENDMARKER
decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE
decorators: decorator+
decorated: decorators (classdef | funcdef)
funcdef: 'def' NAME parameters ':' suite
parameters: '(' [varargslist] ')'
varargslist: ((fpdef ['=' test] ',')*
('*' NAME [',' '**' NAME] | '**' NAME) |
fpdef ['=' test] (',' fpdef ['=' test])* [','])
fpdef: NAME | '(' fplist ')'
fplist: fpdef (',' fpdef)* [',']
stmt: simple_stmt | compound_stmt
simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: (expr_stmt | print_stmt | del_stmt | pass_stmt | flow_stmt |
import_stmt | global_stmt | exec_stmt | assert_stmt)
expr_stmt: testlist (augassign (yield_expr|testlist) |
('=' (yield_expr|testlist))*)
augassign: ('+=' | '-=' | '*=' | '/=' | '%=' | '&=' | '|=' | '^=' |
'<<=' | '>>=' | '**=' | '//=')
# For normal assignments, additional restrictions enforced by the interpreter
print_stmt: 'print' ( [ test (',' test)* [','] ] |
'>>' test [ (',' test)+ [','] ] )
del_stmt: 'del' exprlist
pass_stmt: 'pass'
flow_stmt: break_stmt | continue_stmt | return_stmt | raise_stmt | yield_stmt
break_stmt: 'break'
continue_stmt: 'continue'
return_stmt: 'return' [testlist]
yield_stmt: yield_expr
raise_stmt: 'raise' [test [',' test [',' test]]]
import_stmt: import_name | import_from
import_name: 'import' dotted_as_names
import_from: ('from' ('.'* dotted_name | '.'+)
'import' ('*' | '(' import_as_names ')' | import_as_names))
import_as_name: NAME ['as' NAME]
dotted_as_name: dotted_name ['as' NAME]
import_as_names: import_as_name (',' import_as_name)* [',']
dotted_as_names: dotted_as_name (',' dotted_as_name)*
dotted_name: NAME ('.' NAME)*
global_stmt: 'global' NAME (',' NAME)*
exec_stmt: 'exec' expr ['in' test [',' test]]
assert_stmt: 'assert' test [',' test]
compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated
if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
while_stmt: 'while' test ':' suite ['else' ':' suite]
for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite]
try_stmt: ('try' ':' suite
((except_clause ':' suite)+
['else' ':' suite]
['finally' ':' suite] |
'finally' ':' suite))
with_stmt: 'with' with_item ':' suite
# Dave: Python2.6 actually defines a little bit of a different label called
# 'with_var'. However in 2.7+ this is the default. Apply it for
# consistency reasons.
with_item: test ['as' expr]
# NB compile.c makes sure that the default except clause is last
except_clause: 'except' [test [('as' | ',') test]]
suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT
# Backward compatibility cruft to support:
# [ x for x in lambda: True, lambda: False if x() ]
# even while also allowing:
# lambda x: 5 if x else 2
# (But not a mix of the two)
testlist_safe: old_test [(',' old_test)+ [',']]
old_test: or_test | old_lambdef
old_lambdef: 'lambda' [varargslist] ':' old_test
test: or_test ['if' or_test 'else' test] | lambdef
or_test: and_test ('or' and_test)*
and_test: not_test ('and' not_test)*
not_test: 'not' not_test | comparison
comparison: expr (comp_op expr)*
comp_op: '<'|'>'|'=='|'>='|'<='|'<>'|'!='|'in'|'not' 'in'|'is'|'is' 'not'
expr: xor_expr ('|' xor_expr)*
xor_expr: and_expr ('^' and_expr)*
and_expr: shift_expr ('&' shift_expr)*
shift_expr: arith_expr (('<<'|'>>') arith_expr)*
arith_expr: term (('+'|'-') term)*
term: factor (('*'|'/'|'%'|'//') factor)*
factor: ('+'|'-'|'~') factor | power
power: atom trailer* ['**' factor]
atom: ('(' [yield_expr|testlist_comp] ')' |
'[' [listmaker] ']' |
'{' [dictorsetmaker] '}' |
'`' testlist1 '`' |
NAME | NUMBER | strings)
strings: STRING+
listmaker: test ( list_for | (',' test)* [','] )
# Dave: Renamed testlist_gexpr to testlist_comp, because in 2.7+ this is the
# default. It's more consistent like this.
testlist_comp: test ( gen_for | (',' test)* [','] )
lambdef: 'lambda' [varargslist] ':' test
trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
subscriptlist: subscript (',' subscript)* [',']
subscript: '.' '.' '.' | test | [test] ':' [test] [sliceop]
sliceop: ':' [test]
exprlist: expr (',' expr)* [',']
testlist: test (',' test)* [',']
# Dave: Rename from dictmaker to dictorsetmaker, because this is more
# consistent with the following grammars.
dictorsetmaker: test ':' test (',' test ':' test)* [',']
classdef: 'class' NAME ['(' [testlist] ')'] ':' suite
arglist: (argument ',')* (argument [',']
|'*' test (',' argument)* [',' '**' test]
|'**' test)
argument: test [gen_for] | test '=' test # Really [keyword '='] test
list_iter: list_for | list_if
list_for: 'for' exprlist 'in' testlist_safe [list_iter]
list_if: 'if' old_test [list_iter]
gen_iter: gen_for | gen_if
gen_for: 'for' exprlist 'in' or_test [gen_iter]
gen_if: 'if' old_test [gen_iter]
testlist1: test (',' test)*
# not used in grammar, but may appear in "node" passed from Parser to Compiler
encoding_decl: NAME
yield_expr: 'yield' [testlist]
-143
View File
@@ -1,143 +0,0 @@
# Grammar for Python
# Note: Changing the grammar specified in this file will most likely
# require corresponding changes in the parser module
# (../Modules/parsermodule.c). If you can't make the changes to
# that module yourself, please co-ordinate the required changes
# with someone who can; ask around on python-dev for help. Fred
# Drake <fdrake@acm.org> will probably be listening there.
# NOTE WELL: You should also follow all the steps listed in PEP 306,
# "How to Change Python's Grammar"
# Start symbols for the grammar:
# single_input is a single interactive statement;
# file_input is a module or sequence of commands read from an input file;
# eval_input is the input for the eval() and input() functions.
# NB: compound_stmt in single_input is followed by extra NEWLINE!
single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE
file_input: (NEWLINE | stmt)* ENDMARKER
eval_input: testlist NEWLINE* ENDMARKER
decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE
decorators: decorator+
decorated: decorators (classdef | funcdef)
funcdef: 'def' NAME parameters ':' suite
parameters: '(' [varargslist] ')'
varargslist: ((fpdef ['=' test] ',')*
('*' NAME [',' '**' NAME] | '**' NAME) |
fpdef ['=' test] (',' fpdef ['=' test])* [','])
fpdef: NAME | '(' fplist ')'
fplist: fpdef (',' fpdef)* [',']
stmt: simple_stmt | compound_stmt
simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: (expr_stmt | print_stmt | del_stmt | pass_stmt | flow_stmt |
import_stmt | global_stmt | exec_stmt | assert_stmt)
expr_stmt: testlist (augassign (yield_expr|testlist) |
('=' (yield_expr|testlist))*)
augassign: ('+=' | '-=' | '*=' | '/=' | '%=' | '&=' | '|=' | '^=' |
'<<=' | '>>=' | '**=' | '//=')
# For normal assignments, additional restrictions enforced by the interpreter
print_stmt: 'print' ( [ test (',' test)* [','] ] |
'>>' test [ (',' test)+ [','] ] )
del_stmt: 'del' exprlist
pass_stmt: 'pass'
flow_stmt: break_stmt | continue_stmt | return_stmt | raise_stmt | yield_stmt
break_stmt: 'break'
continue_stmt: 'continue'
return_stmt: 'return' [testlist]
yield_stmt: yield_expr
raise_stmt: 'raise' [test [',' test [',' test]]]
import_stmt: import_name | import_from
import_name: 'import' dotted_as_names
import_from: ('from' ('.'* dotted_name | '.'+)
'import' ('*' | '(' import_as_names ')' | import_as_names))
import_as_name: NAME ['as' NAME]
dotted_as_name: dotted_name ['as' NAME]
import_as_names: import_as_name (',' import_as_name)* [',']
dotted_as_names: dotted_as_name (',' dotted_as_name)*
dotted_name: NAME ('.' NAME)*
global_stmt: 'global' NAME (',' NAME)*
exec_stmt: 'exec' expr ['in' test [',' test]]
assert_stmt: 'assert' test [',' test]
compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated
if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
while_stmt: 'while' test ':' suite ['else' ':' suite]
for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite]
try_stmt: ('try' ':' suite
((except_clause ':' suite)+
['else' ':' suite]
['finally' ':' suite] |
'finally' ':' suite))
with_stmt: 'with' with_item (',' with_item)* ':' suite
with_item: test ['as' expr]
# NB compile.c makes sure that the default except clause is last
except_clause: 'except' [test [('as' | ',') test]]
suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT
# Backward compatibility cruft to support:
# [ x for x in lambda: True, lambda: False if x() ]
# even while also allowing:
# lambda x: 5 if x else 2
# (But not a mix of the two)
testlist_safe: old_test [(',' old_test)+ [',']]
old_test: or_test | old_lambdef
old_lambdef: 'lambda' [varargslist] ':' old_test
test: or_test ['if' or_test 'else' test] | lambdef
or_test: and_test ('or' and_test)*
and_test: not_test ('and' not_test)*
not_test: 'not' not_test | comparison
comparison: expr (comp_op expr)*
comp_op: '<'|'>'|'=='|'>='|'<='|'<>'|'!='|'in'|'not' 'in'|'is'|'is' 'not'
expr: xor_expr ('|' xor_expr)*
xor_expr: and_expr ('^' and_expr)*
and_expr: shift_expr ('&' shift_expr)*
shift_expr: arith_expr (('<<'|'>>') arith_expr)*
arith_expr: term (('+'|'-') term)*
term: factor (('*'|'/'|'%'|'//') factor)*
factor: ('+'|'-'|'~') factor | power
power: atom trailer* ['**' factor]
atom: ('(' [yield_expr|testlist_comp] ')' |
'[' [listmaker] ']' |
'{' [dictorsetmaker] '}' |
'`' testlist1 '`' |
NAME | NUMBER | strings)
strings: STRING+
listmaker: test ( list_for | (',' test)* [','] )
testlist_comp: test ( comp_for | (',' test)* [','] )
lambdef: 'lambda' [varargslist] ':' test
trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
subscriptlist: subscript (',' subscript)* [',']
subscript: '.' '.' '.' | test | [test] ':' [test] [sliceop]
sliceop: ':' [test]
exprlist: expr (',' expr)* [',']
testlist: test (',' test)* [',']
dictorsetmaker: ( (test ':' test (comp_for | (',' test ':' test)* [','])) |
(test (comp_for | (',' test)* [','])) )
classdef: 'class' NAME ['(' [testlist] ')'] ':' suite
arglist: (argument ',')* (argument [',']
|'*' test (',' argument)* [',' '**' test]
|'**' test)
# The reason that keywords are test nodes instead of NAME is that using NAME
# results in an ambiguity. ast.c makes sure it's a NAME.
argument: test [comp_for] | test '=' test
list_iter: list_for | list_if
list_for: 'for' exprlist 'in' testlist_safe [list_iter]
list_if: 'if' old_test [list_iter]
comp_iter: comp_for | comp_if
comp_for: 'for' exprlist 'in' or_test [comp_iter]
comp_if: 'if' old_test [comp_iter]
testlist1: test (',' test)*
# not used in grammar, but may appear in "node" passed from Parser to Compiler
encoding_decl: NAME
yield_expr: 'yield' [testlist]
@@ -1,14 +1,7 @@
# Grammar for Python # Grammar for Python
# Note: Changing the grammar specified in this file will most likely
# require corresponding changes in the parser module
# (../Modules/parsermodule.c). If you can't make the changes to
# that module yourself, please co-ordinate the required changes
# with someone who can; ask around on python-dev for help. Fred
# Drake <fdrake@acm.org> will probably be listening there.
# NOTE WELL: You should also follow all the steps listed at # NOTE WELL: You should also follow all the steps listed at
# https://docs.python.org/devguide/grammar.html # https://devguide.python.org/grammar/
# Start symbols for the grammar: # Start symbols for the grammar:
# single_input is a single interactive statement; # single_input is a single interactive statement;
@@ -16,39 +9,60 @@
# eval_input is the input for the eval() functions. # eval_input is the input for the eval() functions.
# NB: compound_stmt in single_input is followed by extra NEWLINE! # NB: compound_stmt in single_input is followed by extra NEWLINE!
single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE
file_input: (NEWLINE | stmt)* ENDMARKER file_input: stmt* ENDMARKER
eval_input: testlist NEWLINE* ENDMARKER eval_input: testlist NEWLINE* ENDMARKER
decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE decorator: '@' namedexpr_test NEWLINE
decorators: decorator+ decorators: decorator+
decorated: decorators (classdef | funcdef) decorated: decorators (classdef | funcdef | async_funcdef)
async_funcdef: 'async' funcdef
funcdef: 'def' NAME parameters ['->' test] ':' suite funcdef: 'def' NAME parameters ['->' test] ':' suite
parameters: '(' [typedargslist] ')' parameters: '(' [typedargslist] ')'
typedargslist: (tfpdef ['=' test] (',' tfpdef ['=' test])* [',' typedargslist: (
['*' [tfpdef] (',' tfpdef ['=' test])* [',' '**' tfpdef] | '**' tfpdef]] (tfpdef ['=' test] (',' tfpdef ['=' test])* ',' '/' [',' [ tfpdef ['=' test] (
| '*' [tfpdef] (',' tfpdef ['=' test])* [',' '**' tfpdef] | '**' tfpdef) ',' tfpdef ['=' test])* ([',' [
'*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
| '**' tfpdef [',']]])
| '*' [tfpdef] (',' tfpdef ['=' test])* ([',' ['**' tfpdef [',']]])
| '**' tfpdef [',']]] )
| (tfpdef ['=' test] (',' tfpdef ['=' test])* [',' [
'*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
| '**' tfpdef [',']]]
| '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
| '**' tfpdef [','])
)
tfpdef: NAME [':' test] tfpdef: NAME [':' test]
varargslist: (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' varargslist: vfpdef ['=' test ](',' vfpdef ['=' test])* ',' '/' [',' [ (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [
['*' [vfpdef] (',' vfpdef ['=' test])* [',' '**' vfpdef] | '**' vfpdef]] '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '*' [vfpdef] (',' vfpdef ['=' test])* [',' '**' vfpdef] | '**' vfpdef) | '**' vfpdef [',']]]
| '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '**' vfpdef [',']) ]] | (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [
'*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '**' vfpdef [',']]]
| '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '**' vfpdef [',']
)
vfpdef: NAME vfpdef: NAME
stmt: simple_stmt | compound_stmt stmt: simple_stmt | compound_stmt | NEWLINE
simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt | small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt |
import_stmt | global_stmt | nonlocal_stmt | assert_stmt) import_stmt | global_stmt | nonlocal_stmt | assert_stmt)
expr_stmt: testlist_star_expr (augassign (yield_expr|testlist) | expr_stmt: testlist_star_expr (annassign | augassign (yield_expr|testlist) |
('=' (yield_expr|testlist_star_expr))*) ('=' (yield_expr|testlist_star_expr))*)
annassign: ':' test ['=' (yield_expr|testlist_star_expr)]
testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [','] testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [',']
augassign: ('+=' | '-=' | '*=' | '/=' | '%=' | '&=' | '|=' | '^=' | augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' |
'<<=' | '>>=' | '**=' | '//=') '<<=' | '>>=' | '**=' | '//=')
# For normal assignments, additional restrictions enforced by the interpreter # For normal and annotated assignments, additional restrictions enforced by the interpreter
del_stmt: 'del' exprlist del_stmt: 'del' exprlist
pass_stmt: 'pass' pass_stmt: 'pass'
flow_stmt: break_stmt | continue_stmt | return_stmt | raise_stmt | yield_stmt flow_stmt: break_stmt | continue_stmt | return_stmt | raise_stmt | yield_stmt
break_stmt: 'break' break_stmt: 'break'
continue_stmt: 'continue' continue_stmt: 'continue'
return_stmt: 'return' [testlist] return_stmt: 'return' [testlist_star_expr]
yield_stmt: yield_expr yield_stmt: yield_expr
raise_stmt: 'raise' [test ['from' test]] raise_stmt: 'raise' [test ['from' test]]
import_stmt: import_name | import_from import_stmt: import_name | import_from
@@ -65,9 +79,10 @@ global_stmt: 'global' NAME (',' NAME)*
nonlocal_stmt: 'nonlocal' NAME (',' NAME)* nonlocal_stmt: 'nonlocal' NAME (',' NAME)*
assert_stmt: 'assert' test [',' test] assert_stmt: 'assert' test [',' test]
compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated | async_stmt
if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite] async_stmt: 'async' (funcdef | with_stmt | for_stmt)
while_stmt: 'while' test ':' suite ['else' ':' suite] if_stmt: 'if' namedexpr_test ':' suite ('elif' namedexpr_test ':' suite)* ['else' ':' suite]
while_stmt: 'while' namedexpr_test ':' suite ['else' ':' suite]
for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite] for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite]
try_stmt: ('try' ':' suite try_stmt: ('try' ':' suite
((except_clause ':' suite)+ ((except_clause ':' suite)+
@@ -80,6 +95,7 @@ with_item: test ['as' expr]
except_clause: 'except' [test ['as' NAME]] except_clause: 'except' [test ['as' NAME]]
suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT
namedexpr_test: test [':=' test]
test: or_test ['if' or_test 'else' test] | lambdef test: or_test ['if' or_test 'else' test] | lambdef
test_nocond: or_test | lambdef_nocond test_nocond: or_test | lambdef_nocond
lambdef: 'lambda' [varargslist] ':' test lambdef: 'lambda' [varargslist] ':' test
@@ -89,7 +105,7 @@ and_test: not_test ('and' not_test)*
not_test: 'not' not_test | comparison not_test: 'not' not_test | comparison
comparison: expr (comp_op expr)* comparison: expr (comp_op expr)*
# <> isn't actually a valid comparison operator in Python. It's here for the # <> isn't actually a valid comparison operator in Python. It's here for the
# sake of a __future__ import described in PEP 401 # sake of a __future__ import described in PEP 401 (which really works :-)
comp_op: '<'|'>'|'=='|'>='|'<='|'<>'|'!='|'in'|'not' 'in'|'is'|'is' 'not' comp_op: '<'|'>'|'=='|'>='|'<='|'<>'|'!='|'in'|'not' 'in'|'is'|'is' 'not'
star_expr: '*' expr star_expr: '*' expr
expr: xor_expr ('|' xor_expr)* expr: xor_expr ('|' xor_expr)*
@@ -97,38 +113,59 @@ xor_expr: and_expr ('^' and_expr)*
and_expr: shift_expr ('&' shift_expr)* and_expr: shift_expr ('&' shift_expr)*
shift_expr: arith_expr (('<<'|'>>') arith_expr)* shift_expr: arith_expr (('<<'|'>>') arith_expr)*
arith_expr: term (('+'|'-') term)* arith_expr: term (('+'|'-') term)*
term: factor (('*'|'/'|'%'|'//') factor)* term: factor (('*'|'@'|'/'|'%'|'//') factor)*
factor: ('+'|'-'|'~') factor | power factor: ('+'|'-'|'~') factor | power
power: atom trailer* ['**' factor] power: atom_expr ['**' factor]
atom_expr: ['await'] atom trailer*
atom: ('(' [yield_expr|testlist_comp] ')' | atom: ('(' [yield_expr|testlist_comp] ')' |
'[' [testlist_comp] ']' | '[' [testlist_comp] ']' |
'{' [dictorsetmaker] '}' | '{' [dictorsetmaker] '}' |
NAME | NUMBER | strings | '...' | 'None' | 'True' | 'False') NAME | NUMBER | strings | '...' | 'None' | 'True' | 'False')
strings: STRING+ testlist_comp: (namedexpr_test|star_expr) ( comp_for | (',' (namedexpr_test|star_expr))* [','] )
testlist_comp: (test|star_expr) ( comp_for | (',' (test|star_expr))* [','] )
trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
subscriptlist: subscript (',' subscript)* [','] subscriptlist: subscript (',' subscript)* [',']
subscript: test | [test] ':' [test] [sliceop] subscript: test [':=' test] | [test] ':' [test] [sliceop]
sliceop: ':' [test] sliceop: ':' [test]
exprlist: (expr|star_expr) (',' (expr|star_expr))* [','] exprlist: (expr|star_expr) (',' (expr|star_expr))* [',']
testlist: test (',' test)* [','] testlist: test (',' test)* [',']
dictorsetmaker: ( (test ':' test (comp_for | (',' test ':' test)* [','])) | dictorsetmaker: ( ((test ':' test | '**' expr)
(test (comp_for | (',' test)* [','])) ) (comp_for | (',' (test ':' test | '**' expr))* [','])) |
((test [':=' test] | star_expr)
(comp_for | (',' (test [':=' test] | star_expr))* [','])) )
classdef: 'class' NAME ['(' [arglist] ')'] ':' suite classdef: 'class' NAME ['(' [arglist] ')'] ':' suite
arglist: (argument ',')* (argument [','] arglist: argument (',' argument)* [',']
|'*' test (',' argument)* [',' '**' test]
|'**' test)
# The reason that keywords are test nodes instead of NAME is that using NAME # The reason that keywords are test nodes instead of NAME is that using NAME
# results in an ambiguity. ast.c makes sure it's a NAME. # results in an ambiguity. ast.c makes sure it's a NAME.
argument: test [comp_for] | test '=' test # Really [keyword '='] test # "test '=' test" is really "keyword '=' test", but we have no such token.
# These need to be in a single rule to avoid grammar that is ambiguous
# to our LL(1) parser. Even though 'test' includes '*expr' in star_expr,
# we explicitly match '*' here, too, to give it proper precedence.
# Illegal combinations and orderings are blocked in ast.c:
# multiple (test comp_for) arguments are blocked; keyword unpackings
# that precede iterable unpackings are blocked; etc.
argument: ( test [comp_for] |
test ':=' test |
test '=' test |
'**' test |
'*' test )
comp_iter: comp_for | comp_if comp_iter: comp_for | comp_if
comp_for: 'for' exprlist 'in' or_test [comp_iter] sync_comp_for: 'for' exprlist 'in' or_test [comp_iter]
comp_for: ['async'] sync_comp_for
comp_if: 'if' test_nocond [comp_iter] comp_if: 'if' test_nocond [comp_iter]
# not used in grammar, but may appear in "node" passed from Parser to Compiler # not used in grammar, but may appear in "node" passed from Parser to Compiler
encoding_decl: NAME encoding_decl: NAME
yield_expr: 'yield' [yield_arg] yield_expr: 'yield' [yield_arg]
yield_arg: 'from' test | testlist yield_arg: 'from' test | testlist_star_expr
strings: (STRING | fstring)+
fstring: FSTRING_START fstring_content* FSTRING_END
fstring_content: FSTRING_STRING | fstring_expr
fstring_conversion: '!' NAME
fstring_expr: '{' (testlist_comp | yield_expr) ['='] [ fstring_conversion ] [ fstring_format_spec ] '}'
fstring_format_spec: ':' fstring_content*
-134
View File
@@ -1,134 +0,0 @@
# Grammar for Python
# Note: Changing the grammar specified in this file will most likely
# require corresponding changes in the parser module
# (../Modules/parsermodule.c). If you can't make the changes to
# that module yourself, please co-ordinate the required changes
# with someone who can; ask around on python-dev for help. Fred
# Drake <fdrake@acm.org> will probably be listening there.
# NOTE WELL: You should also follow all the steps listed in PEP 306,
# "How to Change Python's Grammar"
# Start symbols for the grammar:
# single_input is a single interactive statement;
# file_input is a module or sequence of commands read from an input file;
# eval_input is the input for the eval() functions.
# NB: compound_stmt in single_input is followed by extra NEWLINE!
single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE
file_input: (NEWLINE | stmt)* ENDMARKER
eval_input: testlist NEWLINE* ENDMARKER
decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE
decorators: decorator+
decorated: decorators (classdef | funcdef)
funcdef: 'def' NAME parameters ['->' test] ':' suite
parameters: '(' [typedargslist] ')'
typedargslist: (tfpdef ['=' test] (',' tfpdef ['=' test])* [','
['*' [tfpdef] (',' tfpdef ['=' test])* [',' '**' tfpdef] | '**' tfpdef]]
| '*' [tfpdef] (',' tfpdef ['=' test])* [',' '**' tfpdef] | '**' tfpdef)
tfpdef: NAME [':' test]
varargslist: (vfpdef ['=' test] (',' vfpdef ['=' test])* [','
['*' [vfpdef] (',' vfpdef ['=' test])* [',' '**' vfpdef] | '**' vfpdef]]
| '*' [vfpdef] (',' vfpdef ['=' test])* [',' '**' vfpdef] | '**' vfpdef)
vfpdef: NAME
stmt: simple_stmt | compound_stmt
simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt |
import_stmt | global_stmt | nonlocal_stmt | assert_stmt)
expr_stmt: testlist_star_expr (augassign (yield_expr|testlist) |
('=' (yield_expr|testlist_star_expr))*)
testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [',']
augassign: ('+=' | '-=' | '*=' | '/=' | '%=' | '&=' | '|=' | '^=' |
'<<=' | '>>=' | '**=' | '//=')
# For normal assignments, additional restrictions enforced by the interpreter
del_stmt: 'del' exprlist
pass_stmt: 'pass'
flow_stmt: break_stmt | continue_stmt | return_stmt | raise_stmt | yield_stmt
break_stmt: 'break'
continue_stmt: 'continue'
return_stmt: 'return' [testlist]
yield_stmt: yield_expr
raise_stmt: 'raise' [test ['from' test]]
import_stmt: import_name | import_from
import_name: 'import' dotted_as_names
# note below: the ('.' | '...') is necessary because '...' is tokenized as ELLIPSIS
import_from: ('from' (('.' | '...')* dotted_name | ('.' | '...')+)
'import' ('*' | '(' import_as_names ')' | import_as_names))
import_as_name: NAME ['as' NAME]
dotted_as_name: dotted_name ['as' NAME]
import_as_names: import_as_name (',' import_as_name)* [',']
dotted_as_names: dotted_as_name (',' dotted_as_name)*
dotted_name: NAME ('.' NAME)*
global_stmt: 'global' NAME (',' NAME)*
nonlocal_stmt: 'nonlocal' NAME (',' NAME)*
assert_stmt: 'assert' test [',' test]
compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated
if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
while_stmt: 'while' test ':' suite ['else' ':' suite]
for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite]
try_stmt: ('try' ':' suite
((except_clause ':' suite)+
['else' ':' suite]
['finally' ':' suite] |
'finally' ':' suite))
with_stmt: 'with' with_item (',' with_item)* ':' suite
with_item: test ['as' expr]
# NB compile.c makes sure that the default except clause is last
except_clause: 'except' [test ['as' NAME]]
suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT
test: or_test ['if' or_test 'else' test] | lambdef
test_nocond: or_test | lambdef_nocond
lambdef: 'lambda' [varargslist] ':' test
lambdef_nocond: 'lambda' [varargslist] ':' test_nocond
or_test: and_test ('or' and_test)*
and_test: not_test ('and' not_test)*
not_test: 'not' not_test | comparison
comparison: expr (comp_op expr)*
# <> isn't actually a valid comparison operator in Python. It's here for the
# sake of a __future__ import described in PEP 401
comp_op: '<'|'>'|'=='|'>='|'<='|'<>'|'!='|'in'|'not' 'in'|'is'|'is' 'not'
star_expr: '*' expr
expr: xor_expr ('|' xor_expr)*
xor_expr: and_expr ('^' and_expr)*
and_expr: shift_expr ('&' shift_expr)*
shift_expr: arith_expr (('<<'|'>>') arith_expr)*
arith_expr: term (('+'|'-') term)*
term: factor (('*'|'/'|'%'|'//') factor)*
factor: ('+'|'-'|'~') factor | power
power: atom trailer* ['**' factor]
atom: ('(' [yield_expr|testlist_comp] ')' |
'[' [testlist_comp] ']' |
'{' [dictorsetmaker] '}' |
NAME | NUMBER | strings | '...' | 'None' | 'True' | 'False')
strings: STRING+
testlist_comp: (test|star_expr) ( comp_for | (',' (test|star_expr))* [','] )
trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
subscriptlist: subscript (',' subscript)* [',']
subscript: test | [test] ':' [test] [sliceop]
sliceop: ':' [test]
exprlist: (expr|star_expr) (',' (expr|star_expr))* [',']
testlist: test (',' test)* [',']
dictorsetmaker: ( (test ':' test (comp_for | (',' test ':' test)* [','])) |
(test (comp_for | (',' test)* [','])) )
classdef: 'class' NAME ['(' [arglist] ')'] ':' suite
arglist: (argument ',')* (argument [',']
|'*' test (',' argument)* [',' '**' test]
|'**' test)
# The reason that keywords are test nodes instead of NAME is that using NAME
# results in an ambiguity. ast.c makes sure it's a NAME.
argument: test [comp_for] | test '=' test # Really [keyword '='] test
comp_iter: comp_for | comp_if
comp_for: 'for' exprlist 'in' or_test [comp_iter]
comp_if: 'if' test_nocond [comp_iter]
# not used in grammar, but may appear in "node" passed from Parser to Compiler
encoding_decl: NAME
yield_expr: 'yield' [yield_arg]
yield_arg: 'from' test | testlist
+5 -4
View File
@@ -9,7 +9,7 @@
# eval_input is the input for the eval() functions. # eval_input is the input for the eval() functions.
# NB: compound_stmt in single_input is followed by extra NEWLINE! # NB: compound_stmt in single_input is followed by extra NEWLINE!
single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE
file_input: (NEWLINE | stmt)* ENDMARKER file_input: stmt* ENDMARKER
eval_input: testlist NEWLINE* ENDMARKER eval_input: testlist NEWLINE* ENDMARKER
decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE
decorators: decorator+ decorators: decorator+
@@ -35,7 +35,7 @@ varargslist: (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [
) )
vfpdef: NAME vfpdef: NAME
stmt: simple_stmt | compound_stmt stmt: simple_stmt | compound_stmt | NEWLINE
simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt | small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt |
import_stmt | global_stmt | nonlocal_stmt | assert_stmt) import_stmt | global_stmt | nonlocal_stmt | assert_stmt)
@@ -140,7 +140,8 @@ argument: ( test [comp_for] |
'*' test ) '*' test )
comp_iter: comp_for | comp_if comp_iter: comp_for | comp_if
comp_for: ['async'] 'for' exprlist 'in' or_test [comp_iter] sync_comp_for: 'for' exprlist 'in' or_test [comp_iter]
comp_for: ['async'] sync_comp_for
comp_if: 'if' test_nocond [comp_iter] comp_if: 'if' test_nocond [comp_iter]
# not used in grammar, but may appear in "node" passed from Parser to Compiler # not used in grammar, but may appear in "node" passed from Parser to Compiler
@@ -153,5 +154,5 @@ strings: (STRING | fstring)+
fstring: FSTRING_START fstring_content* FSTRING_END fstring: FSTRING_START fstring_content* FSTRING_END
fstring_content: FSTRING_STRING | fstring_expr fstring_content: FSTRING_STRING | fstring_expr
fstring_conversion: '!' NAME fstring_conversion: '!' NAME
fstring_expr: '{' testlist_comp [ fstring_conversion ] [ fstring_format_spec ] '}' fstring_expr: '{' (testlist_comp | yield_expr) [ fstring_conversion ] [ fstring_format_spec ] '}'
fstring_format_spec: ':' fstring_content* fstring_format_spec: ':' fstring_content*
+5 -4
View File
@@ -9,7 +9,7 @@
# eval_input is the input for the eval() functions. # eval_input is the input for the eval() functions.
# NB: compound_stmt in single_input is followed by extra NEWLINE! # NB: compound_stmt in single_input is followed by extra NEWLINE!
single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE
file_input: (NEWLINE | stmt)* ENDMARKER file_input: stmt* ENDMARKER
eval_input: testlist NEWLINE* ENDMARKER eval_input: testlist NEWLINE* ENDMARKER
decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE
decorators: decorator+ decorators: decorator+
@@ -33,7 +33,7 @@ varargslist: (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [
) )
vfpdef: NAME vfpdef: NAME
stmt: simple_stmt | compound_stmt stmt: simple_stmt | compound_stmt | NEWLINE
simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt | small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt |
import_stmt | global_stmt | nonlocal_stmt | assert_stmt) import_stmt | global_stmt | nonlocal_stmt | assert_stmt)
@@ -138,7 +138,8 @@ argument: ( test [comp_for] |
'*' test ) '*' test )
comp_iter: comp_for | comp_if comp_iter: comp_for | comp_if
comp_for: ['async'] 'for' exprlist 'in' or_test [comp_iter] sync_comp_for: 'for' exprlist 'in' or_test [comp_iter]
comp_for: ['async'] sync_comp_for
comp_if: 'if' test_nocond [comp_iter] comp_if: 'if' test_nocond [comp_iter]
# not used in grammar, but may appear in "node" passed from Parser to Compiler # not used in grammar, but may appear in "node" passed from Parser to Compiler
@@ -151,5 +152,5 @@ strings: (STRING | fstring)+
fstring: FSTRING_START fstring_content* FSTRING_END fstring: FSTRING_START fstring_content* FSTRING_END
fstring_content: FSTRING_STRING | fstring_expr fstring_content: FSTRING_STRING | fstring_expr
fstring_conversion: '!' NAME fstring_conversion: '!' NAME
fstring_expr: '{' testlist [ fstring_conversion ] [ fstring_format_spec ] '}' fstring_expr: '{' (testlist_comp | yield_expr) [ fstring_conversion ] [ fstring_format_spec ] '}'
fstring_format_spec: ':' fstring_content* fstring_format_spec: ':' fstring_content*
+23 -9
View File
@@ -9,7 +9,7 @@
# eval_input is the input for the eval() functions. # eval_input is the input for the eval() functions.
# NB: compound_stmt in single_input is followed by extra NEWLINE! # NB: compound_stmt in single_input is followed by extra NEWLINE!
single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE
file_input: (NEWLINE | stmt)* ENDMARKER file_input: stmt* ENDMARKER
eval_input: testlist NEWLINE* ENDMARKER eval_input: testlist NEWLINE* ENDMARKER
decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE
@@ -20,13 +20,25 @@ async_funcdef: 'async' funcdef
funcdef: 'def' NAME parameters ['->' test] ':' suite funcdef: 'def' NAME parameters ['->' test] ':' suite
parameters: '(' [typedargslist] ')' parameters: '(' [typedargslist] ')'
typedargslist: (tfpdef ['=' test] (',' tfpdef ['=' test])* [',' [ typedargslist: (
(tfpdef ['=' test] (',' tfpdef ['=' test])* ',' '/' [',' [ tfpdef ['=' test] (
',' tfpdef ['=' test])* ([',' [
'*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
| '**' tfpdef [',']]])
| '*' [tfpdef] (',' tfpdef ['=' test])* ([',' ['**' tfpdef [',']]])
| '**' tfpdef [',']]] )
| (tfpdef ['=' test] (',' tfpdef ['=' test])* [',' [
'*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]] '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
| '**' tfpdef [',']]] | '**' tfpdef [',']]]
| '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]] | '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
| '**' tfpdef [',']) | '**' tfpdef [','])
)
tfpdef: NAME [':' test] tfpdef: NAME [':' test]
varargslist: (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [ varargslist: vfpdef ['=' test ](',' vfpdef ['=' test])* ',' '/' [',' [ (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [
'*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '**' vfpdef [',']]]
| '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '**' vfpdef [',']) ]] | (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [
'*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]] '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '**' vfpdef [',']]] | '**' vfpdef [',']]]
| '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]] | '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
@@ -34,13 +46,13 @@ varargslist: (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [
) )
vfpdef: NAME vfpdef: NAME
stmt: simple_stmt | compound_stmt stmt: simple_stmt | compound_stmt | NEWLINE
simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt | small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt |
import_stmt | global_stmt | nonlocal_stmt | assert_stmt) import_stmt | global_stmt | nonlocal_stmt | assert_stmt)
expr_stmt: testlist_star_expr (annassign | augassign (yield_expr|testlist) | expr_stmt: testlist_star_expr (annassign | augassign (yield_expr|testlist) |
('=' (yield_expr|testlist_star_expr))*) ('=' (yield_expr|testlist_star_expr))*)
annassign: ':' test ['=' test] annassign: ':' test ['=' (yield_expr|testlist_star_expr)]
testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [','] testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [',']
augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' | augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' |
'<<=' | '>>=' | '**=' | '//=') '<<=' | '>>=' | '**=' | '//=')
@@ -69,8 +81,8 @@ assert_stmt: 'assert' test [',' test]
compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated | async_stmt compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated | async_stmt
async_stmt: 'async' (funcdef | with_stmt | for_stmt) async_stmt: 'async' (funcdef | with_stmt | for_stmt)
if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite] if_stmt: 'if' namedexpr_test ':' suite ('elif' namedexpr_test ':' suite)* ['else' ':' suite]
while_stmt: 'while' test ':' suite ['else' ':' suite] while_stmt: 'while' namedexpr_test ':' suite ['else' ':' suite]
for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite] for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite]
try_stmt: ('try' ':' suite try_stmt: ('try' ':' suite
((except_clause ':' suite)+ ((except_clause ':' suite)+
@@ -83,6 +95,7 @@ with_item: test ['as' expr]
except_clause: 'except' [test ['as' NAME]] except_clause: 'except' [test ['as' NAME]]
suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT
namedexpr_test: test [':=' test]
test: or_test ['if' or_test 'else' test] | lambdef test: or_test ['if' or_test 'else' test] | lambdef
test_nocond: or_test | lambdef_nocond test_nocond: or_test | lambdef_nocond
lambdef: 'lambda' [varargslist] ':' test lambdef: 'lambda' [varargslist] ':' test
@@ -108,7 +121,7 @@ atom: ('(' [yield_expr|testlist_comp] ')' |
'[' [testlist_comp] ']' | '[' [testlist_comp] ']' |
'{' [dictorsetmaker] '}' | '{' [dictorsetmaker] '}' |
NAME | NUMBER | strings | '...' | 'None' | 'True' | 'False') NAME | NUMBER | strings | '...' | 'None' | 'True' | 'False')
testlist_comp: (test|star_expr) ( comp_for | (',' (test|star_expr))* [','] ) testlist_comp: (namedexpr_test|star_expr) ( comp_for | (',' (namedexpr_test|star_expr))* [','] )
trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
subscriptlist: subscript (',' subscript)* [','] subscriptlist: subscript (',' subscript)* [',']
subscript: test | [test] ':' [test] [sliceop] subscript: test | [test] ':' [test] [sliceop]
@@ -134,6 +147,7 @@ arglist: argument (',' argument)* [',']
# multiple (test comp_for) arguments are blocked; keyword unpackings # multiple (test comp_for) arguments are blocked; keyword unpackings
# that precede iterable unpackings are blocked; etc. # that precede iterable unpackings are blocked; etc.
argument: ( test [comp_for] | argument: ( test [comp_for] |
test ':=' test |
test '=' test | test '=' test |
'**' test | '**' test |
'*' test ) '*' test )
@@ -153,5 +167,5 @@ strings: (STRING | fstring)+
fstring: FSTRING_START fstring_content* FSTRING_END fstring: FSTRING_START fstring_content* FSTRING_END
fstring_content: FSTRING_STRING | fstring_expr fstring_content: FSTRING_STRING | fstring_expr
fstring_conversion: '!' NAME fstring_conversion: '!' NAME
fstring_expr: '{' testlist [ fstring_conversion ] [ fstring_format_spec ] '}' fstring_expr: '{' (testlist_comp | yield_expr) ['='] [ fstring_conversion ] [ fstring_format_spec ] '}'
fstring_format_spec: ':' fstring_content* fstring_format_spec: ':' fstring_content*
@@ -1,14 +1,7 @@
# Grammar for Python # Grammar for Python
# Note: Changing the grammar specified in this file will most likely
# require corresponding changes in the parser module
# (../Modules/parsermodule.c). If you can't make the changes to
# that module yourself, please co-ordinate the required changes
# with someone who can; ask around on python-dev for help. Fred
# Drake <fdrake@acm.org> will probably be listening there.
# NOTE WELL: You should also follow all the steps listed at # NOTE WELL: You should also follow all the steps listed at
# https://docs.python.org/devguide/grammar.html # https://devguide.python.org/grammar/
# Start symbols for the grammar: # Start symbols for the grammar:
# single_input is a single interactive statement; # single_input is a single interactive statement;
@@ -16,44 +9,60 @@
# eval_input is the input for the eval() functions. # eval_input is the input for the eval() functions.
# NB: compound_stmt in single_input is followed by extra NEWLINE! # NB: compound_stmt in single_input is followed by extra NEWLINE!
single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE
file_input: (NEWLINE | stmt)* ENDMARKER file_input: stmt* ENDMARKER
eval_input: testlist NEWLINE* ENDMARKER eval_input: testlist NEWLINE* ENDMARKER
decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE decorator: '@' namedexpr_test NEWLINE
decorators: decorator+ decorators: decorator+
decorated: decorators (classdef | funcdef | async_funcdef) decorated: decorators (classdef | funcdef | async_funcdef)
# NOTE: Reinoud Elhorst, using ASYNC/AWAIT keywords instead of tokens
# skipping python3.5 compatibility, in favour of 3.7 solution
async_funcdef: 'async' funcdef async_funcdef: 'async' funcdef
funcdef: 'def' NAME parameters ['->' test] ':' suite funcdef: 'def' NAME parameters ['->' test] ':' suite
parameters: '(' [typedargslist] ')' parameters: '(' [typedargslist] ')'
typedargslist: (tfpdef ['=' test] (',' tfpdef ['=' test])* [',' typedargslist: (
['*' [tfpdef] (',' tfpdef ['=' test])* [',' '**' tfpdef] | '**' tfpdef]] (tfpdef ['=' test] (',' tfpdef ['=' test])* ',' '/' [',' [ tfpdef ['=' test] (
| '*' [tfpdef] (',' tfpdef ['=' test])* [',' '**' tfpdef] | '**' tfpdef) ',' tfpdef ['=' test])* ([',' [
'*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
| '**' tfpdef [',']]])
| '*' [tfpdef] (',' tfpdef ['=' test])* ([',' ['**' tfpdef [',']]])
| '**' tfpdef [',']]] )
| (tfpdef ['=' test] (',' tfpdef ['=' test])* [',' [
'*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
| '**' tfpdef [',']]]
| '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
| '**' tfpdef [','])
)
tfpdef: NAME [':' test] tfpdef: NAME [':' test]
varargslist: (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' varargslist: vfpdef ['=' test ](',' vfpdef ['=' test])* ',' '/' [',' [ (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [
['*' [vfpdef] (',' vfpdef ['=' test])* [',' '**' vfpdef] | '**' vfpdef]] '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '*' [vfpdef] (',' vfpdef ['=' test])* [',' '**' vfpdef] | '**' vfpdef) | '**' vfpdef [',']]]
| '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '**' vfpdef [',']) ]] | (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [
'*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '**' vfpdef [',']]]
| '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '**' vfpdef [',']
)
vfpdef: NAME vfpdef: NAME
stmt: simple_stmt | compound_stmt stmt: simple_stmt | compound_stmt | NEWLINE
simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt | small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt |
import_stmt | global_stmt | nonlocal_stmt | assert_stmt) import_stmt | global_stmt | nonlocal_stmt | assert_stmt)
expr_stmt: testlist_star_expr (augassign (yield_expr|testlist) | expr_stmt: testlist_star_expr (annassign | augassign (yield_expr|testlist) |
('=' (yield_expr|testlist_star_expr))*) ('=' (yield_expr|testlist_star_expr))*)
annassign: ':' test ['=' (yield_expr|testlist_star_expr)]
testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [','] testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [',']
augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' | augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' |
'<<=' | '>>=' | '**=' | '//=') '<<=' | '>>=' | '**=' | '//=')
# For normal assignments, additional restrictions enforced by the interpreter # For normal and annotated assignments, additional restrictions enforced by the interpreter
del_stmt: 'del' exprlist del_stmt: 'del' exprlist
pass_stmt: 'pass' pass_stmt: 'pass'
flow_stmt: break_stmt | continue_stmt | return_stmt | raise_stmt | yield_stmt flow_stmt: break_stmt | continue_stmt | return_stmt | raise_stmt | yield_stmt
break_stmt: 'break' break_stmt: 'break'
continue_stmt: 'continue' continue_stmt: 'continue'
return_stmt: 'return' [testlist] return_stmt: 'return' [testlist_star_expr]
yield_stmt: yield_expr yield_stmt: yield_expr
raise_stmt: 'raise' [test ['from' test]] raise_stmt: 'raise' [test ['from' test]]
import_stmt: import_name | import_from import_stmt: import_name | import_from
@@ -72,8 +81,8 @@ assert_stmt: 'assert' test [',' test]
compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated | async_stmt compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated | async_stmt
async_stmt: 'async' (funcdef | with_stmt | for_stmt) async_stmt: 'async' (funcdef | with_stmt | for_stmt)
if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite] if_stmt: 'if' namedexpr_test ':' suite ('elif' namedexpr_test ':' suite)* ['else' ':' suite]
while_stmt: 'while' test ':' suite ['else' ':' suite] while_stmt: 'while' namedexpr_test ':' suite ['else' ':' suite]
for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite] for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite]
try_stmt: ('try' ':' suite try_stmt: ('try' ':' suite
((except_clause ':' suite)+ ((except_clause ':' suite)+
@@ -86,6 +95,7 @@ with_item: test ['as' expr]
except_clause: 'except' [test ['as' NAME]] except_clause: 'except' [test ['as' NAME]]
suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT
namedexpr_test: test [':=' test]
test: or_test ['if' or_test 'else' test] | lambdef test: or_test ['if' or_test 'else' test] | lambdef
test_nocond: or_test | lambdef_nocond test_nocond: or_test | lambdef_nocond
lambdef: 'lambda' [varargslist] ':' test lambdef: 'lambda' [varargslist] ':' test
@@ -111,8 +121,7 @@ atom: ('(' [yield_expr|testlist_comp] ')' |
'[' [testlist_comp] ']' | '[' [testlist_comp] ']' |
'{' [dictorsetmaker] '}' | '{' [dictorsetmaker] '}' |
NAME | NUMBER | strings | '...' | 'None' | 'True' | 'False') NAME | NUMBER | strings | '...' | 'None' | 'True' | 'False')
strings: STRING+ testlist_comp: (namedexpr_test|star_expr) ( comp_for | (',' (namedexpr_test|star_expr))* [','] )
testlist_comp: (test|star_expr) ( comp_for | (',' (test|star_expr))* [','] )
trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
subscriptlist: subscript (',' subscript)* [','] subscriptlist: subscript (',' subscript)* [',']
subscript: test | [test] ':' [test] [sliceop] subscript: test | [test] ':' [test] [sliceop]
@@ -121,8 +130,8 @@ exprlist: (expr|star_expr) (',' (expr|star_expr))* [',']
testlist: test (',' test)* [','] testlist: test (',' test)* [',']
dictorsetmaker: ( ((test ':' test | '**' expr) dictorsetmaker: ( ((test ':' test | '**' expr)
(comp_for | (',' (test ':' test | '**' expr))* [','])) | (comp_for | (',' (test ':' test | '**' expr))* [','])) |
((test | star_expr) ((test [':=' test] | star_expr)
(comp_for | (',' (test | star_expr))* [','])) ) (comp_for | (',' (test [':=' test] | star_expr))* [','])) )
classdef: 'class' NAME ['(' [arglist] ')'] ':' suite classdef: 'class' NAME ['(' [arglist] ')'] ':' suite
@@ -138,16 +147,25 @@ arglist: argument (',' argument)* [',']
# multiple (test comp_for) arguments are blocked; keyword unpackings # multiple (test comp_for) arguments are blocked; keyword unpackings
# that precede iterable unpackings are blocked; etc. # that precede iterable unpackings are blocked; etc.
argument: ( test [comp_for] | argument: ( test [comp_for] |
test ':=' test |
test '=' test | test '=' test |
'**' test | '**' test |
'*' test ) '*' test )
comp_iter: comp_for | comp_if comp_iter: comp_for | comp_if
comp_for: 'for' exprlist 'in' or_test [comp_iter] sync_comp_for: 'for' exprlist 'in' or_test [comp_iter]
comp_for: ['async'] sync_comp_for
comp_if: 'if' test_nocond [comp_iter] comp_if: 'if' test_nocond [comp_iter]
# not used in grammar, but may appear in "node" passed from Parser to Compiler # not used in grammar, but may appear in "node" passed from Parser to Compiler
encoding_decl: NAME encoding_decl: NAME
yield_expr: 'yield' [yield_arg] yield_expr: 'yield' [yield_arg]
yield_arg: 'from' test | testlist yield_arg: 'from' test | testlist_star_expr
strings: (STRING | fstring)+
fstring: FSTRING_START fstring_content* FSTRING_END
fstring_content: FSTRING_STRING | fstring_expr
fstring_conversion: '!' NAME
fstring_expr: '{' (testlist_comp | yield_expr) ['='] [ fstring_conversion ] [ fstring_format_spec ] '}'
fstring_format_spec: ':' fstring_content*
+2 -10
View File
@@ -24,7 +24,6 @@ A list of syntax/indentation errors I've encountered in CPython.
# Just ignore this one, newer versions will not be affected anymore and # Just ignore this one, newer versions will not be affected anymore and
# it's a limit of 2^16 - 1. # it's a limit of 2^16 - 1.
"too many annotations" # Only python 3.0 - 3.5, 3.6 is not affected.
# Python/ast.c # Python/ast.c
# used with_item exprlist expr_stmt # used with_item exprlist expr_stmt
@@ -54,8 +53,8 @@ A list of syntax/indentation errors I've encountered in CPython.
"iterable unpacking cannot be used in comprehension" # [*[] for a in [1]] "iterable unpacking cannot be used in comprehension" # [*[] for a in [1]]
"dict unpacking cannot be used in dict comprehension" # {**{} for a in [1]} "dict unpacking cannot be used in dict comprehension" # {**{} for a in [1]}
"Generator expression must be parenthesized if not sole argument" # foo(x for x in [], b) "Generator expression must be parenthesized if not sole argument" # foo(x for x in [], b)
"positional argument follows keyword argument unpacking" # f(**x, y) >= 3.5 "positional argument follows keyword argument unpacking" # f(**x, y)
"positional argument follows keyword argument" # f(x=2, y) >= 3.5 "positional argument follows keyword argument" # f(x=2, y)
"iterable argument unpacking follows keyword argument unpacking" # foo(**kwargs, *args) "iterable argument unpacking follows keyword argument unpacking" # foo(**kwargs, *args)
"lambda cannot contain assignment" # f(lambda: 1=1) "lambda cannot contain assignment" # f(lambda: 1=1)
"keyword can't be an expression" # f(+x=1) "keyword can't be an expression" # f(+x=1)
@@ -167,10 +166,3 @@ A list of syntax/indentation errors I've encountered in CPython.
E_OVERFLOW: "expression too long" E_OVERFLOW: "expression too long"
E_DECODE: "unknown decode error" E_DECODE: "unknown decode error"
E_BADSINGLE: "multiple statements found while compiling a single statement" E_BADSINGLE: "multiple statements found while compiling a single statement"
Version specific:
Python 3.5:
'yield' inside async function
Python 3.3/3.4:
can use starred expression only as assignment target
+10 -18
View File
@@ -39,17 +39,14 @@ class Parser(BaseParser):
'for_stmt': tree.ForStmt, 'for_stmt': tree.ForStmt,
'while_stmt': tree.WhileStmt, 'while_stmt': tree.WhileStmt,
'try_stmt': tree.TryStmt, 'try_stmt': tree.TryStmt,
'comp_for': tree.CompFor, 'sync_comp_for': tree.SyncCompFor,
# Not sure if this is the best idea, but IMO it's the easiest way to # Not sure if this is the best idea, but IMO it's the easiest way to
# avoid extreme amounts of work around the subtle difference of 2/3 # avoid extreme amounts of work around the subtle difference of 2/3
# grammar in list comoprehensions. # grammar in list comoprehensions.
'list_for': tree.CompFor,
# Same here. This just exists in Python 2.6.
'gen_for': tree.CompFor,
'decorator': tree.Decorator, 'decorator': tree.Decorator,
'lambdef': tree.Lambda, 'lambdef': tree.Lambda,
'old_lambdef': tree.Lambda,
'lambdef_nocond': tree.Lambda, 'lambdef_nocond': tree.Lambda,
'namedexpr_test': tree.NamedExpr,
} }
default_node = tree.PythonNode default_node = tree.PythonNode
@@ -65,8 +62,8 @@ class Parser(BaseParser):
} }
def __init__(self, pgen_grammar, error_recovery=True, start_nonterminal='file_input'): def __init__(self, pgen_grammar, error_recovery=True, start_nonterminal='file_input'):
super(Parser, self).__init__(pgen_grammar, start_nonterminal, super().__init__(pgen_grammar, start_nonterminal,
error_recovery=error_recovery) error_recovery=error_recovery)
self.syntax_errors = [] self.syntax_errors = []
self._omit_dedent_list = [] self._omit_dedent_list = []
@@ -79,7 +76,7 @@ class Parser(BaseParser):
tokens = self._recovery_tokenize(tokens) tokens = self._recovery_tokenize(tokens)
return super(Parser, self).parse(tokens) return super().parse(tokens)
def convert_node(self, nonterminal, children): def convert_node(self, nonterminal, children):
""" """
@@ -98,12 +95,6 @@ class Parser(BaseParser):
# ones and therefore have pseudo start/end positions and no # ones and therefore have pseudo start/end positions and no
# prefixes. Just ignore them. # prefixes. Just ignore them.
children = [children[0]] + children[2:-1] children = [children[0]] + children[2:-1]
elif nonterminal == 'list_if':
# Make transitioning from 2 to 3 easier.
nonterminal = 'comp_if'
elif nonterminal == 'listmaker':
# Same as list_if above.
nonterminal = 'testlist_comp'
node = self.default_node(nonterminal, children) node = self.default_node(nonterminal, children)
for c in children: for c in children:
c.parent = node c.parent = node
@@ -128,10 +119,10 @@ class Parser(BaseParser):
if self._start_nonterminal == 'file_input' and \ if self._start_nonterminal == 'file_input' and \
(token.type == PythonTokenTypes.ENDMARKER (token.type == PythonTokenTypes.ENDMARKER
or token.type == DEDENT and '\n' not in last_leaf.value or token.type == DEDENT and not last_leaf.value.endswith('\n')
and '\r' not in last_leaf.value): and not last_leaf.value.endswith('\r')):
# In Python statements need to end with a newline. But since it's # In Python statements need to end with a newline. But since it's
# possible (and valid in Python ) that there's no newline at the # possible (and valid in Python) that there's no newline at the
# end of a file, we have to recover even if the user doesn't want # end of a file, we have to recover even if the user doesn't want
# error recovery. # error recovery.
if self.stack[-1].dfa.from_rule == 'simple_stmt': if self.stack[-1].dfa.from_rule == 'simple_stmt':
@@ -148,7 +139,7 @@ class Parser(BaseParser):
return return
if not self._error_recovery: if not self._error_recovery:
return super(Parser, self).error_recovery(token) return super().error_recovery(token)
def current_suite(stack): def current_suite(stack):
# For now just discard everything that is not a suite or # For now just discard everything that is not a suite or
@@ -210,6 +201,7 @@ class Parser(BaseParser):
o = self._omit_dedent_list o = self._omit_dedent_list
if o and o[-1] == self._indent_counter: if o and o[-1] == self._indent_counter:
o.pop() o.pop()
self._indent_counter -= 1
continue continue
self._indent_counter -= 1 self._indent_counter -= 1
+74 -36
View File
@@ -1,5 +1,6 @@
import re import re
from contextlib import contextmanager from contextlib import contextmanager
from typing import Tuple
from parso.python.errors import ErrorFinder, ErrorFinderConfig from parso.python.errors import ErrorFinder, ErrorFinderConfig
from parso.normalizer import Rule from parso.normalizer import Rule
@@ -15,16 +16,17 @@ _CLOSING_BRACKETS = ')', ']', '}'
_FACTOR = '+', '-', '~' _FACTOR = '+', '-', '~'
_ALLOW_SPACE = '*', '+', '-', '**', '/', '//', '@' _ALLOW_SPACE = '*', '+', '-', '**', '/', '//', '@'
_BITWISE_OPERATOR = '<<', '>>', '|', '&', '^' _BITWISE_OPERATOR = '<<', '>>', '|', '&', '^'
_NEEDS_SPACE = ('=', '%', '->', _NEEDS_SPACE: Tuple[str, ...] = (
'<', '>', '==', '>=', '<=', '<>', '!=', '=', '%', '->',
'+=', '-=', '*=', '@=', '/=', '%=', '&=', '|=', '^=', '<<=', '<', '>', '==', '>=', '<=', '<>', '!=',
'>>=', '**=', '//=') '+=', '-=', '*=', '@=', '/=', '%=', '&=', '|=', '^=', '<<=',
'>>=', '**=', '//=')
_NEEDS_SPACE += _BITWISE_OPERATOR _NEEDS_SPACE += _BITWISE_OPERATOR
_IMPLICIT_INDENTATION_TYPES = ('dictorsetmaker', 'argument') _IMPLICIT_INDENTATION_TYPES = ('dictorsetmaker', 'argument')
_POSSIBLE_SLICE_PARENTS = ('subscript', 'subscriptlist', 'sliceop') _POSSIBLE_SLICE_PARENTS = ('subscript', 'subscriptlist', 'sliceop')
class IndentationTypes(object): class IndentationTypes:
VERTICAL_BRACKET = object() VERTICAL_BRACKET = object()
HANGING_BRACKET = object() HANGING_BRACKET = object()
BACKSLASH = object() BACKSLASH = object()
@@ -71,7 +73,6 @@ class BracketNode(IndentationNode):
n = n.parent n = n.parent
parent_indentation = n.indentation parent_indentation = n.indentation
next_leaf = leaf.get_next_leaf() next_leaf = leaf.get_next_leaf()
if '\n' in next_leaf.prefix: if '\n' in next_leaf.prefix:
# This implies code like: # This implies code like:
@@ -93,7 +94,7 @@ class BracketNode(IndentationNode):
if '\t' in config.indentation: if '\t' in config.indentation:
self.indentation = None self.indentation = None
else: else:
self.indentation = ' ' * expected_end_indent self.indentation = ' ' * expected_end_indent
self.bracket_indentation = self.indentation self.bracket_indentation = self.indentation
self.type = IndentationTypes.VERTICAL_BRACKET self.type = IndentationTypes.VERTICAL_BRACKET
@@ -111,7 +112,7 @@ class ImplicitNode(BracketNode):
annotations and dict values. annotations and dict values.
""" """
def __init__(self, config, leaf, parent): def __init__(self, config, leaf, parent):
super(ImplicitNode, self).__init__(config, leaf, parent) super().__init__(config, leaf, parent)
self.type = IndentationTypes.IMPLICIT self.type = IndentationTypes.IMPLICIT
next_leaf = leaf.get_next_leaf() next_leaf = leaf.get_next_leaf()
@@ -137,7 +138,7 @@ class BackslashNode(IndentationNode):
self.indentation = parent_indentation + config.indentation self.indentation = parent_indentation + config.indentation
else: else:
# +1 because there is a space. # +1 because there is a space.
self.indentation = ' ' * (equals.end_pos[1] + 1) self.indentation = ' ' * (equals.end_pos[1] + 1)
else: else:
self.indentation = parent_indentation + config.indentation self.indentation = parent_indentation + config.indentation
self.bracket_indentation = self.indentation self.bracket_indentation = self.indentation
@@ -150,7 +151,7 @@ def _is_magic_name(name):
class PEP8Normalizer(ErrorFinder): class PEP8Normalizer(ErrorFinder):
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
super(PEP8Normalizer, self).__init__(*args, **kwargs) super().__init__(*args, **kwargs)
self._previous_part = None self._previous_part = None
self._previous_leaf = None self._previous_leaf = None
self._on_newline = True self._on_newline = True
@@ -173,7 +174,7 @@ class PEP8Normalizer(ErrorFinder):
@contextmanager @contextmanager
def visit_node(self, node): def visit_node(self, node):
with super(PEP8Normalizer, self).visit_node(node): with super().visit_node(node):
with self._visit_node(node): with self._visit_node(node):
yield yield
@@ -190,7 +191,8 @@ class PEP8Normalizer(ErrorFinder):
expr_stmt = node.parent expr_stmt = node.parent
# Check if it's simply defining a single name, not something like # Check if it's simply defining a single name, not something like
# foo.bar or x[1], where using a lambda could make more sense. # foo.bar or x[1], where using a lambda could make more sense.
if expr_stmt.type == 'expr_stmt' and any(n.type == 'name' for n in expr_stmt.children[:-2:2]): if expr_stmt.type == 'expr_stmt' and any(n.type == 'name'
for n in expr_stmt.children[:-2:2]):
self.add_issue(node, 731, 'Do not assign a lambda expression, use a def') self.add_issue(node, 731, 'Do not assign a lambda expression, use a def')
elif typ == 'try_stmt': elif typ == 'try_stmt':
for child in node.children: for child in node.children:
@@ -221,7 +223,6 @@ class PEP8Normalizer(ErrorFinder):
if typ in _IMPORT_TYPES: if typ in _IMPORT_TYPES:
simple_stmt = node.parent simple_stmt = node.parent
module = simple_stmt.parent module = simple_stmt.parent
#if module.type == 'simple_stmt':
if module.type == 'file_input': if module.type == 'file_input':
index = module.children.index(simple_stmt) index = module.children.index(simple_stmt)
for child in module.children[:index]: for child in module.children[:index]:
@@ -341,7 +342,7 @@ class PEP8Normalizer(ErrorFinder):
self._newline_count = 0 self._newline_count = 0
def visit_leaf(self, leaf): def visit_leaf(self, leaf):
super(PEP8Normalizer, self).visit_leaf(leaf) super().visit_leaf(leaf)
for part in leaf._split_prefix(): for part in leaf._split_prefix():
if part.type == 'spacing': if part.type == 'spacing':
# This part is used for the part call after for. # This part is used for the part call after for.
@@ -406,7 +407,6 @@ class PEP8Normalizer(ErrorFinder):
and leaf.parent.parent.type == 'decorated': and leaf.parent.parent.type == 'decorated':
self.add_issue(part, 304, "Blank lines found after function decorator") self.add_issue(part, 304, "Blank lines found after function decorator")
self._newline_count += 1 self._newline_count += 1
if type_ == 'backslash': if type_ == 'backslash':
@@ -461,33 +461,62 @@ class PEP8Normalizer(ErrorFinder):
else: else:
should_be_indentation = node.indentation should_be_indentation = node.indentation
if self._in_suite_introducer and indentation == \ if self._in_suite_introducer and indentation == \
node.get_latest_suite_node().indentation \ node.get_latest_suite_node().indentation \
+ self._config.indentation: + self._config.indentation:
self.add_issue(part, 129, "Line with same indent as next logical block") self.add_issue(part, 129, "Line with same indent as next logical block")
elif indentation != should_be_indentation: elif indentation != should_be_indentation:
if not self._check_tabs_spaces(spacing) and part.value != '\n': if not self._check_tabs_spaces(spacing) and part.value != '\n':
if value in '])}': if value in '])}':
if node.type == IndentationTypes.VERTICAL_BRACKET: if node.type == IndentationTypes.VERTICAL_BRACKET:
self.add_issue(part, 124, "Closing bracket does not match visual indentation") self.add_issue(
part,
124,
"Closing bracket does not match visual indentation"
)
else: else:
self.add_issue(part, 123, "Losing bracket does not match indentation of opening bracket's line") self.add_issue(
part,
123,
"Losing bracket does not match "
"indentation of opening bracket's line"
)
else: else:
if len(indentation) < len(should_be_indentation): if len(indentation) < len(should_be_indentation):
if node.type == IndentationTypes.VERTICAL_BRACKET: if node.type == IndentationTypes.VERTICAL_BRACKET:
self.add_issue(part, 128, 'Continuation line under-indented for visual indent') self.add_issue(
part,
128,
'Continuation line under-indented for visual indent'
)
elif node.type == IndentationTypes.BACKSLASH: elif node.type == IndentationTypes.BACKSLASH:
self.add_issue(part, 122, 'Continuation line missing indentation or outdented') self.add_issue(
part,
122,
'Continuation line missing indentation or outdented'
)
elif node.type == IndentationTypes.IMPLICIT: elif node.type == IndentationTypes.IMPLICIT:
self.add_issue(part, 135, 'xxx') self.add_issue(part, 135, 'xxx')
else: else:
self.add_issue(part, 121, 'Continuation line under-indented for hanging indent') self.add_issue(
part,
121,
'Continuation line under-indented for hanging indent'
)
else: else:
if node.type == IndentationTypes.VERTICAL_BRACKET: if node.type == IndentationTypes.VERTICAL_BRACKET:
self.add_issue(part, 127, 'Continuation line over-indented for visual indent') self.add_issue(
part,
127,
'Continuation line over-indented for visual indent'
)
elif node.type == IndentationTypes.IMPLICIT: elif node.type == IndentationTypes.IMPLICIT:
self.add_issue(part, 136, 'xxx') self.add_issue(part, 136, 'xxx')
else: else:
self.add_issue(part, 126, 'Continuation line over-indented for hanging indent') self.add_issue(
part,
126,
'Continuation line over-indented for hanging indent'
)
else: else:
self._check_spacing(part, spacing) self._check_spacing(part, spacing)
@@ -524,7 +553,7 @@ class PEP8Normalizer(ErrorFinder):
else: else:
last_column = part.end_pos[1] last_column = part.end_pos[1]
if last_column > self._config.max_characters \ if last_column > self._config.max_characters \
and spacing.start_pos[1] <= self._config.max_characters : and spacing.start_pos[1] <= self._config.max_characters:
# Special case for long URLs in multi-line docstrings or comments, # Special case for long URLs in multi-line docstrings or comments,
# but still report the error when the 72 first chars are whitespaces. # but still report the error when the 72 first chars are whitespaces.
report = True report = True
@@ -538,7 +567,7 @@ class PEP8Normalizer(ErrorFinder):
part, part,
501, 501,
'Line too long (%s > %s characters)' % 'Line too long (%s > %s characters)' %
(last_column, self._config.max_characters), (last_column, self._config.max_characters),
) )
def _check_spacing(self, part, spacing): def _check_spacing(self, part, spacing):
@@ -573,11 +602,11 @@ class PEP8Normalizer(ErrorFinder):
message = "Whitespace before '%s'" % part.value message = "Whitespace before '%s'" % part.value
add_if_spaces(spacing, 202, message) add_if_spaces(spacing, 202, message)
elif part in (',', ';') or part == ':' \ elif part in (',', ';') or part == ':' \
and part.parent.type not in _POSSIBLE_SLICE_PARENTS: and part.parent.type not in _POSSIBLE_SLICE_PARENTS:
message = "Whitespace before '%s'" % part.value message = "Whitespace before '%s'" % part.value
add_if_spaces(spacing, 203, message) add_if_spaces(spacing, 203, message)
elif prev == ':' and prev.parent.type in _POSSIBLE_SLICE_PARENTS: elif prev == ':' and prev.parent.type in _POSSIBLE_SLICE_PARENTS:
pass # TODO pass # TODO
elif prev in (',', ';', ':'): elif prev in (',', ';', ':'):
add_not_spaces(spacing, 231, "missing whitespace after '%s'") add_not_spaces(spacing, 231, "missing whitespace after '%s'")
elif part == ':': # Is a subscript elif part == ':': # Is a subscript
@@ -602,9 +631,17 @@ class PEP8Normalizer(ErrorFinder):
if param.type == 'param' and param.annotation: if param.type == 'param' and param.annotation:
add_not_spaces(spacing, 252, 'Expected spaces around annotation equals') add_not_spaces(spacing, 252, 'Expected spaces around annotation equals')
else: else:
add_if_spaces(spacing, 251, 'Unexpected spaces around keyword / parameter equals') add_if_spaces(
spacing,
251,
'Unexpected spaces around keyword / parameter equals'
)
elif part in _BITWISE_OPERATOR or prev in _BITWISE_OPERATOR: elif part in _BITWISE_OPERATOR or prev in _BITWISE_OPERATOR:
add_not_spaces(spacing, 227, 'Missing whitespace around bitwise or shift operator') add_not_spaces(
spacing,
227,
'Missing whitespace around bitwise or shift operator'
)
elif part == '%' or prev == '%': elif part == '%' or prev == '%':
add_not_spaces(spacing, 228, 'Missing whitespace around modulo operator') add_not_spaces(spacing, 228, 'Missing whitespace around modulo operator')
else: else:
@@ -621,8 +658,7 @@ class PEP8Normalizer(ErrorFinder):
if spaces and part not in _ALLOW_SPACE and prev not in _ALLOW_SPACE: if spaces and part not in _ALLOW_SPACE and prev not in _ALLOW_SPACE:
message_225 = 'Missing whitespace between tokens' message_225 = 'Missing whitespace between tokens'
#print('xy', spacing) # self.add_issue(spacing, 225, message_225)
#self.add_issue(spacing, 225, message_225)
# TODO why only brackets? # TODO why only brackets?
if part in _OPENING_BRACKETS: if part in _OPENING_BRACKETS:
message = "Whitespace before '%s'" % part.value message = "Whitespace before '%s'" % part.value
@@ -664,7 +700,8 @@ class PEP8Normalizer(ErrorFinder):
self.add_issue(leaf, 711, message) self.add_issue(leaf, 711, message)
break break
elif node.value in ('True', 'False'): elif node.value in ('True', 'False'):
message = "comparison to False/True should be 'if cond is True:' or 'if cond:'" message = "comparison to False/True should be " \
"'if cond is True:' or 'if cond:'"
self.add_issue(leaf, 712, message) self.add_issue(leaf, 712, message)
break break
elif leaf.value in ('in', 'is'): elif leaf.value in ('in', 'is'):
@@ -680,6 +717,7 @@ class PEP8Normalizer(ErrorFinder):
indentation = re.match(r'[ \t]*', line).group(0) indentation = re.match(r'[ \t]*', line).group(0)
start_pos = leaf.line + i, len(indentation) start_pos = leaf.line + i, len(indentation)
# TODO check multiline indentation. # TODO check multiline indentation.
start_pos
elif typ == 'endmarker': elif typ == 'endmarker':
if self._newline_count >= 2: if self._newline_count >= 2:
self.add_issue(leaf, 391, 'Blank line at end of file') self.add_issue(leaf, 391, 'Blank line at end of file')
@@ -694,7 +732,7 @@ class PEP8Normalizer(ErrorFinder):
return return
if code in (901, 903): if code in (901, 903):
# 901 and 903 are raised by the ErrorFinder. # 901 and 903 are raised by the ErrorFinder.
super(PEP8Normalizer, self).add_issue(node, code, message) super().add_issue(node, code, message)
else: else:
# Skip ErrorFinder here, because it has custom behavior. # Skip ErrorFinder here, because it has custom behavior.
super(ErrorFinder, self).add_issue(node, code, message) super(ErrorFinder, self).add_issue(node, code, message)
@@ -718,7 +756,7 @@ class PEP8NormalizerConfig(ErrorFinderConfig):
# TODO this is not yet ready. # TODO this is not yet ready.
#@PEP8Normalizer.register_rule(type='endmarker') # @PEP8Normalizer.register_rule(type='endmarker')
class BlankLineAtEnd(Rule): class BlankLineAtEnd(Rule):
code = 392 code = 392
message = 'Blank line at end of file' message = 'Blank line at end of file'
+2 -2
View File
@@ -6,7 +6,7 @@ from parso.python.tokenize import group
unicode_bom = BOM_UTF8.decode('utf-8') unicode_bom = BOM_UTF8.decode('utf-8')
class PrefixPart(object): class PrefixPart:
def __init__(self, leaf, typ, value, spacing='', start_pos=None): def __init__(self, leaf, typ, value, spacing='', start_pos=None):
assert start_pos is not None assert start_pos is not None
self.parent = leaf self.parent = leaf
@@ -71,7 +71,7 @@ def split_prefix(leaf, start_pos):
value = spacing = '' value = spacing = ''
bom = False bom = False
while start != len(leaf.prefix): while start != len(leaf.prefix):
match =_regex.match(leaf.prefix, start) match = _regex.match(leaf.prefix, start)
spacing = match.group(1) spacing = match.group(1)
value = match.group(2) value = match.group(2)
if not value: if not value:
+21 -17
View File
@@ -1,8 +1,13 @@
from __future__ import absolute_import from __future__ import absolute_import
from enum import Enum
class TokenType(object):
def __init__(self, name, contains_syntax=False): class TokenType:
name: str
contains_syntax: bool
def __init__(self, name: str, contains_syntax: bool = False):
self.name = name self.name = name
self.contains_syntax = contains_syntax self.contains_syntax = contains_syntax
@@ -10,18 +15,17 @@ class TokenType(object):
return '%s(%s)' % (self.__class__.__name__, self.name) return '%s(%s)' % (self.__class__.__name__, self.name)
class TokenTypes(object): class PythonTokenTypes(Enum):
""" STRING = TokenType('STRING')
Basically an enum, but Python 2 doesn't have enums in the standard library. NUMBER = TokenType('NUMBER')
""" NAME = TokenType('NAME', contains_syntax=True)
def __init__(self, names, contains_syntax): ERRORTOKEN = TokenType('ERRORTOKEN')
for name in names: NEWLINE = TokenType('NEWLINE')
setattr(self, name, TokenType(name, contains_syntax=name in contains_syntax)) INDENT = TokenType('INDENT')
DEDENT = TokenType('DEDENT')
ERROR_DEDENT = TokenType('ERROR_DEDENT')
PythonTokenTypes = TokenTypes(( FSTRING_STRING = TokenType('FSTRING_STRING')
'STRING', 'NUMBER', 'NAME', 'ERRORTOKEN', 'NEWLINE', 'INDENT', 'DEDENT', FSTRING_START = TokenType('FSTRING_START')
'ERROR_DEDENT', 'FSTRING_STRING', 'FSTRING_START', 'FSTRING_END', 'OP', FSTRING_END = TokenType('FSTRING_END')
'ENDMARKER'), OP = TokenType('OP', contains_syntax=True)
contains_syntax=('NAME', 'OP'), ENDMARKER = TokenType('ENDMARKER')
)
+251 -174
View File
@@ -12,17 +12,19 @@ memory optimizations here.
from __future__ import absolute_import from __future__ import absolute_import
import sys import sys
import string
import re import re
from collections import namedtuple
import itertools as _itertools import itertools as _itertools
from codecs import BOM_UTF8 from codecs import BOM_UTF8
from typing import NamedTuple, Tuple, Iterator, Iterable, List, Dict, \
Pattern, Set
from parso.python.token import PythonTokenTypes from parso.python.token import PythonTokenTypes
from parso._compatibility import py_version from parso.utils import split_lines, PythonVersionInfo, parse_version_string
from parso.utils import split_lines
# Maximum code point of Unicode 6.0: 0x10ffff (1,114,111)
MAX_UNICODE = '\U0010ffff'
STRING = PythonTokenTypes.STRING STRING = PythonTokenTypes.STRING
NAME = PythonTokenTypes.NAME NAME = PythonTokenTypes.NAME
NUMBER = PythonTokenTypes.NUMBER NUMBER = PythonTokenTypes.NUMBER
@@ -37,26 +39,23 @@ FSTRING_START = PythonTokenTypes.FSTRING_START
FSTRING_STRING = PythonTokenTypes.FSTRING_STRING FSTRING_STRING = PythonTokenTypes.FSTRING_STRING
FSTRING_END = PythonTokenTypes.FSTRING_END FSTRING_END = PythonTokenTypes.FSTRING_END
TokenCollection = namedtuple(
'TokenCollection', class TokenCollection(NamedTuple):
'pseudo_token single_quoted triple_quoted endpats whitespace ' pseudo_token: Pattern
'fstring_pattern_map always_break_tokens', single_quoted: Set[str]
) triple_quoted: Set[str]
endpats: Dict[str, Pattern]
whitespace: Pattern
fstring_pattern_map: Dict[str, str]
always_break_tokens: Tuple[str]
BOM_UTF8_STRING = BOM_UTF8.decode('utf-8') BOM_UTF8_STRING = BOM_UTF8.decode('utf-8')
_token_collection_cache = {} _token_collection_cache: Dict[PythonVersionInfo, TokenCollection] = {}
if py_version >= 30:
# Python 3 has str.isidentifier() to check if a char is a valid identifier
is_identifier = str.isidentifier
else:
namechars = string.ascii_letters + '_'
is_identifier = lambda s: s in namechars
def group(*choices, **kwargs): def group(*choices, capture=False, **kwargs):
capture = kwargs.pop('capture', False) # Python 2, arrghhhhh :(
assert not kwargs assert not kwargs
start = '(' start = '('
@@ -70,19 +69,17 @@ def maybe(*choices):
# Return the empty string, plus all of the valid string prefixes. # Return the empty string, plus all of the valid string prefixes.
def _all_string_prefixes(version_info, include_fstring=False, only_fstring=False): def _all_string_prefixes(*, include_fstring=False, only_fstring=False):
def different_case_versions(prefix): def different_case_versions(prefix):
for s in _itertools.product(*[(c, c.upper()) for c in prefix]): for s in _itertools.product(*[(c, c.upper()) for c in prefix]):
yield ''.join(s) yield ''.join(s)
# The valid string prefixes. Only contain the lower case versions, # The valid string prefixes. Only contain the lower case versions,
# and don't contain any permuations (include 'fr', but not # and don't contain any permuations (include 'fr', but not
# 'rf'). The various permutations will be generated. # 'rf'). The various permutations will be generated.
valid_string_prefixes = ['b', 'r', 'u'] valid_string_prefixes = ['b', 'r', 'u', 'br']
if version_info >= (3, 0):
valid_string_prefixes.append('br')
result = set(['']) result = {''}
if version_info >= (3, 6) and include_fstring: if include_fstring:
f = ['f', 'fr'] f = ['f', 'fr']
if only_fstring: if only_fstring:
valid_string_prefixes = f valid_string_prefixes = f
@@ -98,10 +95,6 @@ def _all_string_prefixes(version_info, include_fstring=False, only_fstring=False
# create a list with upper and lower versions of each # create a list with upper and lower versions of each
# character # character
result.update(different_case_versions(t)) result.update(different_case_versions(t))
if version_info <= (2, 7):
# In Python 2 the order cannot just be random.
result.update(different_case_versions('ur'))
result.update(different_case_versions('br'))
return result return result
@@ -118,8 +111,16 @@ def _get_token_collection(version_info):
return result return result
fstring_string_single_line = _compile(r'(?:[^{}\r\n]+|\{\{|\}\})+') unicode_character_name = r'[A-Za-z0-9\-]+(?: [A-Za-z0-9\-]+)*'
fstring_string_multi_line = _compile(r'(?:[^{}]+|\{\{|\}\})+') fstring_string_single_line = _compile(
r'(?:\{\{|\}\}|\\N\{' + unicode_character_name
+ r'\}|\\(?:\r\n?|\n)|\\[^\r\nN]|[^{}\r\n\\])+'
)
fstring_string_multi_line = _compile(
r'(?:\{\{|\}\}|\\N\{' + unicode_character_name + r'\}|\\[^N]|[^{}\\])+'
)
fstring_format_spec_single_line = _compile(r'(?:\\(?:\r\n?|\n)|[^{}\r\n])+')
fstring_format_spec_multi_line = _compile(r'[^{}]+')
def _create_token_collection(version_info): def _create_token_collection(version_info):
@@ -128,42 +129,27 @@ def _create_token_collection(version_info):
Whitespace = r'[ \f\t]*' Whitespace = r'[ \f\t]*'
whitespace = _compile(Whitespace) whitespace = _compile(Whitespace)
Comment = r'#[^\r\n]*' Comment = r'#[^\r\n]*'
Name = r'\w+' Name = '([A-Za-z_0-9\u0080-' + MAX_UNICODE + ']+)'
if version_info >= (3, 6): Hexnumber = r'0[xX](?:_?[0-9a-fA-F])+'
Hexnumber = r'0[xX](?:_?[0-9a-fA-F])+' Binnumber = r'0[bB](?:_?[01])+'
Binnumber = r'0[bB](?:_?[01])+' Octnumber = r'0[oO](?:_?[0-7])+'
Octnumber = r'0[oO](?:_?[0-7])+' Decnumber = r'(?:0(?:_?0)*|[1-9](?:_?[0-9])*)'
Decnumber = r'(?:0(?:_?0)*|[1-9](?:_?[0-9])*)' Intnumber = group(Hexnumber, Binnumber, Octnumber, Decnumber)
Intnumber = group(Hexnumber, Binnumber, Octnumber, Decnumber) Exponent = r'[eE][-+]?[0-9](?:_?[0-9])*'
Exponent = r'[eE][-+]?[0-9](?:_?[0-9])*' Pointfloat = group(r'[0-9](?:_?[0-9])*\.(?:[0-9](?:_?[0-9])*)?',
Pointfloat = group(r'[0-9](?:_?[0-9])*\.(?:[0-9](?:_?[0-9])*)?', r'\.[0-9](?:_?[0-9])*') + maybe(Exponent)
r'\.[0-9](?:_?[0-9])*') + maybe(Exponent) Expfloat = r'[0-9](?:_?[0-9])*' + Exponent
Expfloat = r'[0-9](?:_?[0-9])*' + Exponent Floatnumber = group(Pointfloat, Expfloat)
Floatnumber = group(Pointfloat, Expfloat) Imagnumber = group(r'[0-9](?:_?[0-9])*[jJ]', Floatnumber + r'[jJ]')
Imagnumber = group(r'[0-9](?:_?[0-9])*[jJ]', Floatnumber + r'[jJ]')
else:
Hexnumber = r'0[xX][0-9a-fA-F]+'
Binnumber = r'0[bB][01]+'
if version_info >= (3, 0):
Octnumber = r'0[oO][0-7]+'
else:
Octnumber = '0[oO]?[0-7]+'
Decnumber = r'(?:0+|[1-9][0-9]*)'
Intnumber = group(Hexnumber, Binnumber, Octnumber, Decnumber)
Exponent = r'[eE][-+]?[0-9]+'
Pointfloat = group(r'[0-9]+\.[0-9]*', r'\.[0-9]+') + maybe(Exponent)
Expfloat = r'[0-9]+' + Exponent
Floatnumber = group(Pointfloat, Expfloat)
Imagnumber = group(r'[0-9]+[jJ]', Floatnumber + r'[jJ]')
Number = group(Imagnumber, Floatnumber, Intnumber) Number = group(Imagnumber, Floatnumber, Intnumber)
# Note that since _all_string_prefixes includes the empty string, # Note that since _all_string_prefixes includes the empty string,
# StringPrefix can be the empty string (making it optional). # StringPrefix can be the empty string (making it optional).
possible_prefixes = _all_string_prefixes(version_info) possible_prefixes = _all_string_prefixes()
StringPrefix = group(*possible_prefixes) StringPrefix = group(*possible_prefixes)
StringPrefixWithF = group(*_all_string_prefixes(version_info, include_fstring=True)) StringPrefixWithF = group(*_all_string_prefixes(include_fstring=True))
fstring_prefixes = _all_string_prefixes(version_info, include_fstring=True, only_fstring=True) fstring_prefixes = _all_string_prefixes(include_fstring=True, only_fstring=True)
FStringStart = group(*fstring_prefixes) FStringStart = group(*fstring_prefixes)
# Tail end of ' string. # Tail end of ' string.
@@ -186,18 +172,20 @@ def _create_token_collection(version_info):
Bracket = '[][(){}]' Bracket = '[][(){}]'
special_args = [r'\r\n?', r'\n', r'[:;.,@]'] special_args = [r'\.\.\.', r'\r\n?', r'\n', r'[;.,@]']
if version_info >= (3, 0): if version_info >= (3, 8):
special_args.insert(0, r'\.\.\.') special_args.insert(0, ":=?")
else:
special_args.insert(0, ":")
Special = group(*special_args) Special = group(*special_args)
Funny = group(Operator, Bracket, Special) Funny = group(Operator, Bracket, Special)
# First (or only) line of ' or " string. # First (or only) line of ' or " string.
ContStr = group(StringPrefix + r"'[^\r\n'\\]*(?:\\.[^\r\n'\\]*)*" + ContStr = group(StringPrefix + r"'[^\r\n'\\]*(?:\\.[^\r\n'\\]*)*"
group("'", r'\\(?:\r\n?|\n)'), + group("'", r'\\(?:\r\n?|\n)'),
StringPrefix + r'"[^\r\n"\\]*(?:\\.[^\r\n"\\]*)*' + StringPrefix + r'"[^\r\n"\\]*(?:\\.[^\r\n"\\]*)*'
group('"', r'\\(?:\r\n?|\n)')) + group('"', r'\\(?:\r\n?|\n)'))
pseudo_extra_pool = [Comment, Triple] pseudo_extra_pool = [Comment, Triple]
all_quotes = '"', "'", '"""', "'''" all_quotes = '"', "'", '"""', "'''"
if fstring_prefixes: if fstring_prefixes:
@@ -234,17 +222,23 @@ def _create_token_collection(version_info):
fstring_pattern_map[t + quote] = quote fstring_pattern_map[t + quote] = quote
ALWAYS_BREAK_TOKENS = (';', 'import', 'class', 'def', 'try', 'except', ALWAYS_BREAK_TOKENS = (';', 'import', 'class', 'def', 'try', 'except',
'finally', 'while', 'with', 'return') 'finally', 'while', 'with', 'return', 'continue',
'break', 'del', 'pass', 'global', 'assert', 'nonlocal')
pseudo_token_compiled = _compile(PseudoToken) pseudo_token_compiled = _compile(PseudoToken)
return TokenCollection( return TokenCollection(
pseudo_token_compiled, single_quoted, triple_quoted, endpats, pseudo_token_compiled, single_quoted, triple_quoted, endpats,
whitespace, fstring_pattern_map, ALWAYS_BREAK_TOKENS whitespace, fstring_pattern_map, set(ALWAYS_BREAK_TOKENS)
) )
class Token(namedtuple('Token', ['type', 'string', 'start_pos', 'prefix'])): class Token(NamedTuple):
type: PythonTokenTypes
string: str
start_pos: Tuple[int, int]
prefix: str
@property @property
def end_pos(self): def end_pos(self) -> Tuple[int, int]:
lines = split_lines(self.string) lines = split_lines(self.string)
if len(lines) > 1: if len(lines) > 1:
return self.start_pos[0] + len(lines) - 1, 0 return self.start_pos[0] + len(lines) - 1, 0
@@ -258,7 +252,7 @@ class PythonToken(Token):
self._replace(type=self.type.name)) self._replace(type=self.type.name))
class FStringNode(object): class FStringNode:
def __init__(self, quote): def __init__(self, quote):
self.quote = quote self.quote = quote
self.parentheses_count = 0 self.parentheses_count = 0
@@ -281,32 +275,45 @@ class FStringNode(object):
return len(self.quote) == 3 return len(self.quote) == 3
def is_in_expr(self): def is_in_expr(self):
return (self.parentheses_count - self.format_spec_count) > 0 return self.parentheses_count > self.format_spec_count
def is_in_format_spec(self):
return not self.is_in_expr() and self.format_spec_count
def _close_fstring_if_necessary(fstring_stack, string, start_pos, additional_prefix): def _close_fstring_if_necessary(fstring_stack, string, line_nr, column, additional_prefix):
for fstring_stack_index, node in enumerate(fstring_stack): for fstring_stack_index, node in enumerate(fstring_stack):
if string.startswith(node.quote): lstripped_string = string.lstrip()
len_lstrip = len(string) - len(lstripped_string)
if lstripped_string.startswith(node.quote):
token = PythonToken( token = PythonToken(
FSTRING_END, FSTRING_END,
node.quote, node.quote,
start_pos, (line_nr, column + len_lstrip),
prefix=additional_prefix, prefix=additional_prefix+string[:len_lstrip],
) )
additional_prefix = '' additional_prefix = ''
assert not node.previous_lines assert not node.previous_lines
del fstring_stack[fstring_stack_index:] del fstring_stack[fstring_stack_index:]
return token, '', len(node.quote) return token, '', len(node.quote) + len_lstrip
return None, additional_prefix, 0 return None, additional_prefix, 0
def _find_fstring_string(endpats, fstring_stack, line, lnum, pos): def _find_fstring_string(endpats, fstring_stack, line, lnum, pos):
tos = fstring_stack[-1] tos = fstring_stack[-1]
allow_multiline = tos.allow_multiline() allow_multiline = tos.allow_multiline()
if allow_multiline: if tos.is_in_format_spec():
match = fstring_string_multi_line.match(line, pos) if allow_multiline:
regex = fstring_format_spec_multi_line
else:
regex = fstring_format_spec_single_line
else: else:
match = fstring_string_single_line.match(line, pos) if allow_multiline:
regex = fstring_string_multi_line
else:
regex = fstring_string_single_line
match = regex.match(line, pos)
if match is None: if match is None:
return tos.previous_lines, pos return tos.previous_lines, pos
@@ -321,7 +328,9 @@ def _find_fstring_string(endpats, fstring_stack, line, lnum, pos):
new_pos = pos new_pos = pos
new_pos += len(string) new_pos += len(string)
if allow_multiline and (string.endswith('\n') or string.endswith('\r')): # even if allow_multiline is False, we still need to check for trailing
# newlines, because a single-line f-string can contain line continuations
if string.endswith('\n') or string.endswith('\r'):
tos.previous_lines += string tos.previous_lines += string
string = '' string = ''
else: else:
@@ -330,10 +339,12 @@ def _find_fstring_string(endpats, fstring_stack, line, lnum, pos):
return string, new_pos return string, new_pos
def tokenize(code, version_info, start_pos=(1, 0)): def tokenize(
code: str, *, version_info: PythonVersionInfo, start_pos: Tuple[int, int] = (1, 0)
) -> Iterator[PythonToken]:
"""Generate tokens from a the source code (string).""" """Generate tokens from a the source code (string)."""
lines = split_lines(code, keepends=True) lines = split_lines(code, keepends=True)
return tokenize_lines(lines, version_info, start_pos=start_pos) return tokenize_lines(lines, version_info=version_info, start_pos=start_pos)
def _print_tokens(func): def _print_tokens(func):
@@ -342,13 +353,21 @@ def _print_tokens(func):
""" """
def wrapper(*args, **kwargs): def wrapper(*args, **kwargs):
for token in func(*args, **kwargs): for token in func(*args, **kwargs):
print(token) # This print is intentional for debugging!
yield token yield token
return wrapper return wrapper
# @_print_tokens # @_print_tokens
def tokenize_lines(lines, version_info, start_pos=(1, 0)): def tokenize_lines(
lines: Iterable[str],
*,
version_info: PythonVersionInfo,
indents: List[int] = None,
start_pos: Tuple[int, int] = (1, 0),
is_first_token=True,
) -> Iterator[PythonToken]:
""" """
A heavily modified Python standard library tokenizer. A heavily modified Python standard library tokenizer.
@@ -359,20 +378,24 @@ def tokenize_lines(lines, version_info, start_pos=(1, 0)):
def dedent_if_necessary(start): def dedent_if_necessary(start):
while start < indents[-1]: while start < indents[-1]:
if start > indents[-2]: if start > indents[-2]:
yield PythonToken(ERROR_DEDENT, '', (lnum, 0), '') yield PythonToken(ERROR_DEDENT, '', (lnum, start), '')
indents[-1] = start
break break
yield PythonToken(DEDENT, '', spos, '')
indents.pop() indents.pop()
yield PythonToken(DEDENT, '', spos, '')
pseudo_token, single_quoted, triple_quoted, endpats, whitespace, \ pseudo_token, single_quoted, triple_quoted, endpats, whitespace, \
fstring_pattern_map, always_break_tokens, = \ fstring_pattern_map, always_break_tokens, = \
_get_token_collection(version_info) _get_token_collection(version_info)
paren_level = 0 # count parentheses paren_level = 0 # count parentheses
indents = [0] if indents is None:
max = 0 indents = [0]
max_ = 0
numchars = '0123456789' numchars = '0123456789'
contstr = '' contstr = ''
contline = None contline: str
contstr_start: Tuple[int, int]
endprog: Pattern
# We start with a newline. This makes indent at the first position # We start with a newline. This makes indent at the first position
# possible. It's not valid Python, but still better than an INDENT in the # possible. It's not valid Python, but still better than an INDENT in the
# second line (and not in the first). This makes quite a few things in # second line (and not in the first). This makes quite a few things in
@@ -380,47 +403,44 @@ def tokenize_lines(lines, version_info, start_pos=(1, 0)):
new_line = True new_line = True
prefix = '' # Should never be required, but here for safety prefix = '' # Should never be required, but here for safety
additional_prefix = '' additional_prefix = ''
first = True
lnum = start_pos[0] - 1 lnum = start_pos[0] - 1
fstring_stack = [] fstring_stack: List[FStringNode] = []
for line in lines: # loop over lines in stream for line in lines: # loop over lines in stream
lnum += 1 lnum += 1
pos = 0 pos = 0
max = len(line) max_ = len(line)
if first: if is_first_token:
if line.startswith(BOM_UTF8_STRING): if line.startswith(BOM_UTF8_STRING):
additional_prefix = BOM_UTF8_STRING additional_prefix = BOM_UTF8_STRING
line = line[1:] line = line[1:]
max = len(line) max_ = len(line)
# Fake that the part before was already parsed. # Fake that the part before was already parsed.
line = '^' * start_pos[1] + line line = '^' * start_pos[1] + line
pos = start_pos[1] pos = start_pos[1]
max += start_pos[1] max_ += start_pos[1]
first = False is_first_token = False
if contstr: # continued string if contstr: # continued string
endmatch = endprog.match(line) endmatch = endprog.match(line) # noqa: F821
if endmatch: if endmatch:
pos = endmatch.end(0) pos = endmatch.end(0)
yield PythonToken( yield PythonToken(
STRING, contstr + line[:pos], STRING, contstr + line[:pos],
contstr_start, prefix) contstr_start, prefix) # noqa: F821
contstr = '' contstr = ''
contline = None contline = ''
else: else:
contstr = contstr + line contstr = contstr + line
contline = contline + line contline = contline + line
continue continue
while pos < max: while pos < max_:
if fstring_stack: if fstring_stack:
tos = fstring_stack[-1] tos = fstring_stack[-1]
if not tos.is_in_expr(): if not tos.is_in_expr():
string, pos = _find_fstring_string(endpats, fstring_stack, line, lnum, pos) string, pos = _find_fstring_string(endpats, fstring_stack, line, lnum, pos)
if pos == max:
break
if string: if string:
yield PythonToken( yield PythonToken(
FSTRING_STRING, string, FSTRING_STRING, string,
@@ -431,12 +451,15 @@ def tokenize_lines(lines, version_info, start_pos=(1, 0)):
) )
tos.previous_lines = '' tos.previous_lines = ''
continue continue
if pos == max_:
break
rest = line[pos:] rest = line[pos:]
fstring_end_token, additional_prefix, quote_length = _close_fstring_if_necessary( fstring_end_token, additional_prefix, quote_length = _close_fstring_if_necessary(
fstring_stack, fstring_stack,
rest, rest,
(lnum, pos), lnum,
pos,
additional_prefix, additional_prefix,
) )
pos += quote_length pos += quote_length
@@ -444,12 +467,52 @@ def tokenize_lines(lines, version_info, start_pos=(1, 0)):
yield fstring_end_token yield fstring_end_token
continue continue
pseudomatch = pseudo_token.match(line, pos) # in an f-string, match until the end of the string
if not pseudomatch: # scan for tokens if fstring_stack:
string_line = line
for fstring_stack_node in fstring_stack:
quote = fstring_stack_node.quote
end_match = endpats[quote].match(line, pos)
if end_match is not None:
end_match_string = end_match.group(0)
if len(end_match_string) - len(quote) + pos < len(string_line):
string_line = line[:pos] + end_match_string[:-len(quote)]
pseudomatch = pseudo_token.match(string_line, pos)
else:
pseudomatch = pseudo_token.match(line, pos)
if pseudomatch:
prefix = additional_prefix + pseudomatch.group(1)
additional_prefix = ''
start, pos = pseudomatch.span(2)
spos = (lnum, start)
token = pseudomatch.group(2)
if token == '':
assert prefix
additional_prefix = prefix
# This means that we have a line with whitespace/comments at
# the end, which just results in an endmarker.
break
initial = token[0]
else:
match = whitespace.match(line, pos) match = whitespace.match(line, pos)
if pos == 0: initial = line[match.end()]
for t in dedent_if_necessary(match.end()): start = match.end()
yield t spos = (lnum, start)
if new_line and initial not in '\r\n#' and (initial != '\\' or pseudomatch is None):
new_line = False
if paren_level == 0 and not fstring_stack:
indent_start = start
if indent_start > indents[-1]:
yield PythonToken(INDENT, '', spos, '')
indents.append(indent_start)
yield from dedent_if_necessary(indent_start)
if not pseudomatch: # scan for tokens
match = whitespace.match(line, pos)
if new_line and paren_level == 0 and not fstring_stack:
yield from dedent_if_necessary(match.end())
pos = match.end() pos = match.end()
new_line = False new_line = False
yield PythonToken( yield PythonToken(
@@ -460,42 +523,24 @@ def tokenize_lines(lines, version_info, start_pos=(1, 0)):
pos += 1 pos += 1
continue continue
prefix = additional_prefix + pseudomatch.group(1) if (initial in numchars # ordinary number
additional_prefix = '' or (initial == '.' and token != '.' and token != '...')):
start, pos = pseudomatch.span(2)
spos = (lnum, start)
token = pseudomatch.group(2)
if token == '':
assert prefix
additional_prefix = prefix
# This means that we have a line with whitespace/comments at
# the end, which just results in an endmarker.
break
initial = token[0]
if new_line and initial not in '\r\n\\#':
new_line = False
if paren_level == 0 and not fstring_stack:
i = 0
indent_start = start
while line[i] == '\f':
i += 1
# TODO don't we need to change spos as well?
indent_start -= 1
if indent_start > indents[-1]:
yield PythonToken(INDENT, '', spos, '')
indents.append(indent_start)
for t in dedent_if_necessary(indent_start):
yield t
if (initial in numchars or # ordinary number
(initial == '.' and token != '.' and token != '...')):
yield PythonToken(NUMBER, token, spos, prefix) yield PythonToken(NUMBER, token, spos, prefix)
elif pseudomatch.group(3) is not None: # ordinary name
if token in always_break_tokens and (fstring_stack or paren_level):
fstring_stack[:] = []
paren_level = 0
# We only want to dedent if the token is on a new line.
m = re.match(r'[ \f\t]*$', line[:start])
if m is not None:
yield from dedent_if_necessary(m.end())
if token.isidentifier():
yield PythonToken(NAME, token, spos, prefix)
else:
yield from _split_illegal_unicode_name(token, spos, prefix)
elif initial in '\r\n': elif initial in '\r\n':
if any(not f.allow_multiline() for f in fstring_stack): if any(not f.allow_multiline() for f in fstring_stack):
# Would use fstring_stack.clear, but that's not available fstring_stack.clear()
# in Python 2.
fstring_stack[:] = []
if not new_line and paren_level == 0 and not fstring_stack: if not new_line and paren_level == 0 and not fstring_stack:
yield PythonToken(NEWLINE, token, spos, prefix) yield PythonToken(NEWLINE, token, spos, prefix)
@@ -504,7 +549,12 @@ def tokenize_lines(lines, version_info, start_pos=(1, 0)):
new_line = True new_line = True
elif initial == '#': # Comments elif initial == '#': # Comments
assert not token.endswith("\n") assert not token.endswith("\n")
additional_prefix = prefix + token if fstring_stack and fstring_stack[-1].is_in_expr():
# `#` is not allowed in f-string expressions
yield PythonToken(ERRORTOKEN, initial, spos, prefix)
pos = start + 1
else:
additional_prefix = prefix + token
elif token in triple_quoted: elif token in triple_quoted:
endprog = endpats[token] endprog = endpats[token]
endmatch = endprog.match(line, pos) endmatch = endprog.match(line, pos)
@@ -513,7 +563,7 @@ def tokenize_lines(lines, version_info, start_pos=(1, 0)):
token = line[start:pos] token = line[start:pos]
yield PythonToken(STRING, token, spos, prefix) yield PythonToken(STRING, token, spos, prefix)
else: else:
contstr_start = (lnum, start) # multiple lines contstr_start = spos # multiple lines
contstr = line[start:] contstr = line[start:]
contline = line contline = line
break break
@@ -545,20 +595,6 @@ def tokenize_lines(lines, version_info, start_pos=(1, 0)):
elif token in fstring_pattern_map: # The start of an fstring. elif token in fstring_pattern_map: # The start of an fstring.
fstring_stack.append(FStringNode(fstring_pattern_map[token])) fstring_stack.append(FStringNode(fstring_pattern_map[token]))
yield PythonToken(FSTRING_START, token, spos, prefix) yield PythonToken(FSTRING_START, token, spos, prefix)
elif is_identifier(initial): # ordinary name
if token in always_break_tokens:
fstring_stack[:] = []
paren_level = 0
# We only want to dedent if the token is on a new line.
if re.match(r'[ \f\t]*$', line[:start]):
while True:
indent = indents.pop()
if indent > start:
yield PythonToken(DEDENT, '', spos, '')
else:
indents.append(indent)
break
yield PythonToken(NAME, token, spos, prefix)
elif initial == '\\' and line[start:] in ('\\\n', '\\\r\n', '\\\r'): # continued stmt elif initial == '\\' and line[start:] in ('\\\n', '\\\r\n', '\\\r'): # continued stmt
additional_prefix += prefix + line[start:] additional_prefix += prefix + line[start:]
break break
@@ -574,9 +610,13 @@ def tokenize_lines(lines, version_info, start_pos=(1, 0)):
else: else:
if paren_level: if paren_level:
paren_level -= 1 paren_level -= 1
elif token == ':' and fstring_stack \ elif token.startswith(':') and fstring_stack \
and fstring_stack[-1].parentheses_count == 1: and fstring_stack[-1].parentheses_count \
- fstring_stack[-1].format_spec_count == 1:
# `:` and `:=` both count
fstring_stack[-1].format_spec_count += 1 fstring_stack[-1].format_spec_count += 1
token = ':'
pos = start + 1
yield PythonToken(OP, token, spos, prefix) yield PythonToken(OP, token, spos, prefix)
@@ -585,26 +625,63 @@ def tokenize_lines(lines, version_info, start_pos=(1, 0)):
if contstr.endswith('\n') or contstr.endswith('\r'): if contstr.endswith('\n') or contstr.endswith('\r'):
new_line = True new_line = True
end_pos = lnum, max if fstring_stack:
tos = fstring_stack[-1]
if tos.previous_lines:
yield PythonToken(
FSTRING_STRING, tos.previous_lines,
tos.last_string_start_pos,
# Never has a prefix because it can start anywhere and
# include whitespace.
prefix=''
)
end_pos = lnum, max_
# As the last position we just take the maximally possible position. We # As the last position we just take the maximally possible position. We
# remove -1 for the last new line. # remove -1 for the last new line.
for indent in indents[1:]: for indent in indents[1:]:
indents.pop()
yield PythonToken(DEDENT, '', end_pos, '') yield PythonToken(DEDENT, '', end_pos, '')
yield PythonToken(ENDMARKER, '', end_pos, additional_prefix) yield PythonToken(ENDMARKER, '', end_pos, additional_prefix)
def _split_illegal_unicode_name(token, start_pos, prefix):
def create_token():
return PythonToken(ERRORTOKEN if is_illegal else NAME, found, pos, prefix)
found = ''
is_illegal = False
pos = start_pos
for i, char in enumerate(token):
if is_illegal:
if char.isidentifier():
yield create_token()
found = char
is_illegal = False
prefix = ''
pos = start_pos[0], start_pos[1] + i
else:
found += char
else:
new_found = found + char
if new_found.isidentifier():
found = new_found
else:
if found:
yield create_token()
prefix = ''
pos = start_pos[0], start_pos[1] + i
found = char
is_illegal = True
if found:
yield create_token()
if __name__ == "__main__": if __name__ == "__main__":
if len(sys.argv) >= 2: path = sys.argv[1]
path = sys.argv[1] with open(path) as f:
with open(path) as f: code = f.read()
code = f.read()
else:
code = sys.stdin.read()
from parso.utils import python_bytes_to_unicode, parse_version_string for token in tokenize(code, version_info=parse_version_string('3.10')):
if isinstance(code, bytes):
code = python_bytes_to_unicode(code)
for token in tokenize(code, parse_version_string()):
print(token) print(token)
+126 -91
View File
@@ -1,5 +1,5 @@
""" """
This is the syntax tree for Python syntaxes (2 & 3). The classes represent This is the syntax tree for Python 3 syntaxes. The classes represent
syntax elements like functions and imports. syntax elements like functions and imports.
All of the nodes can be traced back to the `Python grammar file All of the nodes can be traced back to the `Python grammar file
@@ -43,8 +43,11 @@ Parser Tree Classes
""" """
import re import re
try:
from collections.abc import Mapping
except ImportError:
from collections import Mapping
from parso._compatibility import utf8_repr, unicode
from parso.tree import Node, BaseNode, Leaf, ErrorNode, ErrorLeaf, \ from parso.tree import Node, BaseNode, Leaf, ErrorNode, ErrorLeaf, \
search_ancestor search_ancestor
from parso.python.prefix import split_prefix from parso.python.prefix import split_prefix
@@ -53,15 +56,19 @@ from parso.utils import split_lines
_FLOW_CONTAINERS = set(['if_stmt', 'while_stmt', 'for_stmt', 'try_stmt', _FLOW_CONTAINERS = set(['if_stmt', 'while_stmt', 'for_stmt', 'try_stmt',
'with_stmt', 'async_stmt', 'suite']) 'with_stmt', 'async_stmt', 'suite'])
_RETURN_STMT_CONTAINERS = set(['suite', 'simple_stmt']) | _FLOW_CONTAINERS _RETURN_STMT_CONTAINERS = set(['suite', 'simple_stmt']) | _FLOW_CONTAINERS
_FUNC_CONTAINERS = set(['suite', 'simple_stmt', 'decorated']) | _FLOW_CONTAINERS
_FUNC_CONTAINERS = set(
['suite', 'simple_stmt', 'decorated', 'async_funcdef']
) | _FLOW_CONTAINERS
_GET_DEFINITION_TYPES = set([ _GET_DEFINITION_TYPES = set([
'expr_stmt', 'comp_for', 'with_stmt', 'for_stmt', 'import_name', 'expr_stmt', 'sync_comp_for', 'with_stmt', 'for_stmt', 'import_name',
'import_from', 'param' 'import_from', 'param', 'del_stmt', 'namedexpr_test',
]) ])
_IMPORTS = set(['import_name', 'import_from']) _IMPORTS = set(['import_name', 'import_from'])
class DocstringMixin(object): class DocstringMixin:
__slots__ = () __slots__ = ()
def get_doc_node(self): def get_doc_node(self):
@@ -89,9 +96,9 @@ class DocstringMixin(object):
return None return None
class PythonMixin(object): class PythonMixin:
""" """
Some Python specific utitilies. Some Python specific utilities.
""" """
__slots__ = () __slots__ = ()
@@ -167,7 +174,6 @@ class EndMarker(_LeafWithoutNewlines):
__slots__ = () __slots__ = ()
type = 'endmarker' type = 'endmarker'
@utf8_repr
def __repr__(self): def __repr__(self):
return "<%s: prefix=%s end_pos=%s>" % ( return "<%s: prefix=%s end_pos=%s>" % (
type(self).__name__, repr(self.prefix), self.end_pos type(self).__name__, repr(self.prefix), self.end_pos
@@ -179,7 +185,6 @@ class Newline(PythonLeaf):
__slots__ = () __slots__ = ()
type = 'newline' type = 'newline'
@utf8_repr
def __repr__(self): def __repr__(self):
return "<%s: %s>" % (type(self).__name__, repr(self.value)) return "<%s: %s>" % (type(self).__name__, repr(self.value))
@@ -196,25 +201,22 @@ class Name(_LeafWithoutNewlines):
return "<%s: %s@%s,%s>" % (type(self).__name__, self.value, return "<%s: %s@%s,%s>" % (type(self).__name__, self.value,
self.line, self.column) self.line, self.column)
def is_definition(self): def is_definition(self, include_setitem=False):
""" """
Returns True if the name is being defined. Returns True if the name is being defined.
""" """
return self.get_definition() is not None return self.get_definition(include_setitem=include_setitem) is not None
def get_definition(self, import_name_always=False): def get_definition(self, import_name_always=False, include_setitem=False):
""" """
Returns None if there's on definition for a name. Returns None if there's no definition for a name.
:param import_name_alway: Specifies if an import name is always a :param import_name_always: Specifies if an import name is always a
definition. Normally foo in `from foo import bar` is not a definition. Normally foo in `from foo import bar` is not a
definition. definition.
""" """
node = self.parent node = self.parent
type_ = node.type type_ = node.type
if type_ in ('power', 'atom_expr'):
# In `self.x = 3` self is not a definition, but x is.
return None
if type_ in ('funcdef', 'classdef'): if type_ in ('funcdef', 'classdef'):
if self == node.name: if self == node.name:
@@ -222,9 +224,6 @@ class Name(_LeafWithoutNewlines):
return None return None
if type_ == 'except_clause': if type_ == 'except_clause':
# TODO in Python 2 this doesn't work correctly. See grammar file.
# I think we'll just let it be. Python 2 will be gone in a few
# years.
if self.get_previous_sibling() == 'as': if self.get_previous_sibling() == 'as':
return node.parent # The try_stmt. return node.parent # The try_stmt.
return None return None
@@ -233,7 +232,7 @@ class Name(_LeafWithoutNewlines):
if node.type == 'suite': if node.type == 'suite':
return None return None
if node.type in _GET_DEFINITION_TYPES: if node.type in _GET_DEFINITION_TYPES:
if self in node.get_defined_names(): if self in node.get_defined_names(include_setitem):
return node return node
if import_name_always and node.type in _IMPORTS: if import_name_always and node.type in _IMPORTS:
return node return node
@@ -295,21 +294,17 @@ class FStringEnd(PythonLeaf):
__slots__ = () __slots__ = ()
class _StringComparisonMixin(object): class _StringComparisonMixin:
def __eq__(self, other): def __eq__(self, other):
""" """
Make comparisons with strings easy. Make comparisons with strings easy.
Improves the readability of the parser. Improves the readability of the parser.
""" """
if isinstance(other, (str, unicode)): if isinstance(other, str):
return self.value == other return self.value == other
return self is other return self is other
def __ne__(self, other):
"""Python 2 compatibility."""
return not self.__eq__(other)
def __hash__(self): def __hash__(self):
return hash(self.value) return hash(self.value)
@@ -333,7 +328,7 @@ class Scope(PythonBaseNode, DocstringMixin):
__slots__ = () __slots__ = ()
def __init__(self, children): def __init__(self, children):
super(Scope, self).__init__(children) super().__init__(children)
def iter_funcdefs(self): def iter_funcdefs(self):
""" """
@@ -359,8 +354,7 @@ class Scope(PythonBaseNode, DocstringMixin):
if element.type in names: if element.type in names:
yield element yield element
if element.type in _FUNC_CONTAINERS: if element.type in _FUNC_CONTAINERS:
for e in scan(element.children): yield from scan(element.children)
yield e
return scan(self.children) return scan(self.children)
@@ -390,7 +384,7 @@ class Module(Scope):
type = 'file_input' type = 'file_input'
def __init__(self, children): def __init__(self, children):
super(Module, self).__init__(children) super().__init__(children)
self._used_names = None self._used_names = None
def _iter_future_import_names(self): def _iter_future_import_names(self):
@@ -409,18 +403,6 @@ class Module(Scope):
if len(names) == 2 and names[0] == '__future__': if len(names) == 2 and names[0] == '__future__':
yield names[1] yield names[1]
def _has_explicit_absolute_import(self):
"""
Checks if imports in this module are explicitly absolute, i.e. there
is a ``__future__`` import.
Currently not public, might be in the future.
:return bool:
"""
for name in self._iter_future_import_names():
if name == 'absolute_import':
return True
return False
def get_used_names(self): def get_used_names(self):
""" """
Returns all the :class:`Name` leafs that exist in this module. This Returns all the :class:`Name` leafs that exist in this module. This
@@ -442,7 +424,7 @@ class Module(Scope):
recurse(child) recurse(child)
recurse(self) recurse(self)
self._used_names = dct self._used_names = UsedNamesMapping(dct)
return self._used_names return self._used_names
@@ -466,6 +448,9 @@ class ClassOrFunc(Scope):
:rtype: list of :class:`Decorator` :rtype: list of :class:`Decorator`
""" """
decorated = self.parent decorated = self.parent
if decorated.type == 'async_funcdef':
decorated = decorated.parent
if decorated.type == 'decorated': if decorated.type == 'decorated':
if decorated.children[0].type == 'decorators': if decorated.children[0].type == 'decorators':
return decorated.children[0].children return decorated.children[0].children
@@ -483,7 +468,7 @@ class Class(ClassOrFunc):
__slots__ = () __slots__ = ()
def __init__(self, children): def __init__(self, children):
super(Class, self).__init__(children) super().__init__(children)
def get_super_arglist(self): def get_super_arglist(self):
""" """
@@ -510,24 +495,13 @@ def _create_params(parent, argslist_list):
You could also say that this function replaces the argslist node with a You could also say that this function replaces the argslist node with a
list of Param objects. list of Param objects.
""" """
def check_python2_nested_param(node):
"""
Python 2 allows params to look like ``def x(a, (b, c))``, which is
basically a way of unpacking tuples in params. Python 3 has ditched
this behavior. Jedi currently just ignores those constructs.
"""
return node.type == 'fpdef' and node.children[0] == '('
try: try:
first = argslist_list[0] first = argslist_list[0]
except IndexError: except IndexError:
return [] return []
if first.type in ('name', 'fpdef'): if first.type in ('name', 'fpdef'):
if check_python2_nested_param(first): return [Param([first], parent)]
return [first]
else:
return [Param([first], parent)]
elif first == '*': elif first == '*':
return [first] return [first]
else: # argslist is a `typedargslist` or a `varargslist`. else: # argslist is a `typedargslist` or a `varargslist`.
@@ -545,7 +519,7 @@ def _create_params(parent, argslist_list):
if param_children[0] == '*' \ if param_children[0] == '*' \
and (len(param_children) == 1 and (len(param_children) == 1
or param_children[1] == ',') \ or param_children[1] == ',') \
or check_python2_nested_param(param_children[0]): or param_children[0] == '/':
for p in param_children: for p in param_children:
p.parent = parent p.parent = parent
new_children += param_children new_children += param_children
@@ -572,7 +546,7 @@ class Function(ClassOrFunc):
type = 'funcdef' type = 'funcdef'
def __init__(self, children): def __init__(self, children):
super(Function, self).__init__(children) super().__init__(children)
parameters = self.children[2] # After `def foo` parameters = self.children[2] # After `def foo`
parameters.children[1:-1] = _create_params(parameters, parameters.children[1:-1]) parameters.children[1:-1] = _create_params(parameters, parameters.children[1:-1])
@@ -607,8 +581,7 @@ class Function(ClassOrFunc):
else: else:
yield element yield element
else: else:
for result in scan(nested_children): yield from scan(nested_children)
yield result
return scan(self.children) return scan(self.children)
@@ -622,8 +595,7 @@ class Function(ClassOrFunc):
or element.type == 'keyword' and element.value == 'return': or element.type == 'keyword' and element.value == 'return':
yield element yield element
if element.type in _RETURN_STMT_CONTAINERS: if element.type in _RETURN_STMT_CONTAINERS:
for e in scan(element.children): yield from scan(element.children)
yield e
return scan(self.children) return scan(self.children)
@@ -637,8 +609,7 @@ class Function(ClassOrFunc):
or element.type == 'keyword' and element.value == 'raise': or element.type == 'keyword' and element.value == 'raise':
yield element yield element
if element.type in _RETURN_STMT_CONTAINERS: if element.type in _RETURN_STMT_CONTAINERS:
for e in scan(element.children): yield from scan(element.children)
yield e
return scan(self.children) return scan(self.children)
@@ -767,8 +738,8 @@ class ForStmt(Flow):
""" """
return self.children[3] return self.children[3]
def get_defined_names(self): def get_defined_names(self, include_setitem=False):
return _defined_names(self.children[1]) return _defined_names(self.children[1], include_setitem)
class TryStmt(Flow): class TryStmt(Flow):
@@ -791,7 +762,7 @@ class WithStmt(Flow):
type = 'with_stmt' type = 'with_stmt'
__slots__ = () __slots__ = ()
def get_defined_names(self): def get_defined_names(self, include_setitem=False):
""" """
Returns the a list of `Name` that the with statement defines. The Returns the a list of `Name` that the with statement defines. The
defined names are set after `as`. defined names are set after `as`.
@@ -800,12 +771,12 @@ class WithStmt(Flow):
for with_item in self.children[1:-2:2]: for with_item in self.children[1:-2:2]:
# Check with items for 'as' names. # Check with items for 'as' names.
if with_item.type == 'with_item': if with_item.type == 'with_item':
names += _defined_names(with_item.children[2]) names += _defined_names(with_item.children[2], include_setitem)
return names return names
def get_test_node_from_name(self, name): def get_test_node_from_name(self, name):
node = name.parent node = search_ancestor(name, "with_item")
if node.type != 'with_item': if node is None:
raise ValueError('The name is not actually part of a with statement.') raise ValueError('The name is not actually part of a with statement.')
return node.children[0] return node.children[0]
@@ -841,7 +812,7 @@ class ImportFrom(Import):
type = 'import_from' type = 'import_from'
__slots__ = () __slots__ = ()
def get_defined_names(self): def get_defined_names(self, include_setitem=False):
""" """
Returns the a list of `Name` that the import defines. The Returns the a list of `Name` that the import defines. The
defined names are set after `import` or in case an alias - `as` - is defined names are set after `import` or in case an alias - `as` - is
@@ -912,7 +883,7 @@ class ImportName(Import):
type = 'import_name' type = 'import_name'
__slots__ = () __slots__ = ()
def get_defined_names(self): def get_defined_names(self, include_setitem=False):
""" """
Returns the a list of `Name` that the import defines. The defined names Returns the a list of `Name` that the import defines. The defined names
is always the first name after `import` or in case an alias - `as` - is is always the first name after `import` or in case an alias - `as` - is
@@ -969,7 +940,7 @@ class ImportName(Import):
class KeywordStatement(PythonBaseNode): class KeywordStatement(PythonBaseNode):
""" """
For the following statements: `assert`, `del`, `global`, `nonlocal`, For the following statements: `assert`, `del`, `global`, `nonlocal`,
`raise`, `return`, `yield`, `return`, `yield`. `raise`, `return`, `yield`.
`pass`, `continue` and `break` are not in there, because they are just `pass`, `continue` and `break` are not in there, because they are just
simple keywords and the parser reduces it to a keyword. simple keywords and the parser reduces it to a keyword.
@@ -988,6 +959,14 @@ class KeywordStatement(PythonBaseNode):
def keyword(self): def keyword(self):
return self.children[0].value return self.children[0].value
def get_defined_names(self, include_setitem=False):
keyword = self.keyword
if keyword == 'del':
return _defined_names(self.children[1], include_setitem)
if keyword in ('global', 'nonlocal'):
return self.children[1::2]
return []
class AssertStmt(KeywordStatement): class AssertStmt(KeywordStatement):
__slots__ = () __slots__ = ()
@@ -1013,7 +992,7 @@ class YieldExpr(PythonBaseNode):
__slots__ = () __slots__ = ()
def _defined_names(current): def _defined_names(current, include_setitem):
""" """
A helper function to find the defined names in statements, for loops and A helper function to find the defined names in statements, for loops and
list comprehensions. list comprehensions.
@@ -1021,14 +1000,22 @@ def _defined_names(current):
names = [] names = []
if current.type in ('testlist_star_expr', 'testlist_comp', 'exprlist', 'testlist'): if current.type in ('testlist_star_expr', 'testlist_comp', 'exprlist', 'testlist'):
for child in current.children[::2]: for child in current.children[::2]:
names += _defined_names(child) names += _defined_names(child, include_setitem)
elif current.type in ('atom', 'star_expr'): elif current.type in ('atom', 'star_expr'):
names += _defined_names(current.children[1]) names += _defined_names(current.children[1], include_setitem)
elif current.type in ('power', 'atom_expr'): elif current.type in ('power', 'atom_expr'):
if current.children[-2] != '**': # Just if there's no operation if current.children[-2] != '**': # Just if there's no operation
trailer = current.children[-1] trailer = current.children[-1]
if trailer.children[0] == '.': if trailer.children[0] == '.':
names.append(trailer.children[1]) names.append(trailer.children[1])
elif trailer.children[0] == '[' and include_setitem:
for node in current.children[-2::-1]:
if node.type == 'trailer':
names.append(node.children[1])
break
if node.type == 'name':
names.append(node)
break
else: else:
names.append(current) names.append(current)
return names return names
@@ -1038,23 +1025,29 @@ class ExprStmt(PythonBaseNode, DocstringMixin):
type = 'expr_stmt' type = 'expr_stmt'
__slots__ = () __slots__ = ()
def get_defined_names(self): def get_defined_names(self, include_setitem=False):
""" """
Returns a list of `Name` defined before the `=` sign. Returns a list of `Name` defined before the `=` sign.
""" """
names = [] names = []
if self.children[1].type == 'annassign': if self.children[1].type == 'annassign':
names = _defined_names(self.children[0]) names = _defined_names(self.children[0], include_setitem)
return [ return [
name name
for i in range(0, len(self.children) - 2, 2) for i in range(0, len(self.children) - 2, 2)
if '=' in self.children[i + 1].value if '=' in self.children[i + 1].value
for name in _defined_names(self.children[i]) for name in _defined_names(self.children[i], include_setitem)
] + names ] + names
def get_rhs(self): def get_rhs(self):
"""Returns the right-hand-side of the equals.""" """Returns the right-hand-side of the equals."""
return self.children[-1] node = self.children[-1]
if node.type == 'annassign':
if len(node.children) == 4:
node = node.children[3]
else:
node = node.children[1]
return node
def yield_operators(self): def yield_operators(self):
""" """
@@ -1068,8 +1061,14 @@ class ExprStmt(PythonBaseNode, DocstringMixin):
first = first.children[2] first = first.children[2]
yield first yield first
for operator in self.children[3::2]: yield from self.children[3::2]
yield operator
class NamedExpr(PythonBaseNode):
type = 'namedexpr_test'
def get_defined_names(self, include_setitem=False):
return _defined_names(self.children[0], include_setitem)
class Param(PythonBaseNode): class Param(PythonBaseNode):
@@ -1081,7 +1080,7 @@ class Param(PythonBaseNode):
type = 'param' type = 'param'
def __init__(self, children, parent): def __init__(self, children, parent):
super(Param, self).__init__(children) super().__init__(children)
self.parent = parent self.parent = parent
for child in children: for child in children:
child.parent = self child.parent = self
@@ -1142,7 +1141,7 @@ class Param(PythonBaseNode):
else: else:
return self._tfpdef() return self._tfpdef()
def get_defined_names(self): def get_defined_names(self, include_setitem=False):
return [self.name] return [self.name]
@property @property
@@ -1158,6 +1157,13 @@ class Param(PythonBaseNode):
index -= 2 index -= 2
except ValueError: except ValueError:
pass pass
try:
keyword_only_index = self.parent.children.index('/')
if index > keyword_only_index:
# Skip the ` /, `
index -= 2
except ValueError:
pass
return index - 1 return index - 1
def get_parent_function(self): def get_parent_function(self):
@@ -1174,7 +1180,7 @@ class Param(PythonBaseNode):
:param include_comma bool: If enabled includes the comma in the string output. :param include_comma bool: If enabled includes the comma in the string output.
""" """
if include_comma: if include_comma:
return super(Param, self).get_code(include_prefix) return super().get_code(include_prefix)
children = self.children children = self.children
if children[-1] == ',': if children[-1] == ',':
@@ -1189,13 +1195,42 @@ class Param(PythonBaseNode):
return '<%s: %s>' % (type(self).__name__, str(self._tfpdef()) + default) return '<%s: %s>' % (type(self).__name__, str(self._tfpdef()) + default)
class CompFor(PythonBaseNode): class SyncCompFor(PythonBaseNode):
type = 'comp_for' type = 'sync_comp_for'
__slots__ = () __slots__ = ()
def get_defined_names(self): def get_defined_names(self, include_setitem=False):
""" """
Returns the a list of `Name` that the comprehension defines. Returns the a list of `Name` that the comprehension defines.
""" """
# allow async for # allow async for
return _defined_names(self.children[self.children.index('for') + 1]) return _defined_names(self.children[1], include_setitem)
# This is simply here so an older Jedi version can work with this new parso
# version. Can be deleted in the next release.
CompFor = SyncCompFor
class UsedNamesMapping(Mapping):
"""
This class exists for the sole purpose of creating an immutable dict.
"""
def __init__(self, dct):
self._dict = dct
def __getitem__(self, key):
return self._dict[key]
def __len__(self):
return len(self._dict)
def __iter__(self):
return iter(self._dict)
def __hash__(self):
return id(self)
def __eq__(self, other):
# Comparing these dicts does not make sense.
return self is other
+22 -15
View File
@@ -1,6 +1,5 @@
from abc import abstractmethod, abstractproperty from abc import abstractmethod, abstractproperty
from parso._compatibility import utf8_repr, encoding, py_version
from parso.utils import split_lines from parso.utils import split_lines
@@ -19,12 +18,12 @@ def search_ancestor(node, *node_types):
return node return node
class NodeOrLeaf(object): class NodeOrLeaf:
""" """
The base class for nodes and leaves. The base class for nodes and leaves.
""" """
__slots__ = () __slots__ = ()
type = None type: str
''' '''
The type is a string that typically matches the types of the grammar file. The type is a string that typically matches the types of the grammar file.
''' '''
@@ -44,8 +43,12 @@ class NodeOrLeaf(object):
Returns the node immediately following this node in this parent's Returns the node immediately following this node in this parent's
children list. If this node does not have a next sibling, it is None children list. If this node does not have a next sibling, it is None
""" """
parent = self.parent
if parent is None:
return None
# Can't use index(); we need to test by identity # Can't use index(); we need to test by identity
for i, child in enumerate(self.parent.children): for i, child in enumerate(parent.children):
if child is self: if child is self:
try: try:
return self.parent.children[i + 1] return self.parent.children[i + 1]
@@ -58,8 +61,12 @@ class NodeOrLeaf(object):
children list. If this node does not have a previous sibling, it is children list. If this node does not have a previous sibling, it is
None. None.
""" """
parent = self.parent
if parent is None:
return None
# Can't use index(); we need to test by identity # Can't use index(); we need to test by identity
for i, child in enumerate(self.parent.children): for i, child in enumerate(parent.children):
if child is self: if child is self:
if i == 0: if i == 0:
return None return None
@@ -70,6 +77,9 @@ class NodeOrLeaf(object):
Returns the previous leaf in the parser tree. Returns the previous leaf in the parser tree.
Returns `None` if this is the first element in the parser tree. Returns `None` if this is the first element in the parser tree.
""" """
if self.parent is None:
return None
node = self node = self
while True: while True:
c = node.parent.children c = node.parent.children
@@ -93,6 +103,9 @@ class NodeOrLeaf(object):
Returns the next leaf in the parser tree. Returns the next leaf in the parser tree.
Returns None if this is the last element in the parser tree. Returns None if this is the last element in the parser tree.
""" """
if self.parent is None:
return None
node = self node = self
while True: while True:
c = node.parent.children c = node.parent.children
@@ -153,7 +166,7 @@ class NodeOrLeaf(object):
@abstractmethod @abstractmethod
def get_code(self, include_prefix=True): def get_code(self, include_prefix=True):
""" """
Returns the code that was input the input for the parser for this node. Returns the code that was the input for the parser for this node.
:param include_prefix: Removes the prefix (whitespace and comments) of :param include_prefix: Removes the prefix (whitespace and comments) of
e.g. a statement. e.g. a statement.
@@ -223,7 +236,6 @@ class Leaf(NodeOrLeaf):
end_pos_column = len(lines[-1]) end_pos_column = len(lines[-1])
return end_pos_line, end_pos_column return end_pos_line, end_pos_column
@utf8_repr
def __repr__(self): def __repr__(self):
value = self.value value = self.value
if not value: if not value:
@@ -235,7 +247,7 @@ class TypedLeaf(Leaf):
__slots__ = ('type',) __slots__ = ('type',)
def __init__(self, type, value, start_pos, prefix=''): def __init__(self, type, value, start_pos, prefix=''):
super(TypedLeaf, self).__init__(value, start_pos, prefix) super().__init__(value, start_pos, prefix)
self.type = type self.type = type
@@ -245,7 +257,6 @@ class BaseNode(NodeOrLeaf):
A node has children, a type and possibly a parent node. A node has children, a type and possibly a parent node.
""" """
__slots__ = ('children', 'parent') __slots__ = ('children', 'parent')
type = None
def __init__(self, children): def __init__(self, children):
self.children = children self.children = children
@@ -300,7 +311,6 @@ class BaseNode(NodeOrLeaf):
except AttributeError: except AttributeError:
return element return element
index = int((lower + upper) / 2) index = int((lower + upper) / 2)
element = self.children[index] element = self.children[index]
if position <= element.end_pos: if position <= element.end_pos:
@@ -318,11 +328,8 @@ class BaseNode(NodeOrLeaf):
def get_last_leaf(self): def get_last_leaf(self):
return self.children[-1].get_last_leaf() return self.children[-1].get_last_leaf()
@utf8_repr
def __repr__(self): def __repr__(self):
code = self.get_code().replace('\n', ' ').replace('\r', ' ').strip() code = self.get_code().replace('\n', ' ').replace('\r', ' ').strip()
if not py_version >= 30:
code = code.encode(encoding, 'replace')
return "<%s: %s@%s,%s>" % \ return "<%s: %s@%s,%s>" % \
(type(self).__name__, code, self.start_pos[0], self.start_pos[1]) (type(self).__name__, code, self.start_pos[0], self.start_pos[1])
@@ -332,7 +339,7 @@ class Node(BaseNode):
__slots__ = ('type',) __slots__ = ('type',)
def __init__(self, type, children): def __init__(self, type, children):
super(Node, self).__init__(children) super().__init__(children)
self.type = type self.type = type
def __repr__(self): def __repr__(self):
@@ -358,7 +365,7 @@ class ErrorLeaf(Leaf):
type = 'error_leaf' type = 'error_leaf'
def __init__(self, token_type, value, start_pos, prefix=''): def __init__(self, token_type, value, start_pos, prefix=''):
super(ErrorLeaf, self).__init__(value, start_pos, prefix) super().__init__(value, start_pos, prefix)
self.token_type = token_type self.token_type = token_type
def __repr__(self): def __repr__(self):
+71 -52
View File
@@ -1,29 +1,32 @@
from collections import namedtuple
import re import re
import sys import sys
from ast import literal_eval from ast import literal_eval
from functools import total_ordering
from parso._compatibility import unicode, total_ordering from typing import NamedTuple, Sequence, Union
# The following is a list in Python that are line breaks in str.splitlines, but # The following is a list in Python that are line breaks in str.splitlines, but
# not in Python. In Python only \r (Carriage Return, 0xD) and \n (Line Feed, # not in Python. In Python only \r (Carriage Return, 0xD) and \n (Line Feed,
# 0xA) are allowed to split lines. # 0xA) are allowed to split lines.
_NON_LINE_BREAKS = ( _NON_LINE_BREAKS = (
u'\v', # Vertical Tabulation 0xB '\v', # Vertical Tabulation 0xB
u'\f', # Form Feed 0xC '\f', # Form Feed 0xC
u'\x1C', # File Separator '\x1C', # File Separator
u'\x1D', # Group Separator '\x1D', # Group Separator
u'\x1E', # Record Separator '\x1E', # Record Separator
u'\x85', # Next Line (NEL - Equivalent to CR+LF. '\x85', # Next Line (NEL - Equivalent to CR+LF.
# Used to mark end-of-line on some IBM mainframes.) # Used to mark end-of-line on some IBM mainframes.)
u'\u2028', # Line Separator '\u2028', # Line Separator
u'\u2029', # Paragraph Separator '\u2029', # Paragraph Separator
) )
Version = namedtuple('Version', 'major, minor, micro')
class Version(NamedTuple):
major: int
minor: int
micro: int
def split_lines(string, keepends=False): def split_lines(string: str, keepends: bool = False) -> Sequence[str]:
r""" r"""
Intended for Python code. In contrast to Python's :py:meth:`str.splitlines`, Intended for Python code. In contrast to Python's :py:meth:`str.splitlines`,
looks at form feeds and other special characters as normal text. Just looks at form feeds and other special characters as normal text. Just
@@ -67,7 +70,9 @@ def split_lines(string, keepends=False):
return re.split(r'\n|\r\n|\r', string) return re.split(r'\n|\r\n|\r', string)
def python_bytes_to_unicode(source, encoding='utf-8', errors='strict'): def python_bytes_to_unicode(
source: Union[str, bytes], encoding: str = 'utf-8', errors: str = 'strict'
) -> str:
""" """
Checks for unicode BOMs and PEP 263 encoding declarations. Then returns a Checks for unicode BOMs and PEP 263 encoding declarations. Then returns a
unicode object like in :py:meth:`bytes.decode`. unicode object like in :py:meth:`bytes.decode`.
@@ -91,24 +96,33 @@ def python_bytes_to_unicode(source, encoding='utf-8', errors='strict'):
possible_encoding = re.search(br"coding[=:]\s*([-\w.]+)", possible_encoding = re.search(br"coding[=:]\s*([-\w.]+)",
first_two_lines) first_two_lines)
if possible_encoding: if possible_encoding:
return possible_encoding.group(1) e = possible_encoding.group(1)
if not isinstance(e, str):
e = str(e, 'ascii', 'replace')
return e
else: else:
# the default if nothing else has been set -> PEP 263 # the default if nothing else has been set -> PEP 263
return encoding return encoding
if isinstance(source, unicode): if isinstance(source, str):
# only cast str/bytes # only cast str/bytes
return source return source
encoding = detect_encoding() encoding = detect_encoding()
if not isinstance(encoding, unicode): try:
encoding = unicode(encoding, 'utf-8', 'replace') # Cast to unicode
return str(source, encoding, errors)
# Cast to unicode except LookupError:
return unicode(source, encoding, errors) if errors == 'replace':
# This is a weird case that can happen if the given encoding is not
# a valid encoding. This usually shouldn't happen with provided
# encodings, but can happen if somebody uses encoding declarations
# like `# coding: foo-8`.
return str(source, 'utf-8', errors)
raise
def version_info(): def version_info() -> Version:
""" """
Returns a namedtuple of parso's version, similar to Python's Returns a namedtuple of parso's version, similar to Python's
``sys.version_info``. ``sys.version_info``.
@@ -118,11 +132,38 @@ def version_info():
return Version(*[x if i == 3 else int(x) for i, x in enumerate(tupl)]) return Version(*[x if i == 3 else int(x) for i, x in enumerate(tupl)])
def _parse_version(version): class _PythonVersionInfo(NamedTuple):
match = re.match(r'(\d+)(?:\.(\d)(?:\.\d+)?)?$', version) major: int
minor: int
@total_ordering
class PythonVersionInfo(_PythonVersionInfo):
def __gt__(self, other):
if isinstance(other, tuple):
if len(other) != 2:
raise ValueError("Can only compare to tuples of length 2.")
return (self.major, self.minor) > other
super().__gt__(other)
return (self.major, self.minor)
def __eq__(self, other):
if isinstance(other, tuple):
if len(other) != 2:
raise ValueError("Can only compare to tuples of length 2.")
return (self.major, self.minor) == other
super().__eq__(other)
def __ne__(self, other):
return not self.__eq__(other)
def _parse_version(version) -> PythonVersionInfo:
match = re.match(r'(\d+)(?:\.(\d{1,2})(?:\.\d+)?)?((a|b|rc)\d)?$', version)
if match is None: if match is None:
raise ValueError('The given version is not in the right format. ' raise ValueError('The given version is not in the right format. '
'Use something like "3.2" or "3".') 'Use something like "3.8" or "3".')
major = int(match.group(1)) major = int(match.group(1))
minor = match.group(2) minor = match.group(2)
@@ -139,37 +180,15 @@ def _parse_version(version):
return PythonVersionInfo(major, minor) return PythonVersionInfo(major, minor)
@total_ordering def parse_version_string(version: str = None) -> PythonVersionInfo:
class PythonVersionInfo(namedtuple('Version', 'major, minor')):
def __gt__(self, other):
if isinstance(other, tuple):
if len(other) != 2:
raise ValueError("Can only compare to tuples of length 2.")
return (self.major, self.minor) > other
super(PythonVersionInfo, self).__gt__(other)
return (self.major, self.minor)
def __eq__(self, other):
if isinstance(other, tuple):
if len(other) != 2:
raise ValueError("Can only compare to tuples of length 2.")
return (self.major, self.minor) == other
super(PythonVersionInfo, self).__eq__(other)
def __ne__(self, other):
return not self.__eq__(other)
def parse_version_string(version=None):
""" """
Checks for a valid version number (e.g. `3.2` or `2.7.1` or `3`) and Checks for a valid version number (e.g. `3.8` or `3.10.1` or `3`) and
returns a corresponding version info that is always two characters long in returns a corresponding version info that is always two characters long in
decimal. decimal.
""" """
if version is None: if version is None:
version = '%s.%s' % sys.version_info[:2] version = '%s.%s' % sys.version_info[:2]
if not isinstance(version, (unicode, str)): if not isinstance(version, str):
raise TypeError("version must be a string like 3.2.") raise TypeError('version must be a string like "3.8"')
return _parse_version(version) return _parse_version(version)
+2
View File
@@ -1,6 +1,8 @@
[pytest] [pytest]
addopts = --doctest-modules addopts = --doctest-modules
testpaths = parso test
# Ignore broken files inblackbox test directories # Ignore broken files inblackbox test directories
norecursedirs = .* docs scripts normalizer_issue_files build norecursedirs = .* docs scripts normalizer_issue_files build
+3 -3
View File
@@ -18,7 +18,6 @@ from docopt import docopt
from jedi.parser.python import load_grammar from jedi.parser.python import load_grammar
from jedi.parser.diff import DiffParser from jedi.parser.diff import DiffParser
from jedi.parser.python import ParserWithRecovery from jedi.parser.python import ParserWithRecovery
from jedi._compatibility import u
from jedi.common import splitlines from jedi.common import splitlines
import jedi import jedi
@@ -37,14 +36,15 @@ def main(args):
with open(args['<file>']) as f: with open(args['<file>']) as f:
code = f.read() code = f.read()
grammar = load_grammar() grammar = load_grammar()
parser = ParserWithRecovery(grammar, u(code)) parser = ParserWithRecovery(grammar, code)
# Make sure used_names is loaded # Make sure used_names is loaded
parser.module.used_names parser.module.used_names
code = code + '\na\n' # Add something so the diff parser needs to run. code = code + '\na\n' # Add something so the diff parser needs to run.
lines = splitlines(code, keepends=True) lines = splitlines(code, keepends=True)
cProfile.runctx('run(parser, lines)', globals(), locals(), sort=args['-s']) cProfile.runctx('run(parser, lines)', globals(), locals(), sort=args['-s'])
if __name__ == '__main__': if __name__ == '__main__':
args = docopt(__doc__) args = docopt(__doc__)
main(args) main(args)
+23
View File
@@ -1,2 +1,25 @@
[bdist_wheel] [bdist_wheel]
universal=1 universal=1
[flake8]
max-line-length = 100
ignore =
# do not use bare 'except'
E722,
# don't know why this was ever even an option, 1+1 should be possible.
E226,
# line break before binary operator
W503,
[mypy]
disallow_subclassing_any = True
# Avoid creating future gotchas emerging from bad typing
warn_redundant_casts = True
warn_unused_ignores = True
warn_return_any = True
warn_unused_configs = True
warn_unreachable = True
strict_equality = True
+44 -41
View File
@@ -12,44 +12,47 @@ __AUTHOR_EMAIL__ = 'davidhalter88@gmail.com'
readme = open('README.rst').read() + '\n\n' + open('CHANGELOG.rst').read() readme = open('README.rst').read() + '\n\n' + open('CHANGELOG.rst').read()
setup(name='parso', setup(
version=parso.__version__, name='parso',
description='A Python Parser', version=parso.__version__,
author=__AUTHOR__, description='A Python Parser',
author_email=__AUTHOR_EMAIL__, author=__AUTHOR__,
include_package_data=True, author_email=__AUTHOR_EMAIL__,
maintainer=__AUTHOR__, include_package_data=True,
maintainer_email=__AUTHOR_EMAIL__, maintainer=__AUTHOR__,
url='https://github.com/davidhalter/parso', maintainer_email=__AUTHOR_EMAIL__,
license='MIT', url='https://github.com/davidhalter/parso',
keywords='python parser parsing', license='MIT',
long_description=readme, keywords='python parser parsing',
packages=find_packages(exclude=['test']), long_description=readme,
package_data={'parso': ['python/grammar*.txt']}, packages=find_packages(exclude=['test']),
platforms=['any'], package_data={'parso': ['python/grammar*.txt', 'py.typed', '*.pyi', '**/*.pyi']},
classifiers=[ platforms=['any'],
'Development Status :: 4 - Beta', python_requires='>=3.6',
'Environment :: Plugins', classifiers=[
'Intended Audience :: Developers', 'Development Status :: 4 - Beta',
'License :: OSI Approved :: MIT License', 'Environment :: Plugins',
'Operating System :: OS Independent', 'Intended Audience :: Developers',
'Programming Language :: Python :: 2', 'License :: OSI Approved :: MIT License',
'Programming Language :: Python :: 2.6', 'Operating System :: OS Independent',
'Programming Language :: Python :: 2.7', 'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3', 'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.3', 'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.4', 'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: 3.5', 'Programming Language :: Python :: 3.9',
'Programming Language :: Python :: 3.6', 'Topic :: Software Development :: Libraries :: Python Modules',
'Programming Language :: Python :: 3.7', 'Topic :: Text Editors :: Integrated Development Environments (IDE)',
'Topic :: Software Development :: Libraries :: Python Modules', 'Topic :: Utilities',
'Topic :: Text Editors :: Integrated Development Environments (IDE)', 'Typing :: Typed',
'Topic :: Utilities', ],
], extras_require={
extras_require={ 'testing': [
'testing': [ 'pytest<6.0.0',
'pytest>=3.0.7', 'docopt',
'docopt', ],
], 'qa': [
}, 'flake8==3.8.3',
) 'mypy==0.782',
],
},
)
+115 -61
View File
@@ -19,14 +19,6 @@ def build_nested(code, depth, base='def f():\n'):
FAILING_EXAMPLES = [ FAILING_EXAMPLES = [
'1 +', '1 +',
'?', '?',
# Python/compile.c
dedent('''\
for a in [1]:
try:
pass
finally:
continue
'''), # 'continue' not supported inside 'finally' clause"
'continue', 'continue',
'break', 'break',
'return', 'return',
@@ -42,7 +34,7 @@ FAILING_EXAMPLES = [
'lambda x=3, y: x', 'lambda x=3, y: x',
'__debug__ = 1', '__debug__ = 1',
'with x() as __debug__: pass', 'with x() as __debug__: pass',
# Mostly 3.6 relevant
'[]: int', '[]: int',
'[a, b]: int', '[a, b]: int',
'(): int', '(): int',
@@ -60,9 +52,38 @@ FAILING_EXAMPLES = [
'f(x=2, y)', 'f(x=2, y)',
'f(**x, *y)', 'f(**x, *y)',
'f(**x, y=3, z)', 'f(**x, y=3, z)',
# augassign
'a, b += 3', 'a, b += 3',
'(a, b) += 3', '(a, b) += 3',
'[a, b] += 3', '[a, b] += 3',
'[a, 1] += 3',
'f() += 1',
'lambda x:None+=1',
'{} += 1',
'{a:b} += 1',
'{1} += 1',
'{*x} += 1',
'(x,) += 1',
'(x, y if a else q) += 1',
'[] += 1',
'[1,2] += 1',
'[] += 1',
'None += 1',
'... += 1',
'a > 1 += 1',
'"test" += 1',
'1 += 1',
'1.0 += 1',
'(yield) += 1',
'(yield from x) += 1',
'(x if x else y) += 1',
'a() += 1',
'a + b += 1',
'+a += 1',
'a and b += 1',
'*a += 1',
'a, b += 1',
'f"xxx" += 1',
# All assignment tests # All assignment tests
'lambda a: 1 = 1', 'lambda a: 1 = 1',
'[x for x in y] = 1', '[x for x in y] = 1',
@@ -110,6 +131,8 @@ FAILING_EXAMPLES = [
r"u'\N{foo}'", r"u'\N{foo}'",
r'b"\x"', r'b"\x"',
r'b"\"', r'b"\"',
'b"ä"',
'*a, *b = 3, 3', '*a, *b = 3, 3',
'async def foo(): yield from []', 'async def foo(): yield from []',
'yield from []', 'yield from []',
@@ -118,6 +141,16 @@ FAILING_EXAMPLES = [
'def x(*): pass', 'def x(*): pass',
'(%s *d) = x' % ('a,' * 256), '(%s *d) = x' % ('a,' * 256),
'{**{} for a in [1]}', '{**{} for a in [1]}',
'(True,) = x',
'([False], a) = x',
'def x(): from math import *',
# str/bytes combinations
'"s" b""',
'"s" b"" ""',
'b"" "" b"" ""',
'f"s" b""',
'b"s" f""',
# Parser/tokenize.c # Parser/tokenize.c
r'"""', r'"""',
@@ -154,11 +187,19 @@ FAILING_EXAMPLES = [
# Now nested parsing # Now nested parsing
"f'{continue}'", "f'{continue}'",
"f'{1;1}'", "f'{1;1}'",
"f'{a=3}'", "f'{a;}'",
"f'{b\"\" \"\"}'", "f'{b\"\" \"\"}'",
] # f-string expression part cannot include a backslash
r'''f"{'\n'}"''',
GLOBAL_NONLOCAL_ERROR = [ 'async def foo():\n yield x\n return 1',
'async def foo():\n yield x\n return 1',
'[*[] for a in [1]]',
'async def bla():\n def x(): await bla()',
'del None',
# Errors of global / nonlocal
dedent(''' dedent('''
def glob(): def glob():
x = 3 x = 3
@@ -257,57 +298,70 @@ GLOBAL_NONLOCAL_ERROR = [
'''), '''),
] ]
if sys.version_info >= (3, 6): if sys.version_info[:2] >= (3, 7):
FAILING_EXAMPLES += GLOBAL_NONLOCAL_ERROR # This is somehow ok in previous versions.
FAILING_EXAMPLES += [ FAILING_EXAMPLES += [
# Raises multiple errors in previous versions. 'class X(base for base in bases): pass',
'async def foo():\n def nofoo():[x async for x in []]',
]
if sys.version_info >= (3, 5):
FAILING_EXAMPLES += [
# Raises different errors so just ignore them for now.
'[*[] for a in [1]]',
# Raises multiple errors in previous versions.
'async def bla():\n def x(): await bla()',
]
if sys.version_info >= (3, 4):
# Before that del None works like del list, it gives a NameError.
FAILING_EXAMPLES.append('del None')
if sys.version_info >= (3,):
FAILING_EXAMPLES += [
# Unfortunately assigning to False and True do not raise an error in
# 2.x.
'(True,) = x',
'([False], a) = x',
# A symtable error that raises only a SyntaxWarning in Python 2.
'def x(): from math import *',
# unicode chars in bytes are allowed in python 2
'b"ä"',
# combining strings and unicode is allowed in Python 2.
'"s" b""',
]
if sys.version_info >= (2, 7):
# This is something that raises a different error in 2.6 than in the other
# versions. Just skip it for 2.6.
FAILING_EXAMPLES.append('[a, 1] += 3')
if sys.version_info[:2] == (3, 5):
# yields are not allowed in 3.5 async functions. Therefore test them
# separately, here.
FAILING_EXAMPLES += [
'async def foo():\n yield x',
'async def foo():\n yield x',
]
else:
FAILING_EXAMPLES += [
'async def foo():\n yield x\n return 1',
'async def foo():\n yield x\n return 1',
] ]
if sys.version_info[:2] < (3, 8):
if sys.version_info[:2] <= (3, 4):
# Python > 3.4 this is valid code.
FAILING_EXAMPLES += [ FAILING_EXAMPLES += [
'a = *[1], 2', # Python/compile.c
'(*[1], 2)', dedent('''\
for a in [1]:
try:
pass
finally:
continue
'''), # 'continue' not supported inside 'finally' clause"
]
if sys.version_info[:2] >= (3, 8):
# assignment expressions from issue#89
FAILING_EXAMPLES += [
# Case 2
'(lambda: x := 1)',
'((lambda: x) := 1)',
# Case 3
'(a[i] := x)',
'((a[i]) := x)',
'(a(i) := x)',
# Case 4
'(a.b := c)',
'[(i.i:= 0) for ((i), j) in range(5)]',
# Case 5
'[i:= 0 for i, j in range(5)]',
'[(i:= 0) for ((i), j) in range(5)]',
'[(i:= 0) for ((i), j), in range(5)]',
'[(i:= 0) for ((i), j.i), in range(5)]',
'[[(i:= i) for j in range(5)] for i in range(5)]',
'[i for i, j in range(5) if True or (i:= 1)]',
'[False and (i:= 0) for i, j in range(5)]',
# Case 6
'[i+1 for i in (i:= range(5))]',
'[i+1 for i in (j:= range(5))]',
'[i+1 for i in (lambda: (j:= range(5)))()]',
# Case 7
'class Example:\n [(j := i) for i in range(5)]',
# Not in that issue
'(await a := x)',
'((await a) := x)',
# new discoveries
'((a, b) := (1, 2))',
'([a, b] := [1, 2])',
'({a, b} := {1, 2})',
'({a: b} := {1: 2})',
'(a + b := 1)',
'(True := 1)',
'(False := 1)',
'(None := 1)',
'(__debug__ := 1)',
# Unparenthesized walrus not allowed in dict literals, dict comprehensions and slices
'{a:="a": b:=1}',
'{y:=1: 2 for x in range(5)}',
'a[b:=0:1:2]',
]
# f-string debugging syntax with invalid conversion character
FAILING_EXAMPLES += [
"f'{1=!b}'",
] ]
+30 -16
View File
@@ -50,6 +50,11 @@ def find_python_files_in_tree(file_path):
yield file_path yield file_path
return return
for root, dirnames, filenames in os.walk(file_path): for root, dirnames, filenames in os.walk(file_path):
if 'chardet' in root:
# Stuff like chardet/langcyrillicmodel.py is just very slow to
# parse and machine generated, so ignore those.
continue
for name in filenames: for name in filenames:
if name.endswith('.py'): if name.endswith('.py'):
yield os.path.join(root, name) yield os.path.join(root, name)
@@ -102,9 +107,17 @@ class LineCopy:
class FileModification: class FileModification:
@classmethod @classmethod
def generate(cls, code_lines, change_count): def generate(cls, code_lines, change_count, previous_file_modification=None):
if previous_file_modification is not None and random.random() > 0.5:
# We want to keep the previous modifications in some cases to make
# more complex parser issues visible.
code_lines = previous_file_modification.apply(code_lines)
added_modifications = previous_file_modification.modification_list
else:
added_modifications = []
return cls( return cls(
list(cls._generate_line_modifications(code_lines, change_count)), added_modifications
+ list(cls._generate_line_modifications(code_lines, change_count)),
# work with changed trees more than with normal ones. # work with changed trees more than with normal ones.
check_original=random.random() > 0.8, check_original=random.random() > 0.8,
) )
@@ -122,11 +135,11 @@ class FileModification:
# We cannot delete every line, that doesn't make sense to # We cannot delete every line, that doesn't make sense to
# fuzz and it would be annoying to rewrite everything here. # fuzz and it would be annoying to rewrite everything here.
continue continue
l = LineDeletion(random_line()) ld = LineDeletion(random_line())
elif rand == 2: elif rand == 2:
# Copy / Insertion # Copy / Insertion
# Make it possible to insert into the first and the last line # Make it possible to insert into the first and the last line
l = LineCopy(random_line(), random_line(include_end=True)) ld = LineCopy(random_line(), random_line(include_end=True))
elif rand in (3, 4): elif rand in (3, 4):
# Modify a line in some weird random ways. # Modify a line in some weird random ways.
line_nr = random_line() line_nr = random_line()
@@ -153,23 +166,23 @@ class FileModification:
# we really replace the line with something that has # we really replace the line with something that has
# indentation. # indentation.
line = ' ' * random.randint(0, 12) + random_string + '\n' line = ' ' * random.randint(0, 12) + random_string + '\n'
l = LineReplacement(line_nr, line) ld = LineReplacement(line_nr, line)
l.apply(lines) ld.apply(lines)
yield l yield ld
def __init__(self, modification_list, check_original): def __init__(self, modification_list, check_original):
self._modification_list = modification_list self.modification_list = modification_list
self._check_original = check_original self._check_original = check_original
def _apply(self, code_lines): def apply(self, code_lines):
changed_lines = list(code_lines) changed_lines = list(code_lines)
for modification in self._modification_list: for modification in self.modification_list:
modification.apply(changed_lines) modification.apply(changed_lines)
return changed_lines return changed_lines
def run(self, grammar, code_lines, print_code): def run(self, grammar, code_lines, print_code):
code = ''.join(code_lines) code = ''.join(code_lines)
modified_lines = self._apply(code_lines) modified_lines = self.apply(code_lines)
modified_code = ''.join(modified_lines) modified_code = ''.join(modified_lines)
if print_code: if print_code:
@@ -197,15 +210,12 @@ class FileModification:
class FileTests: class FileTests:
def __init__(self, file_path, test_count, change_count): def __init__(self, file_path, test_count, change_count):
self._path = file_path self._path = file_path
with open(file_path) as f: with open(file_path, errors='replace') as f:
code = f.read() code = f.read()
self._code_lines = split_lines(code, keepends=True) self._code_lines = split_lines(code, keepends=True)
self._test_count = test_count self._test_count = test_count
self._code_lines = self._code_lines self._code_lines = self._code_lines
self._change_count = change_count self._change_count = change_count
with open(file_path) as f:
code = f.read()
self._file_modifications = [] self._file_modifications = []
def _run(self, grammar, file_modifications, debugger, print_code=False): def _run(self, grammar, file_modifications, debugger, print_code=False):
@@ -231,8 +241,12 @@ class FileTests:
def run(self, grammar, debugger): def run(self, grammar, debugger):
def iterate(): def iterate():
fm = None
for _ in range(self._test_count): for _ in range(self._test_count):
fm = FileModification.generate(self._code_lines, self._change_count) fm = FileModification.generate(
self._code_lines, self._change_count,
previous_file_modification=fm
)
self._file_modifications.append(fm) self._file_modifications.append(fm)
yield fm yield fm
+72 -7
View File
@@ -12,13 +12,6 @@ from .__future__ import absolute_import
''r''u'' ''r''u''
b'' BR'' b'' BR''
for x in [1]:
try:
continue # Only the other continue and pass is an error.
finally:
#: E901
continue
for x in [1]: for x in [1]:
break break
@@ -51,3 +44,75 @@ a = 3
def x(b=a): def x(b=a):
global a global a
*foo, a = (1,)
*foo[0], a = (1,)
*[], a = (1,)
async def foo():
await bar()
#: E901
yield from []
return
#: E901
return ''
# With decorator it's a different statement.
@bla
async def foo():
await bar()
#: E901
yield from []
return
#: E901
return ''
foo: int = 4
(foo): int = 3
((foo)): int = 3
foo.bar: int
foo[3]: int
def glob():
global x
y: foo = x
def c():
a = 3
def d():
class X():
nonlocal a
def x():
a = 3
def y():
nonlocal a
def x():
def y():
nonlocal a
a = 3
def x():
a = 3
def y():
class z():
nonlocal a
a = *args, *args
error[(*args, *args)] = 3
*args, *args
@@ -1,2 +0,0 @@
's' b''
u's' b'ä'
@@ -1,3 +0,0 @@
*foo, a = (1,)
*foo[0], a = (1,)
*[], a = (1,)
@@ -1,23 +0,0 @@
"""
Mostly allowed syntax in Python 3.5.
"""
async def foo():
await bar()
#: E901
yield from []
return
#: E901
return ''
# With decorator it's a different statement.
@bla
async def foo():
await bar()
#: E901
yield from []
return
#: E901
return ''
@@ -1,45 +0,0 @@
foo: int = 4
(foo): int = 3
((foo)): int = 3
foo.bar: int
foo[3]: int
def glob():
global x
y: foo = x
def c():
a = 3
def d():
class X():
nonlocal a
def x():
a = 3
def y():
nonlocal a
def x():
def y():
nonlocal a
a = 3
def x():
a = 3
def y():
class z():
nonlocal a
a = *args, *args
error[(*args, *args)] = 3
*args, *args
-14
View File
@@ -1,14 +0,0 @@
import sys
print 1, 2 >> sys.stdout
foo = ur'This is not possible in Python 3.'
# This is actually printing a tuple.
#: E275:5
print(1, 2)
# True and False are not keywords in Python 2 and therefore there's no need for
# a space.
norman = True+False
-29
View File
@@ -1,29 +0,0 @@
"""
Tests ``from __future__ import absolute_import`` (only important for
Python 2.X)
"""
from parso import parse
def test_explicit_absolute_imports():
"""
Detect modules with ``from __future__ import absolute_import``.
"""
module = parse("from __future__ import absolute_import")
assert module._has_explicit_absolute_import()
def test_no_explicit_absolute_imports():
"""
Detect modules without ``from __future__ import absolute_import``.
"""
assert not parse("1")._has_explicit_absolute_import()
def test_dont_break_imports_without_namespaces():
"""
The code checking for ``from __future__ import absolute_import`` shouldn't
assume that all imports have non-``None`` namespaces.
"""
src = "from __future__ import absolute_import\nimport xyzzy"
assert parse(src)._has_explicit_absolute_import()
+126 -23
View File
@@ -2,25 +2,36 @@
Test all things related to the ``jedi.cache`` module. Test all things related to the ``jedi.cache`` module.
""" """
from os import unlink import os
import pytest import pytest
import time
from pathlib import Path
from parso.cache import _NodeCacheItem, save_module, load_module, \ from parso.cache import (_CACHED_FILE_MAXIMUM_SURVIVAL, _VERSION_TAG,
_get_hashed_path, parser_cache, _load_from_file_system, _save_to_file_system _get_cache_clear_lock_path, _get_hashed_path,
_load_from_file_system, _NodeCacheItem,
_remove_cache_and_update_lock, _save_to_file_system,
load_module, parser_cache, try_to_save_module)
from parso._compatibility import is_pypy
from parso import load_grammar from parso import load_grammar
from parso import cache from parso import cache
from parso import file_io
from parso import parse
skip_pypy = pytest.mark.skipif(
is_pypy,
reason="pickling in pypy is slow, since we don't pickle,"
"we never go into path of auto-collecting garbage"
)
@pytest.fixture() @pytest.fixture()
def isolated_jedi_cache(monkeypatch, tmpdir): def isolated_parso_cache(monkeypatch, tmpdir):
""" """Set `parso.cache._default_cache_path` to a temporary directory
Set `jedi.settings.cache_directory` to a temporary directory during test. during the test. """
cache_path = Path(str(tmpdir), "__parso_cache")
Same as `clean_jedi_cache`, but create the temporary directory for monkeypatch.setattr(cache, '_default_cache_path', cache_path)
each test case (scope='function'). return cache_path
"""
monkeypatch.setattr(cache, '_default_cache_path', str(tmpdir))
def test_modulepickling_change_cache_dir(tmpdir): def test_modulepickling_change_cache_dir(tmpdir):
@@ -29,13 +40,13 @@ def test_modulepickling_change_cache_dir(tmpdir):
See: `#168 <https://github.com/davidhalter/jedi/pull/168>`_ See: `#168 <https://github.com/davidhalter/jedi/pull/168>`_
""" """
dir_1 = str(tmpdir.mkdir('first')) dir_1 = Path(str(tmpdir.mkdir('first')))
dir_2 = str(tmpdir.mkdir('second')) dir_2 = Path(str(tmpdir.mkdir('second')))
item_1 = _NodeCacheItem('bla', []) item_1 = _NodeCacheItem('bla', [])
item_2 = _NodeCacheItem('bla', []) item_2 = _NodeCacheItem('bla', [])
path_1 = 'fake path 1' path_1 = Path('fake path 1')
path_2 = 'fake path 2' path_2 = Path('fake path 2')
hashed_grammar = load_grammar()._hashed hashed_grammar = load_grammar()._hashed
_save_to_file_system(hashed_grammar, path_1, item_1, cache_path=dir_1) _save_to_file_system(hashed_grammar, path_1, item_1, cache_path=dir_1)
@@ -54,7 +65,7 @@ def load_stored_item(hashed_grammar, path, item, cache_path):
return item return item
@pytest.mark.usefixtures("isolated_jedi_cache") @pytest.mark.usefixtures("isolated_parso_cache")
def test_modulepickling_simulate_deleted_cache(tmpdir): def test_modulepickling_simulate_deleted_cache(tmpdir):
""" """
Tests loading from a cache file after it is deleted. Tests loading from a cache file after it is deleted.
@@ -68,20 +79,112 @@ def test_modulepickling_simulate_deleted_cache(tmpdir):
way. way.
__ https://developer.apple.com/library/content/documentation/FileManagement/Conceptual/FileSystemProgrammingGuide/FileSystemOverview/FileSystemOverview.html __ https://developer.apple.com/library/content/documentation/FileManagement/Conceptual/FileSystemProgrammingGuide/FileSystemOverview/FileSystemOverview.html
""" """ # noqa
grammar = load_grammar() grammar = load_grammar()
module = 'fake parser' module = 'fake parser'
# Create the file # Create the file
path = tmpdir.dirname + '/some_path' path = Path(str(tmpdir.dirname), 'some_path')
with open(path, 'w'): with open(path, 'w'):
pass pass
io = file_io.FileIO(path)
save_module(grammar._hashed, path, module, []) try_to_save_module(grammar._hashed, io, module, lines=[])
assert load_module(grammar._hashed, path) == module assert load_module(grammar._hashed, io) == module
unlink(_get_hashed_path(grammar._hashed, path)) os.unlink(_get_hashed_path(grammar._hashed, path))
parser_cache.clear() parser_cache.clear()
cached2 = load_module(grammar._hashed, path) cached2 = load_module(grammar._hashed, io)
assert cached2 is None assert cached2 is None
def test_cache_limit():
def cache_size():
return sum(len(v) for v in parser_cache.values())
try:
parser_cache.clear()
future_node_cache_item = _NodeCacheItem('bla', [], change_time=time.time() + 10e6)
old_node_cache_item = _NodeCacheItem('bla', [], change_time=time.time() - 10e4)
parser_cache['some_hash_old'] = {
'/path/%s' % i: old_node_cache_item for i in range(300)
}
parser_cache['some_hash_new'] = {
'/path/%s' % i: future_node_cache_item for i in range(300)
}
assert cache_size() == 600
parse('somecode', cache=True, path='/path/somepath')
assert cache_size() == 301
finally:
parser_cache.clear()
class _FixedTimeFileIO(file_io.KnownContentFileIO):
def __init__(self, path, content, last_modified):
super().__init__(path, content)
self._last_modified = last_modified
def get_last_modified(self):
return self._last_modified
@pytest.mark.parametrize('diff_cache', [False, True])
@pytest.mark.parametrize('use_file_io', [False, True])
def test_cache_last_used_update(diff_cache, use_file_io):
p = Path('/path/last-used')
parser_cache.clear() # Clear, because then it's easier to find stuff.
parse('somecode', cache=True, path=p)
node_cache_item = next(iter(parser_cache.values()))[p]
now = time.time()
assert node_cache_item.last_used < now
if use_file_io:
f = _FixedTimeFileIO(p, 'code', node_cache_item.last_used - 10)
parse(file_io=f, cache=True, diff_cache=diff_cache)
else:
parse('somecode2', cache=True, path=p, diff_cache=diff_cache)
node_cache_item = next(iter(parser_cache.values()))[p]
assert now < node_cache_item.last_used < time.time()
@skip_pypy
def test_inactive_cache(tmpdir, isolated_parso_cache):
parser_cache.clear()
test_subjects = "abcdef"
for path in test_subjects:
parse('somecode', cache=True, path=os.path.join(str(tmpdir), path))
raw_cache_path = isolated_parso_cache.joinpath(_VERSION_TAG)
assert raw_cache_path.exists()
dir_names = os.listdir(raw_cache_path)
a_while_ago = time.time() - _CACHED_FILE_MAXIMUM_SURVIVAL
old_paths = set()
for dir_name in dir_names[:len(test_subjects) // 2]: # make certain number of paths old
os.utime(raw_cache_path.joinpath(dir_name), (a_while_ago, a_while_ago))
old_paths.add(dir_name)
# nothing should be cleared while the lock is on
assert _get_cache_clear_lock_path().exists()
_remove_cache_and_update_lock() # it shouldn't clear anything
assert len(os.listdir(raw_cache_path)) == len(test_subjects)
assert old_paths.issubset(os.listdir(raw_cache_path))
os.utime(_get_cache_clear_lock_path(), (a_while_ago, a_while_ago))
_remove_cache_and_update_lock()
assert len(os.listdir(raw_cache_path)) == len(test_subjects) // 2
assert not old_paths.intersection(os.listdir(raw_cache_path))
@skip_pypy
def test_permission_error(monkeypatch):
def save(*args, **kwargs):
nonlocal was_called
was_called = True
raise PermissionError
was_called = False
monkeypatch.setattr(cache, '_save_to_file_system', save)
with pytest.warns(Warning):
parse(path=__file__, cache=True, diff_cache=True)
assert was_called
+534 -57
View File
@@ -1,14 +1,13 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
from textwrap import dedent from textwrap import dedent
import logging import logging
import sys
import pytest import pytest
from parso.utils import split_lines from parso.utils import split_lines
from parso import cache from parso import cache
from parso import load_grammar from parso import load_grammar
from parso.python.diff import DiffParser, _assert_valid_graph from parso.python.diff import DiffParser, _assert_valid_graph, _assert_nodes_are_equal
from parso import parse from parso import parse
ANY = object() ANY = object()
@@ -39,7 +38,7 @@ def _check_error_leaves_nodes(node):
return None return None
class Differ(object): class Differ:
grammar = load_grammar() grammar = load_grammar()
def initialize(self, code): def initialize(self, code):
@@ -69,6 +68,9 @@ class Differ(object):
_assert_valid_graph(new_module) _assert_valid_graph(new_module)
without_diff_parser_module = parse(code)
_assert_nodes_are_equal(new_module, without_diff_parser_module)
error_node = _check_error_leaves_nodes(new_module) error_node = _check_error_leaves_nodes(new_module)
assert expect_error_leaves == (error_node is not None), error_node assert expect_error_leaves == (error_node is not None), error_node
if parsers is not ANY: if parsers is not ANY:
@@ -88,15 +90,15 @@ def test_change_and_undo(differ):
# Parse the function and a. # Parse the function and a.
differ.initialize(func_before + 'a') differ.initialize(func_before + 'a')
# Parse just b. # Parse just b.
differ.parse(func_before + 'b', copies=1, parsers=1) differ.parse(func_before + 'b', copies=1, parsers=2)
# b has changed to a again, so parse that. # b has changed to a again, so parse that.
differ.parse(func_before + 'a', copies=1, parsers=1) differ.parse(func_before + 'a', copies=1, parsers=2)
# Same as before parsers should not be used. Just a simple copy. # Same as before parsers should not be used. Just a simple copy.
differ.parse(func_before + 'a', copies=1) differ.parse(func_before + 'a', copies=1)
# Now that we have a newline at the end, everything is easier in Python # Now that we have a newline at the end, everything is easier in Python
# syntax, we can parse once and then get a copy. # syntax, we can parse once and then get a copy.
differ.parse(func_before + 'a\n', copies=1, parsers=1) differ.parse(func_before + 'a\n', copies=1, parsers=2)
differ.parse(func_before + 'a\n', copies=1) differ.parse(func_before + 'a\n', copies=1)
# Getting rid of an old parser: Still no parsers used. # Getting rid of an old parser: Still no parsers used.
@@ -135,7 +137,7 @@ def test_if_simple(differ):
differ.initialize(src + 'a') differ.initialize(src + 'a')
differ.parse(src + else_ + "a", copies=0, parsers=1) differ.parse(src + else_ + "a", copies=0, parsers=1)
differ.parse(else_, parsers=1, copies=1, expect_error_leaves=True) differ.parse(else_, parsers=2, expect_error_leaves=True)
differ.parse(src + else_, parsers=1) differ.parse(src + else_, parsers=1)
@@ -152,7 +154,7 @@ def test_func_with_for_and_comment(differ):
# COMMENT # COMMENT
a""") a""")
differ.initialize(src) differ.initialize(src)
differ.parse('a\n' + src, copies=1, parsers=2) differ.parse('a\n' + src, copies=1, parsers=3)
def test_one_statement_func(differ): def test_one_statement_func(differ):
@@ -236,7 +238,7 @@ def test_backslash(differ):
def y(): def y():
pass pass
""") """)
differ.parse(src, parsers=2) differ.parse(src, parsers=1)
src = dedent(r""" src = dedent(r"""
def first(): def first():
@@ -247,7 +249,7 @@ def test_backslash(differ):
def second(): def second():
pass pass
""") """)
differ.parse(src, parsers=1) differ.parse(src, parsers=2)
def test_full_copy(differ): def test_full_copy(differ):
@@ -261,10 +263,10 @@ def test_wrong_whitespace(differ):
hello hello
''' '''
differ.initialize(code) differ.initialize(code)
differ.parse(code + 'bar\n ', parsers=3) differ.parse(code + 'bar\n ', parsers=2, expect_error_leaves=True)
code += """abc(\npass\n """ code += """abc(\npass\n """
differ.parse(code, parsers=2, copies=1, expect_error_leaves=True) differ.parse(code, parsers=2, expect_error_leaves=True)
def test_issues_with_error_leaves(differ): def test_issues_with_error_leaves(differ):
@@ -279,7 +281,7 @@ def test_issues_with_error_leaves(differ):
str str
''') ''')
differ.initialize(code) differ.initialize(code)
differ.parse(code2, parsers=1, copies=1, expect_error_leaves=True) differ.parse(code2, parsers=1, expect_error_leaves=True)
def test_unfinished_nodes(differ): def test_unfinished_nodes(differ):
@@ -299,7 +301,7 @@ def test_unfinished_nodes(differ):
a(1) a(1)
''') ''')
differ.initialize(code) differ.initialize(code)
differ.parse(code2, parsers=1, copies=2) differ.parse(code2, parsers=2, copies=2)
def test_nested_if_and_scopes(differ): def test_nested_if_and_scopes(differ):
@@ -365,7 +367,7 @@ def test_totally_wrong_whitespace(differ):
''' '''
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=4, copies=0, expect_error_leaves=True) differ.parse(code2, parsers=2, copies=0, expect_error_leaves=True)
def test_node_insertion(differ): def test_node_insertion(differ):
@@ -439,7 +441,7 @@ def test_in_class_movements(differ):
""") """)
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=2, copies=1) differ.parse(code2, parsers=1)
def test_in_parentheses_newlines(differ): def test_in_parentheses_newlines(differ):
@@ -484,7 +486,7 @@ def test_indentation_issue(differ):
""") """)
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=1) differ.parse(code2, parsers=2)
def test_endmarker_newline(differ): def test_endmarker_newline(differ):
@@ -585,7 +587,7 @@ def test_if_removal_and_reappearence(differ):
la la
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=1, copies=4, expect_error_leaves=True) differ.parse(code2, parsers=3, copies=2, expect_error_leaves=True)
differ.parse(code1, parsers=1, copies=1) differ.parse(code1, parsers=1, copies=1)
differ.parse(code3, parsers=1, copies=1) differ.parse(code3, parsers=1, copies=1)
@@ -618,8 +620,8 @@ def test_differing_docstrings(differ):
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=3, copies=1) differ.parse(code2, parsers=2, copies=1)
differ.parse(code1, parsers=3, copies=1) differ.parse(code1, parsers=2, copies=1)
def test_one_call_in_function_change(differ): def test_one_call_in_function_change(differ):
@@ -649,7 +651,7 @@ def test_one_call_in_function_change(differ):
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=1, copies=1, expect_error_leaves=True) differ.parse(code2, parsers=2, copies=1, expect_error_leaves=True)
differ.parse(code1, parsers=2, copies=1) differ.parse(code1, parsers=2, copies=1)
@@ -711,7 +713,7 @@ def test_docstring_removal(differ):
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=1, copies=2) differ.parse(code2, parsers=1, copies=2)
differ.parse(code1, parsers=2, copies=1) differ.parse(code1, parsers=3, copies=1)
def test_paren_in_strange_position(differ): def test_paren_in_strange_position(differ):
@@ -783,7 +785,7 @@ def test_parentheses_before_method(differ):
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=2, copies=1, expect_error_leaves=True) differ.parse(code2, parsers=2, copies=1, expect_error_leaves=True)
differ.parse(code1, parsers=1, copies=1) differ.parse(code1, parsers=2, copies=1)
def test_indentation_issues(differ): def test_indentation_issues(differ):
@@ -824,10 +826,10 @@ def test_indentation_issues(differ):
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=2, copies=2, expect_error_leaves=True) differ.parse(code2, parsers=3, copies=1, expect_error_leaves=True)
differ.parse(code1, copies=2) differ.parse(code1, copies=1, parsers=2)
differ.parse(code3, parsers=2, copies=1) differ.parse(code3, parsers=2, copies=1)
differ.parse(code1, parsers=1, copies=2) differ.parse(code1, parsers=2, copies=1)
def test_error_dedent_issues(differ): def test_error_dedent_issues(differ):
@@ -860,7 +862,7 @@ def test_error_dedent_issues(differ):
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=6, copies=2, expect_error_leaves=True) differ.parse(code2, parsers=3, copies=0, expect_error_leaves=True)
differ.parse(code1, parsers=1, copies=0) differ.parse(code1, parsers=1, copies=0)
@@ -892,8 +894,8 @@ Some'random text: yeah
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=1, copies=1, expect_error_leaves=True) differ.parse(code2, parsers=2, copies=1, expect_error_leaves=True)
differ.parse(code1, parsers=1, copies=1) differ.parse(code1, parsers=2, copies=1)
def test_many_nested_ifs(differ): def test_many_nested_ifs(differ):
@@ -931,7 +933,6 @@ def test_many_nested_ifs(differ):
differ.parse(code1, parsers=1, copies=1) differ.parse(code1, parsers=1, copies=1)
@pytest.mark.skipif(sys.version_info < (3, 5), reason="Async starts working in 3.5")
@pytest.mark.parametrize('prefix', ['', 'async ']) @pytest.mark.parametrize('prefix', ['', 'async '])
def test_with_and_funcdef_in_call(differ, prefix): def test_with_and_funcdef_in_call(differ, prefix):
code1 = prefix + dedent('''\ code1 = prefix + dedent('''\
@@ -946,7 +947,7 @@ def test_with_and_funcdef_in_call(differ, prefix):
code2 = insert_line_into_code(code1, 3, 'def y(self, args):\n') code2 = insert_line_into_code(code1, 3, 'def y(self, args):\n')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=3, expect_error_leaves=True) differ.parse(code2, parsers=1, expect_error_leaves=True)
differ.parse(code1, parsers=1) differ.parse(code1, parsers=1)
@@ -961,33 +962,29 @@ def test_wrong_backslash(differ):
code2 = insert_line_into_code(code1, 3, '\\.whl$\n') code2 = insert_line_into_code(code1, 3, '\\.whl$\n')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=2, copies=2, expect_error_leaves=True) differ.parse(code2, parsers=3, copies=1, expect_error_leaves=True)
differ.parse(code1, parsers=1, copies=1) differ.parse(code1, parsers=1, copies=1)
def test_comment_change(differ):
differ.initialize('')
def test_random_unicode_characters(differ): def test_random_unicode_characters(differ):
""" """
Those issues were all found with the fuzzer. Those issues were all found with the fuzzer.
""" """
differ.initialize('') differ.initialize('')
differ.parse(u'\x1dĔBϞɛˁşʑ˳˻ȣſéÎ\x90̕ȟòwʘ\x1dĔBϞɛˁşʑ˳˻ȣſéÎ', parsers=1, expect_error_leaves=True) differ.parse('\x1dĔBϞɛˁşʑ˳˻ȣſéÎ\x90̕ȟòwʘ\x1dĔBϞɛˁşʑ˳˻ȣſéÎ', parsers=1,
differ.parse(u'\r\r', parsers=1) expect_error_leaves=True)
differ.parse(u"˟Ę\x05À\r rúƣ@\x8a\x15r()\n", parsers=1, expect_error_leaves=True) differ.parse('\r\r', parsers=1)
differ.parse(u'a\ntaǁ\rGĒōns__\n\nb', parsers=1) differ.parse("˟Ę\x05À\r rúƣ@\x8a\x15r()\n", parsers=1, expect_error_leaves=True)
differ.parse('a\ntaǁ\rGĒōns__\n\nb', parsers=1)
s = ' if not (self, "_fi\x02\x0e\x08\n\nle"):' s = ' if not (self, "_fi\x02\x0e\x08\n\nle"):'
differ.parse(s, parsers=1, expect_error_leaves=True) differ.parse(s, parsers=1, expect_error_leaves=True)
differ.parse('') differ.parse('')
differ.parse(s + '\n', parsers=1, expect_error_leaves=True) differ.parse(s + '\n', parsers=1, expect_error_leaves=True)
differ.parse(u' result = (\r\f\x17\t\x11res)', parsers=2, expect_error_leaves=True) differ.parse(' result = (\r\f\x17\t\x11res)', parsers=1, expect_error_leaves=True)
differ.parse('') differ.parse('')
differ.parse(' a( # xx\ndef', parsers=2, expect_error_leaves=True) differ.parse(' a( # xx\ndef', parsers=1, expect_error_leaves=True)
@pytest.mark.skipif(sys.version_info < (2, 7), reason="No set literals in Python 2.6")
def test_dedent_end_positions(differ): def test_dedent_end_positions(differ):
code1 = dedent('''\ code1 = dedent('''\
if 1: if 1:
@@ -1039,7 +1036,7 @@ def test_random_character_insertion(differ):
# 4 # 4
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, copies=1, parsers=3, expect_error_leaves=True) differ.parse(code2, copies=1, parsers=1, expect_error_leaves=True)
differ.parse(code1, copies=1, parsers=1) differ.parse(code1, copies=1, parsers=1)
@@ -1100,8 +1097,8 @@ def test_all_sorts_of_indentation(differ):
end end
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, copies=1, parsers=4, expect_error_leaves=True) differ.parse(code2, copies=1, parsers=1, expect_error_leaves=True)
differ.parse(code1, copies=1, parsers=3) differ.parse(code1, copies=1, parsers=1, expect_error_leaves=True)
code3 = dedent('''\ code3 = dedent('''\
if 1: if 1:
@@ -1111,7 +1108,7 @@ def test_all_sorts_of_indentation(differ):
d d
\x00 \x00
''') ''')
differ.parse(code3, parsers=2, expect_error_leaves=True) differ.parse(code3, parsers=1, expect_error_leaves=True)
differ.parse('') differ.parse('')
@@ -1128,7 +1125,7 @@ def test_dont_copy_dedents_in_beginning(differ):
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, copies=1, parsers=1, expect_error_leaves=True) differ.parse(code2, copies=1, parsers=1, expect_error_leaves=True)
differ.parse(code1, parsers=2) differ.parse(code1, parsers=1, copies=1)
def test_dont_copy_error_leaves(differ): def test_dont_copy_error_leaves(differ):
@@ -1148,7 +1145,7 @@ def test_dont_copy_error_leaves(differ):
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, parsers=1, expect_error_leaves=True) differ.parse(code2, parsers=1, expect_error_leaves=True)
differ.parse(code1, parsers=2) differ.parse(code1, parsers=1)
def test_error_dedent_in_between(differ): def test_error_dedent_in_between(differ):
@@ -1172,7 +1169,7 @@ def test_error_dedent_in_between(differ):
z z
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, copies=1, parsers=1, expect_error_leaves=True) differ.parse(code2, copies=1, parsers=2, expect_error_leaves=True)
differ.parse(code1, copies=1, parsers=2) differ.parse(code1, copies=1, parsers=2)
@@ -1198,8 +1195,8 @@ def test_some_other_indentation_issues(differ):
a a
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, copies=2, parsers=1, expect_error_leaves=True) differ.parse(code2, copies=0, parsers=1, expect_error_leaves=True)
differ.parse(code1, copies=2, parsers=2) differ.parse(code1, copies=1, parsers=1)
def test_open_bracket_case1(differ): def test_open_bracket_case1(differ):
@@ -1239,11 +1236,11 @@ def test_open_bracket_case2(differ):
d d
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, copies=1, parsers=2, expect_error_leaves=True) differ.parse(code2, copies=0, parsers=1, expect_error_leaves=True)
differ.parse(code1, copies=2, parsers=0, expect_error_leaves=True) differ.parse(code1, copies=0, parsers=1, expect_error_leaves=True)
def test_x(differ): def test_some_weird_removals(differ):
code1 = dedent('''\ code1 = dedent('''\
class C: class C:
1 1
@@ -1264,6 +1261,486 @@ def test_x(differ):
omega omega
''') ''')
differ.initialize(code1) differ.initialize(code1)
differ.parse(code2, copies=ANY, parsers=ANY, expect_error_leaves=True) differ.parse(code2, copies=1, parsers=1, expect_error_leaves=True)
differ.parse(code3, copies=ANY, parsers=ANY, expect_error_leaves=True) differ.parse(code3, copies=1, parsers=3, expect_error_leaves=True)
differ.parse(code1, copies=1) differ.parse(code1, copies=1)
def test_async_copy(differ):
code1 = dedent('''\
async def main():
x = 3
print(
''')
code2 = dedent('''\
async def main():
x = 3
print()
''')
differ.initialize(code1)
differ.parse(code2, copies=1, parsers=1)
differ.parse(code1, copies=1, parsers=1, expect_error_leaves=True)
def test_parent_on_decorator(differ):
code1 = dedent('''\
class AClass:
@decorator()
def b_test(self):
print("Hello")
print("world")
def a_test(self):
pass''')
code2 = dedent('''\
class AClass:
@decorator()
def b_test(self):
print("Hello")
print("world")
def a_test(self):
pass''')
differ.initialize(code1)
module_node = differ.parse(code2, parsers=1)
cls = module_node.children[0]
cls_suite = cls.children[-1]
assert len(cls_suite.children) == 3
def test_wrong_indent_in_def(differ):
code1 = dedent('''\
def x():
a
b
''')
code2 = dedent('''\
def x():
//
b
c
''')
differ.initialize(code1)
differ.parse(code2, parsers=1, expect_error_leaves=True)
differ.parse(code1, parsers=1)
def test_backslash_issue(differ):
code1 = dedent('''
pre = (
'')
after = 'instead'
''')
code2 = dedent('''
pre = (
'')
\\if
''') # noqa
differ.initialize(code1)
differ.parse(code2, parsers=1, copies=1, expect_error_leaves=True)
differ.parse(code1, parsers=1, copies=1)
def test_paren_with_indentation(differ):
code1 = dedent('''
class C:
def f(self, fullname, path=None):
x
def load_module(self, fullname):
a
for prefix in self.search_path:
try:
b
except ImportError:
c
else:
raise
def x():
pass
''')
code2 = dedent('''
class C:
def f(self, fullname, path=None):
x
(
a
for prefix in self.search_path:
try:
b
except ImportError:
c
else:
raise
''')
differ.initialize(code1)
differ.parse(code2, parsers=1, copies=1, expect_error_leaves=True)
differ.parse(code1, parsers=3, copies=1)
def test_error_dedent_in_function(differ):
code1 = dedent('''\
def x():
a
b
c
d
''')
code2 = dedent('''\
def x():
a
b
c
d
e
''')
differ.initialize(code1)
differ.parse(code2, parsers=2, copies=1, expect_error_leaves=True)
def test_with_formfeed(differ):
code1 = dedent('''\
@bla
async def foo():
1
yield from []
return
return ''
''')
code2 = dedent('''\
@bla
async def foo():
1
\x0cimport
return
return ''
''') # noqa
differ.initialize(code1)
differ.parse(code2, parsers=ANY, copies=ANY, expect_error_leaves=True)
def test_repeating_invalid_indent(differ):
code1 = dedent('''\
def foo():
return
@bla
a
def foo():
a
b
c
''')
code2 = dedent('''\
def foo():
return
@bla
a
b
c
''')
differ.initialize(code1)
differ.parse(code2, parsers=2, copies=1, expect_error_leaves=True)
def test_another_random_indent(differ):
code1 = dedent('''\
def foo():
a
b
c
return
def foo():
d
''')
code2 = dedent('''\
def foo():
a
c
return
def foo():
d
''')
differ.initialize(code1)
differ.parse(code2, parsers=1, copies=3)
def test_invalid_function(differ):
code1 = dedent('''\
a
def foo():
def foo():
b
''')
code2 = dedent('''\
a
def foo():
def foo():
b
''')
differ.initialize(code1)
differ.parse(code2, parsers=1, copies=1, expect_error_leaves=True)
def test_async_func2(differ):
code1 = dedent('''\
async def foo():
return ''
@bla
async def foo():
x
''')
code2 = dedent('''\
async def foo():
return ''
{
@bla
async def foo():
x
y
''')
differ.initialize(code1)
differ.parse(code2, parsers=ANY, copies=ANY, expect_error_leaves=True)
def test_weird_ending(differ):
code1 = dedent('''\
def foo():
a
return
''')
code2 = dedent('''\
def foo():
a
nonlocal xF"""
y"""''')
differ.initialize(code1)
differ.parse(code2, parsers=1, copies=1, expect_error_leaves=True)
def test_nested_class(differ):
code1 = dedent('''\
def c():
a = 3
class X:
b
''')
code2 = dedent('''\
def c():
a = 3
class X:
elif
''')
differ.initialize(code1)
differ.parse(code2, parsers=1, copies=1, expect_error_leaves=True)
def test_class_with_paren_breaker(differ):
code1 = dedent('''\
class Grammar:
x
def parse():
y
parser(
)
z
''')
code2 = dedent('''\
class Grammar:
x
def parse():
y
parser(
finally ;
)
z
''')
differ.initialize(code1)
differ.parse(code2, parsers=3, copies=1, expect_error_leaves=True)
def test_byte_order_mark(differ):
code2 = dedent('''\
x
\ufeff
else :
''')
differ.initialize('\n')
differ.parse(code2, parsers=2, expect_error_leaves=True)
code3 = dedent('''\
\ufeff
if:
x
''')
differ.initialize('\n')
differ.parse(code3, parsers=2, expect_error_leaves=True)
def test_byte_order_mark2(differ):
code = '\ufeff# foo'
differ.initialize(code)
differ.parse(code + 'x', parsers=ANY)
def test_byte_order_mark3(differ):
code1 = "\ufeff#\ny\n"
code2 = 'x\n\ufeff#\n\ufeff#\ny\n'
differ.initialize(code1)
differ.parse(code2, expect_error_leaves=True, parsers=ANY, copies=ANY)
differ.parse(code1, parsers=1)
def test_backslash_insertion(differ):
code1 = dedent('''
def f():
x
def g():
base = "" \\
""
return
''')
code2 = dedent('''
def f():
x
def g():
base = "" \\
def h():
""
return
''')
differ.initialize(code1)
differ.parse(code2, parsers=2, copies=1, expect_error_leaves=True)
differ.parse(code1, parsers=2, copies=1)
def test_fstring_with_error_leaf(differ):
code1 = dedent("""\
def f():
x
def g():
y
""")
code2 = dedent("""\
def f():
x
F'''
def g():
y
{a
\x01
""")
differ.initialize(code1)
differ.parse(code2, parsers=1, copies=1, expect_error_leaves=True)
def test_yet_another_backslash(differ):
code1 = dedent('''\
def f():
x
def g():
y
base = "" \\
"" % to
return
''')
code2 = dedent('''\
def f():
x
def g():
y
base = "" \\
\x0f
return
''')
differ.initialize(code1)
differ.parse(code2, parsers=ANY, copies=ANY, expect_error_leaves=True)
differ.parse(code1, parsers=ANY, copies=ANY)
def test_backslash_before_def(differ):
code1 = dedent('''\
def f():
x
def g():
y
z
''')
code2 = dedent('''\
def f():
x
>\\
def g():
y
x
z
''')
differ.initialize(code1)
differ.parse(code2, parsers=3, copies=1, expect_error_leaves=True)
def test_backslash_with_imports(differ):
code1 = dedent('''\
from x import y, \\
''')
code2 = dedent('''\
from x import y, \\
z
''')
differ.initialize(code1)
differ.parse(code2, parsers=1)
differ.parse(code1, parsers=1)
def test_one_line_function_error_recovery(differ):
code1 = dedent('''\
class X:
x
def y(): word """
# a
# b
c(self)
''')
code2 = dedent('''\
class X:
x
def y(): word """
# a
# b
c(\x01+self)
''')
differ.initialize(code1)
differ.parse(code2, parsers=1, copies=1, expect_error_leaves=True)
def test_one_line_property_error_recovery(differ):
code1 = dedent('''\
class X:
x
@property
def encoding(self): True -
return 1
''')
code2 = dedent('''\
class X:
x
@property
def encoding(self): True -
return 1
''')
differ.initialize(code1)
differ.parse(code2, parsers=2, copies=1, expect_error_leaves=True)
+65 -1
View File
@@ -1,3 +1,5 @@
from textwrap import dedent
from parso import parse, load_grammar from parso import parse, load_grammar
@@ -72,7 +74,7 @@ def test_invalid_token():
def test_invalid_token_in_fstr(): def test_invalid_token_in_fstr():
module = load_grammar(version='3.6').parse('f"{a + ? + b}"') module = load_grammar(version='3.9').parse('f"{a + ? + b}"')
error_node, q, plus_b, error1, error2, endmarker = module.children error_node, q, plus_b, error1, error2, endmarker = module.children
assert error_node.get_code() == 'f"{a +' assert error_node.get_code() == 'f"{a +'
assert q.value == '?' assert q.value == '?'
@@ -83,3 +85,65 @@ def test_invalid_token_in_fstr():
assert error1.type == 'error_leaf' assert error1.type == 'error_leaf'
assert error2.value == '"' assert error2.value == '"'
assert error2.type == 'error_leaf' assert error2.type == 'error_leaf'
def test_dedent_issues1():
code = dedent('''\
class C:
@property
f
g
end
''')
module = load_grammar(version='3.8').parse(code)
klass, endmarker = module.children
suite = klass.children[-1]
assert suite.children[2].type == 'error_leaf'
assert suite.children[3].get_code(include_prefix=False) == 'f\n'
assert suite.children[5].get_code(include_prefix=False) == 'g\n'
assert suite.type == 'suite'
def test_dedent_issues2():
code = dedent('''\
class C:
@property
if 1:
g
else:
h
end
''')
module = load_grammar(version='3.8').parse(code)
klass, endmarker = module.children
suite = klass.children[-1]
assert suite.children[2].type == 'error_leaf'
if_ = suite.children[3]
assert if_.children[0] == 'if'
assert if_.children[3].type == 'suite'
assert if_.children[3].get_code() == '\n g\n'
assert if_.children[4] == 'else'
assert if_.children[6].type == 'suite'
assert if_.children[6].get_code() == '\n h\n'
assert suite.children[4].get_code(include_prefix=False) == 'end\n'
assert suite.type == 'suite'
def test_dedent_issues3():
code = dedent('''\
class C:
f
g
''')
module = load_grammar(version='3.8').parse(code)
klass, endmarker = module.children
suite = klass.children[-1]
assert len(suite.children) == 4
assert suite.children[1].get_code() == ' f\n'
assert suite.children[1].type == 'simple_stmt'
assert suite.children[2].get_code() == ''
assert suite.children[2].type == 'error_leaf'
assert suite.children[2].token_type == 'ERROR_DEDENT'
assert suite.children[3].get_code() == ' g\n'
assert suite.children[3].type == 'simple_stmt'
+111 -36
View File
@@ -7,31 +7,80 @@ from parso.python.tokenize import tokenize
@pytest.fixture @pytest.fixture
def grammar(): def grammar():
return load_grammar(version='3.6') return load_grammar(version='3.8')
@pytest.mark.parametrize( @pytest.mark.parametrize(
'code', [ 'code', [
'{1}', # simple cases
'{1:}', 'f"{1}"',
'', 'f"""{1}"""',
'{1!a}', 'f"{foo} {bar}"',
'{1!a:1}',
'{1:1}', # empty string
'{1:1.{32}}', 'f""',
'{1::>4}', 'f""""""',
'{foo} {bar}',
# empty format specifier is okay
'f"{1:}"',
# use of conversion options
'f"{1!a}"',
'f"{1!a:1}"',
# format specifiers
'f"{1:1}"',
'f"{1:1.{32}}"',
'f"{1::>4}"',
'f"{x:{y}}"',
'f"{x:{y:}}"',
'f"{x:{y:1}}"',
# Escapes # Escapes
'{{}}', 'f"{{}}"',
'{{{1}}}', 'f"{{{1}}}"',
'{{{1}', 'f"{{{1}"',
'1{{2{{3', 'f"1{{2{{3"',
'}}', 'f"}}"',
# New Python 3.8 syntax f'{a=}'
'f"{a=}"',
'f"{a()=}"',
# multiline f-string
'f"""abc\ndef"""',
'f"""abc{\n123}def"""',
# a line continuation inside of an fstring_string
'f"abc\\\ndef"',
'f"\\\n{123}\\\n"',
# a line continuation inside of an fstring_expr
'f"{\\\n123}"',
# a line continuation inside of an format spec
'f"{123:.2\\\nf}"',
# some unparenthesized syntactic structures
'f"{*x,}"',
'f"{*x, *y}"',
'f"{x, *y}"',
'f"{*x, y}"',
'f"{x for x in [1]}"',
# named unicode characters
'f"\\N{BULLET}"',
'f"\\N{FLEUR-DE-LIS}"',
'f"\\N{NO ENTRY}"',
'f"Combo {expr} and \\N{NO ENTRY}"',
'f"\\N{NO ENTRY} and {expr}"',
'f"\\N{no entry}"',
'f"\\N{SOYOMBO LETTER -A}"',
'f"\\N{DOMINO TILE HORIZONTAL-00-00}"',
'f"""\\N{NO ENTRY}"""',
] ]
) )
def test_valid(code, grammar): def test_valid(code, grammar):
code = 'f"""%s"""' % code
module = grammar.parse(code, error_recovery=False) module = grammar.parse(code, error_recovery=False)
fstring = module.children[0] fstring = module.children[0]
assert fstring.type == 'fstring' assert fstring.type == 'fstring'
@@ -40,23 +89,40 @@ def test_valid(code, grammar):
@pytest.mark.parametrize( @pytest.mark.parametrize(
'code', [ 'code', [
'}', # an f-string can't contain unmatched curly braces
'{', 'f"}"',
'{1!{a}}', 'f"{"',
'{!{a}}', 'f"""}"""',
'{}', 'f"""{"""',
'{:}',
'{:}}}', # invalid conversion characters
'{:1}', 'f"{1!{a}}"',
'{!:}', 'f"{1=!{a}}"',
'{!}', 'f"{!{a}}"',
'{!a}',
'{1:{}}', # The curly braces must contain an expression
'{1:{:}}', 'f"{}"',
'f"{:}"',
'f"{:}}}"',
'f"{:1}"',
'f"{!:}"',
'f"{!}"',
'f"{!a}"',
# invalid (empty) format specifiers
'f"{1:{}}"',
'f"{1:{:}}"',
# a newline without a line continuation inside a single-line string
'f"abc\ndef"',
# various named unicode escapes that aren't name-shaped
'f"\\N{ BULLET }"',
'f"\\N{NO ENTRY}"',
'f"""\\N{NO\nENTRY}"""',
] ]
) )
def test_invalid(code, grammar): def test_invalid(code, grammar):
code = 'f"""%s"""' % code
with pytest.raises(ParserSyntaxError): with pytest.raises(ParserSyntaxError):
grammar.parse(code, error_recovery=False) grammar.parse(code, error_recovery=False)
@@ -72,6 +138,8 @@ def test_invalid(code, grammar):
(1, 10), (1, 11), (1, 12), (1, 13)]), (1, 10), (1, 11), (1, 12), (1, 13)]),
('f"""\n {\nfoo\n }"""', [(1, 0), (1, 4), (2, 1), (3, 0), (4, 1), ('f"""\n {\nfoo\n }"""', [(1, 0), (1, 4), (2, 1), (3, 0), (4, 1),
(4, 2), (4, 5)]), (4, 2), (4, 5)]),
('f"\\N{NO ENTRY} and {expr}"', [(1, 0), (1, 2), (1, 19), (1, 20),
(1, 24), (1, 25), (1, 26)]),
] ]
) )
def test_tokenize_start_pos(code, positions): def test_tokenize_start_pos(code, positions):
@@ -79,11 +147,18 @@ def test_tokenize_start_pos(code, positions):
assert positions == [p.start_pos for p in tokens] assert positions == [p.start_pos for p in tokens]
def test_roundtrip(grammar): @pytest.mark.parametrize(
code = dedent("""\ 'code', [
f'''s{ dedent("""\
str.uppe f'''s{
''' str.uppe
""") '''
"""),
'f"foo',
'f"""foo',
'f"abc\ndef"',
]
)
def test_roundtrip(grammar, code):
tree = grammar.parse(code) tree = grammar.parse(code)
assert tree.get_code() == code assert tree.get_code() == code
+13
View File
@@ -118,3 +118,16 @@ def test_carriage_return_at_end(code, types):
assert tree.get_code() == code assert tree.get_code() == code
assert [c.type for c in tree.children] == types assert [c.type for c in tree.children] == types
assert tree.end_pos == (len(code) + 1, 0) assert tree.end_pos == (len(code) + 1, 0)
@pytest.mark.parametrize('code', [
' ',
' F"""',
' F"""\n',
' F""" \n',
' F""" \n3',
' f"""\n"""',
' f"""\n"""\n',
])
def test_full_code_round_trip(code):
assert parse(code).get_code() == code
+2 -2
View File
@@ -20,7 +20,7 @@ def test_parse_version(string, result):
assert utils._parse_version(string) == result assert utils._parse_version(string) == result
@pytest.mark.parametrize('string', ['1.', 'a', '#', '1.3.4.5', '1.12']) @pytest.mark.parametrize('string', ['1.', 'a', '#', '1.3.4.5'])
def test_invalid_grammar_version(string): def test_invalid_grammar_version(string):
with pytest.raises(ValueError): with pytest.raises(ValueError):
load_grammar(version=string) load_grammar(version=string)
@@ -28,4 +28,4 @@ def test_invalid_grammar_version(string):
def test_grammar_int_version(): def test_grammar_int_version():
with pytest.raises(TypeError): with pytest.raises(TypeError):
load_grammar(version=3.2) load_grammar(version=3.8)
+4 -4
View File
@@ -5,14 +5,14 @@ tests of pydocstyle.
import difflib import difflib
import re import re
from functools import total_ordering
import parso import parso
from parso._compatibility import total_ordering
from parso.utils import python_bytes_to_unicode from parso.utils import python_bytes_to_unicode
@total_ordering @total_ordering
class WantedIssue(object): class WantedIssue:
def __init__(self, code, line, column): def __init__(self, code, line, column):
self.code = code self.code = code
self._line = line self._line = line
@@ -42,9 +42,9 @@ def collect_errors(code):
column = int(add_indent or len(match.group(1))) column = int(add_indent or len(match.group(1)))
code, _, add_line = code.partition('+') code, _, add_line = code.partition('+')
l = line_nr + 1 + int(add_line or 0) ln = line_nr + 1 + int(add_line or 0)
yield WantedIssue(code[1:], l, column) yield WantedIssue(code[1:], ln, column)
def test_normalizer_issue(normalizer_issue_case): def test_normalizer_issue(normalizer_issue_case):
+3 -4
View File
@@ -8,12 +8,11 @@ However the tests might still be relevant for the parser.
from textwrap import dedent from textwrap import dedent
from parso._compatibility import u
from parso import parse from parso import parse
def test_carriage_return_splitting(): def test_carriage_return_splitting():
source = u(dedent(''' source = dedent('''
@@ -21,7 +20,7 @@ def test_carriage_return_splitting():
class Foo(): class Foo():
pass pass
''')) ''')
source = source.replace('\n', '\r\n') source = source.replace('\n', '\r\n')
module = parse(source) module = parse(source)
assert [n.value for lst in module.get_used_names().values() for n in lst] == ['Foo'] assert [n.value for lst in module.get_used_names().values() for n in lst] == ['Foo']
@@ -136,7 +135,7 @@ def test_wrong_indentation():
b b
a a
""") """)
#check_p(src, 1) check_p(src, 1)
src = dedent("""\ src = dedent("""\
def complex(): def complex():
+10 -10
View File
@@ -8,13 +8,13 @@ from textwrap import dedent
from parso import parse from parso import parse
def assert_params(param_string, version=None, **wanted_dct): def assert_params(param_string, **wanted_dct):
source = dedent(''' source = dedent('''
def x(%s): def x(%s):
pass pass
''') % param_string ''') % param_string
module = parse(source, version=version) module = parse(source)
funcdef = next(module.iter_funcdefs()) funcdef = next(module.iter_funcdefs())
dct = dict((p.name.value, p.default and p.default.get_code()) dct = dict((p.name.value, p.default and p.default.get_code())
for p in funcdef.get_params()) for p in funcdef.get_params())
@@ -23,23 +23,23 @@ def assert_params(param_string, version=None, **wanted_dct):
def test_split_params_with_separation_star(): def test_split_params_with_separation_star():
assert_params(u'x, y=1, *, z=3', x=None, y='1', z='3', version='3.5') assert_params('x, y=1, *, z=3', x=None, y='1', z='3')
assert_params(u'*, x', x=None, version='3.5') assert_params('*, x', x=None)
assert_params(u'*', version='3.5') assert_params('*')
def test_split_params_with_stars(): def test_split_params_with_stars():
assert_params(u'x, *args', x=None, args=None) assert_params('x, *args', x=None, args=None)
assert_params(u'**kwargs', kwargs=None) assert_params('**kwargs', kwargs=None)
assert_params(u'*args, **kwargs', args=None, kwargs=None) assert_params('*args, **kwargs', args=None, kwargs=None)
def test_kw_only_no_kw(works_ge_py3): def test_kw_only_no_kw(works_in_py):
""" """
Parsing this should be working. In CPython the parser also parses this and Parsing this should be working. In CPython the parser also parses this and
in a later step the AST complains. in a later step the AST complains.
""" """
module = works_ge_py3.parse('def test(arg, *):\n pass') module = works_in_py.parse('def test(arg, *):\n pass')
if module is not None: if module is not None:
func = module.children[0] func = module.children[0]
open_, p1, asterisk, close = func._get_param_nodes() open_, p1, asterisk, close = func._get_param_nodes()
+38 -21
View File
@@ -3,7 +3,6 @@ from textwrap import dedent
import pytest import pytest
from parso._compatibility import u
from parso import parse from parso import parse
from parso.python import tree from parso.python import tree
from parso.utils import split_lines from parso.utils import split_lines
@@ -110,23 +109,15 @@ def test_param_splitting(each_version):
but Jedi does this to simplify argument parsing. but Jedi does this to simplify argument parsing.
""" """
def check(src, result): def check(src, result):
# Python 2 tuple params should be ignored for now.
m = parse(src, version=each_version) m = parse(src, version=each_version)
if each_version.startswith('2'): assert not list(m.iter_funcdefs())
# We don't want b and c to be a part of the param enumeration. Just
# ignore them, because it's not what we want to support in the
# future.
func = next(m.iter_funcdefs())
assert [param.name.value for param in func.get_params()] == result
else:
assert not list(m.iter_funcdefs())
check('def x(a, (b, c)):\n pass', ['a']) check('def x(a, (b, c)):\n pass', ['a'])
check('def x((b, c)):\n pass', []) check('def x((b, c)):\n pass', [])
def test_unicode_string(): def test_unicode_string():
s = tree.String(None, u(''), (0, 0)) s = tree.String(None, '', (0, 0))
assert repr(s) # Should not raise an Error! assert repr(s) # Should not raise an Error!
@@ -135,19 +126,10 @@ def test_backslash_dos_style(each_version):
def test_started_lambda_stmt(each_version): def test_started_lambda_stmt(each_version):
m = parse(u'lambda a, b: a i', version=each_version) m = parse('lambda a, b: a i', version=each_version)
assert m.children[0].type == 'error_node' assert m.children[0].type == 'error_node'
def test_python2_octal(each_version):
module = parse('0660', version=each_version)
first = module.children[0]
if each_version.startswith('2'):
assert first.type == 'number'
else:
assert first.type == 'error_node'
@pytest.mark.parametrize('code', ['foo "', 'foo """\n', 'foo """\nbar']) @pytest.mark.parametrize('code', ['foo "', 'foo """\n', 'foo """\nbar'])
def test_open_string_literal(each_version, code): def test_open_string_literal(each_version, code):
""" """
@@ -189,3 +171,38 @@ def test_no_error_nodes(each_version):
check(child) check(child)
check(parse("if foo:\n bar", version=each_version)) check(parse("if foo:\n bar", version=each_version))
def test_named_expression(works_ge_py38):
works_ge_py38.parse("(a := 1, a + 1)")
def test_extended_rhs_annassign(works_ge_py38):
works_ge_py38.parse("x: y = z,")
works_ge_py38.parse("x: Tuple[int, ...] = z, *q, w")
@pytest.mark.parametrize(
'param_code', [
'a=1, /',
'a, /',
'a=1, /, b=3',
'a, /, b',
'a, /, b',
'a, /, *, b',
'a, /, **kwargs',
]
)
def test_positional_only_arguments(works_ge_py38, param_code):
works_ge_py38.parse("def x(%s): pass" % param_code)
@pytest.mark.parametrize(
'expression', [
'a + a',
'lambda x: x',
'a := lambda x: x'
]
)
def test_decorator_expression(works_ge_py39, expression):
works_ge_py39.parse("@%s\ndef x(): pass" % expression)
+80 -21
View File
@@ -8,7 +8,7 @@ from parso import parse
from parso.python import tree from parso.python import tree
class TestsFunctionAndLambdaParsing(object): class TestsFunctionAndLambdaParsing:
FIXTURES = [ FIXTURES = [
('def my_function(x, y, z) -> str:\n return x + y * z\n', { ('def my_function(x, y, z) -> str:\n return x + y * z\n', {
@@ -26,7 +26,7 @@ class TestsFunctionAndLambdaParsing(object):
@pytest.fixture(params=FIXTURES) @pytest.fixture(params=FIXTURES)
def node(self, request): def node(self, request):
parsed = parse(dedent(request.param[0]), version='3.5') parsed = parse(dedent(request.param[0]), version='3.10')
request.keywords['expected'] = request.param[1] request.keywords['expected'] = request.param[1]
child = parsed.children[0] child = parsed.children[0]
if child.type == 'simple_stmt': if child.type == 'simple_stmt':
@@ -79,16 +79,16 @@ def test_default_param(each_version):
assert not param.star_count assert not param.star_count
def test_annotation_param(each_py3_version): def test_annotation_param(each_version):
func = parse('def x(foo: 3): pass', version=each_py3_version).children[0] func = parse('def x(foo: 3): pass', version=each_version).children[0]
param, = func.get_params() param, = func.get_params()
assert param.default is None assert param.default is None
assert param.annotation.value == '3' assert param.annotation.value == '3'
assert not param.star_count assert not param.star_count
def test_annotation_params(each_py3_version): def test_annotation_params(each_version):
func = parse('def x(foo: 3, bar: 4): pass', version=each_py3_version).children[0] func = parse('def x(foo: 3, bar: 4): pass', version=each_version).children[0]
param1, param2 = func.get_params() param1, param2 = func.get_params()
assert param1.default is None assert param1.default is None
@@ -100,23 +100,14 @@ def test_annotation_params(each_py3_version):
assert not param2.star_count assert not param2.star_count
def test_default_and_annotation_param(each_py3_version): def test_default_and_annotation_param(each_version):
func = parse('def x(foo:3=42): pass', version=each_py3_version).children[0] func = parse('def x(foo:3=42): pass', version=each_version).children[0]
param, = func.get_params() param, = func.get_params()
assert param.default.value == '42' assert param.default.value == '42'
assert param.annotation.value == '3' assert param.annotation.value == '3'
assert not param.star_count assert not param.star_count
def test_ellipsis_py2(each_py2_version):
module = parse('[0][...]', version=each_py2_version, error_recovery=False)
expr = module.children[0]
trailer = expr.children[-1]
subscript = trailer.children[1]
assert subscript.type == 'subscript'
assert [leaf.value for leaf in subscript.children] == ['.', '.', '.']
def get_yield_exprs(code, version): def get_yield_exprs(code, version):
return list(parse(code, version=version).children[0].iter_yield_exprs()) return list(parse(code, version=version).children[0].iter_yield_exprs())
@@ -142,7 +133,7 @@ def test_yields(each_version):
def test_yield_from(): def test_yield_from():
y, = get_yield_exprs('def x(): (yield from 1)', '3.3') y, = get_yield_exprs('def x(): (yield from 1)', '3.8')
assert y.type == 'yield_expr' assert y.type == 'yield_expr'
@@ -172,11 +163,79 @@ def top_function_three():
raise Exception raise Exception
""" """
r = get_raise_stmts(code, 0) # Lists in a simple Function r = get_raise_stmts(code, 0) # Lists in a simple Function
assert len(list(r)) == 1 assert len(list(r)) == 1
r = get_raise_stmts(code, 1) # Doesn't Exceptions list in closures r = get_raise_stmts(code, 1) # Doesn't Exceptions list in closures
assert len(list(r)) == 1 assert len(list(r)) == 1
r = get_raise_stmts(code, 2) # Lists inside try-catch r = get_raise_stmts(code, 2) # Lists inside try-catch
assert len(list(r)) == 2 assert len(list(r)) == 2
@pytest.mark.parametrize(
'code, name_index, is_definition, include_setitem', [
('x = 3', 0, True, False),
('x.y = 3', 0, False, False),
('x.y = 3', 1, True, False),
('x.y = u.v = z', 0, False, False),
('x.y = u.v = z', 1, True, False),
('x.y = u.v = z', 2, False, False),
('x.y = u.v, w = z', 3, True, False),
('x.y = u.v, w = z', 4, True, False),
('x.y = u.v, w = z', 5, False, False),
('x, y = z', 0, True, False),
('x, y = z', 1, True, False),
('x, y = z', 2, False, False),
('x, y = z', 2, False, False),
('x[0], y = z', 2, False, False),
('x[0] = z', 0, False, False),
('x[0], y = z', 0, False, False),
('x[0], y = z', 2, False, True),
('x[0] = z', 0, True, True),
('x[0], y = z', 0, True, True),
('x: int = z', 0, True, False),
('x: int = z', 1, False, False),
('x: int = z', 2, False, False),
('x: int', 0, True, False),
('x: int', 1, False, False),
]
)
def test_is_definition(code, name_index, is_definition, include_setitem):
module = parse(code, version='3.8')
name = module.get_first_leaf()
while True:
if name.type == 'name':
if name_index == 0:
break
name_index -= 1
name = name.get_next_leaf()
assert name.is_definition(include_setitem=include_setitem) == is_definition
def test_iter_funcdefs():
code = dedent('''
def normal(): ...
async def asyn(): ...
@dec
def dec_normal(): ...
@dec1
@dec2
async def dec_async(): ...
def broken
''')
module = parse(code, version='3.8')
func_names = [f.name.value for f in module.iter_funcdefs()]
assert func_names == ['normal', 'asyn', 'dec_normal', 'dec_async']
def test_with_stmt_get_test_node_from_name():
code = "with A as X.Y, B as (Z), C as Q[0], D as Q['foo']: pass"
with_stmt = parse(code, version='3').children[0]
tests = [
with_stmt.get_test_node_from_name(name).value
for name in with_stmt.get_defined_names(include_setitem=True)
]
assert tests == ["A", "B", "C", "D"]
+1
View File
@@ -33,6 +33,7 @@ def test_eof_blankline():
assert_issue('# foobar\n\n') assert_issue('# foobar\n\n')
assert_issue('\n\n') assert_issue('\n\n')
def test_shebang(): def test_shebang():
assert not issues('#!\n') assert not issues('#!\n')
assert not issues('#!/foo\n') assert not issues('#!/foo\n')
+182 -115
View File
@@ -1,11 +1,3 @@
"""Test suite for 2to3's parser and grammar files.
This is the place to add tests for changes to 2to3's grammar, such as those
merging the grammars for Python 2 and 3. In addition to specific tests for
parts of the grammar we've changed, we also make sure we can parse the
test_grammar.py files from both Python 2 and Python 3.
"""
from textwrap import dedent from textwrap import dedent
import pytest import pytest
@@ -29,32 +21,36 @@ def _invalid_syntax(code, version=None, **kwargs):
print(module.children) print(module.children)
def test_formfeed(each_py2_version): def test_formfeed(each_version):
s = u"""print 1\n\x0Cprint 2\n""" s = "foo\n\x0c\nfoo\n"
t = _parse(s, each_py2_version) t = _parse(s, each_version)
assert t.children[0].children[0].type == 'print_stmt' assert t.children[0].children[0].type == 'name'
assert t.children[1].children[0].type == 'print_stmt' assert t.children[1].children[0].type == 'name'
s = u"""1\n\x0C\x0C2\n""" s = "1\n\x0c\x0c\n2\n"
t = _parse(s, each_py2_version) t = _parse(s, each_version)
with pytest.raises(ParserSyntaxError):
s = "\n\x0c2\n"
_parse(s, each_version)
def test_matrix_multiplication_operator(works_ge_py35): def test_matrix_multiplication_operator(works_in_py):
works_ge_py35.parse("a @ b") works_in_py.parse("a @ b")
works_ge_py35.parse("a @= b") works_in_py.parse("a @= b")
def test_yield_from(works_ge_py3, each_version): def test_yield_from(works_in_py, each_version):
works_ge_py3.parse("yield from x") works_in_py.parse("yield from x")
works_ge_py3.parse("(yield from x) + y") works_in_py.parse("(yield from x) + y")
_invalid_syntax("yield from", each_version) _invalid_syntax("yield from", each_version)
def test_await_expr(works_ge_py35): def test_await_expr(works_in_py):
works_ge_py35.parse("""async def foo(): works_in_py.parse("""async def foo():
await x await x
""") """)
works_ge_py35.parse("""async def foo(): works_in_py.parse("""async def foo():
def foo(): pass def foo(): pass
@@ -63,91 +59,139 @@ def test_await_expr(works_ge_py35):
await x await x
""") """)
works_ge_py35.parse("""async def foo(): return await a""") works_in_py.parse("""async def foo(): return await a""")
works_ge_py35.parse("""def foo(): works_in_py.parse("""def foo():
def foo(): pass def foo(): pass
async def foo(): await x async def foo(): await x
""") """)
@pytest.mark.skipif('sys.version_info[:2] < (3, 5)') @pytest.mark.parametrize(
@pytest.mark.xfail(reason="acting like python 3.7") 'code', [
def test_async_var(): "async = 1",
_parse("""async = 1""", "3.5") "await = 1",
_parse("""await = 1""", "3.5") "def async(): pass",
_parse("""def async(): pass""", "3.5") ]
)
def test_async_var(works_not_in_py, code):
works_not_in_py.parse(code)
def test_async_for(works_ge_py35): def test_async_for(works_in_py):
works_ge_py35.parse("async def foo():\n async for a in b: pass") works_in_py.parse("async def foo():\n async for a in b: pass")
def test_async_with(works_ge_py35): @pytest.mark.parametrize("body", [
works_ge_py35.parse("async def foo():\n async with a: pass") """[1 async for a in b
]""",
"""[1 async
for a in b
]""",
"""[
1
async for a in b
]""",
"""[
1
async for a
in b
]""",
"""[
1
async
for
a
in
b
]""",
""" [
1 async for a in b
]""",
])
def test_async_for_comprehension_newline(works_in_py, body):
# Issue #139
works_in_py.parse("""async def foo():
{}""".format(body))
@pytest.mark.skipif('sys.version_info[:2] < (3, 5)')
@pytest.mark.xfail(reason="acting like python 3.7") def test_async_with(works_in_py):
def test_async_with_invalid(): works_in_py.parse("async def foo():\n async with a: pass")
_invalid_syntax("""def foo():
async with a: pass""", version="3.5")
def test_async_with_invalid(works_in_py):
works_in_py.parse("""def foo():\n async with a: pass""")
def test_raise_3x_style_1(each_version): def test_raise_3x_style_1(each_version):
_parse("raise", each_version) _parse("raise", each_version)
def test_raise_2x_style_2(works_in_py2): def test_raise_2x_style_2(works_not_in_py):
works_in_py2.parse("raise E, V") works_not_in_py.parse("raise E, V")
def test_raise_2x_style_3(works_not_in_py):
works_not_in_py.parse("raise E, V, T")
def test_raise_2x_style_3(works_in_py2):
works_in_py2.parse("raise E, V, T")
def test_raise_2x_style_invalid_1(each_version): def test_raise_2x_style_invalid_1(each_version):
_invalid_syntax("raise E, V, T, Z", version=each_version) _invalid_syntax("raise E, V, T, Z", version=each_version)
def test_raise_3x_style(works_ge_py3):
works_ge_py3.parse("raise E1 from E2") def test_raise_3x_style(works_in_py):
works_in_py.parse("raise E1 from E2")
def test_raise_3x_style_invalid_1(each_version): def test_raise_3x_style_invalid_1(each_version):
_invalid_syntax("raise E, V from E1", each_version) _invalid_syntax("raise E, V from E1", each_version)
def test_raise_3x_style_invalid_2(each_version): def test_raise_3x_style_invalid_2(each_version):
_invalid_syntax("raise E from E1, E2", each_version) _invalid_syntax("raise E from E1, E2", each_version)
def test_raise_3x_style_invalid_3(each_version): def test_raise_3x_style_invalid_3(each_version):
_invalid_syntax("raise from E1, E2", each_version) _invalid_syntax("raise from E1, E2", each_version)
def test_raise_3x_style_invalid_4(each_version): def test_raise_3x_style_invalid_4(each_version):
_invalid_syntax("raise E from", each_version) _invalid_syntax("raise E from", each_version)
# Adapted from Python 3's Lib/test/test_grammar.py:GrammarTests.testFuncdef # Adapted from Python 3's Lib/test/test_grammar.py:GrammarTests.testFuncdef
def test_annotation_1(works_ge_py3): def test_annotation_1(works_in_py):
works_ge_py3.parse("""def f(x) -> list: pass""") works_in_py.parse("""def f(x) -> list: pass""")
def test_annotation_2(works_ge_py3):
works_ge_py3.parse("""def f(x:int): pass""")
def test_annotation_3(works_ge_py3): def test_annotation_2(works_in_py):
works_ge_py3.parse("""def f(*x:str): pass""") works_in_py.parse("""def f(x:int): pass""")
def test_annotation_4(works_ge_py3):
works_ge_py3.parse("""def f(**x:float): pass""")
def test_annotation_5(works_ge_py3): def test_annotation_3(works_in_py):
works_ge_py3.parse("""def f(x, y:1+2): pass""") works_in_py.parse("""def f(*x:str): pass""")
def test_annotation_6(each_py3_version):
_invalid_syntax("""def f(a, (b:1, c:2, d)): pass""", each_py3_version)
def test_annotation_7(each_py3_version): def test_annotation_4(works_in_py):
_invalid_syntax("""def f(a, (b:1, c:2, d), e:3=4, f=5, *g:6): pass""", each_py3_version) works_in_py.parse("""def f(**x:float): pass""")
def test_annotation_8(each_py3_version):
def test_annotation_5(works_in_py):
works_in_py.parse("""def f(x, y:1+2): pass""")
def test_annotation_6(each_version):
_invalid_syntax("""def f(a, (b:1, c:2, d)): pass""", each_version)
def test_annotation_7(each_version):
_invalid_syntax("""def f(a, (b:1, c:2, d), e:3=4, f=5, *g:6): pass""", each_version)
def test_annotation_8(each_version):
s = """def f(a, (b:1, c:2, d), e:3=4, f=5, s = """def f(a, (b:1, c:2, d), e:3=4, f=5,
*g:6, h:7, i=8, j:9=10, **k:11) -> 12: pass""" *g:6, h:7, i=8, j:9=10, **k:11) -> 12: pass"""
_invalid_syntax(s, each_py3_version) _invalid_syntax(s, each_version)
def test_except_new(each_version): def test_except_new(each_version):
@@ -158,27 +202,31 @@ def test_except_new(each_version):
y""") y""")
_parse(s, each_version) _parse(s, each_version)
def test_except_old(works_in_py2):
def test_except_old(works_not_in_py):
s = dedent(""" s = dedent("""
try: try:
x x
except E, N: except E, N:
y""") y""")
works_in_py2.parse(s) works_not_in_py.parse(s)
# Adapted from Python 3's Lib/test/test_grammar.py:GrammarTests.testAtoms # Adapted from Python 3's Lib/test/test_grammar.py:GrammarTests.testAtoms
def test_set_literal_1(works_ge_py27): def test_set_literal_1(works_in_py):
works_ge_py27.parse("""x = {'one'}""") works_in_py.parse("""x = {'one'}""")
def test_set_literal_2(works_ge_py27):
works_ge_py27.parse("""x = {'one', 1,}""")
def test_set_literal_3(works_ge_py27): def test_set_literal_2(works_in_py):
works_ge_py27.parse("""x = {'one', 'two', 'three'}""") works_in_py.parse("""x = {'one', 1,}""")
def test_set_literal_4(works_ge_py27):
works_ge_py27.parse("""x = {2, 3, 4,}""") def test_set_literal_3(works_in_py):
works_in_py.parse("""x = {'one', 'two', 'three'}""")
def test_set_literal_4(works_in_py):
works_in_py.parse("""x = {2, 3, 4,}""")
def test_new_octal_notation(each_version): def test_new_octal_notation(each_version):
@@ -186,8 +234,21 @@ def test_new_octal_notation(each_version):
_invalid_syntax("""0o7324528887""", each_version) _invalid_syntax("""0o7324528887""", each_version)
def test_old_octal_notation(works_in_py2): def test_old_octal_notation(works_not_in_py):
works_in_py2.parse("07") works_not_in_py.parse("07")
def test_long_notation(works_not_in_py):
works_not_in_py.parse("0xFl")
works_not_in_py.parse("0xFL")
works_not_in_py.parse("0b1l")
works_not_in_py.parse("0B1L")
works_not_in_py.parse("0o7l")
works_not_in_py.parse("0O7L")
works_not_in_py.parse("0l")
works_not_in_py.parse("0L")
works_not_in_py.parse("10l")
works_not_in_py.parse("10L")
def test_new_binary_notation(each_version): def test_new_binary_notation(each_version):
@@ -195,28 +256,24 @@ def test_new_binary_notation(each_version):
_invalid_syntax("""0b0101021""", each_version) _invalid_syntax("""0b0101021""", each_version)
def test_class_new_syntax(works_ge_py3): def test_class_new_syntax(works_in_py):
works_ge_py3.parse("class B(t=7): pass") works_in_py.parse("class B(t=7): pass")
works_ge_py3.parse("class B(t, *args): pass") works_in_py.parse("class B(t, *args): pass")
works_ge_py3.parse("class B(t, **kwargs): pass") works_in_py.parse("class B(t, **kwargs): pass")
works_ge_py3.parse("class B(t, *args, **kwargs): pass") works_in_py.parse("class B(t, *args, **kwargs): pass")
works_ge_py3.parse("class B(t, y=9, *args, **kwargs): pass") works_in_py.parse("class B(t, y=9, *args, **kwargs): pass")
def test_parser_idempotency_extended_unpacking(works_ge_py3): def test_parser_idempotency_extended_unpacking(works_in_py):
"""A cut-down version of pytree_idempotency.py.""" """A cut-down version of pytree_idempotency.py."""
works_ge_py3.parse("a, *b, c = x\n") works_in_py.parse("a, *b, c = x\n")
works_ge_py3.parse("[*a, b] = x\n") works_in_py.parse("[*a, b] = x\n")
works_ge_py3.parse("(z, *y, w) = m\n") works_in_py.parse("(z, *y, w) = m\n")
works_ge_py3.parse("for *z, m in d: pass\n") works_in_py.parse("for *z, m in d: pass\n")
def test_multiline_bytes_literals(each_version): def test_multiline_bytes_literals(each_version):
""" s = """
It's not possible to get the same result when using \xaa in Python 2/3,
because it's treated differently.
"""
s = u"""
md5test(b"\xaa" * 80, md5test(b"\xaa" * 80,
(b"Test Using Larger Than Block-Size Key " (b"Test Using Larger Than Block-Size Key "
b"and Larger Than One Block-Size Data"), b"and Larger Than One Block-Size Data"),
@@ -235,17 +292,17 @@ def test_multiline_bytes_tripquote_literals(each_version):
_parse(s, each_version) _parse(s, each_version)
def test_ellipsis(works_ge_py3, each_version): def test_ellipsis(works_in_py, each_version):
works_ge_py3.parse("...") works_in_py.parse("...")
_parse("[0][...]", version=each_version) _parse("[0][...]", version=each_version)
def test_dict_unpacking(works_ge_py35): def test_dict_unpacking(works_in_py):
works_ge_py35.parse("{**dict(a=3), foo:2}") works_in_py.parse("{**dict(a=3), foo:2}")
def test_multiline_str_literals(each_version): def test_multiline_str_literals(each_version):
s = u""" s = """
md5test("\xaa" * 80, md5test("\xaa" * 80,
("Test Using Larger Than Block-Size Key " ("Test Using Larger Than Block-Size Key "
"and Larger Than One Block-Size Data"), "and Larger Than One Block-Size Data"),
@@ -254,24 +311,24 @@ def test_multiline_str_literals(each_version):
_parse(s, each_version) _parse(s, each_version)
def test_py2_backticks(works_in_py2): def test_py2_backticks(works_not_in_py):
works_in_py2.parse("`1`") works_not_in_py.parse("`1`")
def test_py2_string_prefixes(works_in_py2): def test_py2_string_prefixes(works_not_in_py):
works_in_py2.parse("ur'1'") works_not_in_py.parse("ur'1'")
works_in_py2.parse("Ur'1'") works_not_in_py.parse("Ur'1'")
works_in_py2.parse("UR'1'") works_not_in_py.parse("UR'1'")
_invalid_syntax("ru'1'", works_in_py2.version) _invalid_syntax("ru'1'", works_not_in_py.version)
def py_br(each_version): def py_br(each_version):
_parse('br""', each_version) _parse('br""', each_version)
def test_py3_rb(works_ge_py3): def test_py3_rb(works_in_py):
works_ge_py3.parse("rb'1'") works_in_py.parse("rb'1'")
works_ge_py3.parse("RB'1'") works_in_py.parse("RB'1'")
def test_left_recursion(): def test_left_recursion():
@@ -279,12 +336,22 @@ def test_left_recursion():
generate_grammar('foo: foo NAME\n', tokenize.PythonTokenTypes) generate_grammar('foo: foo NAME\n', tokenize.PythonTokenTypes)
def test_ambiguities(): @pytest.mark.parametrize(
with pytest.raises(ValueError, match='ambiguous'): 'grammar, error_match', [
generate_grammar('foo: bar | baz\nbar: NAME\nbaz: NAME\n', tokenize.PythonTokenTypes) ['foo: bar | baz\nbar: NAME\nbaz: NAME\n',
r"foo is ambiguous.*given a PythonTokenTypes\.NAME.*bar or baz"],
with pytest.raises(ValueError, match='ambiguous'): ['''foo: bar | baz\nbar: 'x'\nbaz: "x"\n''',
generate_grammar('''foo: bar | baz\nbar: 'x'\nbaz: "x"\n''', tokenize.PythonTokenTypes) r"foo is ambiguous.*given a ReservedString\(x\).*bar or baz"],
['''foo: bar | 'x'\nbar: 'x'\n''',
with pytest.raises(ValueError, match='ambiguous'): r"foo is ambiguous.*given a ReservedString\(x\).*bar or foo"],
generate_grammar('''foo: bar | 'x'\nbar: 'x'\n''', tokenize.PythonTokenTypes) # An ambiguity with the second (not the first) child of a production
['outer: "a" [inner] "b" "c"\ninner: "b" "c" [inner]\n',
r"outer is ambiguous.*given a ReservedString\(b\).*inner or outer"],
# An ambiguity hidden by a level of indirection (middle)
['outer: "a" [middle] "b" "c"\nmiddle: inner\ninner: "b" "c" [inner]\n',
r"outer is ambiguous.*given a ReservedString\(b\).*middle or outer"],
]
)
def test_ambiguities(grammar, error_match):
with pytest.raises(ValueError, match=error_match):
generate_grammar(grammar, tokenize.PythonTokenTypes)
+2 -7
View File
@@ -1,9 +1,4 @@
try: from itertools import zip_longest
from itertools import zip_longest
except ImportError:
# Python 2
from itertools import izip_longest as zip_longest
from codecs import BOM_UTF8 from codecs import BOM_UTF8
import pytest import pytest
@@ -44,7 +39,7 @@ def test_simple_prefix_splitting(string, tokens):
else: else:
end_pos = start_pos[0], start_pos[1] + len(expected) + len(pt.spacing) end_pos = start_pos[0], start_pos[1] + len(expected) + len(pt.spacing)
#assert start_pos == pt.start_pos # assert start_pos == pt.start_pos
assert end_pos == pt.end_pos assert end_pos == pt.end_pos
start_pos = end_pos start_pos = end_pos
+175 -26
View File
@@ -7,6 +7,8 @@ import warnings
import pytest import pytest
import parso import parso
from textwrap import dedent
from parso._compatibility import is_pypy from parso._compatibility import is_pypy
from .failing_examples import FAILING_EXAMPLES, indent, build_nested from .failing_examples import FAILING_EXAMPLES, indent, build_nested
@@ -37,10 +39,30 @@ def test_python_exception_matches(code):
error, = errors error, = errors
actual = error.message actual = error.message
assert actual in wanted assert actual in wanted
# Somehow in Python3.3 the SyntaxError().lineno is sometimes None # Somehow in Python2.7 the SyntaxError().lineno is sometimes None
assert line_nr is None or line_nr == error.start_pos[0] assert line_nr is None or line_nr == error.start_pos[0]
def test_non_async_in_async():
"""
This example doesn't work with FAILING_EXAMPLES, because the line numbers
are not always the same / incorrect in Python 3.8.
"""
# Raises multiple errors in previous versions.
code = 'async def foo():\n def nofoo():[x async for x in []]'
wanted, line_nr = _get_actual_exception(code)
errors = _get_error_list(code)
if errors:
error, = errors
actual = error.message
assert actual in wanted
if sys.version_info[:2] < (3, 8):
assert line_nr == error.start_pos[0]
else:
assert line_nr == 0 # For whatever reason this is zero in Python 3.8+
@pytest.mark.parametrize( @pytest.mark.parametrize(
('code', 'positions'), [ ('code', 'positions'), [
('1 +', [(1, 3)]), ('1 +', [(1, 3)]),
@@ -95,25 +117,9 @@ def _get_actual_exception(code):
assert False, "The piece of code should raise an exception." assert False, "The piece of code should raise an exception."
# SyntaxError # SyntaxError
# Python 2.6 has a bit different error messages here, so skip it. if wanted == 'SyntaxError: assignment to keyword':
if sys.version_info[:2] == (2, 6) and wanted == 'SyntaxError: unexpected EOF while parsing': return [wanted, "SyntaxError: can't assign to keyword",
wanted = 'SyntaxError: invalid syntax' 'SyntaxError: cannot assign to __debug__'], line_nr
if wanted == 'SyntaxError: non-keyword arg after keyword arg':
# The python 3.5+ way, a bit nicer.
wanted = 'SyntaxError: positional argument follows keyword argument'
elif wanted == 'SyntaxError: assignment to keyword':
return [wanted, "SyntaxError: can't assign to keyword"], line_nr
elif wanted == 'SyntaxError: assignment to None':
# Python 2.6 does has a slightly different error.
wanted = 'SyntaxError: cannot assign to None'
elif wanted == 'SyntaxError: can not assign to __debug__':
# Python 2.6 does has a slightly different error.
wanted = 'SyntaxError: cannot assign to __debug__'
elif wanted == 'SyntaxError: can use starred expression only as assignment target':
# Python 3.4/3.4 have a bit of a different warning than 3.5/3.6 in
# certain places. But in others this error makes sense.
return [wanted, "SyntaxError: can't use starred expression here"], line_nr
elif wanted == 'SyntaxError: f-string: unterminated string': elif wanted == 'SyntaxError: f-string: unterminated string':
wanted = 'SyntaxError: EOL while scanning string literal' wanted = 'SyntaxError: EOL while scanning string literal'
elif wanted == 'SyntaxError: f-string expression part cannot include a backslash': elif wanted == 'SyntaxError: f-string expression part cannot include a backslash':
@@ -171,12 +177,13 @@ def test_statically_nested_blocks():
def test_future_import_first(): def test_future_import_first():
def is_issue(code, *args): def is_issue(code, *args, **kwargs):
code = code % args code = code % args
return bool(_get_error_list(code)) return bool(_get_error_list(code, **kwargs))
i1 = 'from __future__ import division' i1 = 'from __future__ import division'
i2 = 'from __future__ import absolute_import' i2 = 'from __future__ import absolute_import'
i3 = 'from __future__ import annotations'
assert not is_issue(i1) assert not is_issue(i1)
assert not is_issue(i1 + ';' + i2) assert not is_issue(i1 + ';' + i2)
assert not is_issue(i1 + '\n' + i2) assert not is_issue(i1 + '\n' + i2)
@@ -187,6 +194,8 @@ def test_future_import_first():
assert not is_issue('""\n%s;%s', i1, i2) assert not is_issue('""\n%s;%s', i1, i2)
assert not is_issue('"";%s;%s ', i1, i2) assert not is_issue('"";%s;%s ', i1, i2)
assert not is_issue('"";%s\n%s ', i1, i2) assert not is_issue('"";%s\n%s ', i1, i2)
assert not is_issue(i3, version="3.7")
assert is_issue(i3, version="3.6")
assert is_issue('1;' + i1) assert is_issue('1;' + i1)
assert is_issue('1\n' + i1) assert is_issue('1\n' + i1)
assert is_issue('"";1\n' + i1) assert is_issue('"";1\n' + i1)
@@ -240,10 +249,7 @@ def test_escape_decode_literals(each_version):
# Finally bytes. # Finally bytes.
error, = _get_error_list(r'b"\x"', version=each_version) error, = _get_error_list(r'b"\x"', version=each_version)
wanted = r'SyntaxError: (value error) invalid \x escape' wanted = r'SyntaxError: (value error) invalid \x escape at position 0'
if sys.version_info >= (3, 0):
# The positioning information is only available in Python 3.
wanted += ' at position 0'
assert error.message == wanted assert error.message == wanted
@@ -255,6 +261,11 @@ def test_too_many_levels_of_indentation():
assert _get_error_list(build_nested('pass', 50, base=base)) assert _get_error_list(build_nested('pass', 50, base=base))
def test_paren_kwarg():
assert _get_error_list("print((sep)=seperator)", version="3.8")
assert not _get_error_list("print((sep)=seperator)", version="3.7")
@pytest.mark.parametrize( @pytest.mark.parametrize(
'code', [ 'code', [
"f'{*args,}'", "f'{*args,}'",
@@ -263,12 +274,52 @@ def test_too_many_levels_of_indentation():
r'fr"\""', r'fr"\""',
r'fr"\\\""', r'fr"\\\""',
r"print(f'Some {x:.2f} and some {y}')", r"print(f'Some {x:.2f} and some {y}')",
# Unparenthesized yield expression
'def foo(): return f"{yield 1}"',
] ]
) )
def test_valid_fstrings(code): def test_valid_fstrings(code):
assert not _get_error_list(code, version='3.6') assert not _get_error_list(code, version='3.6')
@pytest.mark.parametrize(
'code', [
'a = (b := 1)',
'[x4 := x ** 5 for x in range(7)]',
'[total := total + v for v in range(10)]',
'while chunk := file.read(2):\n pass',
'numbers = [y := math.factorial(x), y**2, y**3]',
'{(a:="a"): (b:=1)}',
'{(y:=1): 2 for x in range(5)}',
'a[(b:=0)]',
'a[(b:=0, c:=0)]',
'a[(b:=0):1:2]',
]
)
def test_valid_namedexpr(code):
assert not _get_error_list(code, version='3.8')
@pytest.mark.parametrize(
'code', [
'{x := 1, 2, 3}',
'{x4 := x ** 5 for x in range(7)}',
]
)
def test_valid_namedexpr_set(code):
assert not _get_error_list(code, version='3.9')
@pytest.mark.parametrize(
'code', [
'a[b:=0]',
'a[b:=0, c:=0]',
]
)
def test_valid_namedexpr_index(code):
assert not _get_error_list(code, version='3.10')
@pytest.mark.parametrize( @pytest.mark.parametrize(
('code', 'message'), [ ('code', 'message'), [
("f'{1+}'", ('invalid syntax')), ("f'{1+}'", ('invalid syntax')),
@@ -283,3 +334,101 @@ def test_invalid_fstrings(code, message):
""" """
error, = _get_error_list(code, version='3.6') error, = _get_error_list(code, version='3.6')
assert message in error.message assert message in error.message
@pytest.mark.parametrize(
'code', [
"from foo import (\nbar,\n rab,\n)",
"from foo import (bar, rab, )",
]
)
def test_trailing_comma(code):
errors = _get_error_list(code)
assert not errors
def test_continue_in_finally():
code = dedent('''\
for a in [1]:
try:
pass
finally:
continue
''')
assert not _get_error_list(code, version="3.8")
assert _get_error_list(code, version="3.7")
@pytest.mark.parametrize(
'template', [
"a, b, {target}, c = d",
"a, b, *{target}, c = d",
"(a, *{target}), c = d",
"for x, {target} in y: pass",
"for x, q, {target} in y: pass",
"for x, q, *{target} in y: pass",
"for (x, *{target}), q in y: pass",
]
)
@pytest.mark.parametrize(
'target', [
"True",
"False",
"None",
"__debug__"
]
)
def test_forbidden_name(template, target):
assert _get_error_list(template.format(target=target), version="3")
def test_repeated_kwarg():
# python 3.9+ shows which argument is repeated
assert (
_get_error_list("f(q=1, q=2)", version="3.8")[0].message
== "SyntaxError: keyword argument repeated"
)
assert (
_get_error_list("f(q=1, q=2)", version="3.9")[0].message
== "SyntaxError: keyword argument repeated: q"
)
@pytest.mark.parametrize(
('source', 'no_errors'), [
('a(a for a in b,)', False),
('a(a for a in b, a)', False),
('a(a, a for a in b)', False),
('a(a, b, a for a in b, c, d)', False),
('a(a for a in b)', True),
('a((a for a in b), c)', True),
('a(c, (a for a in b))', True),
('a(a, b, (a for a in b), c, d)', True),
]
)
def test_unparenthesized_genexp(source, no_errors):
assert bool(_get_error_list(source)) ^ no_errors
@pytest.mark.parametrize(
('source', 'no_errors'), [
('*x = 2', False),
('(*y) = 1', False),
('((*z)) = 1', False),
('a, *b = 1', True),
('a, *b, c = 1', True),
('a, (*b), c = 1', True),
('a, ((*b)), c = 1', True),
('a, (*b, c), d = 1', True),
('[*(1,2,3)]', True),
('{*(1,2,3)}', True),
('[*(1,2,3),]', True),
('[*(1,2,3), *(4,5,6)]', True),
('[0, *(1,2,3)]', True),
('{*(1,2,3),}', True),
('{*(1,2,3), *(4,5,6)}', True),
('{0, *(4,5,6)}', True)
]
)
def test_starred_expr(source, no_errors):
assert bool(_get_error_list(source, version="3")) ^ no_errors
+115 -30
View File
@@ -4,7 +4,6 @@ from textwrap import dedent
import pytest import pytest
from parso._compatibility import py_version
from parso.utils import split_lines, parse_version_string from parso.utils import split_lines, parse_version_string
from parso.python.token import PythonTokenTypes from parso.python.token import PythonTokenTypes
from parso.python import tokenize from parso.python import tokenize
@@ -16,6 +15,7 @@ from parso.python.tokenize import PythonToken
NAME = PythonTokenTypes.NAME NAME = PythonTokenTypes.NAME
NEWLINE = PythonTokenTypes.NEWLINE NEWLINE = PythonTokenTypes.NEWLINE
STRING = PythonTokenTypes.STRING STRING = PythonTokenTypes.STRING
NUMBER = PythonTokenTypes.NUMBER
INDENT = PythonTokenTypes.INDENT INDENT = PythonTokenTypes.INDENT
DEDENT = PythonTokenTypes.DEDENT DEDENT = PythonTokenTypes.DEDENT
ERRORTOKEN = PythonTokenTypes.ERRORTOKEN ERRORTOKEN = PythonTokenTypes.ERRORTOKEN
@@ -30,7 +30,7 @@ FSTRING_END = PythonTokenTypes.FSTRING_END
def _get_token_list(string, version=None): def _get_token_list(string, version=None):
# Load the current version. # Load the current version.
version_info = parse_version_string(version) version_info = parse_version_string(version)
return list(tokenize.tokenize(string, version_info)) return list(tokenize.tokenize(string, version_info=version_info))
def test_end_pos_one_line(): def test_end_pos_one_line():
@@ -107,7 +107,7 @@ def test_tokenize_multiline_I():
fundef = '''""""\n''' fundef = '''""""\n'''
token_list = _get_token_list(fundef) token_list = _get_token_list(fundef)
assert token_list == [PythonToken(ERRORTOKEN, '""""\n', (1, 0), ''), assert token_list == [PythonToken(ERRORTOKEN, '""""\n', (1, 0), ''),
PythonToken(ENDMARKER , '', (2, 0), '')] PythonToken(ENDMARKER, '', (2, 0), '')]
def test_tokenize_multiline_II(): def test_tokenize_multiline_II():
@@ -116,7 +116,7 @@ def test_tokenize_multiline_II():
fundef = '''""""''' fundef = '''""""'''
token_list = _get_token_list(fundef) token_list = _get_token_list(fundef)
assert token_list == [PythonToken(ERRORTOKEN, '""""', (1, 0), ''), assert token_list == [PythonToken(ERRORTOKEN, '""""', (1, 0), ''),
PythonToken(ENDMARKER, '', (1, 4), '')] PythonToken(ENDMARKER, '', (1, 4), '')]
def test_tokenize_multiline_III(): def test_tokenize_multiline_III():
@@ -125,7 +125,7 @@ def test_tokenize_multiline_III():
fundef = '''""""\n\n''' fundef = '''""""\n\n'''
token_list = _get_token_list(fundef) token_list = _get_token_list(fundef)
assert token_list == [PythonToken(ERRORTOKEN, '""""\n\n', (1, 0), ''), assert token_list == [PythonToken(ERRORTOKEN, '""""\n\n', (1, 0), ''),
PythonToken(ENDMARKER, '', (3, 0), '')] PythonToken(ENDMARKER, '', (3, 0), '')]
def test_identifier_contains_unicode(): def test_identifier_contains_unicode():
@@ -135,12 +135,7 @@ def test_identifier_contains_unicode():
''') ''')
token_list = _get_token_list(fundef) token_list = _get_token_list(fundef)
unicode_token = token_list[1] unicode_token = token_list[1]
if py_version >= 30: assert unicode_token[0] == NAME
assert unicode_token[0] == NAME
else:
# Unicode tokens in Python 2 seem to be identified as operators.
# They will be ignored in the parser, that's ok.
assert unicode_token[0] == OP
def test_quoted_strings(): def test_quoted_strings():
@@ -183,19 +178,16 @@ def test_ur_literals():
assert typ == NAME assert typ == NAME
check('u""') check('u""')
check('ur""', is_literal=not py_version >= 30) check('ur""', is_literal=False)
check('Ur""', is_literal=not py_version >= 30) check('Ur""', is_literal=False)
check('UR""', is_literal=not py_version >= 30) check('UR""', is_literal=False)
check('bR""') check('bR""')
# Starting with Python 3.3 this ordering is also possible. check('Rb""')
if py_version >= 33:
check('Rb""')
# Starting with Python 3.6 format strings where introduced. check('fr""')
check('fr""', is_literal=py_version >= 36) check('rF""')
check('rF""', is_literal=py_version >= 36) check('f""')
check('f""', is_literal=py_version >= 36) check('F""')
check('F""', is_literal=py_version >= 36)
def test_error_literal(): def test_error_literal():
@@ -230,20 +222,43 @@ def test_endmarker_end_pos():
@pytest.mark.parametrize( @pytest.mark.parametrize(
('code', 'types'), [ ('code', 'types'), [
# Indentation
(' foo', [INDENT, NAME, DEDENT]), (' foo', [INDENT, NAME, DEDENT]),
(' foo\n bar', [INDENT, NAME, NEWLINE, ERROR_DEDENT, NAME, DEDENT]), (' foo\n bar', [INDENT, NAME, NEWLINE, ERROR_DEDENT, NAME, DEDENT]),
(' foo\n bar \n baz', [INDENT, NAME, NEWLINE, ERROR_DEDENT, NAME, (' foo\n bar \n baz', [INDENT, NAME, NEWLINE, ERROR_DEDENT, NAME,
NEWLINE, ERROR_DEDENT, NAME, DEDENT]), NEWLINE, NAME, DEDENT]),
(' foo\nbar', [INDENT, NAME, NEWLINE, DEDENT, NAME]), (' foo\nbar', [INDENT, NAME, NEWLINE, DEDENT, NAME]),
# Name stuff
('1foo1', [NUMBER, NAME]),
('மெல்லினம்', [NAME]),
('²', [ERRORTOKEN]),
('ä²ö', [NAME, ERRORTOKEN, NAME]),
('ää²¹öö', [NAME, ERRORTOKEN, NAME]),
(' \x00a', [INDENT, ERRORTOKEN, NAME, DEDENT]),
(dedent('''\
class BaseCache:
a
def
b
def
c
'''), [NAME, NAME, OP, NEWLINE, INDENT, NAME, NEWLINE,
ERROR_DEDENT, NAME, NEWLINE, INDENT, NAME, NEWLINE, DEDENT,
NAME, NEWLINE, INDENT, NAME, NEWLINE, DEDENT, DEDENT]),
(' )\n foo', [INDENT, OP, NEWLINE, ERROR_DEDENT, NAME, DEDENT]),
('a\n b\n )\n c', [NAME, NEWLINE, INDENT, NAME, NEWLINE, INDENT, OP,
NEWLINE, DEDENT, NAME, DEDENT]),
(' 1 \\\ndef', [INDENT, NUMBER, NAME, DEDENT]),
] ]
) )
def test_indentation(code, types): def test_token_types(code, types):
actual_types = [t.type for t in _get_token_list(code)] actual_types = [t.type for t in _get_token_list(code)]
assert actual_types == types + [ENDMARKER] assert actual_types == types + [ENDMARKER]
def test_error_string(): def test_error_string():
t1, newline, endmarker = _get_token_list(' "\n') indent, t1, newline, token, endmarker = _get_token_list(' "\n')
assert t1.type == ERRORTOKEN assert t1.type == ERRORTOKEN
assert t1.prefix == ' ' assert t1.prefix == ' '
assert t1.string == '"' assert t1.string == '"'
@@ -304,16 +319,18 @@ def test_brackets_no_indentation():
def test_form_feed(): def test_form_feed():
error_token, endmarker = _get_token_list(dedent('''\ indent, error_token, dedent_, endmarker = _get_token_list(dedent('''\
\f"""''')) \f"""'''))
assert error_token.prefix == '\f' assert error_token.prefix == '\f'
assert error_token.string == '"""' assert error_token.string == '"""'
assert endmarker.prefix == '' assert endmarker.prefix == ''
assert indent.type == INDENT
assert dedent_.type == DEDENT
def test_carriage_return(): def test_carriage_return():
lst = _get_token_list(' =\\\rclass') lst = _get_token_list(' =\\\rclass')
assert [t.type for t in lst] == [INDENT, OP, DEDENT, NAME, ENDMARKER] assert [t.type for t in lst] == [INDENT, OP, NAME, DEDENT, ENDMARKER]
def test_backslash(): def test_backslash():
@@ -324,21 +341,89 @@ def test_backslash():
@pytest.mark.parametrize( @pytest.mark.parametrize(
('code', 'types'), [ ('code', 'types'), [
# f-strings
('f"', [FSTRING_START]), ('f"', [FSTRING_START]),
('f""', [FSTRING_START, FSTRING_END]), ('f""', [FSTRING_START, FSTRING_END]),
('f" {}"', [FSTRING_START, FSTRING_STRING, OP, OP, FSTRING_END]), ('f" {}"', [FSTRING_START, FSTRING_STRING, OP, OP, FSTRING_END]),
('f" "{}', [FSTRING_START, FSTRING_STRING, FSTRING_END, OP, OP]), ('f" "{}', [FSTRING_START, FSTRING_STRING, FSTRING_END, OP, OP]),
(r'f"\""', [FSTRING_START, FSTRING_STRING, FSTRING_END]), (r'f"\""', [FSTRING_START, FSTRING_STRING, FSTRING_END]),
(r'f"\""', [FSTRING_START, FSTRING_STRING, FSTRING_END]), (r'f"\""', [FSTRING_START, FSTRING_STRING, FSTRING_END]),
# format spec
(r'f"Some {x:.2f}{y}"', [FSTRING_START, FSTRING_STRING, OP, NAME, OP, (r'f"Some {x:.2f}{y}"', [FSTRING_START, FSTRING_STRING, OP, NAME, OP,
FSTRING_STRING, OP, OP, NAME, OP, FSTRING_END]), FSTRING_STRING, OP, OP, NAME, OP, FSTRING_END]),
# multiline f-string
('f"""abc\ndef"""', [FSTRING_START, FSTRING_STRING, FSTRING_END]),
('f"""abc{\n123}def"""', [
FSTRING_START, FSTRING_STRING, OP, NUMBER, OP, FSTRING_STRING,
FSTRING_END
]),
# a line continuation inside of an fstring_string
('f"abc\\\ndef"', [
FSTRING_START, FSTRING_STRING, FSTRING_END
]),
('f"\\\n{123}\\\n"', [
FSTRING_START, FSTRING_STRING, OP, NUMBER, OP, FSTRING_STRING,
FSTRING_END
]),
# a line continuation inside of an fstring_expr
('f"{\\\n123}"', [FSTRING_START, OP, NUMBER, OP, FSTRING_END]),
# a line continuation inside of an format spec
('f"{123:.2\\\nf}"', [
FSTRING_START, OP, NUMBER, OP, FSTRING_STRING, OP, FSTRING_END
]),
# a newline without a line continuation inside a single-line string is
# wrong, and will generate an ERRORTOKEN
('f"abc\ndef"', [
FSTRING_START, FSTRING_STRING, NEWLINE, NAME, ERRORTOKEN
]),
# a more complex example
(r'print(f"Some {x:.2f}a{y}")', [ (r'print(f"Some {x:.2f}a{y}")', [
NAME, OP, FSTRING_START, FSTRING_STRING, OP, NAME, OP, NAME, OP, FSTRING_START, FSTRING_STRING, OP, NAME, OP,
FSTRING_STRING, OP, FSTRING_STRING, OP, NAME, OP, FSTRING_END, OP FSTRING_STRING, OP, FSTRING_STRING, OP, NAME, OP, FSTRING_END, OP
]), ]),
# issue #86, a string-like in an f-string expression
('f"{ ""}"', [
FSTRING_START, OP, FSTRING_END, STRING
]),
('f"{ f""}"', [
FSTRING_START, OP, NAME, FSTRING_END, STRING
]),
] ]
) )
def test_fstring(code, types, version_ge_py36): def test_fstring_token_types(code, types, each_version):
actual_types = [t.type for t in _get_token_list(code, version_ge_py36)] actual_types = [t.type for t in _get_token_list(code, each_version)]
assert types + [ENDMARKER] == actual_types assert types + [ENDMARKER] == actual_types
@pytest.mark.parametrize(
('code', 'types'), [
# issue #87, `:=` in the outest paratheses should be tokenized
# as a format spec marker and part of the format
('f"{x:=10}"', [
FSTRING_START, OP, NAME, OP, FSTRING_STRING, OP, FSTRING_END
]),
('f"{(x:=10)}"', [
FSTRING_START, OP, OP, NAME, OP, NUMBER, OP, OP, FSTRING_END
]),
]
)
def test_fstring_assignment_expression(code, types, version_ge_py38):
actual_types = [t.type for t in _get_token_list(code, version_ge_py38)]
assert types + [ENDMARKER] == actual_types
def test_fstring_end_error_pos(version_ge_py38):
f_start, f_string, bracket, f_end, endmarker = \
_get_token_list('f" { "', version_ge_py38)
assert f_start.start_pos == (1, 0)
assert f_string.start_pos == (1, 2)
assert bracket.start_pos == (1, 3)
assert f_end.start_pos == (1, 5)
assert endmarker.start_pos == (1, 6)
+39 -1
View File
@@ -1,6 +1,11 @@
from codecs import BOM_UTF8 from codecs import BOM_UTF8
from parso.utils import split_lines, python_bytes_to_unicode from parso.utils import (
split_lines,
parse_version_string,
python_bytes_to_unicode,
)
import parso import parso
import pytest import pytest
@@ -63,3 +68,36 @@ def test_utf8_bom():
expr_stmt = module.children[0] expr_stmt = module.children[0]
assert expr_stmt.type == 'expr_stmt' assert expr_stmt.type == 'expr_stmt'
assert unicode_bom == expr_stmt.get_first_leaf().prefix assert unicode_bom == expr_stmt.get_first_leaf().prefix
@pytest.mark.parametrize(
('code', 'errors'), [
(b'# coding: wtf-12\nfoo', 'strict'),
(b'# coding: wtf-12\nfoo', 'replace'),
]
)
def test_bytes_to_unicode_failing_encoding(code, errors):
if errors == 'strict':
with pytest.raises(LookupError):
python_bytes_to_unicode(code, errors=errors)
else:
python_bytes_to_unicode(code, errors=errors)
@pytest.mark.parametrize(
('version_str', 'version'), [
('3', (3,)),
('3.6', (3, 6)),
('3.6.10', (3, 6)),
('3.10', (3, 10)),
('3.10a9', (3, 10)),
('3.10b9', (3, 10)),
('3.10rc9', (3, 10)),
]
)
def test_parse_version_string(version_str, version):
parsed_version = parse_version_string(version_str)
if len(version) == 1:
assert parsed_version[0] == version[0]
else:
assert parsed_version == version
-19
View File
@@ -1,19 +0,0 @@
[tox]
envlist = py27, py33, py34, py35, py36, py37, pypy
[testenv]
extras = testing
deps =
py26,py33: pytest>=3.0.7,<3.3
py26,py33: setuptools<37
setenv =
# https://github.com/tomchristie/django-rest-framework/issues/1957
# tox corrupts __pycache__, solution from here:
PYTHONDONTWRITEBYTECODE=1
commands =
pytest {posargs:parso test}
[testenv:cov]
deps =
coverage
commands =
coverage run --source parso -m pytest
coverage report