This is a very intentional change. Previously form feeds were handled very
poorly and sometimes where not counted as indentation. This obviously makes
sense. But at the same time indentation is very tricky to deal with (both for
editors and parso).
Especially in the diff parser this led to a lot of very weird issues. The
decision probably makes sense since:
1. Almost nobody uses form feeds in the first place.
2. People that use form feeds like Barry Warsaw often put a newline ater them.
(e.g Python's email.__init__)
3. If you write an editor you want to be able to identify a unicode character
with a clear line/column. This would not be the case if form feeds were just
ignored when counting.
Form feeds will still work in Jedi, will not cause parse errors and in general
you should be fine using them. It might just cause Jedi to count them as
indentation **if** you use it like '\f foo()'. This is however confusing for
most editors anyway. It leads to a weird display e.g. in VIM, even if it's
perfectly valid code in Python.
Since parso is a code analysis parser and not the languages parser I think it's
fine to ignore this edge case.
Line continuation characters are valid inside of strings, but weren't
handled correctly in certain cases with f-strings, due to some small
tokenizer bugs.
This pull request to address those issues, and adds tests to validate
the new logic.
* Don't mutate the standard library token.tok_name dictionary
Fixes#41.
* More robust test that tok_name isn't mutated
This test now works in Python 2.7, and actually tests something in Python 3.7,
and it's better anyway because it tests the whole dictionary instead of just
one token.
* Fix test_tok_name_copied in Python 3.7 and PyPy
Apparently Python 3.7 adds N_TOKENS to the tok_name dictionary, and PyPy
doesn't have NT_OFFSET in it.