Skip to content

Fix encoding problems (Windows, python >= 3)#1469

Merged
menshikh-iv merged 3 commits into
developfrom
windows-enc-fix
Jul 6, 2017
Merged

Fix encoding problems (Windows, python >= 3)#1469
menshikh-iv merged 3 commits into
developfrom
windows-enc-fix

Conversation

@menshikh-iv
Copy link
Copy Markdown
Contributor

@menshikh-iv menshikh-iv commented Jul 6, 2017

Second part of #1441

@menshikh-iv menshikh-iv merged commit e9e223e into develop Jul 6, 2017
@menshikh-iv menshikh-iv deleted the windows-enc-fix branch July 6, 2017 10:40

d.save_as_text(tmpf)
with open(tmpf) as file:
with codecs.open(tmpf, 'r', encoding='utf-8') as file:
Copy link
Copy Markdown
Owner

@piskvorky piskvorky Jul 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1: please use smart_open, binary mode rb, and convert to unicode explicitly (avoid codecs).


d.save_as_text(tmpf, sort_by_word=False)
with open(tmpf) as file:
with codecs.open(tmpf, 'r', encoding='utf-8') as file:
Copy link
Copy Markdown
Owner

@piskvorky piskvorky Jul 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1: please use smart_open, binary mode rb, and convert to unicode explicitly (avoid codecs).

no_num_docs_serialization = "1\tprvé\t1\n2\tslovo\t2\n"
with open(tmpf, "w") as file:
no_num_docs_serialization = to_utf8("1\tprvé\t1\n2\tslovo\t2\n")
with open(tmpf, "wb") as file:
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer smart_open.

no_num_docs_serialization = "2\n1\tprvé\t1\n2\tslovo\t2\n"
with open(tmpf, "w") as file:
no_num_docs_serialization = to_utf8("2\n1\tprvé\t1\n2\tslovo\t2\n")
with open(tmpf, "wb") as file:
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer smart_open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants