write and read csv file python

Question

write and read csv file python

I have a text file containing words with non-English alphabets and I want to open it, do some preprocess and finally save it as a csv file. and use it some where else.

the code to read and store file:

with open('file.txt', encoding="utf-8") as f:
    train = f.read().splitlines()

then creating a dataframe, and the code to store it:

df.to_csv('file.csv', index=True, encoding="utf-8")

till now every thing seems ok, but when I try to open the file.csv with this code:

train = pd.read_csv('file.csv', encoding="utf-8")

I face this :

Process finished with exit code -1073740940 (0xC0000374)

without going to next lines.

also when I try to open it with ISO-8859-1 encoding, it is ok; but when I try to print the head of that csv it just print some question marks('?')

is anyone knows what is going wrong?

any kind of help will be appreciated.

python

pandas

csv

unicode

asked on Stack Overflow Aug 11, 2017 by

Ehsan Mehralian • edited Aug 11, 2017 by

daneeq

2 Answers

I tried reproducing it with this code:

import pandas as pd

with open('persian.txt', encoding="utf-8") as f:
    train = f.read().splitlines() 
    df = pd.DataFrame({'text': train})
    df.to_csv('file.csv', index=True, encoding="utf-8")
    train = pd.read_csv('file.csv', encoding="utf-8")

with a txt file containing two lines of sample Persian text. It ran without any problems in Python 3, producing this csv:

    text
0   همهٔ افراد بشر آزاد به دنیا می‌آیند و حیثیت و حقوق شان با هم برابر است
1   همه اندیشه و وجدان دارند و باید در برابر یکدیگر با روح برادری رفتار کنند.

Can you provide more detail on the text properties and the operations you did in the dataframe processing, or identify the line where the reading breaks? You might be producing some invalid characters on the way.

answered on Stack Overflow Aug 11, 2017 by

daneeq

I was getting crazy by writing Persian in a CSV file. Finally this one worked for me:

data.to_csv (r'hi.csv', encoding='utf-8-sig')

answered on Stack Overflow Apr 23, 2021 by

ghazale

User contributions licensed under CC BY-SA 3.0