Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CSV file with Arabic characters is displayed as symbols in Excel

I am using python to extract Arabic tweets from twitter and save it as a CSV file, but when I open the saved file in excel the Arabic language displays as symbols. However, inside python and notepad or word, it looks good. May I know where is the problem?

like image 902
Shams Avatar asked Feb 15 '20 13:02

Shams


People also ask

How do I display Arabic characters in Excel?

You can also format Arabic text in Excel. To do this, select the text you want to format and then click on the "Format" tab. Then, select "Arabic" from the "Font" drop-down menu.


3 Answers

This is a problem I face frequently with Microsoft Excel when opening CSV files that contain Arabic characters. Try the following workaround that I tested on latest versions of Microsoft Excel on both Windows and MacOS:

  1. Open Excel on a blank workbook

  2. Within the Data tab, click on From Text button (if not activated, make sure an empty cell is selected)

  3. Browse and select the CSV file

  4. In the Text Import Wizard, change the File_origin to "Unicode (UTF-8)"

  5. Go next and from the Delimiters, select the delimiter used in your file e.g. comma

  6. Finish and select where to import the data

The Arabic characters should show correctly.

like image 59
mhalshehri Avatar answered Sep 19 '22 12:09

mhalshehri


Just use encoding='utf-8-sig' instead of encoding='utf-8' as follows:

import csv

data = u"اردو"

with(open('example.csv', 'w', encoding='utf-8-sig')) as fh:
    writer = csv.writer(fh)
    writer.writerow([data])

It worked on my machine.

like image 42
saifhassan Avatar answered Sep 17 '22 12:09

saifhassan


The only solution that i've found to save arabic into an excel file from python is to use pandas and to save into the xlsx extension instead of csv, xlsx seems a million times better here's the code i've put together which worked for me

import pandas as pd
def turn_into_csv(data, csver):
    ids = []
    texts = []
    for each in data:
        texts.append(each["full_text"])
        ids.append(str(each["id"]))

    df = pd.DataFrame({'ID': ids, 'FULL_TEXT': texts})
    writer = pd.ExcelWriter(csver + '.xlsx', engine='xlsxwriter')
    df.to_excel(writer, sheet_name='Sheet1', encoding="utf-8-sig")

    # Close the Pandas Excel writer and output the Excel file.
    writer.save()
like image 32
Kream Avatar answered Sep 19 '22 12:09

Kream