import pandas as pd
csvfile = open('text.csv',encoding='utf-8')
df = pd.read_csv(csvfile,engine='python')
# 按行读取保存到字典里,假设每行有三个字段,item_id,info,title
dict_item_id = {}
dict_info = {}
dict_title = {}
dict_item_id_reverse = {}
for i in range(len(df)):
    dict_item_id[i] = df["item_id"][i]
    dict_info[i] = df["info"][i]
    dict_title[i] = df["title"][i]
    dict_item_id_reverse[df["item_id"][i]] = i

通过字典的key i 构建了item_id,info,title字段的关联,方便后续数据的处理。

目的是分别提取出每行每个字段下面的数据。

1、pandas.read_csv()函数,读取文件数据时,由于分隔符为'::',弹出如下警告

       警告:ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex)

       解决方法:增加函数的引擎参数engine='python',如下:

header = ['user_id', 'item_id', 'rating', 'timestamp']
df = pd.read_csv("D:/ratings.dat", sep='::', names=header,engine='python')


 

Logo

有“AI”的1024 = 2048,欢迎大家加入2048 AI社区

更多推荐