The raw data is divided into train.dat and test.da
no vote
Application background The beginning is to do movielense data, you can be divided into the original data train.dat and test.dat, mainly in order to do validation experiments. Very simple and clear, suitable for beginners to see, if you don't like, please light. Key Technology # -*- coding: cp936 -*-Sklearn import cross_validation fromC = []Filename = r'Raw.data'# original dataOut_train = open (r'train.txt','w') # training setOut_test = open (r'test.txt','w') # test setLine in open for (filename): items =.Split ()   (',') line.strip; c.append (items) C_train, c_test = cross_validation.train_test_split (C, test_size=0.1) #size = you need the ratioI in c_train: for out_train.write (','.join (I) +'\n')I in c_test: for out_test.write (','.join (I) +'\n')