Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I create a sklearn.datasets.base.Bunch object in scikit-learn from my own data?

In most of the Scikit-learn algorithms, the data must be loaded as a Bunch object. For many example in the tutorial load_files() or other functions are used to populate the Bunch object. Functions like load_files() expect data to be present in certain format, but I have data stored in a different format, namely a CSV file with strings for each field.

How do I parse this and load data in the Bunch object format?

like image 754
David Avatar asked Dec 10 '13 03:12

David


People also ask

What is a scikit-learn bunch object?

class sklearn.utils. Bunch(**kwargs)[source] Container object exposing keys as attributes. Bunch objects are sometimes used as an output for functions and methods. They extend dictionaries by enabling values to be accessed by key, bunch["value_key"] , or by an attribute, bunch.

What is scikit-learn explain the datasets used in scikit-learn?

Scikit-learn (Sklearn) is the most useful and robust library for machine learning in Python. It provides a selection of efficient tools for machine learning and statistical modeling including classification, regression, clustering and dimensionality reduction via a consistence interface in Python.

What are requirements for working with data in scikit-learn?

Requirements for working with data in scikit learn Features = predictor variables = independent variables. Target variable = dependent variable = response variable. Samples=records=instances.

What are the datasets available in sklearn datasets?

scikit-learn comes with a few small standard datasets that do not require to download any file from some external website. Load and return the boston house-prices dataset (regression). Load and return the iris dataset (classification). Load and return the diabetes dataset (regression).


1 Answers

You can do it like this:

import numpy as np
import sklearn.datasets

examples = []
examples.append('some text')
examples.append('another example text')
examples.append('example 3')

target = np.zeros((3,), dtype=np.int64)
target[0] = 0
target[1] = 1
target[2] = 0
dataset = sklearn.datasets.base.Bunch(data=examples, target=target)
like image 123
Hugh Perkins Avatar answered Sep 28 '22 00:09

Hugh Perkins