I've worked in the h2o R package for quite a while, now, but have recently had to move to the python package. For the most part, an <code>H2OFrame</code> is designed to work like a pandas <code>DataFrame</code> object. However, there are several hurdles I haven't managed to get over... in Pandas, if I want to drop some rows: <pre class="prettyprint"><code>df.drop([0,1,2], axis=0, inplace=True) </code></pre> However, I cannot figure out how to do the same with an <code>H2OFrame</code>: <pre class="prettyprint"><code>frame.drop([0,1,2], axis=0) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-30-0eff75c48e35> in <module>() ----> frame.drop([0,1,2], axis=0) TypeError: drop() got an unexpected keyword argument 'axis' </code></pre> Their github source documents that the drop method is only for columns, so obviously the obvious way isn't working: <pre class="prettyprint"><code>def drop(self, i): """Drop a column from the current H2OFrame. </code></pre> Is there a way to drop rows from an <code>H2OFrame</code>?

Currently, the <code>H2OFrame.drop</code> method does not support this, but we have added a ticket to add support for dropping multiple rows (and multiple columns). In the meantime, you can subset rows by an index: <pre class="prettyprint"><code>import h2o h2o.init(nthreads = -1) hf = h2o.H2OFrame([[1,3],[4,5],[3,0],[5,5]]) # 4 rows x 2 columns hf2 = hf[[1,3],:] # Keep some of the rows by passing an index </code></pre> Note that the index list, <code>[1,3]</code>, is ordered. If you try to pass <code>[3,1]</code> instead, you will get an error. H2O will not reorder the rows, and this is its way of telling you that. If you have a list of out-of-order indexes, just wrap the <code>sorted</code> function around it first. <pre class="prettyprint"><code>hf2 = hf[sorted([3,3]),:] </code></pre> Lastly, if you prefer, it's also okay to reassign the new subsetted frame to the original frame name, as follows: <pre class="prettyprint"><code>hf = hf[[1,3],:] </code></pre>

Since this is now supported I wanted to highlight the comment that says how to drop by index: <code>df = df.drop([0,1,2], axis=0)</code> where if axis = 1 (default), then it drop columns; if axis=0 then drop rows. <code>drop(index, axis=1)</code> where index is a list of column indices, column names, or row indices to drop; or a string to drop a single column by name; or an int to drop a single column by index.

How to drop rows in an H2OFrame?

Tags:

python

h2o

I've worked in the h2o R package for quite a while, now, but have recently had to move to the python package.

For the most part, an H2OFrame is designed to work like a pandas DataFrame object. However, there are several hurdles I haven't managed to get over... in Pandas, if I want to drop some rows:

Click to copy

df.drop([0,1,2], axis=0, inplace=True)

However, I cannot figure out how to do the same with an H2OFrame:

Click to copy

frame.drop([0,1,2], axis=0)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-30-0eff75c48e35> in <module>()
----> frame.drop([0,1,2], axis=0)

TypeError: drop() got an unexpected keyword argument 'axis'

Their github source documents that the drop method is only for columns, so obviously the obvious way isn't working:

Click to copy

def drop(self, i):
    """Drop a column from the current H2OFrame.

Is there a way to drop rows from an H2OFrame?

548

asked Jul 12 '16 17:07

TayTay

2 Answers

Currently, the H2OFrame.drop method does not support this, but we have added a ticket to add support for dropping multiple rows (and multiple columns).

In the meantime, you can subset rows by an index:

Click to copy

import h2o
h2o.init(nthreads = -1)

hf = h2o.H2OFrame([[1,3],[4,5],[3,0],[5,5]])  # 4 rows x 2 columns
hf2 = hf[[1,3],:]  # Keep some of the rows by passing an index

Note that the index list, [1,3], is ordered. If you try to pass [3,1] instead, you will get an error. H2O will not reorder the rows, and this is its way of telling you that. If you have a list of out-of-order indexes, just wrap the sorted function around it first.

Click to copy

hf2 = hf[sorted([3,3]),:]

Lastly, if you prefer, it's also okay to reassign the new subsetted frame to the original frame name, as follows:

Click to copy

hf = hf[[1,3],:]

answered Oct 06 '22 18:10

Erin LeDell

Since this is now supported I wanted to highlight the comment that says how to drop by index:

df = df.drop([0,1,2], axis=0)

where if axis = 1 (default), then it drop columns; if axis=0 then drop rows.

drop(index, axis=1)

where index is a list of column indices, column names, or row indices to drop; or a string to drop a single column by name; or an int to drop a single column by index.

answered Oct 06 '22 20:10

Lauren

Related questions
                            
                                Different Sigmoid Equations and its implementation
                            
                                How to Convert XLSX to Sheets in Google Drive API v3
                            
                                Pyautogui TypeError: 'NoneType' object is not iterable
                            
                                How to get the N maximum values per row in a numpy ndarray?
                            
                                Understanding the output of Doc2Vec from Gensim package
                            
                                Cannot connect to neo4j database on Docker container
                            
                                How to convert a sha256 object to integer and pack it to bytearray in python?
                            
                                Python CMA-ES Algorithm to solve user-defined function and constraints
                            
                                What's distutils' equivalent of setuptools' `find_packages`? (python)
                            
                                How to unittest Python Lock is acquired with 'with' statement?
                            
                                value based thread lock
                            
                                What's the most efficient way to select a non-rectangular ROI of an Image in OpenCV?
                            
                                Unsupported TIFF Compression
                            
                                Is it actually possible to pass data (callback) from mpld3 to ipython?
                            
                                How to compute optical flow using tvl1 opencv function
                            
                                How to use monkeypatch in a "setup" method for unit tests using pytest?
                            
                                Parse BeautifulSoup element into Selenium
                            
                                Reading large file in Spark issue - python
                            
                                catch exception and return empty dataframe
                            
                                Dividing Pandas Dataframe by Week

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to drop rows in an H2OFrame?

Tags:

python

h2o

TayTay

People also ask

2 Answers

Erin LeDell

Lauren

Recent Activity

Donate For Us