Now i need to merge two dataframe with the condition greater than(>=). But merge only support equal. Is there any way to deal with it? Thanks!

I don't know how to achieve the following with similar merge and join syntax in pandas, <pre class="prettyprint"><code>SELECT * FROM a INNER JOIN b ON a.column1 >= b.column1 AND a.column1 <= b.column2 </code></pre> But the query above can also be written implicitly as; <pre class="prettyprint"><code>SELECT * FROM a, b WHERE a.column1 >= b.column1 AND a.column1 <= b.column2 </code></pre> Which is basically the old syntax and should do exactly same (performance wise). It takes the cartesian product of 2 tables (or cross join) and then select from that using the WHERE condition, which could be easily implemented in pandas. This could be a little heavy on memory, but should be fast. First the <code>FROM a, b</code> clause (we temporarily assign a column with same values in all rows, so we can cross join over it); <pre class="prettyprint"><code>df = pd.merge(a.assign(key=0), b.assign(key=0), on='key').drop('key', axis=1) </code></pre> and then use boolean indexing (our <code>WHERE</code> clause) to slice the frame; <pre class="prettyprint"><code>df[(df["column1_x"] >= df["column1_y"]) & (df["column1_x"] <= df["column2_y"])] </code></pre> If you don't want the cartesian product and only want to compare the rows on same index of both tables, you can merge on index like this; <pre class="prettyprint"><code>df = a.merge(b, left_index = True, right_index = True) </code></pre> or concat on axis 1 if they are same length; <pre class="prettyprint"><code>df = pd.concat([a, b], axis=1) </code></pre> And use boolean indexing again to eliminate results; <pre class="prettyprint"><code>df[(df["column1_x"] >= df["column1_y"]) & (df["column1_x"] <= df["column2_y"])] </code></pre>

Does pandas dataframe merge work with greater or less?

1 Answers

I don't know how to achieve the following with similar merge and join syntax in pandas,

SELECT * 
FROM a 
INNER JOIN b 
ON a.column1 >= b.column1 AND a.column1 <= b.column2

But the query above can also be written implicitly as;

SELECT * 
FROM a, b 
WHERE a.column1 >= b.column1 AND a.column1 <= b.column2

Which is basically the old syntax and should do exactly same (performance wise). It takes the cartesian product of 2 tables (or cross join) and then select from that using the WHERE condition, which could be easily implemented in pandas. This could be a little heavy on memory, but should be fast.

First the FROM a, b clause (we temporarily assign a column with same values in all rows, so we can cross join over it);

df = pd.merge(a.assign(key=0), b.assign(key=0), on='key').drop('key', axis=1)

and then use boolean indexing (our WHERE clause) to slice the frame;

df[(df["column1_x"] >= df["column1_y"]) & (df["column1_x"] <= df["column2_y"])]

If you don't want the cartesian product and only want to compare the rows on same index of both tables, you can merge on index like this;

df = a.merge(b, left_index = True, right_index = True)

or concat on axis 1 if they are same length;

df = pd.concat([a, b], axis=1)

And use boolean indexing again to eliminate results;

df[(df["column1_x"] >= df["column1_y"]) & (df["column1_x"] <= df["column2_y"])]

198

answered Sep 24 '22 10:09

umutto

Related questions
                            
                                Installing Gurobi in Virtualenv without Anaconda
                            
                                Vuejs computed properties and jquery ui sortable issue
                            
                                How to check if a template function was specialized?
                            
                                Crash casting WKNSURLRequest as? other type
                            
                                Windows docker container cannot ping host
                            
                                How do I create an 2-D array in Haskell?
                            
                                How to change controller response in filter to make the response structure consistent all over the API's using spring-boot
                            
                                Availability of snapcraft on AlpineLinux
                            
                                path '%s' cannot be absolute" % pathname
                            
                                Rubocop MutableConstant not observing frozen string literal comment
                            
                                cosine similarity optimized implementation
                            
                                How to create a queryset that filters multiple fields based on a single condition in Django?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Does pandas dataframe merge work with greater or less?

Tags:

J.Bao

People also ask

1 Answers

umutto

Recent Activity

Donate For Us