Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python numpy equivalent of R rep and rep_len functions

I'd like to find the python (numpy is possible)-equivalent of the R rep and rep_len functions.

Question 1: Regarding the rep_len function, say I run,

rep_len(paste('q',1:4,sep=""), length.out = 7)

then the elements of vector ['q1','q2','q3','q4'] will be recycled to fill up 7 spaces and you'll get the output

[1] "q1" "q2" "q3" "q4" "q1" "q2" "q3"

How do I do recycle elements of a list or a 1-d numpy array to fit a predetermined length? From what I've seen numpy's repeat function lets you specify a certain number of reps, but doesn't repeat values to fill a predetermined length.

Question 2: Regarding the rep function, say I run,

rep(2000:2004, each = 3, length.out = 14)

then the output is

[1] 2000 2000 2000 2001 2001 2001 2002 2002 2002 2003 2003 2003 2004 2004

How could I make this (recycling elements of a list or numpy array to fit a predetermined length and list each element consecutively a predetermined number of times) happen using python?

I apologize if this question has been asked before; I'm totally new to stack overflow and pretty new to programming in general.

like image 408
Raw Noob Avatar asked Sep 12 '17 02:09

Raw Noob


2 Answers

NumPy actually does provide an equivalent of rep_len. It's numpy.resize:

new_arr = numpy.resize(arr, new_len)

Note that the resize method pads with zeros instead of repeating elements, so arr.resize(new_len) doesn't do what you want.

As for rep, I know of no equivalent. There's numpy.repeat, but it doesn't allow you to limit the length of the output. (There's also numpy.tile for the repeat-the-whole-vector functionality, but again, no length.out equivalent.) You could slice the result, but it would still spend all the time and memory to generate the un-truncated array:

new_arr = numpy.repeat(arr, repetitions)[:new_len]
like image 156
user2357112 supports Monica Avatar answered Oct 18 '22 03:10

user2357112 supports Monica


numpy.repeat() acts like R's rep() function with each=True. When each=False, recycling can be implemented by transposition:

import numpy as np

def np_rep(x, reps=1, each=False, length=0):
    """ implementation of functionality of rep() and rep_len() from R

    Attributes:
        x: numpy array, which will be flattened
        reps: int, number of times x should be repeated
        each: logical; should each element be repeated reps times before the next
        length: int, length desired; if >0, overrides reps argument
    """
    if length > 0:
        reps = np.int(np.ceil(length / x.size))
    x = np.repeat(x, reps)
    if(not each):
        x = x.reshape(-1, reps).T.ravel() 
    if length > 0:
        x = x[0:length]
    return(x)

For, example, if we set each=True:

np_rep(np.array(['tinny', 'woody', 'words']), reps=3, each=True)

...we get:

array(['tinny', 'tinny', 'tinny', 'woody', 'woody', 'woody', 'words', 'words', 'words'], 
  dtype='<U5')

But when each=False:

np_rep(np.array(['tinny', 'woody', 'words']), reps=3, each=False)

...the result is:

array(['tinny', 'woody', 'words', 'tinny', 'woody', 'words', 'tinny', 'woody', 'words'], 
  dtype='<U5')

Note that x gets flattened, and the result is flattened as well. To implement the length argument, the minimum number of reps needed is calculated, and then the result is truncated to the desired length.

like image 4
mwrowe Avatar answered Oct 18 '22 01:10

mwrowe