Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging Lists by Index in Python

Tags:

python

list

Scenario: I retrieved 3 lists from a N-triples file, and now I am trying to combine them into a single, organized list.

Original format:

+--------+---------+--------+
| 100021 | hasdata | y      |
+--------+---------+--------+
| 100021 | name    | USER1  |
+--------+---------+--------+
| 100021 | extra1  | typer  |
+--------+---------+--------+
| 100021 | extra2  | reader |
+--------+---------+--------+
| 50003  | hasdata | y      |
+--------+---------+--------+
| 50003  | name    | USER2  |
+--------+---------+--------+
| 50003  | extra1  | reader |
+--------+---------+--------+
| 50003  | extra2  | writer |
+--------+---------+--------+
| 50003  | extra3  | coder  |
+--------+---------+--------+
| 30007  | hasdata | n      |
+--------+---------+--------+
| 30007  | name    | 0001   |
+--------+---------+--------+
| 30007  | extra1  | Null   |
+--------+---------+--------+

While looping the ntriples file, I produced 3 lists (each is a column of the table above). I am now trying to match them into something like this:

+--------+---------+-------+--------+--------+--------+
|        | hasdata | name  | extra1 | extra2 | extra3 |
+--------+---------+-------+--------+--------+--------+
| 100021 | y       | USER1 | typer  | reader |        |
+--------+---------+-------+--------+--------+--------+
| 50003  | y       | USER2 | reader | writer | coder  |
+--------+---------+-------+--------+--------+--------+
| 30007  | extra2  | n     | 0001   | Null   |        |
+--------+---------+-------+--------+--------+--------+

So far, I used the function:

def listOfTuples(l1, l2, l3): 
    return list(map(lambda x, y, z:(x,y, z), l1, l2, l3)) 

But this only gave me a direct merge of correspondent items.

Question: I know it is possible to loop through the lists and get the matching items and build an array/dataframe manually. My question is is there any function or package that can do this automatically and in a less convoluted way?

Obs: I already have a way to produce the dataframe by looping manually. I just wanted to know if there is another more efficient way.

like image 508
DGMS89 Avatar asked Mar 29 '19 16:03

DGMS89


2 Answers

If I understand you correctly, you have a list which has tuple elements with a size of three objects, and you want to put them in another tuple You can use zip to achieve this.

result = list(zip(list1, zip(*[(l1,l2,l3) for i in list1])))
like image 143
Farhood ET Avatar answered Nov 08 '22 05:11

Farhood ET


You say you want a dataframe, so I'll operate on the assumption that pandas operations are acceptable.

I'm also assuming that the symbols are only you formatting, and not part of the actual data file (in the future, those sort of decorators are unnecessary and even detrimental for these type of questions)

Using your given data, I create a df (pd.read_csv or some such) then pivot it

    col1    col2    col3
0   100021  hasdata y
1   100021  name    USER1
2   100021  extra1  typer
3   100021  extra2  reader
4   50003   hasdata y
5   50003   name    USER2
6   50003   extra1  reader
7   50003   extra2  writer
8   50003   extra3  coder
9   30007   hasdata n
10  30007   name    0001
11  30007   extra1  Null

df.pivot(index='col1',columns='col2',values='col3')

col2    extra1  extra2  extra3  hasdata name
col1                    
30007   Null    NaN     NaN     n       0001
50003   reader  writer  coder   y       USER2
100021  typer   reader  NaN     y       USER1
like image 31
G. Anderson Avatar answered Nov 08 '22 05:11

G. Anderson