Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to concatenate a list with a nested list?

Tags:

python

I have two lists:

The first one is a regular list which contains links of Sitemaps:

ur = ['https://www.hi.de/hu/sitemap.xml', 
      'https://www.hi.de/ma/sitemap.xml', 
      'https://www.hi.de/au/sitemap.xml', 
      ]

The second list is nested and contains links which were indexed on the sitemaps and a date for every link:

wh = [['No-Date', 'https://www.hi.de/hu/artikel/xxx', ''],        
      ['2019-11-13', 'https://www.hi.de/ma/artikel/xxx'], 
      ['2019-11-12', 'https://www.hi.de/ma/artikel/xxx'], 
      ['2019-11-11', 'https://www.hi.de/au/artikel/xxx']]

Now I want to merge the list with the nedted list based on the sitmap they came from like this:

ui = [['https://www.hi.de/hu/sitemap.xml', 'No-Date', 'https://www.hi.de/hu/artikel/xxx', ''],        
      ['https://www.hi.de/ma/sitemap.xml' '2019-11-13', 'https://www.hi.de/ma/artikel/xxx'], 
      ['https://www.hi.de/ma/sitemap.xml', '2019-11-12', 'https://www.hi.de/ma/artikel/xxx'], 
      ['https://www.hi.de/au/sitemap.xml', '2019-11-11', 'https://www.hi.de/au/artikel/xxx']]

But with my code:

ui = [[(url2, x) for url2 in ur for x in y if url2.rsplit('/', 1)[0] in x] for y in wh]

The date in every sublist gets deleted and additionally the entries are stored in a tuple like this:

...
[[('https://www.hi.de/hu/sitemap.xml', 'https://www.hi.de/hu/artikel/xxx', '')],
...

How can I change the code to get the desired result in the variable ui?

like image 920
gython Avatar asked Nov 18 '19 13:11

gython


People also ask

How do I concatenate a list of lists?

You can concatenate multiple lists into one list by using the * operator. For Example, [*list1, *list2] – concatenates the items in list1 and list2 and creates a new resultant list object. Usecase: You can use this method when you want to concatenate multiple lists into a single list in one shot.

Can a list be nested in another list?

Lists can be nested within other lists, as shown in the following example that details a sequenced plan to relocate. In this case, it's an ordered list inside another one, though you can nest any type of list within any other type (see the dl entry in this chapter for a related note).

Can we concatenate 2 lists in Python?

Python's extend() method can be used to concatenate two lists in Python. The extend() function does iterate over the passed parameter and adds the item to the list thus, extending the list in a linear fashion. All the elements of the list2 get appended to list1 and thus the list1 gets updated and results as output.


Video Answer


2 Answers

You can use a list comprehension that checks for the matching sitemap between two lists to get your desired result:

ur = ['https://www.hi.de/hu/sitemap.xml', 
      'https://www.hi.de/ma/sitemap.xml', 
      'https://www.hi.de/au/sitemap.xml', 
      ]

wh = [['No-Date', 'https://www.hi.de/hu/artikel/xxx', ''],        
      ['2019-11-13', 'https://www.hi.de/ma/artikel/xxx'], 
      ['2019-11-12', 'https://www.hi.de/ma/artikel/xxx'], 
      ['2019-11-11', 'https://www.hi.de/au/artikel/xxx']]

print([[[u] + x] for x in wh for u in ur if x[1].split('/')[3] == u.split('/')[3]])

which outputs:

[['https://www.hi.de/hu/sitemap.xml', 'No-Date', 'https://www.hi.de/hu/artikel/xxx', ''],
 ['https://www.hi.de/ma/sitemap.xml' '2019-11-13', 'https://www.hi.de/ma/artikel/xxx'],
 ['https://www.hi.de/ma/sitemap.xml', '2019-11-12', 'https://www.hi.de/ma/artikel/xxx'],
 ['https://www.hi.de/au/sitemap.xml', '2019-11-11', 'https://www.hi.de/au/artikel/xxx']]
like image 114
Austin Avatar answered Oct 27 '22 00:10

Austin


You can transform ur to a dictionary for easier lookup:

import re
ur = ['https://www.hi.de/hu/sitemap.xml', 'https://www.hi.de/ma/sitemap.xml', 'https://www.hi.de/au/sitemap.xml']
data = [['No-Date', 'https://www.hi.de/hu/artikel/xxx'], ['2019-11-13', 'https://www.hi.de/ma/artikel/xxx'], ['2019-11-12', 'https://www.hi.de/ma/artikel/xxx'], ['2019-11-11', 'https://www.hi.de/au/artikel/xxx']]
d = dict((re.split('/(?=sitemap\.)', i)[0], i) for i in ur)
result = [[d[re.split('/(?=\w{3,}/)', b)[0]], a, b] for a, b in data]

Output:

[['https://www.hi.de/hu/sitemap.xml', 'No-Date', 'https://www.hi.de/hu/artikel/xxx'], 
['https://www.hi.de/ma/sitemap.xml', '2019-11-13', 'https://www.hi.de/ma/artikel/xxx'], 
['https://www.hi.de/ma/sitemap.xml', '2019-11-12', 'https://www.hi.de/ma/artikel/xxx'], 
['https://www.hi.de/au/sitemap.xml', '2019-11-11', 'https://www.hi.de/au/artikel/xxx']]
like image 32
Ajax1234 Avatar answered Oct 27 '22 01:10

Ajax1234