Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract data from lines of a text file

I need to extract data from lines of a text file. The data is name and scoring information formatted like this:

Shyvana - 12/4/5 - Loss - 2012-11-22
Fizz - 12/4/5 - Win - 2012-11-22
Miss Fortune - 12/4/3 - Win - 2012-11-22

This file is generated by another part of my little python program where I ask the user for the name, lookup the name they enter to ensure it's valid from a list of names, and then ask for kills, deaths, assists, and whether they won or lost. Then I ask for confirmation and write that data to the file on a new line, and append the date at the end like that. The code that prepares that data:

data = "%s - %s/%s/%s - %s - %s\n" % (
        champname, kills, deaths, assists, winloss, timestamp)

Basically I want to read that data back in another part of the program and display it to the user and do calculations with it like averages over time for a particular name.

I'm new to python and and I'm not very experienced with programming in general so most of the string splitting and formatting examples I find are just too cryptic for me to understand how to adapt to quite what I need here, could anyone help? I could format the written data differently so token finding would be simpler, but I want it to be simple directly in the file.

like image 654
Kassandra Avatar asked Nov 22 '12 23:11

Kassandra


People also ask

How do you read multiple lines in a text file in Python?

The linecache package can be imported in Python and then be used to extract and access specific lines in Python. The package can be used to read multiple lines simultaneously.


1 Answers

The following will read everything into a dictionary keyed by player name. The value associated with each player is itself a dictionary acting as a record with named fields associated with the items converted to a format suitable for further processing.

info = {}
with open('scoring_info.txt') as input_file:
    for line in input_file:
        player, stats, outcome, date = (
            item.strip() for item in line.split('-', 3))
        stats = dict(zip(('kills', 'deaths', 'assists'),
                          map(int, stats.split('/'))))
        date = tuple(map(int, date.split('-')))
        info[player] = dict(zip(('stats', 'outcome', 'date'),
                                (stats, outcome, date)))

print('info:')
for player, record in info.items():
    print('  player %r:' % player)
    for field, value in record.items():
        print('    %s: %s' % (field, value))

# sample usage
player = 'Fizz'
print('\n%s had %s kills in the game' % (player, info[player]['stats']['kills']))

Output:

info:
  player 'Shyvana':
    date: (2012, 11, 22)
    outcome: Loss
    stats: {'assists': 5, 'kills': 12, 'deaths': 4}
  player 'Miss Fortune':
    date: (2012, 11, 22)
    outcome: Win
    stats: {'assists': 3, 'kills': 12, 'deaths': 4}
  player 'Fizz':
    date: (2012, 11, 22)
    outcome: Win
    stats: {'assists': 5, 'kills': 12, 'deaths': 4}

Fizz had 12 kills in the game

Alternatively, rather than holding most of the data in dictionaries, which can make nested-field access a little awkward — info[player]['stats']['kills'] — you could instead use a little more advanced "generic" class to hold them, which will let you write info2[player].stats.kills instead.

To illustrate, here's almost the same thing using a class I've named Struct because it's somewhat like the C language's struct data type:

class Struct(object):
    """ Generic container object """
    def __init__(self, **kwds): # keyword args define attribute names and values
        self.__dict__.update(**kwds)

info2 = {}
with open('scoring_info.txt') as input_file:
    for line in input_file:
        player, stats, outcome, date = (
            item.strip() for item in line.split('-', 3))
        stats = dict(zip(('kills', 'deaths', 'assists'),
                          map(int, stats.split('/'))))
        victory = (outcome.lower() == 'win') # change to boolean T/F
        date = dict(zip(('year','month','day'), map(int, date.split('-'))))
        info2[player] = Struct(champ_name=player, stats=Struct(**stats),
                               victory=victory, date=Struct(**date))
print('info2:')
for rec in info2.values():
    print('  player %r:' % rec.champ_name)
    print('    stats: kills=%s, deaths=%s, assists=%s' % (
          rec.stats.kills, rec.stats.deaths, rec.stats.assists))
    print('    victorious: %s' % rec.victory)
    print('    date: %d-%02d-%02d' % (rec.date.year, rec.date.month, rec.date.day))

# sample usage
player = 'Fizz'
print('\n%s had %s kills in the game' % (player, info2[player].stats.kills))

Output:

info2:
  player 'Shyvana':
    stats: kills=12, deaths=4, assists=5
    victorious: False
    date: 2012-11-22
  player 'Miss Fortune':
    stats: kills=12, deaths=4, assists=3
    victorious: True
    date: 2012-11-22
  player 'Fizz':
    stats: kills=12, deaths=4, assists=5
    victorious: True
    date: 2012-11-22

Fizz had 12 kills in the game
like image 137
martineau Avatar answered Oct 12 '22 15:10

martineau