Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse a file for iphone? Should I use NSScanner?

So I am new to iphone development, but am trying to learn how to take a file that is CSV and read it and save it using Core Data (I think that is the best way?) so that I can display it in a tableview on the iphone. Below is an example of the type of csv file I am working with;

13,1,1,History,,,,,,1
,263,1,Smith,Bob,Freshman
,317,2,Jones,John,Sophmore
14,2,1,Math,,,,,,1
,311,1,Kim,Mary,Freshman
,352,2,Doe,Fred,Senior

Where the first number is the type of class (i.e. 13 = history), the second number is how many sections (i.e. 1 = 1 class) the third number is the meeting pattern (i.e. 1 = monday, wednesday, friday), then the name of the class. The rest of the class line I don't really care about, so I thought I could just have it ignore those characters.

For the 2nd and 3rd lines, the first number after the comma is the student number, then seat number, then last name, first name, year in school.

So I have two main challenges I think. First being how to parse the data so that I can know many and who is in each class so I can call it into a table view (and add to it later), and then I don't know how to associate a integer value with the meeting pattern or class name (i.e. 13 = history, and 1 = Mon/Wed/Fri)

thank you so much for any help

like image 847
Paul Lapke Avatar asked Nov 28 '22 23:11

Paul Lapke


1 Answers

You packed a lot of questions into a short space!

The answer to your main question is "yes". You should use NSScanner if you have to import a CSV file into your application.

CSV (comma-separated values) is a very tricky file format. At first glance, it looks very simple. Unfortunately, in practice it is anything but simple! It is not even particularly well defined, as it turns out. CSV is the closest thing to a messy "hack" of any type data file I have ever seen!

CSV files were apparently cooked up by Microsoft to work like BASIC "DATA" statements. Something they had been parsing just fine since the mid-1970's. Unfortunately, they should have left it at that. DATA statements were never intended as a file format, just a shortcut to avoid the bother of putting simple data into files in the first place. Better than hardcoding assignment statements but that was about it.

The problem with scanning a CSV file is that it uses as its delimiters things that can occur in the data itself. That does not stop it from being usable. It just adds complications, and these complications add complexity to the scanner. The complexity arrives in the form of special cases. You basically have to define a formal grammar for the CSV file.

Naive implementations do not do this. Your course names or one of the 2 name fields might contain commas in them. For instance, if there is a "junior" in the class, they might have "Jr." after part of their name. Or a class might be called "trig, honors". The user might enter that, the front end program - maybe a simple text editor after all, might not prevent that. So long as the field is wrapped in quotation marks, it is allowed to have commas embedded in it.

I once picked up the ball on a CSV import routine someone had written and checked into our version control as supposedly done. Thing was, it was not. It was blowing up on the forth record of the file. That record had a field in it named something like "XYZ, INC." in it. Well, the embedded comma was throwing it off.

I wrote a real lexer (scanner) for the parsing our CSV import file. That solved the problem. The import routine actually started working then, so I checked it in and marked the "bug" (cough, cough) as "fixed".

As you may have guessed, I think it is highly unlikely the programmer was unaware that his code was unusable, especially given that it did not work on the baseline test data. If you are writing this application for other people to use or going for a high grade, do not finesse the CSV input scanner: do a good job. Anything less would be "faking it".

That is why the best way to handle this issue is to shoot down CSV file support in the requirements stage! Tell stakeholders that tab-delimited files are a much, much simpler format to parse. The only time that CSV has an edge over tab-delimited is the case when embedded newlines and/or tabs need to be supported in any fields of the records.

However, if your project is such a case - try a different tack so you can avoid CSV: suggest XML. Then, write a schema (DTD, XSD, or RNC which is my personal favorite) so you can validate the input with whatever XML parsing API you are using, if your parsing engine supports validation.

If you are stuck with CSV, then luckily there is a very good tutorial showing how to input a CSV using NSScanner called Writing a parser using NSScanner (a CSV parsing example).

Take a look, as you can see it is not simple - nor did the author make it overly complex. You might want to bookmark the whole blog as well as the article. It is a really excellent weblog for Cocoa developers.

Another example of a CSV scanner is found in Cocoa for Scientists (Part XXVI): Parsing CSV Data. Though it does not give as much explanation on the solution domain problems that CSV presents, nor go into design - you still can see that usable code is going to have to do more than simply split the line up using comma and newline characters.

The next part of your program you will have to think about is Developing with Core Data. Make sure you crate your Cocoa iPhone application in the IDE with Core Data checked. Also, do as much of your data modeling and GUI design as possible in the Interface Builder rather than struggling to write a lot of code manually.

To store the records using the Core Data part of Cocoa you will need to define a subclass of NSManagedObject (see NSManagedObject class reference). This is a good time, by the way, to look over the Core Data Class Overview at Cocoa Dev Central. Acquaint yourself there with the fundamental object types in Core Data. The diagrams and explanations will make it really clear how the different Core Data abstractions and classes fit together in your application.

Pick a good business object (problem domain) name like Enrollment, or better yet - CourseRegistration. Careful not to pick something that does not sound like solution domain things. In this case, I would specifically stay away from using words like: class, registration, or schedule since those have special meanings in programming. No sense muddying the line between problem domain and solution domain.

If you want to set up a real database, and not just dump the records you read from the CSV import file into a single table in your database - you probably will want to also define NSManagedObject subclasses called Student and Course as well. You will use Interface Builder to inform Core Data that there is a relationship between these two and CourseRegistration.

There is an example in a tutorial that I cite below that shows you how to set up these relationships.

Here is a somewhat outdated walkthrough of the steps to create a Core Data application: Build a Core Data App. It is not outdated programming-wise.

It is just that the Xcode IDE and Interface Builder tool have had some of their dialog boxes heavily reworked. The functionality is the same but it makes screenshots in tutorials written a few years or more ago a bit hard to follow sometimes.

If you have not worked with Core Data before, be aware that an Entity is a persistant object type, instances of which get stored in (and loaded from) the database (or file store). Attributes are basically the fields of the entity. Attributes have names and types, just like properties do.

There are some rules you have to follow when subclassing NSManagedObject as well as using Core Data in general. So I encourage you to read the Introduction to Core Data programming Guide. At least now that you have gotten a bit of help here and from the tutorials, you will not be hitting it cold.

Conveniently, Cocoa provided you with both an NSScanner class to simplify inputing your CSV import file, as well as the Core Data facility you decided to use to persist your data.

As you point out, you are going to want a GUI for editing your dataset. Cocoa uses the Model-View-Controller triad as its GUI design pattern.

NSTableView might be a good thing to put into your application's GUI to do that. That will give you a table view of the records in your user interface.

There is an NSTableView Tutorial you should take a look at over at CocoaDev. The CocoaDev site is quite an excellent resource to guide you through all kinds of areas of Cocoa programming. If you need more help, there is Another NSTableView Tutorial there.

People often get stuck wondering what to use as the controller with an NSTableView. I suggest reading up on the NSArrayController class. You probably to have a look at some Cocoa Bindings Examples and Hints. Cocoa Bindings, in a Nutshell, are a way to keep an attribute in a view in sync with a property of a model object.

Most programmers eventually realize that most database applications simply: collect data, move data values around, and initiate and then propagate changes to pieces of it. Model-View-Controller architecture separates your UI from your business objects in a nice, loosely-coupled fashion.

A binding, is a declarative mechanism for linking them to together. They are still kept loosely coupled though, so do not worry that they violate any architectural rules of the MVC design pattern.

Bindings are handy for doing rapid application development using a WYSIWYG GUI-building tools like Interface Builder.

Without bindings, you would have to manually write procedural code to associate the user interface components with the data in the model. Bindings let you handle that concern in the Interface Builder. The result is you wind up constructing your data management application, rather than "coding" it.

If you need a good book to tie up loose ends about how to use Xcode to write Cocoa or Core Data in particular apps, Xcode 3 Unleashed is pretty good.

It does not detail iPhone development but I assume you have access to documentation that will help you address iPhone-specific limitations and features. Targeting the iPhone means that you will have to use the reference-counting approach to memory management, rather than the newer garbage collection approach that was introduced with Objective-C 2.0.

like image 109
JohnnySoftware Avatar answered Dec 20 '22 13:12

JohnnySoftware