Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How should I store articles with photos and formatting in a database?

Take a look at some of the articles on Storehouse, such as this one. They're nice and photo rich, which makes them appealing, but they aren't just articles. They've got photos, and other extraneous components which makes them more complicated, from my point of view, for storing them. In short, what's the best way to create a similar article implementation for a personal site or project at a database level?

By this, I mean, say I were to write an article that is broken into maybe 8 paragraphs, some of the paragraphs are sibling children of <section>'s, and in between these sections there'd be various photos of different layouts - a parallax photo, a fullpage photo, and a gallery-style viewer, for example - maybe even quotes. This generates a reasonably complex HTML structure, moreso than a plaintext article that could be stored simply as text in a database and then just outputted between two tags on a webpage. How do I go about storing all this information, HTML tags, class names, etc. in an database & application appropriate manner?

I've come up with a few ideas, but I'm not entirely sure on the best practice or what advantages and disadvantages each option holds.

Option 1: Store everything as plaintext inside a database field.

This is the simplest, but also the ugliest. Everything, image tags, classnames, and all are stored in plain text inside anarticle_text field inside an article table.

Option 2: Store the article text and formatting inside a database field, then store the images in another table.

This is a hybrid solution of 1 & 3. Basically, you'd reference the image section inside the article like so:

{{ imageSection1 }}

say, and get you application logic to integrate and weave it into a final product. This is easy on the database side, but gets more confusing on the applogic side.

Option 3: Store everything separately.

Each paragraph is its own entry in a article_collation table, with images and comments and quotes stored in their own independent tables. This seems like the most effective way of separating distinct elements which should be stored separately, but it makes program logic hellish and might end up being less effective because of it.


Each one has significant problems IMO, and I'm not sure what to do. Input? Recommendations? Are there any tools that make this easier?

like image 715
marked-down Avatar asked Oct 03 '14 17:10

marked-down


2 Answers

I would go for Option 3: Store everything separately. It will be complicated but most flexible. It allows you to expand your CMS one bit at a time.

A rough outline of the table structure:

article
- article_id (PK)
- title

display
- display_id (PK)
- name

section
- section_id (PK)
- article_id (FK)
- display_id (FK)
- sort

content
- content_id (PK)
- section_id (FK)
- title
- description
- image
- sort

Sample values for display table to begin with:

1: pass-through
2: horizontal-slideshow
3: vertical-sections

And here is a sample of what the page looks like in the end (the classes and ids will reveal what the data looks like):

<div id="article-1" class="article">
    <h1 class="title">Article Title</h1>
    <div id="section-1" class="section display-pass-through">
        <div id="content-11" class="content">
            <h2 class="title">Introduction</h2>
            <div class="description">Lorem ipsum dolor sit amet.</div>
        </div>
    </div>
    <div id="section-2" class="section display-horizontal-slideshow">
        <div id="content-21" class="content">
            <h2 class="title">Slide 1</h2>
            <div class="description">Lorem ipsum dolor sit amet.</div>
            <img class="image">
        </div>
        <div id="content-22" class="content">
            <h2 class="title">Slide 2</h2>
            <div class="description">Lorem ipsum dolor sit amet.</div>
            <img class="image">
        </div>
        <div id="content-23" class="content">
            <h2 class="title">Slide 3</h2>
            <div class="description">Lorem ipsum dolor sit amet.</div>
            <img class="image">
        </div>
    </div>
    <div id="section-3" class="section display-vertical-sections">
        <div id="content-31" class="content">
            <img class="image">
            <h2 class="title">Heading 1</h2>
            <div class="description">Lorem ipsum dolor sit amet.</div>
        </div>
        <div id="content-32" class="content">
            <img class="image">
            <h2 class="title">Heading 2</h2>
            <div class="description">Lorem ipsum dolor sit amet.</div>
        </div>
        <div id="content-33" class="content">
            <img class="image">
            <h2 class="title">Heading 3</h2>
            <div class="description">Lorem ipsum dolor sit amet.</div>
        </div>
    </div>
    <div id="section-4" class="section display-pass-through">
        <div id="content-41" class="content">
            <h2 class="title">Conslusion</h2>
            <div class="description">Lorem ipsum dolor sit amet.</div>
        </div>
    </div>
</div>

You can add additional display options as your requirements grow (single movie and movie playlist are two trending examples). Use semantic markup to display the content regardless of which section it belongs to. You can use JavaScript and CSS classes to control how sections look (e.g. for horizontal slideshow you can hide all but the first slide using CSS and add prev/next controls using JavaScript).

Note: The above idea can be generalized to use only one, self referencing table. Generalizing too much and/or using self referencing tables have issues of their own. However, there will be no restriction on how and how deep the content can be nested; very much similar to how you author HTML content.

like image 155
Salman A Avatar answered Oct 05 '22 23:10

Salman A


I'm a huge fan of using XML, then converting it to your output format dynamically (using xslt or the like). This allows you to represent your content in a strongly structured fashion, but still have something coherent to refer to. This helps for things like versioning, using tools like diff, or even packaging it up and transporting it as lists of files (not database tables that need to be re-assembled).

It also allows totally different uses for your content in some future system that is not HTML based (like PDF output).

Take a look at Martin Fowler's article: http://martinfowler.com/articles/writingInXml.html

There are several open "standards" for publishing content in XML that are worth looking at. NLM is quite common. I have experience with it, and have found it to be very (possibly over) complete.

like image 33
Rob Conklin Avatar answered Oct 06 '22 01:10

Rob Conklin