I work for a news website that stores all of their stories as XML. I know, not the best way to go, but it is what it is. What I'm trying to do is make it possible to search through the XML files from the website. Right now our search feature is all Google powered (it only searches whatever Google has already crawled).
What I'm thinking right off the bat is to use Grep, which sort of works alright, but probably won't scale out too much. The other option that will take a lot more work, but would work way better, is to store parts of the XMLs in a relational database.
Given the way our backend is set up, moving to a different storage model would take a long time, so for the time being, this is what we have to work with. Ideas?
Adding some caching might help you scale out the grep idea. However, you might consider a solution that won't just band aid the problem today but also takes you closer to a better solution tomorrow. Maybe designing a better solution and implement it piece by piece over time would do the trick.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With