I have this situation where I have a parent child relationship between two sets of data. I have a parent document collection and a child document collection. The requirement is that the parents and their corresponding children need to be exported into 'a' pdf document. A simple-implementation of the above situation can be as follows(java-ish pseudo code below):
for(Document parentDocument:Documents){
ExportToPdf(parentDocument);
for(Document childDocument:parentDocument.children()){
AppendToParentPdf(childDocument);
}
}
Something as above will probably solve the problem, but all of a sudden the requirements changes and now each of these parents and their corresponding children need to be in separate pdfs, so the above given snippet is modified by changing the AppendToParentPdf()
to ExportToPdf()
follows:
for(Document parentDocument:Documents){
ExportToPdf(parentDocument);
for(Document childDocument:parentDocument.children()){
ExportToPdf(childDocument);
}
}
Going along this way, it will not take long before this seemingly trivial snippet would suffer from some serious code smells.
My questions to SO are:
Are there better representations of parent-child relationships such as the above where instead of brute-forcing my way through all the documents and their children in an O(n^2)
fashion, I can use a different data-structure or technique to traverse the entire structure in a more optimal fashion.
In the scenario that I described above, where the business rules are rather fluid about the way the pdfs should be exported, is there a smarter way to code the nature of the export function? Also the export format is transient. PDFs can give way to *.docs/csvs/xmls et al.
It will be great to get some perspective on this.
Thanks
Are there better representations of parent-child relationships such as the above where instead of brute-forcing my way through all the documents and their children in an O(n^2) fashion.
This is not O(N^2)
. It is O(N)
where N
is the total number of parent and child documents. Assuming that no child has more than one parent document, then you can't significantly improve the performance. Furthermore, the cost of the traversal is probably trivial compared with the cost of the calls that generate the PDFs.
The only case where you might want to consider optimizing is if child documents can be children of multiple parents. In that case, you may want to keep track of the documents that you've already generated PDF's for ... and skip them if you revisit them in the traversal. The test for "have I seen this document before" can be implemented using a HashSet
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With