I've got an interesting problem that's cropped up in a sort of pass based compiler of mine. Each pass knows nothing of other passes, and a common object is passed down the chain as it goes, following the chain of command pattern.
The object that is being passed along is a reference to a file.
Now, during one of the stages, one might wish to associate a large chunk of data, such as that file's SHA512 hash, which requires a reasonable amount of time to compute. However, since that chunk of data is only used in that specific case, I don't want all file references to need to reserve space for that SHA512. However, I also don't want other passes to have to recalculate the SHA512 hash over and over again. For example, someone might only accept files which match a given list of SHA512s, but they don't want that value printed when the file reference gets to the end of the chain, or perhaps they want both, or... .etc.
What I need is some sort of container which contain only one of a given type. If the container does not contain that type, it needs to create an instance of that type and store it somehow. It's basically a dictionary with the type being the thing used to look things up.
Here's what I've gotten so far, the relevant bit being the FileData::Get<t>
method:
class FileData;
// Cache entry interface
struct FileDataCacheEntry
{
virtual void Initalize(FileData&)
{
}
virtual ~FileDataCacheEntry()
{
}
};
// Cache itself
class FileData
{
struct Entry
{
std::size_t identifier;
FileDataCacheEntry * data;
Entry(FileDataCacheEntry *dataToStore, std::size_t id)
: data(dataToStore), identifier(id)
{
}
std::size_t GetIdentifier() const
{
return identifier;
}
void DeleteData()
{
delete data;
}
};
WindowsApi::ReferenceCounter refCount;
std::wstring fileName_;
std::vector<Entry> cache;
public:
FileData(const std::wstring& fileName) : fileName_(fileName)
{
}
~FileData()
{
if (refCount.IsLastObject())
for_each(cache.begin(), cache.end(), std::mem_fun_ref(&Entry::DeleteData));
}
const std::wstring& GetFileName() const
{
return fileName_;
}
//RELEVANT METHOD HERE
template<typename T>
T& Get()
{
std::vector<Entry>::iterator foundItem =
std::find_if(cache.begin(), cache.end(), boost::bind(
std::equal_to<std::size_t>(), boost::bind(&Entry::GetIdentifier, _1), T::TypeId));
if (foundItem == cache.end())
{
std::auto_ptr<T> newCacheEntry(new T);
Entry toInsert(newCacheEntry.get(), T::TypeId);
cache.push_back(toInsert);
newCacheEntry.release();
T& result = *static_cast<T*>(cache.back().data);
result.Initalize(*this);
return result;
}
else
{
return *static_cast<T*>(foundItem->data);
}
}
};
// Example item you'd put in cache
class FileBasicData : public FileDataCacheEntry
{
DWORD dwFileAttributes;
FILETIME ftCreationTime;
FILETIME ftLastAccessTime;
FILETIME ftLastWriteTime;
unsigned __int64 size;
public:
enum
{
TypeId = 42
}
virtual void Initialize(FileData& input)
{
// Get file attributes and friends...
}
DWORD GetAttributes() const;
bool IsArchive() const;
bool IsCompressed() const;
bool IsDevice() const;
// More methods here
};
int main()
{
// Example use
FileData fd;
FileBasicData& data = fd.Get<FileBasicData>();
// etc
}
For some reason though, this design feels wrong to me, namely because it's doing a whole bunch of things with untyped pointers. Am I severely off base here? Are there preexisting libraries (boost or otherwise) which would make this clearer/easier to understand?
As ergosys said already, std::map is the obvious solution to your problem. But I can see you concerns with RTTI (and the associated bloat). As a matter of fact, an "any" value container does not need RTTI to work. It is sufficient to provide a mapping between a type and an unique identifier. Here is a simple class that provides this mapping:
#include <stdexcept>
#include <boost/shared_ptr.hpp>
class typeinfo
{
private:
typeinfo(const typeinfo&);
void operator = (const typeinfo&);
protected:
typeinfo(){}
public:
bool operator != (const typeinfo &o) const { return this != &o; }
bool operator == (const typeinfo &o) const { return this == &o; }
template<class T>
static const typeinfo & get()
{
static struct _ti : public typeinfo {} _inst;
return _inst;
}
};
typeinfo::get<T>()
returns a reference to a simple, stateless singleton which allows comparisions.
This singleton is created only for types T where typeinfo::get< T >() is issued anywhere in the program.
Now we are using this to implement a top type we call value
. value
is a holder for a value_box
which actually contains the data:
class value_box
{
public:
// returns the typeinfo of the most derived object
virtual const typeinfo& type() const =0;
virtual ~value_box(){}
};
template<class T>
class value_box_impl : public value_box
{
private:
friend class value;
T m_val;
value_box_impl(const T &t) : m_val(t) {}
virtual const typeinfo& type() const
{
return typeinfo::get< T >();
}
};
// specialization for void.
template<>
class value_box_impl<void> : public value_box
{
private:
friend class value_box;
virtual const typeinfo& type() const
{
return typeinfo::get< void >();
}
// This is an optimization to avoid heap pressure for the
// allocation of stateless value_box_impl<void> instances:
void* operator new(size_t)
{
static value_box_impl<void> inst;
return &inst;
}
void operator delete(void* d)
{
}
};
Here's the bad_value_cast exception:
class bad_value_cast : public std::runtime_error
{
public:
bad_value_cast(const char *w="") : std::runtime_error(w) {}
};
And here's value:
class value
{
private:
boost::shared_ptr<value_box> m_value_box;
public:
// a default value contains 'void'
value() : m_value_box( new value_box_impl<void>() ) {}
// embedd an object of type T.
template<class T>
value(const T &t) : m_value_box( new value_box_impl<T>(t) ) {}
// get the typeinfo of the embedded object
const typeinfo & type() const { return m_value_box->type(); }
// convenience type to simplify overloading on return values
template<class T> struct arg{};
template<class T>
T convert(arg<T>) const
{
if (type() != typeinfo::get<T>())
throw bad_value_cast();
// this is safe now
value_box_impl<T> *impl=
static_cast<value_box_impl<T>*>(m_value_box.get());
return impl->m_val;
}
void convert(arg<void>) const
{
if (type() != typeinfo::get<void>())
throw bad_value_cast();
}
};
The convenient casting syntax:
template<class T>
T value_cast(const value &v)
{
return v.convert(value::arg<T>());
}
And that's it. Here is how it looks like:
#include <string>
#include <map>
#include <iostream>
int main()
{
std::map<std::string,value> v;
v["zero"]=0;
v["pi"]=3.14159;
v["password"]=std::string("swordfish");
std::cout << value_cast<int>(v["zero"]) << std::endl;
std::cout << value_cast<double>(v["pi"]) << std::endl;
std::cout << value_cast<std::string>(v["password"]) << std::endl;
}
The nice thing about having you own implementation of any
is, that you can very easily tailor it to the features you actually need, which is quite tedious with boost::any. For example, there are few requirements on the types that value can store: they need to be copy-constructible and have a public destructor. What if all types you use have an operator<<(ostream&,T) and you want a way to print your dictionaries? Just add a to_stream method to box and overload operator<< for value and you can write:
std::cout << v["zero"] << std::endl;
std::cout << v["pi"] << std::endl;
std::cout << v["password"] << std::endl;
Here's a pastebin with the above, should compile out of the box with g++/boost: http://pastebin.com/v0nJwVLW
EDIT: Added an optimization to avoid the allocation of box_impl< void > from the heap: http://pastebin.com/pqA5JXhA
You can create a hash or map of string to boost::any. The string key can be extracted from any::type().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With