Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to obfuscate string constants?

Tags:

We have an application which contains sensitive information and I'm trying my best to secure it. The sensitive information includes:

  1. The main algorithm
  2. The keys for an encryption/decryption algorithm

I've been looking at Obfuscating the code but it doesn't seem to help much as I can still decompile it. However, my biggest concern is that the keys used for encryption of serial numbers etc are clearly visible when you decompile the code, even when it's Obfuscated.

Can anyone suggest how I can secure these strings?

I realise one of the methods might be to remove any decryption from the application itself, while this may be possible in part, there are some features which have to use encryption/decryption - mainly to save a config file and to pass an 'authorisation' token to a DLL to perform a calculation.

like image 306
cusimar9 Avatar asked May 16 '11 13:05

cusimar9


People also ask

What is string obfuscation?

String obfuscation is an established technique used by proprietary, closed-source applications to protect intellectual property. Furthermore, it is also frequently used to hide spyware or malware in applications. In both cases, the techniques range from bit-manipulation over XOR operations to AES encryption.

How do you code obfuscated?

Encrypting some or all of a program's code is one obfuscation method. Other approaches include stripping out potentially revealing metadata, replacing class and variable names with meaningless labels and adding unused or meaningless code to an application script.

What is obfuscation technique?

Obfuscation techniques entail making a design or system more complicated to prevent RE, while also allowing the design or system to have the same functionality as the original.

What method is used to obfuscate the message?

Three of the most common techniques used to obfuscate data are encryption, tokenization, and data masking. Encryption, tokenization, and data masking work in different ways. Encryption and tokenization are reversible in that the original values can be derived from the obfuscated data.


1 Answers

There are ways to do what you want, but it isn't cheap and it isn't easy.

Is it worth it?

When looking at whether to protect software, we first have to answer a number of questions:

  1. How likely is this to happen?
  2. What is the value to someone else of your algorithm and data?
  3. What is the cost to them of buying a license to use your software?
  4. What is the cost to them of replicating your algorithm and data?
  5. What is the cost to them of reverse engineering your algorithm and data?
  6. What is the cost to you of protecting your algorithm and data?

If these produce a significant economic imperative to protect your algorithm/data then you should look into doing it. For instance if the value of the service and cost to customers are both high, but the cost of reverse engineering your code is much lower than the cost of developing it themselves, then people may attempt it.

So, this leads on to your question

  • How do you secure your algorithm and data?

Discouragement

Obfuscation

The option you suggest, obfuscating the code, messes with the economics above - it tries to significantly increase the cost to them (5 above) without increasing the cost to you (6) very much. The research by the Center for Encrypted Functionalities has done some interesting research on this. The problem is that as with DVD encryption it is doomed to failure if there is enough of a differential between 3, 4 and 5 then eventually someone will do it.

Detection

Another option might be a form of Steganography, which allows you to identify who decrypted your data and started distributing it. For instance, if you have 100 different float values as part of your data, and a 1bit error in the LSB of each of those values wouldn't cause a problem with your application, encode a unique (to each customer) identifier into those bits. The problem is, if someone has access to multiple copies of your application data, it would be obvious that it differs, making it easier to identify the hidden message.

Protection

SaaS - Software as a Service

A more secure option might be to provide the critical part of your software as a service, rather than include it in your application.

Conceptually, your application would collect up all of the data required to run your algorithm, package it up as a request to a server (controlled by you) in the cloud, your service would then calculate your results and pass it back to the client, which would display it.

This keeps all of your proprietary, confidential data and algorithms within a domain that you control completely, and removes any possibility of a client extracting either.

The obvious downside is that clients are tied into your service provision, are at the mercy of your servers and their internet connection. Unfortunately many people object to SaaS for exactly these reasons. On the plus side, they are always up to date with bug fixes, and your compute cluster is likely to be higher performance than the PC they are running the user interface on.

This would be a huge step to take though, and could have a huge cost 6 above, but is one of the few ways to keep your algorithm and data completely secure.

Software Protection Dongles

Although traditional Software Protection Dongles would protect from software piracy, they wouldn't protect against algorithms and data in your code being extracted.

Newer Code Porting dongles (such as SenseLock) appear to be able to do what you want though. With these devices, you take code out of your application and port it to the secure dongle processor. As with SaaS, your application would bundle up the data, pass it to the dongle (probably a USB device attached to your computer) and read back the results.

Unlike SaaS, data bandwidth would be unlikely to be an issue, but performance of your application may be limited by the performance of your SDP.

† This was the first example I could find with a google search.

Trusted platform

Another option, which may become viable in the future is to use a Trusted Platform Module and Trusted Execution Technology to secure critical areas of the code. Whenever a customer installs your software, they would provide you with a fingerprint of their hardware and you would provide them with a unlock key for that specific system.

This key would would then allow the code to be decrypted and executed within the trusted environment, where the encrypted code and data would be inaccessible outside of the trusted platform. If anything at all about the trusted environment changed, it would invalidate the key and that functionality would be lost.

For the customer this has the advantage that their data stays local, and they don't need to buy a new dongle to improve performance, but it has the potential to create an ongoing support requirement and the likelihood that your customers would become frustrated with the hoops they had to jump through to use software they have bought and paid for - losing you good will.

Conclusion

What you want to do is not simple or cheap. It could require a big investment in software, infrastructure or both. You need to know that it is worth the investment before you start along this road.

like image 123
Mark Booth Avatar answered Oct 03 '22 00:10

Mark Booth