[Discussion] Encryption support for carbondata files
*Background:* Currently carbondata files are not encrypted. If anyone has
carbon reader, they can read the carbondata files.
If the data has sensitive information, that data can be encrypted with the
So, that along with carbon reader this key is required to decrypt and read
*Why encryption at file format level ?*
As files generated by one application can be used by the other applications
Also encrypting the data at application level is a time consuming process
as we have very huge data.
and whole carbondata files need to be encrypted from application. This is
Only the columns that have sensitive data can be encrypted if we support
encryption at file format level. so that we can have column level
*Note:* Also keep in mind that encryption needs more CPU for crypto key
computation and decryption also takes some time.
So, it will impact loading and query time if user wants to encrypt the data.
*So, how many of you think this feature has real world use case and carbon
should have this feature ?*
Based on the need of this feature, I can go ahead and explore the
Re: [Discussion] Encryption support for carbondata files
+1 for starting this discussion and +1 for the new feature. It is good to
have a feature like Encryption from security point of view.
During further analysis and design you can think on the following points.
1. Partial Encryption (column based encryption with configurable encryption
2. Complete file encryption
3. Different encryption algorithm for footer metadata and actual data.
> Instead of supporting encryption, I think carbondata can provide another
> common feature:
> A framework that support some hooks while reading/writing column chunk.
> User can specify the hooks while creating table and implement the
> feature as a special instance as they need.
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ >
Encryption Algo are CPU intensive, Any analysis to guarantee performance will
have no impact , Any other file format already support this and what is the
motivational real time use case behind this support