[DISCUSSION] Hive and Presto Write support + Performance improvement
As you all know that carbon has been supporting reading carbontable from
presto and hive for a long time now and its high time that we start
supporting write from presto and hive in 2.0.0 version.
The development would be divided into 2 Phases.
*1. Support a OutputFormat(MapredCarbonOutputFormat) that allows the user
to write data in carbondata format from hive.*
- Tables would be created in spark, until a solution to create schema
file in hive is found.
- Tables would support the same folder structure as a transactional
- Any carbon specific DDL/DML would not be supported.
*2. Read Performance should be better or equivalent to ORC.*
*Phase2 (Presto): To be done later*
The Tasks are same as Hive and any update to the task list would be updated
Any suggestions from the community is appreciated.