[Discussion] Supporting Hive Metastore in Presto CarbonData.
Current Carbon Presto integration added a new presto connector that takes
the carbon store folder and lists the databases and tables from the folders.
In this implementation, we have many issues like.
1. DB and table always need to be in specific order and name of the folders
should always match the DB name and table name.
2. The table which is created in presto cannot be reflected directly in
other execution engines like Spark.
3. DB with location and table with location cannot work.
4. There will not be any access control on tables.
5. There is no interoperability between hive tables like ORC or Parquet with
carbon. Like if we want to join some hive table with Carbon Table then it
won't be possible.
To overcome the above limitations we can support HiveMetastore in Presto
Carbon. Basically, instead of creating a new Presto Connector for Carbon, we
can extend the HiveConnector and override and add new
CarbonPageSourceFactory for reading the data and FileWriterFactory for
writing the data. So Carbon Table becomes one of the hive supported format
for Presto. So whatever the tables added in spark can be reflected
immediately in Carbon and also the limitations mentioned above will be
solved with this type of implementation.
Re: [Discussion] Supporting Hive Metastore in Presto CarbonData.
I really agree with your options.In hive-integration,create
HiveConnector extension to support hive metastore in presto carbondata.I
think that creating HiveConnector which implements Connector through
CarbondataConnectorFactory.I'm familiar with hive integration,which could
help me develop this extension with you.Will you do development cooperation