Apache CarbonData community is pleased to announce the release of the
Version 1.6.1 in The Apache Software Foundation (ASF).
CarbonData is a high-performance data solution that supports various data
analytic scenarios, including BI analysis, ad-hoc SQL query, fast filter
lookup on detail record, streaming analytics, and so on. CarbonData has
been deployed in many enterprise production environments, in one of the
largest scenarios, it supports queries on a single table with 3PB data
(more than 5 trillion records) with response time less than 3 seconds!
This release note provides information on the new features, improvements,
and bug fixes of this release.
What’s New in CarbonData Version 1.6.1?
CarbonData 1.6.1 intention was to move closer to unified analytics and
improve the stability. In this version of CarbonData, around 40 JIRA
tickets related to improvements, and bugs have been resolved. Following are
Index Server performance improvements for Full Scan and TPCH Queries
Carbon currently prunes and caches all block/blocklet datamap index
information into the driver. If the cache size becomes huge(70-80% of the
driver memory) then there can be excessive GC in the driver which can slow
down the queries and the driver may even go OutOfMemory. Moving out the
indexes to separate JDBCServer reduced the overhead on the primary
JDBCServer, but introduced delay in fetching the bulk pruning blocks list
from the Index server. This is improved in this release and performance is
same as running without Index Server.